Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Next challenges

Recent achievements

  • Global
  • Personal

Recognition

  • Give kudos
  • Received
  • Given

Leaderboard

  • Global

Trophy case

Kudos (beta program)

Kudos logo

You've been invited into the Kudos (beta program) private group. Chat with others in the program, or give feedback to Atlassian.

View group

It's not the same without you

Join the community to find out what other Atlassian users are discussing, debating and creating.

Atlassian Community Hero Image Collage

How to clean programmatically old drafts and purge trash with python for Confluence?

Hello!

In this article, we will learn how to maintenance with python and Confluence REST API and set script into your CI e.g. Atlassian Bamboo. Also, it is very easy to automate it and extend functionality.

Nowadays, so many Confluence instances have enabled function collaborative editing or long time did not maintenance. For example, in our instance, DB backup in text format has been decreased by ~ 16%.

 

Let’s start with trash cleaner functionality.

  1. Algorithm is easy, like get all pages from trash, remove related pages.
  2. And REST API reference located here e.g. https://docs.atlassian.com/ConfluenceServer/rest/6.11.0/
  3. Next thing is easy for implement language, it is python with module requests.

 For more comfortable use raw Rest API I’m using python module with name atlassian-python-api. Hence code is so small and easy.

def clean_pages_from_space(confluence, space_key):
    """
    Remove all pages from trash for related space
    :param confluence:
    :param space_key:
    :return:
    """
    limit = 500
    flag = True
    step = 0
    while flag:
        values = confluence.get_all_pages_from_space_trash(space=space_key, start=0, limit=limit)
        step += 1
        if len(values) == 0:
            flag = False
            print("For space {} trash is empty".format(space_key))
        else:
            for value in values:
                print(value['title'])
                confluence.remove_page_from_trash(value['id'])

Feel free use this full example in script: https://github.com/atlassian-python-api/atlassian-python-api/blob/master/examples/confluence-trash-cleaner.py

 

 

Next step is clean draft pages.

Of course, in this use case we need to have some anchor for determine how old draft we should remove it

Therefore I am using variable

DRAFT_DAYS = 30

def clean_draft_pages_from_space(confluence, space_key, count, date_now):
    """
    Remove draft pages from space using datetime.now
    :param confluence:
    :param space_key:
    :param count:
    :param date_now:
    :return: int counter
    """
    pages = confluence.get_all_draft_pages_from_space(space=space_key, start=0, limit=500)
    for page in pages:
        page_id = page['id']
        draft_page = confluence.get_draft_page_by_id(page_id=page_id)
        last_date_string = draft_page['version']['when']
        last_date = datetime.datetime.strptime(last_date_string.replace(".000", "")[:-6], "%Y-%m-%dT%H:%M:%S")
        if (date_now - last_date) > datetime.timedelta(days=DRAFT_DAYS):
            count += 1
print("Removing page with page id: " + page_id)
confluence.remove_page_as_draft(page_id=page_id) print("Removed page with date {}".format(last_date_string)) return count

https://github.com/atlassian-python-api/atlassian-python-api/blob/master/examples/confluence-draft-page-cleaner.py

 

That’s all. I hope it helps for easy cleanup your Confluence. Next time I will show how to clean page versions, attachement versions. Because of these use case will reduce a lot of disk usage. 

P.S. Let's set into CI system for delegate to other team mates.

image.png

 

Cheers,

Gonchik Tsymzhitov

10 comments

Hi, Gonchik

Purge trash is good.

But cleaning draft pages - method get_all_draft_pages_from_space returns draft pages only for the current user in space.

 

Best regards,

Pavel Dmitriev

Hi Pavel, 

 

Thanks for feedback.

It sounds reasonable answer. But I have tested on my Confluence instance, where results was the same. 

Anyway, I will push your idea, and compare on other places.

 

Thanks! 

Cheers,

Gonchik Tsymzhitov 

Gonchik, thanks.

And there is no action in block for deleting draft page

if (date_now - last_date) > datetime.timedelta(days=DRAFT_DAYS):
            count += 1
            print("Removed page with date {}".format(last_date_string))

Best regards,

Pavel Dmitriev

@Pavel  Examples has been adjusted. (FYI: it shows how you can do it based on wrapper)

You can see here:

https://github.com/AstroMatt/atlassian-python-api/pull/74

Also, I have changed a few of logic methods.

Also, you can meet with this error in your scripts  https://confluence.atlassian.com/confkb/removing-orphaned-draft-316113059.html

 

 

 

Cheers,

Gonchik Tsymzhitov

How can I use it in confluence 4.2.3 version? Thanks a lot.

Regards, Hsu Yao Chang

@Anderson Hsu  Sorry for the delay. I would recommend you to find the xml-rpc call around that. 

Unfortunately, I don't 4.2.3 version to test it. 

What about upgrade your instance ?

hello @Gonchik Tsymzhitov , thank you for providing me the link. I tried few scripts that were provided and they worked great thanks for that. But, I am unable to get the confluence-trash-cleaner.py to run. When I run confluence-trash-cleaner.py -vvv it doesn't give me any output or throw any errors. I have checked my space and I still see the pages in the trash. Also, made sure I have space admin permissions. Can you please help. Below is the code..did I miss something?

Note: I only want to purge the pages from the trash for a single space

#!/usr/bin/python 
# coding=utf-8
from atlassian import Confluence

confluence = Confluence(
url='https://confluence-site-url',
username='sachin',
password='***********')


def clean_pages_from_space(confluence, TEST):
"""
Remove all pages from trash for related space
:param confluence:
:param space_key:
:return:
"""
limit=500
flag = True
step = 0
while flag:
values = confluence.get_all_pages_from_space_trash(space=space_key, start=0, limit=limit)
step += 1
if len(values) == 0:
flag = False
print("For space {} trash is empty".format(space_key))
else:
for value in values:
print(value['title'])
confluence.remove_page_from_trash(value['id'])

I have also tried the one from this link 

@sachin gangam 

Please, add 

import logging

logging.basicConfig(level=logging.DEBUG)

to make sure the errors 

Comment

Log in or Sign up to comment
TAGS
Community showcase
Posted in Confluence

What do you think is the most *delightful* Confluence feature? Comment for a prize!

- Create your own custom emoji 🔥 - "Shake for Feedback" on mobile 📱 - An endless supply of GIFs via GIPHY 🤩 Is there anything quite as nice as a pleasant surprise? Comment below with what...

480 views 24 9
Join discussion

Community Events

Connect with like-minded Atlassian users at free events near you!

Find an event

Connect with like-minded Atlassian users at free events near you!

Unfortunately there are no Community Events near you at the moment.

Host an event

You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events

Events near you