It's not the same without you

Join the community to find out what other Atlassian users are discussing, debating and creating.

Atlassian Community Hero Image Collage

How to clean programmatically old drafts and purge trash with python for Confluence?

Hello!

In this article, we will learn how to maintenance with python and Confluence REST API and set script into your CI e.g. Atlassian Bamboo. Also, it is very easy to automate it and extend functionality.

Nowadays, so many Confluence instances have enabled function collaborative editing or long time did not maintenance. For example, in our instance, DB backup in text format has been decreased by ~ 16%.

 

Let’s start with trash cleaner functionality.

  1. Algorithm is easy, like get all pages from trash, remove related pages.
  2. And REST API reference located here e.g. https://docs.atlassian.com/ConfluenceServer/rest/6.11.0/
  3. Next thing is easy for implement language, it is python with module requests.

 For more comfortable use raw Rest API I’m using python module with name atlassian-python-api. Hence code is so small and easy.

def clean_pages_from_space(confluence, space_key):
    """
    Remove all pages from trash for related space
    :param confluence:
    :param space_key:
    :return:
    """
    limit = 500
    flag = True
    step = 0
    while flag:
        values = confluence.get_all_pages_from_space_trash(space=space_key, start=0, limit=limit)
        step += 1
        if len(values) == 0:
            flag = False
            print("For space {} trash is empty".format(space_key))
        else:
            for value in values:
                print(value['title'])
                confluence.remove_page_from_trash(value['id'])

Feel free use this full example in script: https://github.com/atlassian-python-api/atlassian-python-api/blob/master/examples/confluence-trash-cleaner.py

 

 

Next step is clean draft pages.

Of course, in this use case we need to have some anchor for determine how old draft we should remove it

Therefore I am using variable

DRAFT_DAYS = 30

def clean_draft_pages_from_space(confluence, space_key, count, date_now):
    """
    Remove draft pages from space using datetime.now
    :param confluence:
    :param space_key:
    :param count:
    :param date_now:
    :return: int counter
    """
    pages = confluence.get_all_draft_pages_from_space(space=space_key, start=0, limit=500)
    for page in pages:
        page_id = page['id']
        draft_page = confluence.get_draft_page_by_id(page_id=page_id)
        last_date_string = draft_page['version']['when']
        last_date = datetime.datetime.strptime(last_date_string.replace(".000", "")[:-6], "%Y-%m-%dT%H:%M:%S")
        if (date_now - last_date) > datetime.timedelta(days=DRAFT_DAYS):
            count += 1
print("Removing page with page id: " + page_id)
confluence.remove_page_as_draft(page_id=page_id) print("Removed page with date {}".format(last_date_string)) return count

https://github.com/atlassian-python-api/atlassian-python-api/blob/master/examples/confluence-draft-page-cleaner.py

 

That’s all. I hope it helps for easy cleanup your Confluence. Next time I will show how to clean page versions, attachement versions. Because of these use case will reduce a lot of disk usage. 

P.S. Let's set into CI system for delegate to other team mates.

image.png

 

Cheers,

Gonchik Tsymzhitov

5 comments

Hi, Gonchik

Purge trash is good.

But cleaning draft pages - method get_all_draft_pages_from_space returns draft pages only for the current user in space.

 

Best regards,

Pavel Dmitriev

Hi Pavel, 

 

Thanks for feedback.

It sounds reasonable answer. But I have tested on my Confluence instance, where results was the same. 

Anyway, I will push your idea, and compare on other places.

 

Thanks! 

Cheers,

Gonchik Tsymzhitov 

Gonchik, thanks.

And there is no action in block for deleting draft page

if (date_now - last_date) > datetime.timedelta(days=DRAFT_DAYS):
            count += 1
            print("Removed page with date {}".format(last_date_string))

Best regards,

Pavel Dmitriev

@Pavel  Examples has been adjusted. (FYI: it shows how you can do it based on wrapper)

You can see here:

https://github.com/AstroMatt/atlassian-python-api/pull/74

Also, I have changed a few of logic methods.

Also, you can meet with this error in your scripts  https://confluence.atlassian.com/confkb/removing-orphaned-draft-316113059.html

 

 

 

Cheers,

Gonchik Tsymzhitov

How can I use it in confluence 4.2.3 version? Thanks a lot.

Regards, Hsu Yao Chang

Comment

Log in or Sign up to comment
TAGS
Community showcase
Published in Confluence Cloud

What's New in Confluence Cloud – June 2020 Edition

Ready for the monthly rollup of what happened in May for Confluence Cloud? Improved mobile login experience (& SSO) For those of you with the Confluence mobile app, we know the login experien...

25,953 views 40 82
Read article

Community Events

Connect with like-minded Atlassian users at free events near you!

Find an event

Connect with like-minded Atlassian users at free events near you!

Unfortunately there are no Community Events near you at the moment.

Host an event

You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events

Events near you