Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Extracting all attachments from a space

Tony Rice June 2, 2023

We're retiring a Confluence server, migrating some spaces to another server, but many just aren't updated any more.  We wanted to take snapshots of those spaces for later reference.

We exported each as PDF, a reference to squirrel away somewhere for later.  But that only includes image attachments.  It ignores all other file types.

The XML export does include all attachments, but names the files with their internal ID rather than something more human readable.

So I threw together this Python script which parses the entities.xml file in an XML export, gathering the filenames from there and mapping them to their ids.  It creates a folder named for the space, then sub folders named for each page with an attachment, then copies those attachment files in there.

I hope it is useful to someone else.

https://github.com/rtphokie/confluence_attachment_extract

1 comment

Comment

Log in or Sign up to comment
Craig Nodwell
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
June 3, 2023

Value here!
@Tony Rice thank you sir!

Like Tony Rice likes this
TAGS
AUG Leaders

Atlassian Community Events