• Community
  • Products
  • Confluence
  • Questions
  • Does anyone know how to export files attached in Confluence as-is? I need to extract the files as-is (pdf, word, xls...) and save them in a drive.)

Does anyone know how to export files attached in Confluence as-is? I need to extract the files as-is (pdf, word, xls...) and save them in a drive.)

 

2 answers

Hello!

Please follow the below steps to recover the attachments files. There are three options for this, please choose the one that best fits for you.

Choice A - Recover Attachments By Filename

Best if you know each filename you need to restore, especially if you want just a few files:

  1. Unzip the backup directory and open entities.xml.
  2. Search entities.xml for the filename and find the attachment object with that filename. Locate its page and attachment id.
  3. Using the page and attachment id from entities.xml, go to the attachments directory and open that directory with that page id. Locate the file with the attachment id.
  4. Rename the file to the original filename and test it.
  5. Repeat for each file.
  6. To import each file back into Confluence, upload to the original page by attaching the file from within Confluence.

Choice B - Restore Files By Page

Best if you only want to restore attachments for certain pages:

  1. Unzip the backup directory and open entities.xml.
  2. Search entities.xml for the page title and find the page object with that title. Locate its page id.
  3. Go to the attachments directory and open that directory with that page id. Each of the files in the directory is an attachment that must be renamed.
  4. Search entities.xml for attachment objects with that page id. Every attachment object for the page will have an attachment id and filename.
  5. Rename the file with that attachment id to the original filename and test it.
  6. Repeat for each page.
  7. To import each file back into Confluence, upload to the original page by attaching the file from within Confluence.

Choice C - Restore All Files

Best if you have a small backup but want to restore many or all the attachments inside:

 

Following process is applicable to space export only. Site xml backups do not require page id to be updated manually due to the nature of persistent page_id's.

  1. Unzip the backup directory and open entities.xml.
  2. Go to the attachments directory and open any directory. The directory name is a page id. Each of the files in the directory is an attachment that must be renamed.
  3. Search entities.xml for attachment objects with that page id. When one is found, locate the attachment id and filename.
  4. Rename the file with that attachment id to the original filename and test it.
  5. Find the next attachment id and rename it. Repeat for each file in the directory.
  6. Once all files in the current directory are renamed to their original filenames, search entities.xml for the page id, eg directory name. Find the page object with that page id and locate its page title.
  7. Rename the directory to the page title and move on to the next directory. Repeat for each un-renamed directory in the attachments directory.
  8. To import each file back into Confluence, upload to the original page by attaching the file from within Confluence.

All these steps are documented here: https://confluence.atlassian.com/display/DOC/Retrieving+File+Attachments+from+a+Backup

If my answer helps you, please select the Accept Answer button. smile

If you have any questions, please let me know.

Kind regards,

Luiz Maia
Atlassian Support

I wonder do you have any idea about text extraction from pdf files? There're something wrong with my pdf reader. Any suggestion will be appreciated. Thanks in advance.

 

 

 

 

-----------------------------------------------

Tags: pdf extraction

Suggest an answer

Log in or Sign up to answer
Atlassian Community Anniversary

Happy Anniversary, Atlassian Community!

This community is celebrating its one-year anniversary and Atlassian co-founder Mike Cannon-Brookes has all the feels.

Read more
Community showcase
Kesha Thillainayagam
Posted Apr 13, 2018 in Confluence

We want to hear how your non-technical teams are using Confluence!

Hi Community! Kesha (kay-sha) from the Confluence marketing team here! Can you share stories with us on how your non-technical (think Marketing, Sales, HR, legal, etc.) teams are using Confluen...

1,154 views 22 10
Join discussion

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you