Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

How can I use `rest/api/content` to download all the 800k pages of my Confluence wiki without timing

Franck Dernoncourt July 20, 2022

I  want to download all the 800k pages of my Confluence wiki.

I'd like to use:

curl -u wikiusername:wikipassword https://wiki.hostname.com/rest/api/content?start=1`

and simply increase start from 1 to 800000.

However, the response time increases as start increases, and from ~80000 begins to timeout:

startresponse time (seconds)
10.4
1,0002.5
10,0009
50,000112
100,000timeout

How can I use rest/api/content to download all the 800k pages of my Confluence wiki without timing out?

1 answer

0 votes
Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
July 25, 2022

This is because when you make the REST call, the server is trying to build the response in memory before it can send it back.  If you make it too large, the process will fail.

You're going to need to page through what you're trying to download, you can't do it in one massive great chunk (unless you increased the server memory to something massive)

I'd also want to quickly question why?  What is this download going to do for you?  I'm thinking there may be a better option (like parsing a backup)

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events