List all pages in a space showing titles of page and IDs

Vikas Shrivastava
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
July 29, 2018

Does any please help me to get the list of all pages showing page titles and page IDs within a Confluence space using bash / python script ?

I want to generate a list of all pages showing the page title and page ID.

Thanks in advance

Vikas

2 answers

2 votes
Zak Laughton
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
July 31, 2018

Hi Vikas!

There are a few ways to accomplish this:

REST API

The api will likely be the best way to retrieve data from Confluence in a bash or Confluence script. You can see Confluence REST API Examples for examples of terminal and python commands for using the API.

The following URL will return a JSON list of all pages in the instance (replace <base-URL> with the base URL for your instance):

http://<base-URL>/rest/api/content?type=page&start=0&limit=99999

You can then use python to parse through the JSON to find the ID and title of each page (useful article on JSON parsing with Python: Working with JSON data in Python).

Database

While the REST API would be most convenient to use with a Python/bash script, you can also get all the page titles and ID's from the database with the following query:

SELECT title, contentid
FROM content
WHERE contenttype = 'PAGE'
AND prevver IS NULL
AND content_status = 'current';

I hope this helps!
-Zak

0 votes
antony terrence June 1, 2021

@Zak Laughton When I use the following, I get only 200 results. Is that set by the Confluence Server admin?

http://<base-URL>/rest/api/content?type=page&start=0&limit=99999
Zhiwei Deng
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
June 1, 2021

It is works to me for this url.

https://<base-URL>/wiki/rest/api/space/{SPACE_KEY}/content?start=0&limit=9999&type=page

But there were still some problems.

1. the result still exist limit. the limit is 1000

2. I add a new param: expand=children.page. the limit param is no effective. (In fact. the limit is return to 200...)

@Vikas Shrivastava @antony terrence 

Like Deleted user likes this
antony terrence June 1, 2021

I had to get the first set of results and do a loop based on the presence of the next link in the response.  When I set the limit to 99999, and I get maximum of 500. If we have to perform a simple action of getting all page details, we have to make multiple calls. I am sure there are areas where Atlassian could reduce the number of calls required to be made. This scenario is one of them.  The depth parameter does not work. 

Like Deleted user likes this
Zhiwei Deng
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
June 3, 2021

Yes. Finally, I made multiple calls to get all pages. But I found another problem. There were exist limit in the "children" field when I add the param: expand=children.page.

(The limit is 25). So that I can't generate the tree structure. This is confusing

https://<base-URL>/wiki/rest/api/space/{SPACE_KEY}/content?expand=children.page&type=page&limit=9999
Pankaj Rana May 23, 2023

I am seeing 404 not found on using above API call.

Admin October 2, 2023

Hi Guys, I used this script for listing all pages from specific space via API:

$url = "https://$($serverUrl)/rest/api/space/$($SpaceKey)/content/page?limit=99999"

$response = Invoke-RestMethod -Method "GET" -Headers $headers -Uri $url -UseBasicParsing

$allSpacePages = $response.results

do {

$url = "https://$($serverUrl)$($response._links.next)"

$response = Invoke-RestMethod -Method "GET" -Headers $headers -Uri $url -UseBasicParsing

$allSpacePages += $response.results


} while($response._links.next -ne $null)

 



This is really goes thru (i tested via POSTMAN step by step) all "_links.next" until this object is null and returns me about 6500 pages from space, but...
when I listed  all pages from space via SQL query:

 

 

SELECT * FROM [Cfl-Db].[dbo].[CONTENT]

WHERE CONTENTTYPE = 'PAGE' AND SPACEID = 51118093

ORDER BY TITLE

 


!!! I got twice more pages about 12 000 !!!


So question is why the api call didn't list all existing pages?

I use Datacenter version 
7.18.3 


Thank you for your answers :)



Admin October 3, 2023


Guys my fault :( I realized that DB returns all page types like "drafts, deleted or current" pages. 
But anyway from DB I got more pages then from API.

Fixed query:

SELECT * FROM [$db].[dbo].[CONTENT]
WHERE CONTENTTYPE = 'PAGE' AND CONTENT_STATUS = 'current' AND SPACEID = $spaceId
ORDER BY TITLE

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events