Can we get Confluence Page ID via REST for a given page URL

srinivasp
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
November 20, 2017

I need Confluence Page ID and page owner values using REST api for a given Confluence URL. Is it possible?

3 answers

1 accepted

6 votes
Answer accepted
AnnWorley
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
November 21, 2017

Hi Srinivas,

This page has an example of how to get the page ID and creator if you have the URL (which includes the space key and page title):

Confluence REST API Examples

Please see the heading, "Find a page by title and space key".

There are no page owners in Confluence, but perhaps finding the creator will be of help.

Please let us know any follow up questions.

Thanks,

Ann

srinivasp
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
November 21, 2017

Thank you

Hung Nguyen February 25, 2020

In some case, the page is given with a url with no Title at all. It may be a tiny url. Is there anyway we can get the page Id from that, using REST API?

Like Moritz_Ringler likes this
Hung Nguyen April 2, 2020

The way I manage around in this case (no title, no space) is to do one extra step to 'get' this tiny url content first, then extract the space, title and id from the metadata content.

They are kept in the attributes with names ajs-space-key, ajs-page-title and ajs-latest-page-id. 

Phaneendra chitta September 30, 2021

Hi,

@AnnWorley Can I get only page ID as a response in JSON instead of all the info of the page?

Andrey Tetyuev July 13, 2023

@Phaneendra chitta: AFAIK - there is no way around to get the page ID from short link except to "open the link" (e.g. via curl) and then read the ID from retrieved page content as described below in the comment from @Partha Kaushik on 24.06.2021.

I don't know - why it should be avoid to retrieve the page content from short link, but if it's really strictly prohibited for some reason - a workaround could be to iterate via REST API through all pages at confluence server (i.e. get all spaces, then within all spaces iterate over all pages and their sub-pages starting from each space-homepage). During the iteration compare the short links of pages with requested one and in this way - get the page ID. But this workaround is really ugly and will take longer as direct download of page content...

1 vote
Partha Kaushik
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
June 24, 2021

Here is a code snippet in Python:

import os, re, sys, json, requests
# In the page URL, replace the + symbols with %20
confPage = https://confluence.mydomain.com/display/TST/My%20Specific%20Page
auth = (confUsername, confPassword)
headers = { 'Content-type': 'application/json', 'Accept': 'application/json' }
res = requests.get(confPage)  
#print (res.content)
matched_lines = [line for line in (res.text).split('\n') if "ajs-page-id" in line]
if len(matched_lines) == 0: exit("Page ID not found")
# Output line example: <meta name="ajs-page-id" content="544428643"> , need that page-id
pageID = (matched_lines[0].split()[2].strip('<').strip('>').split('"')[1])
print (" Page ID: " + pageID)

Rakesh Narayanaswamy
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
September 21, 2023

confPage = https://confluence.mydomain.com/display/TST/My%20Specific%20Page  did not work for me instead I had to use REST end point 

confPage = https://confluence.mydomain.com/display/rest/api/content/464194260

 

Are you sure we can use normal https url that we use in browser in script ?

Andrey Tetyuev June 5, 2024

The snippet works, you have just forgotten to give auth and headers arguments in the call of requests.get() ;-)
So the correct call would look like:

res = requests.get(confPage, headers = headers, auth = auth)

I would also recommend to save user & pass combination in user-specific local environment variable on the PC (where the script runs). Then from the script during runtime retrieve the values of corresponding env. variables and use them by the call of requests.get() .
Also it would be good to use different names for local variables as the function-arguments (e.g. use my_auth and my_headers).

Last hint: if someone likes to access the confluence using PAT (and not user & pass combination) - following could help:

my_pat = os.getenv('YOUR_NAME_OF_ENV_VARIABLE_CONTAINING_PAT')
my_headers = {"Content-type": "application/json", "Accept": "application/json", "Authorization": "Bearer " + my_pat}
res = requests.get(confPage, headers=my_headers )

 

0 votes
Dara O hEidhin June 4, 2024

Some urls do not contain a title, page id or space. The following grabs the redirect URL from those URLs and extracts the page id that can then be used to grab the page using the Python API

``` python

import re, requests

confpage = "https://<MYSITE>.atlassian.net/l/cp/blahblahblahNO_TITLE_NO_ID"

auth = (<username>, <password>)
headers = {"Content-type": "application/json", "Accept": "application/json"}
r = requests.get(confpage, headers=headers, auth=auth)

page_id = re.search(r"\/pages\/(\d{8})", r.url).groups()[0]
print(page_id)

```

Andrey Tetyuev June 5, 2024

I've tried the proposed way right now and it doesn't work in my case (using Confluence at cloud server): The response.url contains exactly the same page url as in the request (i.e. it being not "resolved" to something like https://Some_Site_Url/*page* ).
Correspondingly the regex doesn't find anything.
I would recommend to use the way described in the comment:
https://community.atlassian.com/t5/Confluence-questions/Re-Can-we-get-Confluence-Page-ID-via-REST-for-a-given-pa/qaq-p/1733664/comment-id/212025#M212025

Currently I don't know a way around except parsing of the page content to get its ID.

Dara O hEidhin June 5, 2024


A couple of things

  1. when you paste the URL into a browser does it redirect to a  https://Some_Site_Url/*page* . If it doesn't the the approach above is likely to fail. 
  2. If it does try 
    r = requests.get(confpage, headers=headers, auth=auth, allow_redirects=True)

     

  3. It may also be worth looking at 
    r.headers['Location'] 

 

 

 

 

 

Andrey Tetyuev June 7, 2024

1) No -> the url will not be redirected in browser to something different ( so it would not contain page-ID). Therefore you can't trust such method because it depends on server behavior and concrete requested page url.
The solution with parsing the page content works out of the box.

3) the headers doesn't contain 'Location' item (neither 'location'). I can assume that it could be filled only if redirection happens. Without redirection - the item doesn't exists. There are also no any other items in the headers containing the page-ID.

So the best way to get the ID for sure - use the way described in this comment:
https://community.atlassian.com/t5/Confluence-questions/Re-Can-we-get-Confluence-Page-ID-via-REST-for-a-given-pa/qaq-p/1733664/comment-id/212025#M212025
(the code is working with minor fix as described in my answer to it)

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events