Come for the products,
stay for the community

The Atlassian Community can help you and your team get more value out of Atlassian products and practices.

Atlassian Community about banner
4,365,310
Community Members
 
Community Events
168
Community Groups

Export page to pdf using rest call

Edited

I'd like to save a page to pdf using a request call. I know the page url, including the page id, but I don't know the request or wiki call.

I haven't been able to find any confluence rest calls related to pdfs.

I'm using Python and the Rest API.

3 answers

I'm working on the same type of issue but only for pdf's.  If you're using Python, you can use the confluence package (https://atlassian-python-api.readthedocs.io/confluence.html) and the export_page function for pdf.

confluence.export_page(page_id)

As the doc indicates, if you're using Confluence Cloud, make sure when you instantiate confluence that you set the api_version to cloud.


confluence = Confluence( url='https://yourcompany.atlassian.net/', api_version='cloud', username=username, password=password)

It works for me, export the pdf.

That might have worked in the past, but it no longer works. It now gives the HTML for the page that waits for the .PDF file to be ready before displaying a link to download it.

I finally got it working:

confluence = Confluence(
url="https://yourcompany.atlassian.net", username=user, password=pass,
api_version="cloud"
)
with open("output.pdf", "wb") as pdf_file:
pdf_file.write(confluence.get_page_as_pdf("123456"))

Note that you have to add api_version="cloud" to the init, not cloud=True!

With api_version="cloud", the API wrapper will use  get_pdf_download_url_for_confluence_cloud to wait for the long running task to be complete and return the correct URL.

See https://github.com/atlassian-api/atlassian-python-api/blob/717656071b8d352ef349ce4ed8dace1e8c7084ad/atlassian/confluence.py#L2463

PS: Atlassian people, this is confusing: why have both a cloud=True and api_version="cloud" option to the init? From the source code, I don't have the impression the cloud=True parameter is ever used?

Like Steve Jorgensen likes this

Dirk, I think you need to post that about the constructor argument as an issue at https://github.com/atlassian-api/atlassian-python-api if you want it to get any attention since that has to do with the atlassian-python-api package and not the API itself.

Even though that will apparently work, it relies on undocumented calls intended for use by the Web UI that might change without notice at any time.

What I finally ended up doing instead is using wkhtmltopdf to generate a pdf locally from the HTML in the body in export_view format after saving that to a local file.

In order to make wkhtmltopdf access the images using the API login, I generated the basic auth header (base64 encoding of <username>:<token>), used the `--custom-header` option to pass that in, and used the `--custom-header-propagation` option to have that included in the additional requests. In order to keep from including the auth header value in the process info for wkhtmltopdf (which is a possible security hole) I am passing the options in through stdin using wkhtmltopdf's `--read-args-from-stdin` option.

Submitted suggestion about constructor argument as an issue to the python API repo: https://github.com/atlassian-api/atlassian-python-api/issues/997

This is how I solved it. I used curl and sent this command where pageId= 123456 was the ID number of the page. I call that line from a Windows DOS prompt via Matlab.

curl -v -L -u "user_name:password" -H "X-Atlassian-Token: no-check" "https://confluence/spaces/flyingpdf/pdfpageexport.action?pageId=123456" --output "c:\output_dir\confluence_title.pdf"

Is it a typo that your pageId is "23456" in your description and "123456" in your curl command, or do you need to add a "1" at the beginning?

Yes, that was a typo. I fixed it and thanks for the catch.

Does it work? I think it just renders a page with link to download .. 

Like Klaus Pittig likes this

Like the function in the atlassian-python-api package, this might have worked in the past but now gets the content of an HTML page instead.

I agree with @Steve Jorgensen : the result is a HTML page to monitor the Long Running Task, instead of the pdf page itself. Did anybody find a workaround for this?

The end of that curl call specifies a save location of the pdf. Are you sure that the pdf is not being saved there? In the example below that's "c:\output_dir\confluence_title.pdf".

 

curl -v -L -u "user_name:password" -H "X-Atlassian-Token: no-check" "https://confluence/spaces/flyingpdf/pdfpageexport.action?pageId=123456" --output "c:\output_dir\confluence_title.pdf"

I had to slightly adjust the curl command (I am using Confluence cloud, maybe that is the case). Obviously, the file 'test.pdf' is written, but it contains HTML code to render the progress bar I also see when I navigate directly to the link in a browser.

The URL I use is: 

curl -v -L -u "user_name:password" -H "X-Atlassian-Token: no-check" "https://company.atlassian.net/wiki/spaces/flyingpdf/pdfpageexport.action?pageId=123456" --output "test.pdf"

Suggest an answer

Log in or Sign up to answer
TAGS

Atlassian Community Events