Export pages/space to HTML - inter page links are created in two different ways

James Hurrell July 1, 2013

We are using Confluence 4.2.12 and I make regular use of the "HTML Export" option to export a specific space into HTML format for re-publishing to clients. This works well. However, I am increasingly coming across a problem with the way in which links between pages are created in the exported HTML files.

For the majority of the exported pages, when one page references another, the link is created correctly as follows:

<a href="test%2page.html">test page2</a> - this then renders as test+page.html in the browser and will call the local copy of the required page. This is fine.

However, sometimes, the link is created as follows with the fully qualified domain name:

<a href="https://confluence.domain.com/display/SPACE-NAME/test+page2">test page2</a>

Obviously this breaks the links in our publised doc system as the link points to the public addres for the Confluence system

I cannot understand why the links are created differently in the export because within Confluence itself I am creating the links in eactly the same way - using the "Links" button in the editor and then searching for the page.

Does anyone have any ideas?

2 answers

0 votes
Deleted user February 24, 2022

This problem still persists in Confluence Cloud version.  I am glad it is working for you in server version, but the links break arbitrarily in Cloud version.  Sometimes they work others they don't and there is no pattern to this behavior

maker February 24, 2022

Sondra, my solution was to write a Python script to run after the export.

it unzips the archive and hunts for the “bad” links, fixing them as necessary. 

It was fairly involved to get it working perfectly enough for production workflows (I added other stuff like improved styling of callouts, handling other bugs Atlassian won’t fix, etc) but it worked well. 

If you have a dev resource you may want to hit them up for this. Otherwise, lmk I could spec a project to create a solution for your group.

Deleted user February 25, 2022

Would this Python script work on the Cloud version?  Would eliminating all special characters (/,%&) from headings and links help?  I already tried running something that stripped out the leading https:/obscura.atlassian.net/somedirectoryname/someothername/ from the links and this did not work.  let me know some more details and I could bring it up with management.  Thanks

maker March 7, 2022

Hey Sondra, it does work for the cloud version. I think that the linking problem is unrelated to the special charcters in headings issue. If you are still in need, I can send you some details by email, do you have one you can post here? (I don't see a way to DM)

Like Deleted user likes this
Deleted user March 7, 2022

Thank you, yes my email is sondra.menthers@msg.com.  We were exporting from Confluence to HTML and then to Paligo (an xml solution) and then to a website using an HTML5 layout.  A source at Paligo removed all the special characters (this is a common problem for browsers) and it worked for that particular Confluence space instance, but not for other spaces.  So hopefully your solution will work across all Confluence spaces.  Thanks again. 

0 votes
James Hurrell July 1, 2013

Actually, I have just, rather embarrassingly, realised why this is: if the referenced page is not exported with the source page, then the link will be created using the fully qualified domain name. Question answered!!

maker January 27, 2020

James, IDK if you are still do html exports in this way, but I have a similar question. Please let me know if you can help.

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events