Broken links after XML space import

 

Hello support,

I have run a successful space import from my PreProd confluence instance to my Production instance.

In general everything is fine, but a few internal links are broken.

  • On the original wiki (exported), PageA is linking to PageB and pageC:
    • Link to pageB is a link with page title
    • Link to pageC is a link with pages/viewpage.action?pageId={id}
  • On the target wiki (imported), PageA is still linking to pageB, but pageC link is now broken.
    • Link to pageB is still using the target page title
    • Link to pageC became a create+edit link with the target title


As storage format, the link from PageA to PageC is a link with page title, not with any hardcoded pageId:

<ac:link><ri:page ri:content-title="Product Discontinuation (ABC)" /><ac:plain-text-link-body><![CDATA[product discontinuation]]></ac:plain-text-link-body></ac:link> 

This storage format is the same in both version of the wiki (exported and imported). The page with that title (PageC) does exist in both wiki too, BUT when reading this page, the URL does not show the page title, but the pageId :

  • Original wiki: pages/viewpage.action?pageId=60065714
  • Imported wiki: pages/viewpage.action?pageId=79495906

It looks like confluence has all information to be able to recover these links (page titles are unique in spaces), how can I recover this case, is there a way to force reindexing or something like this ?

Is it possible to enforce confluence to regenerate page link via title, so that

  • pages/viewpage.action?pageId=79495906

becomes:

  • display/spacekey/Product+Discontinuation+(ABC)

?


I am pretty sure that if I can trigger this kind or "reindexing", the internal links will work back.

UPDATE

In this case, the issue happens with pages that have parenthesis in the name (page title = "Product Discontinuation (ABC)" for instance). I just saw this related KB article:

https://confluence.atlassian.com/confkb/confluence-page-urls-contain-pageid-instead-of-the-page-title-278692715.html


But I don't understand on detail: here '(' and ')' are forbidden characters, but if I update my page title with a random suffix, my link changes from:
/pages/viewpage.action?pageId=79495906
to
/display/spacekey/Product+Discontinuation+%28ABC%29+title

So... it looks like it can be handled in the page URL ??

We are now using Confluence 5.6.3


Thanks and best regards

Colin

 

 

3 answers

1 accepted

This widget could not be displayed.

Hello,

I had a deeper look at the DB, and I found the root cause.

All my dead links were having actually more than parenthesis as per my example. They had actually a quote in the original title.

This is the state of the DB in both wiki (exported from, and imported in):

  • exported from: (OK)
  • 60065326 is the page id of "PageA"
  • 60065714 is the page id of "Product Discontinuation (PLM ABC’s)"
  • For 60065326 : LINKS.DESTPAGETITLE = Product Discontinuation (PLM ABC’s)
  • For 60065326 : BODYCONTENT.BODY = <ri:page ri:content-title="Product Discontinuation (PLM ABC&rsquo;s)" />
  • For 60065714 : CONTENT.TITLE = Product Discontinuation (PLM ABC’s)

 

  • imported to: (Fail)
  • 79495974 is the page id of "PageA"
  • 79495906 is the page id of "Product Discontinuation (PLM ABC's)"
  • 79495974 : LINKS.DESTPAGETITLE = Product Discontinuation (PLM ABC’s)
  • 79495974 : BODYCONTENT.BODY = <ri:page ri:content-title="Product Discontinuation (PLM ABC&rsquo;s)" />
  • 79495906 : CONTENT.TITLE = Product Discontinuation (PLM ABC's)

 

In the imported wiki, in DB, the became a ' .

Then all my links based on page title were dead.

 

To fix that, I had to run a SQL query to repair all the page title with ' ; in order to have the back:

UPDATE [db].[CONTENT]
SET TITLE = REPLACE(TITLE, '(PLM ABC''s)', '(PLM ABC’s)')
WHERE SPACEID = {space_id} AND TITLE like '%(PLM ABC''s)%'

 

After that query, all the links were valid.

I think there is an error in Export / Import process somewhere, the special right single quotation mark became a single quote.

 

The ticket to fix the corresponding issue has been created by Atlassian:
https://jira.atlassian.com/browse/CONF-41354

This widget could not be displayed.

I can see one "painful" solution here:

Write a script (SQL / Java ... ?) that:

  • scan all pages title and replace / remove reserved characters
  • scan all pages content and replace / remove reserved characters inside <ac:link><ri:pageri:content-title attribute value

isn't it a better and faster solution ?

This widget could not be displayed.

Hi Colin,

You can contact Atlassian Support here:  https://support.atlassian.com/customer/servicedesk-portal

Regards,

Kay

Suggest an answer

Log in or Sign up to answer
Community showcase
Posted Sep 17, 2018 in Confluence

Why start from scratch? Introducing four new templates for Confluence Cloud

Hi my Community friends!  For those who don't know me, I'm a product marketer on the Confluence Cloud team - nice to meet you! For those of you who do, you know that I've been all up in your Co...

574 views 7 6
Join discussion

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you