Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

How do I find problem pages with invalid XML characters?

Betsy Ostrander
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 21, 2012

We used the Confluence CLI to create a PILE of pages and attachments from our old Sharepoint site. Some of these must give JIRA heartburn.

We have Application Links working between Confluence and JIRA, but when you try to search for Confluence pages from JIRA, and you pick the Development or Production space (where the PILES got added), the search craps out with

The JIRA server was contacted but has returned an error response. We are unsure of the result of this operation.

Works fine in all of the other spaces. In the log, I see the error (Failed to parse Confluence Remote API response), followed about 500 lines later by

Caused by: org.xml.sax.SAXParseException: Character reference "&#7" is an invalid XML character.

I would like to be able to clean up the pages and/or attachments that are causing this. Is there an EASY way to find the pages/attachments? And also an EASY way to find the offending content within them?

Maybe a SQL query from someone who knows the db structure well??

1 answer

1 accepted

Comments for this post are closed

Community moderators have prevented the ability to post new answers.

Post a new question

3 votes
Answer accepted
HuseinA
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
September 12, 2012

JIRA used to use atlassian-xml-cleaner-0.1.jar to clean invalid characters from XML backup as described here. I belive the same can be done in Confluence as well.

You could either export your whole Confluence to XML or export certain pages or a Space to XML:

  1. Extract the entities.xml from the backup file (ZIP)
  2. Clean the invalid characters:
    java -jar atlassian-xml-cleaner-0.1.jar entities.xml > entities-clean.xml
  3. Rename entities-clean.xml to entities.xml. Replace the original entities.xml in the backup ZIP file, with the clean one (make sure to retain the filename to be entities.xml, otherwise it won't work in the import)
  4. Restore it
TAGS
AUG Leaders

Atlassian Community Events