UWC is unable to convert the HTML tags to correct form in Confluence

Satyam Roy November 16, 2013

Hi,

I am using the UWC to convert our Mediawiki data to Confluence. I am able to create text files for all the data successfully. But I figured out that some pages have HTML tags in the text files

eg - <font face="Arial" color="black" size="6" font-weight="bold">Title of Page</font>

After checking throughly I found out that this is the way data is stored in the Mediawiki database. So it gets carried forward to the text files created.

When this text file is used to create pages in Confluence, it doesn't get converted to h1 or h2. Thus the page created in Confluence contains the data with the html tags in them.

So my question is, does the UWC support this type of syntax or do I need to write my own parser to parse such occurences of HTML tags.

Secondly, I have a problem with the bullet points as well. What I found is say there are 4 bullet points under a topic, say for 3 the style is maintained and the last is converted as "*". This occurs sometimes for some pages only.

eg-

In the text file

* Topic 1

* Topic 2

* Topic 3

In Confluence

  • Topic 1
  • Topic 2

* Topic 3

Thirdly, the attachments are not getting carried to Confluence. I tried to search for the attachments location in Mediawiki but could not locate it.

Any help would be helpful.

2 answers

1 accepted

0 votes
Answer accepted
Satyam Roy January 2, 2014

I had to write many regular expressions and put them in the mediawiki.converter.properties file to get it work. For some cases I changed the already existing regular expressions to suite my requirements. This way it worked.

0 votes
sandeep sankalapur December 17, 2013

Hi Satyam, There will be nothing like attachments in MediaWiki, you will find 'Images' directory this is what they call as attachments.

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events