How can I handle special characters when importing Word documents?

July 11, 2017

We have a set of old Microsoft Word documents (primarily .doc) that we want to import into Confluence (5.9.12). Most content imports OK, but Word's "special characters" that were inserted as symbols do not. For example, a μ (mu) symbol from these documents shows up as  in the Confluence web interface. I can import a test .doc file with both a proper unicode μ and the non-functional  from the word documents. The unicode works where Word's "symbol" doesn't. So, it seems to be a problem of handling whatever Microsoft Word is doing when it stores these special characters. Does anyone know of a way that Confluence could handle this, or failing that, that we could convert these goofy characters into their unicode equivalents before importing?

A bit more gory detail on my troubleshooting:

If I copy and paste the  into a text file and check the contents on that one-character file byte-for-byte, I see 0xef81ad, which matches what I get if I copy the character directly from the Word document. I can also do a manual search-and-replace in the Confluence web interface for that specific  (literally pasting in the box symbol) and put μ in its place, and the replacement leaves alone the other identical-looking but different special characters (like for a degree symbol). So it does seem that Confluence has all the information after importing, the display is just garbled since it doesn't know that 0xef81ad should be shown as a mu character. I'm playing around with the XML-RPC API to see if I can do a batch search-and-replace, but then I still need to figure out all the possible characters we'd run into and make sure I can actually get at that weird text via the API.

Thanks in advance for any ideas,

Jesse

Forums

Product Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

How can I handle special characters when importing Word documents?

2 answers

Suggest an answer

Was this helpful?

Thanks!

TAGS

Atlassian Community Events