Unicode in online editor

Michael Maxwell January 6, 2018

I recently imported a local git project into bitbucket.  Everything went well, but when I edited a minor typo using the online source editor, it changed a non-ASCII Unicode UTF-8 character into some other encoding.  Specifically, it changed the Unicode copyright symbol (U+00A9) into a single-byte character A9, which is not a legal UTF-8 character.

I verified that this is where the character got changed by doing a diff between the two versions, the relevant lines being:

-# Copyright © 2018 Center for Advanced Study of Language University of Maryland
+# Copyright © 2018 Center for Advanced Study of Language University of Maryland

The version before my edit contains the sequence of bytes C2 (represented in the online diff tool as the funny kind of A-hat) + A9 (represented as a copyright symbol).  But in the version after the edit, it displays just A9, i.e. an 8-bit char.  It happens that C2 A9 is the sequence of bytes used in UTF-8 to encode U+00A9.  So it appears that when I edited the file using the on-line editor, it changed the valid UTF-8 byte sequence C2 A9 into just A9, which as I say is invalid UTF-8.

I'm using the Firefox browser, and it says the page with the editor is UTF-8.  So I don't think it's my browser.

Is the bitbucket on-line editor unsafe with Unicode?  If so, that's fine, I normally only edit things in my own editor; I just saw a typo and decided to fix it, but I could instead have pushed the fix from my own computer.

0 answers

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events