I recently imported a local git project into bitbucket. Everything went well, but when I edited a minor typo using the online source editor, it changed a non-ASCII Unicode UTF-8 character into some other encoding. Specifically, it changed the Unicode copyright symbol (U+00A9) into a single-byte character A9, which is not a legal UTF-8 character.
I verified that this is where the character got changed by doing a diff between the two versions, the relevant lines being:
Â© 2018 Center for Advanced Study of Language University of Maryland
+# Copyright © 2018 Center for Advanced Study of Language University of Maryland
The version before my edit contains the sequence of bytes C2 (represented in the online diff tool as the funny kind of A-hat) + A9 (represented as a copyright symbol). But in the version after the edit, it displays just A9, i.e. an 8-bit char. It happens that C2 A9 is the sequence of bytes used in UTF-8 to encode U+00A9. So it appears that when I edited the file using the on-line editor, it changed the valid UTF-8 byte sequence C2 A9 into just A9, which as I say is invalid UTF-8.
I'm using the Firefox browser, and it says the page with the editor is UTF-8. So I don't think it's my browser.
Is the bitbucket on-line editor unsafe with Unicode? If so, that's fine, I normally only edit things in my own editor; I just saw a typo and decided to fix it, but I could instead have pushed the fix from my own computer.
Hello! My name is Mark Askew and I am a Premier Support Engineer for products Bitbucket Server/Data Center, Fisheye & Crucible. Today, I want to bring the discussion that Jennifer, Matt, and ...
Connect with like-minded Atlassian users at free events near you!Find a group
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no AUG chapters near you at the moment.Start an AUG
You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs