Garbled text display in "Log View"

In Sourcetree, when in Log View, on the datagrid with columns "Graph", "Description", "Commit", "Author" and "Date", commit messages and author names having unicode accented characters display as garbled text with strange characters.

But on the panel below, which displays the selected commit's full message, text appears correct.

Other applications display text correctly, only Sourcetree has this problem.

On Mac OS X 10.8.2

2 answers

1 accepted

SourceTree supports UTF everywhere, and accented characters (and Japanese in fact) are very common. Is it possible the encoding used here is something else, perhaps one of the Latin ASCII subsets from another platform?

Hello Steve, thank you for your answer.

The commit messages are written in Portuguese, and encoded as UTF-8.

The messages have been written on SourceTree itself, or on SmartGit. It doesn't matter. If I write accented characters on a commit message in SourceTree, for instance, and then commit, they will appear garbled on the Log View.

All other applications (including command-line git) correctly show the messages.

Event Sourcetree shows the text correctly in several places, except on the Log View's main datagrid, as explained above.

I include below, on this message, a screenshot that may help you understant what I'm talking about.

I created a test message having several accented characters (in this case: ÁèíçããâêÇ) to make a more obvious example.

As you can see, the selected commit message appears correct on the commit details pane, but appears garbled on the log list.
All other messages also display the same problem.

The Author names also appear garbled.

I have hilighted some areas of the application's interface where the problems are clearly visible.

I hope this helps to diagnose the problem.

I just copied & pasted those characters in your comment above into a commit in SourceTree and it worked fine for me:

Are you able to give us a copy of the repo to investigate? You can do it privately at https://support.atlassian.com if you like.

Great! That means there is hope!

Have you any idea of what could be done to investigate this issue further?

The problem is, the way things are, I will have to give up using SourceTree, for it's too unpleasant to see a Log View of garbled messages.

Are you sure this isn't a bug in SourceTree that only happens on certain repo configurations? SmartGit and the command line Git show the log fine.

What can I do?

Thanks

Unfortunately, the repo contains a private company project which I cannot divulge. But thank you for your offer!

The problem also occurs on other repos, so it's not just a weird accident with a specific repo.

I decided to investigate the issue further, and after many tests, I discovered the source of the error:

The problem starts on a commit who's author name has accented characters that are encoded in a specific way.

When displaying a log where such a commit exists (even if just one), SourceTree displays all commit messages and author names with an incorrect encoding (it probably starts using the encoding of the offending commit).

I exported 2 patches to demonstrate the problem:

In the first patch excerpt, you may see the author name (Cláudio Silva) encoded as UTF-8
(this is just one line from the patch file):

From: =?UTF-8?q?Cla=CC=81udio=20Silva?= <claudio.silva@impactwave.com>

The accented A is encoded as 3 characters (a xCC x81). This commit causes no problems on SourceTree.

Now, here's the author name from an offending (error inducing) commit:

From: =?UTF-8?q?Cl=E1udio=20Silva?= <claudio.silva@impactwave.com>

The accented A is encoded as just 1 character (xE1). This seems to be a valid encoding for Unicode, but NOT for UTF-8 (see this: Unicode Character 'LATIN SMALL LETTER A WITH ACUTE' (U+00E1)).

All it takes is just one commit with an author name encoded like this to make SourceTree go mad!...

Nevertheless, command line Git and SmartGit display the logs just fine, and are unaffected by this. At most, the incorrectly encoded characters may appear garbled, but all other text appears fine.

The problematic commit was probably created on another application (perhaps on Windows, with msysgit or with SmartGit, I don't know).

So, in conclusion, I believe making SourceTree being able to handle incorrectly encoded strings without going nuts would be a nice enhancement to the software.

May I suggest bringing up this issue to the development team?

Best regards.

Would you mind attaching the patch that reproduces this either here (as an attachment rather than inline, the encoding seems to have been lost) or against https://jira.atlassian.com/browse/SRCTREE-1285 ? That will make it easier to make sure we test this case directly.

OK, that makes sense, thanks for the detailed analysis. I've seen this problem once before in fact, and the issue is the way that Cocoa deals with character encoding - basically if one character in the stream fails UTF decoding, it refuses to decode the entire stream as UTF, meaning it falls back on a simpler encoding (which then breaks the other extended UTF characters). It doesn't appear to be possible to tell it to skip the offending characters. SourceTree loads the log in bulk for performance reasons which is why this problem can leak across up to 200 lines when it occurs.

I'm guessing that SmartGit works because Java is more tolerant of bad encoding. Command-line git is fine because it does one line at a time.

I've tried to find a workaround for this in the past and not managed it (without horribly killing performance), but I'll try again. The one case this happened in before became a non-issue because it faded into history really fast, but obviously this is more of a problem for you - the problem will go away eventually once that commit drops out of the first 200 lines in the log (after that it won't make the decoding fail for the entire first batch). We'll track it here: https://jira.atlassian.com/browse/SRCTREE-1285

Done!

Thank you very much for looking into this issue.

Best regards.

Hi, I think I have the same exact problem. Any news related to this matter?

Suggest an answer

Log in or Join to answer
Community showcase
Brian Ganninger
Published Jan 23, 2018 in Sourcetree

Tip from the team: workflow and keyboard shortcuts

Supported Platforms macOS Sourcetree has a lot to offer and, like many developer tools, finding and using it all can be a challenge, especially for a new user. Everyone might not love ...

241 views 0 3
Read article

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you
Atlassian Team Tour

Join us on the Team Tour

We're bringing product updates and pro tips on teamwork to ten cities around the world.

Save your spot