Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

SourceTree - Russian charset support

haiflive haiflive July 11, 2013

Hello, i found the bug in the program "SourceTree".

The program do not view russian symbols(charset is look like ~cp1251).

charset bug,

My system configuration is:

Windows 7 x64, Service Pack 1, language is Russian.

Thanks.

4 answers

0 votes
Vitaliy Timoshenko August 13, 2016

I have the same shame in SourceTree. As you can see answers is on Unicode and on Russian, but filenames detected like as unknown symbols. Apologize it's ASCII codepage. As a result I can not add file in commit. It's a bug anyway. Apologize it's problem not only for Cyrillic, it should be for any non Latin symbols.

tools->options->utf-8

hg add -y .hgignore src\img\06_������������_m.jpg
skipping unreadable pattern file 'C:\Users\Виталий\Documents\hgignore_global.txt': No such file or directory
src\img\06_????????????_m.jpg: �������������� ������ � ����� �����,
.hgignore already tracked!
Выполнено с ошибками, см. выше.

errors in filename and part of comments that generated by sourcetree (after "m.jpg:") Somebody told that it's all utf-8 => no.

tools->options->windows-1251

hg add -y .hgignore src\img\06_������������_m.jpg
skipping unreadable pattern file 'C:\Users\Виталий\Documents\hgignore_global.txt': No such file or directory
src\img\06_????????????_m.jpg: Синтаксическая ошибка в имени файла,
.hgignore already tracked!
Выполнено с ошибками, см. выше.

errors in filename and username!?

tools->options->koi8-r

hg add -y .hgignore src\img\06_������������_m.jpg
skipping unreadable pattern file 'C:\Users\п▓п╦я┌п╟п╩п╦п╧\Documents\hgignore_global.txt': No such file or directory
src\img\06_????????????_m.jpg: яХМРЮЙЯХВЕЯЙЮЪ НЬХАЙЮ Б ХЛЕМХ ТЮИКЮ,
.hgignore already tracked!
Выполнено с ошибками, см. выше.

filename, username, comments - look like all of them go in different codepage - all wrong

!But found one strange thing those file have Russian letter "Ё" [yo] inside. "CYRILLIC CAPITAL LETTER IO" 0401/0451 codes in Unicode page. when I delete this letter and do commit:

hg add -y src\img\06_������������_m.jpg
skipping unreadable pattern file 'C:\Users\Виталий\Documents\hgignore_global.txt': No such file or directory
hg commit -y --logfile C:\Users\Виталий\AppData\Local\Temp\tzaur2jz.hwv .hgignore src\img\06_������������_m.jpg
hg push --new-branch default
pushing to bla-bla-bla
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 2 changes to 2 files
Успешно выполнено.

What is it guys?! What a strange full UTF-8 support in modern instrument? Let's not support letter W (reason can be because it made from two letters V and user can write VV against W)! It will be minimum funny, isn't it?

P.S. Strange that August 2016 on the road but it's not fixed yet.

gedeonych December 27, 2018

P.S. Strange that December 2018 on the road but it's not fixed yet.

Max Reshetov November 8, 2019

P.S. Strange that November 2019 on the road but it's not fixed yet...

Like Sur0vy likes this
Sur0vy June 11, 2021

I have the same problem (I think).

if i select code page win-1251, i see text in commits tree like this:

работа над прошивкой main платы - (работа с асинхронным двигателем + с датчиком температуры) Sur0vy <Sur0vy@xxxxx.ru> daca7cf4d1f55321e3290a2da219e6897eabe5eb 03.06.2021 10:36:46

and name of files looks like normal (even in Russian).

But if I change code page to utf-8 all of my commist will looks like normally, but russian names of files are corrupted.

docs_desc\������������_.doc

Нow to solve it?

0 votes
Alexey Makarenya June 6, 2016

Let's me add some info.
Current version of SourceTree attempts to detect source file encoding by calling IMultiLanguage2::DetectInputCodepage method. If file contains BOM, this method successfully detects encoding. If the file contains only ASCII chars, detection of codepage turns out to be unnecessary. But if file contains not ASCII chars and it not contains BOM, DetectInputCodepage throw dices. Or maybe it detect moon phase, or temperature on the mars... I don't konw, that actually it do, but it is not a codepage detection. Result of this detection is random. In a same time, some build systems require, that BOM must NOT be included in file. php as the example. And if my system require to exclude BOM, then i can't use any chars, except ASCII.
Strictly speaking, i can use any codepage, but sourcetree may (and often do) display file with wrong encoding.
And current version of SourceTree is entirely trust to this foolish DetectInputCodepage. And It gives no way to override it's bad prophecies.
And this is a bug, not feature.

Alexey Makarenya June 6, 2016

Of course, such decompilation, manual override EncodingTools.dll and substitution in the folder save situation. But it's too dicey decision.

0 votes
stevestreeting
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
July 18, 2013

Actually from SourceTree For Windows 1.1 we'll support non-UTF8 encoding, specifically old-style Windows codepages. https://jira.atlassian.com/browse/SRCTREEWIN-169

We can mostly auto-detect this in the diff view, but you should set your default codepage option in Tools > Options after release so that other output will be decoded correctly.

0 votes
KieranA
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
July 11, 2013

Hi haiflive,

We generally don't support anything beyond UTF-8 encoding as this covers most bases. UTF-8 does support Russian character sets, too (as shown here). If you can use that encoding instead then that'd solve the problem. For us to support all character sets would involve a lot of work so we generally support the most popular ones that have support for the majority of characters used across languages (UTF-8, Latin-1).

Hope that helps

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events