11

About a week ago I realized that the file list in µTorrent would hang for less than a second whenever a file with a long Japanese file name was visible. I found it curious, but I didn't really have time to worry about it at the time, especially since it was only limited to µTorrent.

However, today I realized that it is not. If I for example save a text file with a long multibyte character file name and open it in Notepad, I get some strange results. When I try to resize the window, everything slows to a crawl. I can however release my grip on the window and see how my cursor splits in two, one being controlled by me and the other being a sort of "ghost cursor" for lack of a better word that executes the dragging motion I originally made with the mouse. This only applies to filenames of this nature, and I have tested it in applications other than Notepad and µTorrent as well.

I've tried searching for clues as to what is causing this strange behavior, but I cannot find anything. Does anyone here have any idea what's going on?

Unfortunately, I cannot take a screenshot of this as it seems like all screenshot applications hang until the resizing is complete before taking the shot...

Edit: I've recorded a video demonstrating the problem. I'm not sure whether this will help in identifying the cause but it should at least be better than my explanation above:

https://vimeo.com/58619918

Edit 2: Here's a sample file as requested: Note that it's simply an empty file with a long multibyte filename: http://goo.gl/bgnGP (And for those of you with a browser which can't handle the filename, here's a zip-file: https://dl.dropbox.com/u/55495248/multibyte.zip)

Merigrim
  • 223
  • 1
  • 6
  • I was going to upload it to YouTube at first, but apparently it's impossible without "upgrading" your account to show your real name. No thanks. I hope Vimeo's okay. – Merigrim Jan 31 '13 at 12:31
  • Could you tell us some details about the computer? In special, the video card you use (or is it those videoboard inside chip? Are the video drivers updated? Rendering problems can be caused by video, not Windows.... – woliveirajr Feb 05 '13 at 15:19
  • 1
    @woliveirajr Sure. Here's a stripped DxDiag.txt (contains info about CPU, GPU, memory, etc.): http://pastebin.com/eYvS8mGL I think it's been a month or two since I updated my video drivers, I'll give it a go. – Merigrim Feb 05 '13 at 18:34
  • 2
    Try the first answer to the question http://superuser.com/questions/371282/my-windows-7-has-suddenly-stopped-displaying-unicode-symbols and see if it helps... – woliveirajr Feb 07 '13 at 11:03
  • 1
    and also (in the same link above) the note about http://support.microsoft.com/kb/2505438 – woliveirajr Feb 07 '13 at 11:04
  • Could you provide some more details about your computer's specs? It's possible that the file name is getting pushed to memory and your computer is having trouble cycling through events with such a large text object. –  Mar 01 '13 at 21:32
  • Just watched your video link. I think the above is likely the cause. Whatever multibyte character combination you're using for the file name is so large that when it's loaded in to memory, it's causing the application that's loading it to lag behind. Does it cause any other issues i.e. screen tearing? Does it cause any issues with other programs? I would just stay away from naming files with such long names to begin with. Remember, a file name is just a pointer, so it will be called as many times as the data that it points to is needed. –  Mar 01 '13 at 21:37
  • @Ash The only problem that arises is the delay, and it seems to be there for all programs which I have tested so far. No screen tearing or anything other I can think of. (Accidental submit, more to come in a minute) – Merigrim Mar 02 '13 at 00:38
  • @Ash (Continued) If I get what you're saying correctly, the window caption string is repeatedly read when rendering the application, thus slowing down the rendering process? While I could accept this as the cause, it seems strange. My computer, while not top-of-the-line anymore, has good components (most of which you can find in the link above. It doesn't say, but I have 8GB DDR3 1600MHz RAM) – Merigrim Mar 02 '13 at 00:45
  • Are you sure it's only with Japanese names? Because since "recently", µTorrent has been doing things like "loading metadata" pretty much all the time for me. – Ariane Mar 07 '13 at 12:24
  • @Ariane Yes, Japanese only (and other East Asian languages too I would presume). I haven't encountered any problems with other names. =( – Merigrim Mar 07 '13 at 16:04
  • Then perhaps it's a character set thing. This is just a guess, but maybe, since efficacy can be super important to some people in torrents, and since most files destined for download/sharing have an Internet-friendly, ASCII-ish name, they decided that to increase performance, they would change the character set to something else than what they had before, because it would be faster for the vast majority of users (especially since there's a language selection in the program), and scan and set another character set in exceptional cases. Maybe try setting the program to JP if you can and see. – Ariane Mar 08 '13 at 19:01
  • @Ariane I think it's a good guess, but in this case it's not limited to µTorrent. The problem occurs in applications like Notepad as well. – Merigrim Mar 09 '13 at 01:17
  • Could you post an example file for testing? – OakNinja Mar 09 '13 at 02:34
  • @OakNinja I added a link in the description. =) – Merigrim Mar 09 '13 at 10:13
  • Thank you - but you'll have to ZIP it. Both IE and FF hates that filename :) – OakNinja Mar 10 '13 at 01:15
  • @OakNinja I see, I haven't used either of those in years, so I didn't even consider that possibility. =P I have added a zip-file now, thanks for pointing that out. – Merigrim Mar 10 '13 at 01:25
  • The erronous link in IE really just adds to your question - this shows how segregated the web (and digital text in general) really is. Character encodings is hell! – OakNinja Mar 10 '13 at 10:29
  • @Merigrim : I think the file name did not survive the zipping and/or uploading. The file's name is î®é-é+é¦éóüBéÃéñéÁé-é¦é¦Äûé¬ïNé½é-éóé_éÀé®üHûlé-é¦é-é¤éÞò¬é®éÞé_é¦é±üBî®é-é+é¦éóüBéÃéñéÁé-é¦é¦Äûé¬ïNé½é-éóé_éÀé®üHûlé-é¦é-é¤éÞò¬é®éÞé_é¦é±üB to me, both inside WinRAR and once the file is extracted. Not so Asian. – Ariane Mar 10 '13 at 20:51
  • @Ariane Huh, that's strange. Are you sure the problem isn't on your end? It extracts fine for me (using 7-Zip and even with just the regular explorer). – Merigrim Mar 11 '13 at 01:43
  • @Merigrim With my very limited knowledge of Japanese and massive help from the IME keyboard layout, I just created a file named あの日は雨でした.txt, and have successfully made it into a Zip, and extracted it back with Winrar. The name stayed the same. Either 7-Zip and WinRAR don't communicate well, either it's the upload/download process that poses a problem. ----- Oh, and uhm, for your file, Windows Explorer gives a slightly different result from WinRAR's: î®é─é╚é│éóüBéÃéñéÁé─é▒é╠Äûé¬ïNé½é─éóé▄éÀé®üHûlé═é│é┴é¤éÞò¬é®éÞé▄é╣é±üBî®é─é╚é│éóüBéÃéñéÁé─é▒é╠Äûé¬ïNé½é─éóé▄éÀé®üHûlé═é│é┴é¤éÞò¬é®éÞé▄é╣é±üB.txt – Ariane Mar 11 '13 at 22:20
  • @Ariane I see. Try this one, I used Windows Explorer to create the archive instead of 7-Zip: https://dl.dropbox.com/u/55495248/multibyte2.zip – Merigrim Mar 12 '13 at 02:40
  • Still the same... I guess the one to blame is Dropbox. – Ariane Mar 12 '13 at 23:06
  • @Ariane I don't think so, if I download the file from Dropbox it still extracts just fine. =/ Can you try creating the file yourself? The filename I use is `見てなさい。どうしてこの事が起きていますか?僕はさっぱり分かりません。見てなさい。どうしてこの事が起きていますか?僕はさっぱり分かりません。.txt` – Merigrim Mar 13 '13 at 00:53
  • Seems to work fine for me. How about you try downloading it from my own Dropbox stuff? http://dl.dropbox.com/u/51617032/file.zip When I download it myself, it seems fine. – Ariane Mar 13 '13 at 02:03
  • Can you go to "Control Panel\All Control Panel Items\Display" and check in the sidebar for "adjust ClearType text" and toggle the setting? – uncovery Mar 13 '13 at 06:46
  • I downloaded the file, and can't reproduce any of your problems (the text is displayed correctly and everything is smooth). Also I have many japanese text in utorrent lists, but no issues with that too. – キキジキ Mar 17 '13 at 07:17
  • Do you get the same behaviour when your desktop theme is set to Windows 7 basic (i.e. a non Aero theme)? – aportr May 07 '13 at 08:31

2 Answers2

1

I can explain how Unicode is being handled, but I cannot really directly answer your question. I have had slowness for the first write, but once that is done, it gets fast again...

Unicode is composed of what we call planes. Planes are 256 characters. In many situations, fonts will handle one plane, in part to avoid very large files but also because it is enough for many languages (English, French, German...). However, Asian languages make use of larger fonts that cover multiple planes. For a complete Japanese character set you'd get, if I'm correct, about 10 planes. Chinese is more (especially traditional Chinese!)

When rendering with such fonts, you have to select the corresponding font (if one font is not enough to handle all the characters, the operating system switches between fonts for you; that's under the hood, but it happens.) That is time consuming. Plus, the first time the system writes in that font, it needs to load it from disk. Asian languages having large fonts, that takes time too.

Finally, and that is probably more likely what you are encountering, the characters (or glyphs) are generally more complex. That means more time to render the characters. Although that could be done by the video board with OpenGL/D3D, for fonts, that is not so good. You lose a lot of quality (although font quality under MS-Windows...) So it is most often done by the processor.

One last note, although I would really doubt that is a concern, by default Win7 makes the window edges semi-transparent. It could be that adds to the problem. This part of the rendering, however, is most certainly done with accelerated 2D/3D functions on your video board.

Alexis Wilke
  • 1,518
  • 13
  • 19
-1

If your pc render a multibyte character it goes slower because maybe it has to do more than 1 instruction to procces the character.

A 64bits version could get the 64 bits name in 1 call, proccess it in 1 call and store it in 1 call = 3 calls.

A 32bits version will have to work with the first 32 bits, then the other 32, and then manage both operations:

get the 64 bits name in 3 call, proccess it in 3 call and store it in 3 call = 9 calls.

AskPGSV
  • 27
  • 2