9

On Windows 7, dir or tree can't show unicode characters, even starting cmd with cmd /U

So I would press Window Key + R to run something, and type in cmd /U so that the content might handle Unicode.

And then using dir or tree /F, the content in Unicode won't show as Unicode. (in Window Explorer (file manager), the Unicode will show)

Is there a way to handle it? To get Unicode characters to test your filenames, you can go to

http://news.google.com/news?edchanged=1&ned=tw

and you will be able to get many Unicode characters there (UTF-8)

nonopolarity
  • 9,516
  • 25
  • 116
  • 172

4 Answers4

9

Change the font for the console window to a TrueType font, such as Lucida Console or Consolas. With raster fonts you are restricted to the OEm character set.

cmd /u only changes output piped into files, not what you see on screen.

PowerShell by default uses a TrueType font which is why it worked for you.

This has nothing to do with cmd.

Joey
  • 40,002
  • 15
  • 104
  • 126
  • 1
    Even if I change to Lucida or Consolas and run cmd /u, I don't get the Unicode characters. – Snark Apr 10 '11 at 08:07
  • 2
    If you see boxes, then it *does* indeed work. The console subsystem does not support font switching so it can only use glyphs from the single font you specified. And since neither font has glyphs for Han ideographs you'll see only boxes. However, the text *is* there; you can copy and paste it, for example. You won't see anything different in PowerShell, though, unless you use the PowerShell ISE (which is not a console application and therefore not subject to the same limitations). – Joey Apr 10 '11 at 08:31
  • 1
    @Joey The problem is entirely a result of using CMD. CMD *cannot* display Unicode characters. CMD can display DBCS characters, but only that of your system locale. Change your system locale to Shift-JIS, reboot, and you'll be able to show Japanese (Shift-JIS only, not Unicode) characters. The CMD command line processor supports Unicode, thus you can pipe output to files and open them in Notepad, for example, but you can't *display* Unicode. – Jeff Dec 02 '14 at 03:09
  • 1
    @Jeff it can't display a lot characters simply because those characters aren't exist in the current font. It can't simply replace the characters by the same in another font like in GUI, as the font policy in cmd is very strict, it only accepts some specific fixed-width font. It isn't because of locale because if I set the codepage to Vietnamese, cmd can still display Russian, Turkish, Japanese... characters without problem, provided that the characters are available in the font. – phuclv Dec 02 '14 at 07:51
  • This is not really true that `This has nothing to do with cmd`. When I use `dir /s` into console, unicode characters are showed correctly, whilst when I pipe them into file, they are messed. Only `chcp 65001` solves the problem. – Suncatcher May 29 '18 at 12:41
  • @Suncatcher: Redirection is a feature of `cmd`, so of course, if you use such a feature, `cmd` is involved. Try selecting and copying Unicode characters in the console, though, which has nothing to do with `cmd`. – Joey May 29 '18 at 13:58
1

https://stackoverflow.com/questions/10764920/utf-16-on-cmd-exe

  1. Open/run cmd.exe
  2. Click on the icon at the top-left corner
  3. Select properties
  4. Then Font bar
  5. Select Lucida Console and OK.
  6. Write Chcp 10000 at the prompt
  7. Finally dir /b

Also from https://stackoverflow.com/questions/379240/is-there-a-windows-command-shell-that-will-display-unicode-characters/24135341#24135341

  1. CHCP 65001
  2. DIR > UTF8.TXT
  3. TYPE UTF8.TXT
0

Reg file

Windows Registry Editor Version 5.00 [HKEY_CURRENT_USER\Console] "CodePage"=dword:fde9

Command Prompt

REG ADD HKCU\Console /v CodePage /t REG_DWORD /d 0xfde9

PowerShell

sp -t d HKCU:\Console CodePage 0xfde9

Cygwin

regtool set /user/Console/CodePage 0xfde9

0

It's not just a command prompt problem, but a Windows problem in general. The C "wide-character" functions in Windows (namely wprintf) do not support Unicode.

user541686
  • 23,663
  • 46
  • 140
  • 214
  • Proof for your claim? MSDN constantly mentions the wide-char variants for Unicode. – Joey Apr 10 '11 at 08:34
  • 1
    [Here](https://connect.microsoft.com/VisualStudio/feedback/details/101864/wprintf-has-no-unicode-support)'s one. – user541686 Apr 10 '11 at 08:37
  • you are aware that that is a bug that was reported on a *prerelease version* of Visual Studio *2005* and has been long fixed (not to mention that we're at VS 2010 by now ...)? Note also that the Windows API can easily be used and there is no particular reason to use the C standard library in places where no portability problems are likely to be encountered. – Joey Apr 10 '11 at 20:18
  • @Joey: No, I wasn't aware of that. However, right now I'm using Visual Studio 2008 and calling `wprintf(L"私")`, and it's definitely *not* printing anything Unicode. Do you have any example of complex scripts that *do* get printed with `wprintf`? (Locale changing shouldn't be needed because the strings are UTF-16.) – user541686 Apr 10 '11 at 20:24
  • 1
    The Windows console does support Unicode because it uses Unicode behind the scenes http://stackoverflow.com/questions/1259084/what-encoding-code-page-is-cmd-exe-using?lq=1 http://stackoverflow.com/questions/2213541/vietnamese-character-in-net-console-application-utf-8?lq=1 http://stackoverflow.com/questions/388490/unicode-characters-in-windows-command-line-how?lq=1 In Windows 7 codepage 65001 can't display the characters correctly but you can copy the them to a text editor to see the correct output. In Windows 8 it displays UTF-8 without problem – phuclv Jun 10 '14 at 09:56
  • @LưuVĩnhPhúc: The question specifically said the console *cannot **show*** Unicode characters, *not* that it cannot *copy* them... so being able to copy-paste to see it correctly doesn't change anything about my answer. I haven't tried it on Win8 but I'm skeptical it would actually display Unicode properly... have you tried something complex like Chinese? – user541686 Jun 10 '14 at 11:46
  • Sorry I haven't test Chinese characters on Windows 8 before. But actually the inability to displaying all Unicode characters is a font problem, not because wprintf doesn't support Unicode, so your last statement is not correct. Chinese and Japanese locale Windows use a different font that I can't select in other locales. I don't know why they still don't allow for all monospace fonts and font substitution in console while this has been pointed out decades ago – phuclv Jun 10 '14 at 12:10
  • I have tested and indeed wprintf supports Unicode on Windows – phuclv Jun 10 '14 at 12:10
  • @LưuVĩnhPhúc CMD cannot display Unicode characters. It can display DBCS characters. If you're seeing DBCS characters, it's because your system locale is set to something appropriate. For example, Japanese Shift-JIS (code page 932) will allow the display of Japanese characters. This is via DBCS, however, not Unicode. – Jeff Dec 02 '14 at 03:17
  • @Jeff Did you have a look at the pages on my comment above? cmd supports UTF-16 natively and UTF-8 if set to codepage 65001. You can write unicode characters out there without changing codepage. It's just the display not correct in some environment – phuclv Dec 02 '14 at 03:42
  • @LưuVĩnhPhúc as I said, CMD cannot *display* Unicode. I don't know what you think you're seeing displayed, but it's not Unicode. Given that the entire question was about *displaying* Unicode characters, I don't see the relevancy of CMD being Unicode "behind the scenes" and being able to copy/paste or pipe Unicode output. Your statement that the inability to display Unicode is a *font problem* is incorrect. It's because CMD is a DBCS program, not a Unicode program. – Jeff Dec 02 '14 at 06:01
  • @Jeff no, it supports Unicode. It's just the DBCS codepoints converted to Unicode. Don't you see the pictures with characters from different codepages? How can you display that in a specific codepage? And most of those codepages aren't DBCS either because they only use 1 byte for encoding the characters. – phuclv Dec 02 '14 at 07:43