0

On a Windows 11 device, I have the international settings (running "intl.cpl") set to Italian, for non-Unicode applications. Now, that results in applications like WinMerge guessing codepage 1252, which looks fine, from a Western European point of view. The same is true for WordPad too.

However, any terminal application I run (Windows Terminal, cmd.exe, TCC/LE) still starts with cp 850 (DOS Multilingual Latin I), which is different for accented letters.

Besides, I can still CHCP 1252 in every console, but ain't this a strange behavior?

EDIT: To add more fun, if from cmd.exe or TCC/LE I run bash the accented letters look to be correct, but when I exit back to the previous command processor I find that the codepage has been switched to cp 858 (which anyway does not have the accented letters of 1252) ...

LuC
  • 101
  • 2
  • Besides setting the locale for non-Unicode applications did you also "copy the locale settings for the welcome screen and new users"? IIRC that also sets if for system accounts, which in turn forces it onto CMD.exe. (Reboot needed to fully activate the setting. Without reboot it is not fully functional.) – Tonny Mar 09 '23 at 10:43
  • @Tonny note that it was already the **default** system setting, on the freshly installed OS with the Italian language. Anyway, I also rebooted after any change, and I have only one admin user here – LuC Mar 09 '23 at 11:11
  • You didn't mention this was a native Italian install, but if that doesn't help either I'm all out of ideas. Nice to hear that even in 2023 Windows is still a mess when it comes to proper language handling. I'm very happy I only need to deal with English and Dutch so I can usually avoid this. – Tonny Mar 09 '23 at 11:49
  • I fear it all has spread for compatibility on existing systems over time, like many solutions before (and after). Unicode has come some years later than PCs, and at the same time, it wouldn't have been possible with the memory footprint of 1980. With a language like Italian, we just have a very few diacritics, others are in bigger trouble yup! – LuC Mar 09 '23 at 13:01
  • Western European languages are reasonably well catered for. Everything else... I had to deal with Thai some years ago... The database and its clients handled it well (internally UC2) but it was hosted on an old Unix system without support for Thai and had to print to a printer hosted on Windows (winprinter). We had to print everything as "plain text" from the DB (which left the UC2 intact) to file and then run the files through special conversion software that converted to the Windows code page. From those we then could create PDFs to be send to the Windows system fro print. Very messy... – Tonny Mar 09 '23 at 16:50

1 Answers1

0

No, this has always been the case for the Windows console due to its roots in literally being a "virtualized MS-DOS" window before NT era.

Similar to GUI apps, the console in Windows NT has a separate set of wchar_t-based APIs which programs can use to directly output Unicode text without any codepage conversion.

(Alternatively, in Windows 10 and later, enabling the option to use UTF-8 (65001) as the "ANSI" codepage for non-wide APIs will also select UTF-8 as the "OEM" console codepage.)

u1686_grawity
  • 426,297
  • 64
  • 894
  • 966
  • Note that this is true also for PowerShell, which should be quite more modern than the DOS consoles alike. – LuC Mar 09 '23 at 11:02
  • Using UTF-8 was my first attempt, but that gives some more trouble - a `type` or `cat` works, while some other tools/commands, both in TCC/LE and bash, don't always like the way UTF-8 encodes some of the characters in two bytes... – LuC Mar 09 '23 at 11:07