0

Considering UTF-8 + Windows CMD nightmare...

After reading this question, are these solutions only partial ? Is there a way to set globally the character set/encoding in a cmd environment? It seems that CHCP command does not change the stdout/stderr encodings.

To check it: write a program that fills a file with latin/korean/ukrainian strings.

On direct output, the file will be ok if you set the encoding properly into your source code (i checked it with Java, easy encoding settings for files). But if you redirect your output into a log file, you will simply have series of ???????????????????? in it ...

The indirection could be useful too, like this:

PROMPT> myprog < inputdata.txt > outputdata.txt

Am i missing something? Is it cmd that badly converts stdout, or Java that adapts System.out, depending of the cmd encoding? I have not found any method to re-define System.out/err encoding.

Grubert
  • 11
  • 1
  • Read http://ss64.com/nt/chcp.html and [this detailed analysis](http://stackoverflow.com/a/17177904/3439404) in great answer by @andrewdotn to another question at SO. FYI, I have `DejaVu Sans Mono` font installed. – JosefZ Jun 17 '15 at 15:58
  • To answer the question of whether it's cmd or the program.. try pasting the character into cmd, if it goes there then cmd is fine. i.e. the font supports it. I find type can display a file with funny characters if it's unicode LE(xxd -p file, look for fffe at the start, save file in notepad as 'unicode' that's unicode little endian), but more cannot display these funny characters. – barlop Jun 17 '15 at 23:49
  • a related question but for C# http://stackoverflow.com/questions/30904504/font-is-right-why-cant-i-get-this-unicode-character-to-display-in-this-c-sharp – barlop Jun 18 '15 at 02:15
  • I find that for redireciton . utf8 works in c sharp though unicode doesn't – barlop Jun 20 '15 at 18:01
  • Many thanks for your answers, finally got it: whatever the session settings are, you must redefine stdout and stderr. For Java, do something like: myStdOut = new PrintWriter( new OutputStreamWriter( System.out, "UTF8" )); see this post:https://poeticcode.wordpress.com/2009/01/19/systemout-and-utf8/ . Many thanks to this contributor. Not sure at this time what to do to deal with System.in. – Grubert Jun 22 '15 at 08:57
  • @Grubert paste better you mean `PrintWriter out = new PrintWriter(new OutputStreamWriter(System.out)); out.println(“some-utf8-string”);` i'm not in front of java right now but you could experiment with InputStreamReader(System.in) and a readLine() You should ask on stackoverflow, it's a coding issue as you know – barlop Jun 25 '15 at 11:58

1 Answers1

0

Considering UTF-8 + Windows CMD nightmare...

Works for C#.

Should work for Java too, maybe you are doing it wrongly. You should put your problem code on stackoverflow and ask where you are going wrong with the encoding statements.

To check it: write a program that fills a file with latin/korean/ukrainian strings.

I have done something like that in C#

On direct output,

you mean on display

the file will be ok if you set the encoding properly into your source code (i checked it with Java, easy encoding settings for files). But if you redirect your output into a log file, you will simply have series of ???????????????????? in it ...

You have to get the encoding statement correct in your code, then the > will work.

I haven't had to change CHCP in order to just redirect non ascii unicode characters to a file. Or to put it another way.

The indirection could be useful too, like this:

PROMPT> myprog < inputdata.txt > outputdata.txt Am i missing something? Is it cmd that badly converts stdout, or Java that adapts System.out, depending of the cmd encoding? I have not found any method to re-define System.out/err encoding.

It's all an issue with your Java code.

See it work here in C#

https://stackoverflow.com/questions/30904504/font-is-right-why-cant-i-get-this-unicode-character-to-display-in-this-c-sharp

And look at my comment on Htin's answer. But that's for C#

You want it for Java, post your a demonstrative piece of code with your question, to stackoverflow. It's a programming issue that you have.

barlop
  • 23,380
  • 43
  • 145
  • 225