0

I'm making manual backups of remote Windows shares to my Ubuntu server. It's as simple as mounting the shares with mount -t cifs and running rsync on that.

My problem is with accents or special chars like the EURO currency symbol WITHIN files (not only in file names). So for instance, if I edit/view a text file on a remote Windows 7 host with, say, notepad or wordpad, I can see the euro symbol €. However, when I rsync the file to Ubuntu Linux, the text file contains a weird symbol instead of the currency sign, whether I view it with cat, mc, nano or gedit.

I also tried changing the locale on Ubuntu but had no luck. I can type the ruo symbol and create a new text file with it just fine directly on the Ubuntu host. So maybe the problem is with either mount.cifs or rsync. I read about the iocharset option for mount and tried a few values but had no luck.

With rsync I'm using the -a option only.

Any suggestions? What can I try/test?

Thanks

[UPDATE 02/2017] On my Windows 7 system I ran:

chcp
Active code page: 850

However, on my Ubuntu box the following command does NOT produce a file from which I can correctly display the euro symbol with 'cat' or 'more'.

sudo iconv -f CP850 -t UTF-8 /Windows/share/README.txt > /tmp/README.txt 

However, this other command DOES.

sudo iconv -f CP1252 -t UTF-8 /Windows/share/README.txt > /tmp/README.txt 

Why?

Unfortunately I can't run iconv on thousands of files and furthermore, do that after each rsync.

I found out that I can avoid running iconv by enabling the WINDOWS-1252 encoding in the Terminal Shell. Then a 'cat' or 'more' on the Windows file correctly displays the euro symbol.

However, opening the same Windows file with gedit through the "Files" browser in Ubuntu wrongly displays the euro symbol again. So I guess I should either enable WINDOWS-1252 system-wide on Ubuntu (how can I do that?) or force the Windows system to use UTF-8 (I don't know how to do that either).

  • The text file is saved with a different character encoding than you are using on Ubuntu. Look at this answer http://unix.stackexchange.com/questions/78776/characters-encodings-supported-by-more-cat-and-less – Linkan Jan 18 '17 at 17:44
  • The content is likely the same. You probably need `iconv -f cp1252 -t UTF-8 your-file.txt` assuming your Windows system's ANSI code page is cp1252 and your Ubuntu server's locale and the terminal use utf-8. – jfs Jan 19 '17 at 04:43
  • Did you used the command `dos2unix` to convert your text files? `dos2unix yourfile.txt -n yourfile_linux.txt` – kcdtv Feb 07 '17 at 23:05

0 Answers0