I'm making manual backups of remote Windows shares to my Ubuntu server. It's as simple as mounting the shares with mount -t cifs and running rsync on that.
My problem is with accents or special chars like the EURO currency symbol WITHIN files (not only in file names). So for instance, if I edit/view a text file on a remote Windows 7 host with, say, notepad or wordpad, I can see the euro symbol €. However, when I rsync the file to Ubuntu Linux, the text file contains a weird symbol instead of the currency sign, whether I view it with cat, mc, nano or gedit.
I also tried changing the locale on Ubuntu but had no luck. I can type the ruo symbol and create a new text file with it just fine directly on the Ubuntu host. So maybe the problem is with either mount.cifs or rsync. I read about the iocharset option for mount and tried a few values but had no luck.
With rsync I'm using the -a option only.
Any suggestions? What can I try/test?
Thanks
[UPDATE 02/2017] On my Windows 7 system I ran:
chcp
Active code page: 850
However, on my Ubuntu box the following command does NOT produce a file from which I can correctly display the euro symbol with 'cat' or 'more'.
sudo iconv -f CP850 -t UTF-8 /Windows/share/README.txt > /tmp/README.txt
However, this other command DOES.
sudo iconv -f CP1252 -t UTF-8 /Windows/share/README.txt > /tmp/README.txt
Why?
Unfortunately I can't run iconv on thousands of files and furthermore, do that after each rsync.
I found out that I can avoid running iconv by enabling the WINDOWS-1252 encoding in the Terminal Shell. Then a 'cat' or 'more' on the Windows file correctly displays the euro symbol.
However, opening the same Windows file with gedit through the "Files" browser in Ubuntu wrongly displays the euro symbol again. So I guess I should either enable WINDOWS-1252 system-wide on Ubuntu (how can I do that?) or force the Windows system to use UTF-8 (I don't know how to do that either).