14

I have 3 types of file name encodings on reiserfs mounted hard drive: CP1251, KOI-8, UTF-8 and ASCII. I really need to convert all encodings to UTF-8, recursively. Is there any utility, which will detect source encoding and convert it to UTF-8 or I have to write Python script?

Pablo
  • 2,597

3 Answers3

24

Use convmv, a CLI tool that converts file names between different encodings.

To convert from (-f) these encodings to (-t) UTF-8 do the following:

convmv -f CP1251 -t UTF-8 inputfile
convmv -f KOI-8  -t UTF-8 inputfile
convmv -f ASCII  -t UTF-8 inputfile

In addition, if you want to convert the file content, use iconv, a CLI tool to convert file content to different encodings.

To convert from (-f) these encodings to (-t) UTF-8 do the following:

iconv -f CP1251 -t UTF-8 inputfile > outputfile
iconv -f KOI-8  -t UTF-8 inputfile > outputfile
iconv -f ASCII  -t UTF-8 inputfile > outputfile
2

Nope. One of the big downsides to the old code page system is that there is no way to detect which one is being used; you must simply know that a priori. If you do know which files are using which encoding then you can convert the names using something like:

mv somefile `echo somefile | iconv -f CP1251 -t UTF-8`
psusi
  • 38,031
1

Same solution with iconv as @psusi sugeses but with loop and while-card:

Also oneline shell sh script:

for f in /path/*.txt; do mv $f `echo $f | iconv -f 866 -t UTF-8`; done

With reading while-card from pipe line:

echo * | for f in `read f&&echo $f`; do mv $f `echo $f | iconv -f 866 -t UTF-8`; done
oklas
  • 191