I have 3 types of file name encodings on reiserfs mounted hard drive: CP1251, KOI-8, UTF-8 and ASCII. I really need to convert all encodings to UTF-8, recursively. Is there any utility, which will detect source encoding and convert it to UTF-8 or I have to write Python script?
Asked
Active
Viewed 3.5k times
14
Pablo
- 2,597
3 Answers
24
Use convmv, a CLI tool that converts file names between different encodings.
To convert from (-f) these encodings to (-t) UTF-8 do the following:
convmv -f CP1251 -t UTF-8 inputfile
convmv -f KOI-8 -t UTF-8 inputfile
convmv -f ASCII -t UTF-8 inputfile
In addition, if you want to convert the file content, use iconv, a CLI tool to convert file content to different encodings.
To convert from (-f) these encodings to (-t) UTF-8 do the following:
iconv -f CP1251 -t UTF-8 inputfile > outputfile
iconv -f KOI-8 -t UTF-8 inputfile > outputfile
iconv -f ASCII -t UTF-8 inputfile > outputfile
Danny Beckett
- 109
Marcos Roriz Junior
- 4,667
2
Nope. One of the big downsides to the old code page system is that there is no way to detect which one is being used; you must simply know that a priori. If you do know which files are using which encoding then you can convert the names using something like:
mv somefile `echo somefile | iconv -f CP1251 -t UTF-8`
psusi
- 38,031
1
Same solution with iconv as @psusi sugeses but with loop and while-card:
Also oneline shell sh script:
for f in /path/*.txt; do mv $f `echo $f | iconv -f 866 -t UTF-8`; done
With reading while-card from pipe line:
echo * | for f in `read f&&echo $f`; do mv $f `echo $f | iconv -f 866 -t UTF-8`; done
oklas
- 191