6

I have a rar file. After extraction, it generates a file with Chinese name which is shown in Nautilus as:

��������ѧ.������.����������ѧ������.2008.djvu (invalid encoding)

In terminal it is shown as:

<BD><FC><B4><FA><D7><E9><BA><CF>ѧ.<CD><F5><CC><EC><C3><F7>.<B4><F3><C1><AC><C0><U+DE64><B4><F3>ѧ<B3><F6><B0><E6><C9><E7>.2008.djvu

The content of the rar file listed by unrar is correct:

$ unrar l 近代组合学.王天明.大连理工大学出版 社.2008.rar    
UNRAR 3.93 freeware      Copyright (c) 1993-2010 Alexander Roshal    
Archive 近代组合学.王天明.大连理工大学出版社.2008.rar    
 Name             Size   Packed Ratio  Date   Time     Attr      CRC   Meth Ver
-------------------------------------------------------------------------------
 近代组合学.王天明.大连理工大学出版社.2008.djvu  6190416  6187189  99% 03-06-11 10:33  .....A.   98320D40 m3g 2.9
-------------------------------------------------------------------------------
    1          6190416  6187189  99%

The file cannot be open unless I change its name to be something like 1.djvu.

I was wondering why the characters are not shown properly with Chinese name of compressed file, while I can create a directory or file with Chinese name?

How shall I do?

Thanks and regards!

Tim
  • 26,107

4 Answers4

6

Perhaps the graphical archive program does not understand Chinese. Try extracting the archive using the command-line:

  1. Open a terminal.
  2. Navigate to the directory containing the file:

    cd /path/to/directory/
    

    You can use the Tab key to complete filenames and directory names. Double press Tab to get a list of possible completions in case there is more than one option.

  3. Run the unrar program to unpack filename.rar:

    unrar x filename.rar
    

    Here, you can use tab-completion too for the filename.

  4. The contents of the archive will be visible in the current directory.
Lekensteyn
  • 178,446
3

I had the same issue with rar-file that contained names with Cyrillic letters. I was able to fix it by reinstalling unrar as it is suggested here:

$ sudo apt-get remove rar
$ sudo apt-get remove unrar
$ sudo apt-get install unrar

It turned out that by default the open source version of rar & unrar utilities is installed in Ubuntu: "unrar 0.0.1 Copyright (C) 2004 Ben Asselstine, Jeroen Dekkers". This version does not handle non-ascii symbols well.

After reinstalling the unrar the actual version of it installed from "restricted" (proprietary software) repository (note that this should be enabled in your update settings): "UNRAR 5.40 freeware Copyright (c) 1993-2016 Alexander Roshal"

This version handles Unicode symbols, at least it worked for me with Cyrillic letters.

Note that removing open source version of rar/unrar also fixed an issue with GUI software: Rar archive with Cyrillic letters

2

Looks like the filename usres a different character encoding than your environment. The character ѧ (CYRILLIC SMALL LETTER LITTLE YUS) is most likely not part of a Chinese file name.

Do you have any information about the operating system and language settings the file has been created in? Do you know which character encodings are common to encode Chinese file names?

If you know the filenames's encoding you can use convmv (not installed by default) to convert it to the encoding you use (most likely UTF-8).

1

try this:

unrar --enable-charset x $1