Questions tagged [character-set]

Refers to text input character sets such as Unicode or ASCII.

What Questions Should Have This Tag?:

Questions include:

  • Questions about the character sets in Ubuntu, using them, installing them, etc.

Basic Definitions:

Character Encoding - In computing, a character encoding is used to represent a repertoire of characters by some kind of an encoding system.


Brief Introduction To The Subject:

In computer science, the terms "character encoding", "character map", "character set" and "code page" were historically synonymous, as the same standard would specify a repertoire of characters and how they were to be encoded into a stream of code units – usually with a single character per code unit. The terms now have related but distinct meanings, reflecting the efforts of standards bodies to use precise terminology when writing about and unifying many different encoding systems. Regardless, the terms are still used interchangeably, with character set being nearly ubiquitous.


Important Links For Learning More:

55 questions
21
votes
3 answers

How to write en and em dashes?

I understand that to be able to type en and em dashes I have to configure the COMPOSE key, or something like that, but I am not sure about that nor on how I get from there to be able to type en and em dashes in addition to the minus sign. How do I…
user364819
20
votes
2 answers

Unable to mount cifs with iocharset=utf8

When I try to mount a cifs share with option iocharset=utf8 I receive the error: mount error 79 = Can not access a needed shared library. What shared library am I missing?
Uggla
  • 303
16
votes
3 answers

Opening a non-utf8 encoded text file

I sometimes need to open text files that are encoded in EUC-KR. man gedit gives: --encoding Set the character encoding to be used for opening the files listed on the command line. This can be used to open specific…
user85023
10
votes
2 answers

filename encoding issue

I am getting a file with a faroese name and trying to save it in a PHP script: 2010_08_Útflutningur.xls In Ubuntu 10.04 LTS is saving it as: 2010_08_�tflutningur.xls (invalid encoding) I've installed and run utf8-migration-tool, but with no…
7
votes
2 answers

Cedilla no longer working with US Intl. keyboard layout

I'm using Ubuntu 20.04.4 LTS with the US Intl. keyboard layout, but when I type ', then c, I'm now getting a ć instead of the regular ç. How can I fix this?
6
votes
2 answers

Why Chinese for my uploads?

Didn't use to happen. Now happens with any upload of csv data, regardless of content (and it isn't Chinese!). I checked cat /etc/default/locale LANG="en_US.UTF-8" My language settings are all English
6
votes
3 answers

Best practice to replace unknown chars from unknown charsets in filenames?

i.e. i have a file called Porträt.pdf. But the filename was created with a charsets which isn't properly shown in ubuntu like the following example. What would be the best practice to rename such chars in filenames, when you have several filenames…
NES
  • 33,935
5
votes
3 answers

How to print the ■ character in linux terminal using C?

This char is 254 in ASCII Extended Table, and 25A0 in Unicode. If I run putchar(254) the terminal does not recognize the char, as I think it utilizes not extended ASCII.
5
votes
1 answer

Ubuntu 20.04 how to remove duplicate packages after upgrade

I noticed that in My Ubuntu 20.04, I have some duplicate packages (from Ubuntu 18.04): Calculator, System monitor, etc. For instance, when I search for System monitor, here is the result: In addition, the old packages appear with…
user545149
4
votes
3 answers

How to print an octal value's corresponding UTF-8 character in bash?

I expected printf %s '\' to do the trick, but it doesn't: printf %s '\101' Outputs: \101
kos
  • 41,268
4
votes
2 answers

Is there a way to tell what encoding is used for the name and content of a file?

Is there a way to tell what encoding is used for the name and content of a file? Both GUI and terminal solutions (preferred) are fine. Thanks and regards!
Tim
  • 26,107
4
votes
2 answers

I am unable to type polish characters when connecting via ssh

I'm running ubuntu server 12.04 on VirtualBox. When connecting via console, I am able to type in Polish characters just fine (both on the command line as well as in VIM). When connecting via putty, I am unable to type polish chars (AltGr+a produces…
dreamwalker
  • 171
  • 1
  • 6
3
votes
1 answer

What is the "Character Map" application used for?

Ubuntu comes pre-installed with an application called "Character Map". In the terminal, it can be launched as "charmap". What is this application typically used for? What function does it provide people that Ubuntu sees as so necessary as to include…
Anon
  • 12,339
3
votes
2 answers

w3m charset source_dump not working correctly

I am trying to use the following command but can't get the right output: $ w3m -dump_source google.com r���G��2�Ph��ү�f�� ����?�l���%Y:���c(�����������Y\��s8Ư| ��;����1ʹ��D��^�lK���٥r��\���Սk�V��Ϸv���{��r�����~s\��~?�ML7���¹���ƿ�qm��h��q�(��:wZ…
2
votes
2 answers

Unable to change gnome-terminal default character encoding

I have tried everything that I could find on the internet. $> gconftool-2 --type string --set /apps/gnome-terminal/profiles/Default/encoding "en_US.UTF-8" $> cat /etc/environment ... LC_ALL="en_US.UTF-8" $>…
1
2 3 4