Sublime Forum

utf8 or utf16

#1

Hi all,

I have a question. While I was customizing my subl I stumbled across the config setting option ‘“default_encoding”: “UTF-8”,’. I understood that UTF8 is a subset of UTF16 which encodes a much larger variety of symbols.
Why is UTF8 default and not UTF16?
What happens with strange UTF16-symbols in a file when I open it with default UTF8 encoding or when I save it with UTF8 encoding?
Or is expectable at all that UTF16 encoding will become important in programming?

0 Likes

#2

After googling, I think they are just different encoding implementations for Unicode but not in a subset relationship.

I guess people uses utf8 more just because of maybe-better compatibility and because most codes are ASCII chars (better encoding efficiency in UTF-8, only 1 byte).

1 Like

#3

UTF-16 has little advantage over UTF-8. It is still variable-width and in almost all situations needs more or the same amount of bytes to represent the same text (unicode glyphs). Never use UTF-16.

Unless you are looking at doing optimizations for internal string representations inside a programming language for obscure reasons (hint: you’re not), just use UTF-8 for everything. And in that case, you’d want to look at a constant-width encoding like UTF-32, which needs even more bytes.

1 Like