> ASCII compatibility would be Somebody Else's Problem
for many of those outside "A" in ASCII (euphemism for America :) there were already a ton of problems, so endianness was the least (i personally never hit this problem)
// disclaimer: i'm not that serious about predominance of Latin script, this is sorta irony
Depending on the level of abstraction you're living at - and that depends on the overall goal, performance constraints, environmental integration, OS / machine heterogeneity etc. - it may or may not be a problem.
It's easy to dismiss if you have all the time in the world and a deep stack of abstractions.
If you're doing deep packet analysis on UTF-16 text in a router, things may be different.
thanks, my question was right about the issues met by people living in another levels of abstractions.
i'm not a native english speaker and a newb to HN, so sorry that i put my sincere question so that it looked like arrogant statement 'there are no issues, what are you talking about, i even don't know what LE and BE mean'.
> or many of those outside "A" in ASCII (euphemism for America :)
Abbreviation for 'American', in fact. No euphemisms needed.
(ASCII = American Standard Code for Information Interchange)
> there were already a ton of problems, so endianness was the least
I can appreciate this. However, UTF-8 also has desirable properties like 'dropping a single byte only means you lose one character, as opposed to potentially losing the whole file', and 'you can often tell if a multi-byte UTF-8 sequence has been corrupted without doing complex analysis'.
> i'm not that serious about predominance of Latin script, this is sorta irony
Heh. ASCII can't even encode the entirety of the Latin script: Ask a Frenchman how he spells 'café', or a German how he spells 'straße', and notice how important characters are missing from ASCII.
what are the issues?
> ASCII compatibility would be Somebody Else's Problem
for many of those outside "A" in ASCII (euphemism for America :) there were already a ton of problems, so endianness was the least (i personally never hit this problem)
// disclaimer: i'm not that serious about predominance of Latin script, this is sorta irony