Skip navigation.

Syndicate

Syndicate content

User login

unicode

Encoding gotchas with Arabic in Visual Studio 2005

My current project in Iraq is the first time I’ve developed software in another language, and more to the point, in a non-Latin character set.

Our alphabet, which we share with the Latin-based languages of Western Europe and South America, is based on the Latin alphabet of Roman times, which is why we call our character set ‘Latin’ and not ‘English’ or whatever. There are plenty of other alphabets out there, including Cyrillic (used by Russian, among others) and Arabic.

Each of these alphabets, in order to be represented in digital form, has at least one (and, confusingly, sometimes more than one) ‘code page’, which simply means a standard translation of each letter in the alphabet into a number. So, a latin ‘A’ in almost all Latin code pages is assigned the number 65, while the Arabic beh (ب) is assigned its own number in Arabic code pages.

Syndicate content