Unicode characters for Chinese and Japanese numbers
[29 Nov 2008]
I hosted the 20th Math Carnival here on squareCircleZ last November.
To give the post an Asian flavor, I used Chinese/Japanese numerals and Devanagari (Hindi) characters. I copied and pasted the characters from various Web sources and the post looked fine when I published it. However, I was a little concerned about the “health” of these characters at the time, and I said:
“Depending on your browser and existing fonts, hopefully you are seeing Chinese & Devanagari (Hindi) characters, and not splodges.”
Since then, WordPress (the blog engine I use) has undergone several updates and some of these have involved a re-write of the blog database. Unfortunately, the re-writes have mangled math symbols that I have used in various posts (including even the humble apostrophe: ', which looks ugly when chewed).
I edited the mess each time, replacing the mangled characters with more copy-pasted ones, but I got sick of having to do it after WordPress upgrades.
I then went looking for the Unicode character equivalents, which I should have used in the first place. According to Internet.com’s Webopedia, Unicode is:
A standard for representing characters as integers. Unlike ASCII, which uses 7 bits for each character, Unicode uses 16 bits, which means that it can represent more than 65,000 unique characters. This is a bit of overkill for English and Western-European languages, but it is necessary for some other languages, such as Greek, Chinese and Japanese. Many analysts believe that as the software industry becomes increasingly global, Unicode will eventually supplant ASCII as the standard character coding format.
I looked all over the place for Unicode Japanese/Chinese Unicode characters and finally found them tucked away in this Unicode Kanji Code Table. But the characters for numbers are mixed up with hundreds of other characters and almost impossible to find.
So for those of you who publish Japanese or Chinese characters on the Web, here are the Unicode characters for Japanese/Chinese numerals.
For the math students, each number (after the x) is a hexadecimal (or base 16) number, where the decimal number 10 is represented using “a”, 11 is written “b”, 12 is “c”, 13 is “d”, 14 is “e” and 15 is “f”. (See also Hexadecimal Numbers in a previous IntMath Newsletter).
Note: You may need East Asian fonts on your system to see the Japanese/Chinese characters in the table below.
|Hindu-Arabic Numeral||Chinese/ Japanese numeral||Unicode|
|100:||百||百 (Japanese: hyaku, Chinese: bai)|
|1000:||千||千 (Japanese: sen, Chinese: qian)|
|10,000:||万||万 (Japanese: man)|
|10,000:||萬||萬 (Chinese: wan)|
|108:||億||億 (Japanese: oku)|
|108:||亿||亿 (Chinese: yi)|
|1012:||兆||兆 (Japanese: chou, Chinese: jhao)|
I’ve included the Japanese and Chinese readings for the larger numbers.