Unicode 5.1.0 has been released for over a month and now, Google supports this new format of Unicode, 5.1.0.
What is Unicode:
Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language.
Why Unicode ?
Any given computer (especially servers) needs to support many different encodings; yet whenever data is passed between different encodings or platforms, that data always runs the risk of corruption.
Unicode 5.1.0 contains over 100,000 characters, and provides significant additions and improvements that extend text processing for software worldwide. The new standard has significant character additions for Indic and South East Asian scripts, expanded identifier specifications for Indic and Arabic scripts, improvements in the processing of many Indian and other Indic scripts. Now, you can type in Indian languages like Malayalam, Tamil, Hindi more efficiently.
Altogether, it’s good for the Indian languages. Google says:
The new Unicode 5.1.0, is now available in search, so people speaking languages such as Malayalam can now search for words containing the new characters in Unicode 5.1.0.
You can find the Unicode Character Code Charts for all supported scripts here.
You can read the official announcement and a graph showing Unicode Statistics here.
If you enjoyed this post, make sure you subscribe to my RSS feed!



