Softpedia
 

NEWS CATEGORIES:



NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
Home > News > Webmaster > Google News

February 4th, 2012, 14:48 GMT · By

Unicode Now Used by More than 60 Percent of the Web

SHARE:

Adjust text size:


Unicode adoption on the web
Enlarge picture
Google is celebrating the adoption of Unicode on the web and is also announcing that it's already switching to the very latest version available, Unicode 6.1 which was released on January 31st.

"We’ve long used Unicode as the internal format for all the text Google searches and process: any other encoding is first converted to Unicode. Version 6.1 just released with over 110,000 characters; soon we’ll be updating to that version and to Unicode’s locale data from CLDR 21 (both via ICU)," Google's Mark Davis announced.

The latest addition to the Unicode standard comes with 732 new characters, but also plenty of technical changes.

"Unicode was invented to solve that problem: to encode all human languages, from Chinese (中文) to Russian (русский) to Arabic (العربية), and even emoji symbols like or it encodes nearly 75,000 Chinese ideographs alone. In the ASCII encoding, there wasn’t even enough room for all the English punctuation (like curly quotes), while Unicode has room for over a million characters," Davis explained.

Along with announcing that it will be switching over to the latest standard soon, Google also provided an update on how Unicode is doing on the web.

Even though it was created at the same time as the web, it has been a long batter and it only started being adopted in a meaningful way in 2003, 2004. Even then, it didn't really take off until 2006.

After that, usage shot up, while the other popular scripts used, the basic US-only ASCII or the more extended Latin script, began to lose market share.

It was only in 2008 that Unicode became the most popular script on the web. Since then, it has continued to grow. It is now used by more than 60 percent of web pages. If you don't count ASCII as separate, since it is included in most other encodings, Unicode accounts for almost 80 percent of pages published online, as far as Google is aware.

TELL US WHAT YOU THINK:

1,188 hits · Link to this article · Print article · Send to friend · Subscribe to news

MUST-READ RELATED ARTICLES:


Google Chrome 16 Introduces a New Bubble View UI for Errors

New Update for Yahoo!'s Instant Messenger

Finally, One More Version of Google News

The Unicode System Is Vulnerable

Bug Fixes in Firefox 3.0.11

READER COMMENTS:



No user comments yet.
Be the first to express your opinion!
Copyright © 2001-2012 Softpedia. Contact/Tip us at

WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM