Oct 18, 2010 13:49 GMT  ·  By

A new free and open source tool from Microsoft Research is designed to leverage crowdsourcing in order to build multilingual content on Wikipedia.

WikiBhasha Beta is currently available for download from the Redmond company, and can be used in order to expand the number of languages in which Wikipedia content is available.

According to the software giant, WikiBhasha is based on the work done with WikiBABEL, a Microsoft Research project set up to take advantage of a language community in order to collaboratively create linguistic parallel data.

“WikiBhasha beta enables Wikipedia users and contributors to explore and source content from English Wikipedia articles, to translate the content into a set of target languages, and to use the content with user additions and corrections for contribution to the target language Wikipedia,” Microsoft stated.

“The content creation workflow is flexible enough to accommodate new content creation, at the same time preserving reusable information, such as references and templates.”

WikiBhasha is designed to work in tandem with Microsoft’s machine translation technology, but with one limitation.

Users will not be able to perform translations between all the language pairs supported by Microsoft Translator, but only between English as a source language and any of the other languages featured by the technology.

All the content that is translated by contributors will be submitted to the appropriate Wikipedias, the software giant informs.

Users interested in contributing will need to install the WikiBhasha beta, which is designed as a browser application that will be brought to life by Wikipedia articles.

“It features an intuitive and simple UI layer that stays on the target language Wikipedia for the entire content creation process,” the company stated.

“This UI layer integrates content discovery, linguistic and collaborative services, focusing the user primarily on content creation in the target Wikipedia.

“A simple 3-step process guides the user in the content discovery and sourcing from English Wikipedia articles, composing target language Wikipedia article and, finally, publication in target Wikipedia. While a typical session may be to enhance a target language Wikipedia article, new articles may also be created following similar process.”

The WikiBhasha label has been coined through a combination of “Wiki” and “Bhasha” (language in Hindi or Sanskrit).

Microsoft Research released WikiBhasha Beta as an open-source MediaWiki extension. The solution can also be leveraged as a user gadget in Wikipedia. The wikibhasha.org site hosted on Windows Azure is set up to offer an installable bookmarklet.