Oct 30, 2010 11:24 GMT  ·  By

Crowdsourcing can easily break all of the limits associated with the traditional translation process, high costs, restricted resources, the inherent errors caused by the discrepancies between linguistic skills and domain knowledge.

I had a chance to chat with Rob Vandenberg, President and CEO of Lingotek recently, and following our discussion it became clear that for companies and organizations with a global presence, tapping the unused talent of their communities to adapt content to different markets worldwide is a much smarter decision than using a professional translation agency.

With an impressive list of over 100 customers, both federal and commercial, from the U.S. Federal Government to Novell, eBay, Adobe, etc. , Lingotek offers the Collaborative Translation Platform, a web-based crowdsourcing translation tool that enables enterprises not only to easily serve content to a global audience, but also drive brand engagement and increase loyalty within their existent communities.

1. Please tell Softpedia readers about your company. If there’s any way to make translation exciting, I think we’re going to try and do it. Lingotek is a five year old software company, and we have focused on developing a new web-based Collaborative Translation Platform.

It’s a web-based software which is very focused on content owners, those originating the content or that have a community. They can inject content into the Lingotek platform via web APIs (application programming interfaces) and then they can assign various workflows, there’s a workflow component to it, that then assigns, notifies translators if there’s content that’s available to be translated and suitable to them best on their language skills, main expertise, etc.

That content is rendered or served up to them in a translation workbench. This translation workbench shoes the source content, each sentence, segment by segment or sentence by sentence, allows the users to translate.

In this process there are various resources available to the translator including machine translation, essentially a statistical engine that connects the source content with its automated translation. While it gives you a good jist, or overall view about what the content is about, but it’s not necessarily high enough quality that you can print it on your website or in collateral.

Another resource is translation memory. So every time someone translates one sentence from one language into another, those then are stored together as parallel corpora. That parallel corpora is called translation memory.

What we do, as people use our system or load it with data, as source content reappears, the platform recognizes the segments that have already been translated by another human translator and offers the option to use that translation. With translation memory, the system enables users to reuse translations by other human translators.

At a high level what we’re trying to do is deliver a global web-based platform to enable translation. That’s where the company started. We have a history with our federal government here in the United States, one of our investors is In-Q-Tel.

So there’s very active in translation, having invested a lot in our natural language processing technology, enabling our platform to be better than other things on the market.

What we focused on in the past few years, was developing a strong commercial strategy centered around crowdsourcing. We have an interesting and compelling trend in things like social media, social networks, enterprise mash-ups or collaboration platforms.

We have the ability to say, hey Adobe, you have a wide base of users around the world, and we’re going to be great, because these people that are underserved in content in their own language. How can you bring together the content and the community into a place or platform where they can actually start to translate for you for their own benefit?

One of our big clients is Adobe, they have 650 product groups around the world. What they’ve done is what we call the Lingotek inside strategy, embedding the Lingotek platform / translation tools into the content, and enable the community to select the content that they’re interested in, and like to have translated into my own language.

We have a number of customers across various verticals, Novel, eBay, the Library of Congress. Many clients are using Lingotek to enable the community to participate more actively with their brands and content.

We enable content owners to connect with their users around their content, and we see it have huge benefits, from reducing the cost of translation, enabling relevant content to be translated, bringing together the community and the content owner more closely, and have a stronger brand relation.

2. Do you have a proprietary machine translation technology? We’ve always been machine translation agnostic. And by that I mean that there’s different machine translation engines that are more appropriate for certain language pairs or certain kinds of content. We plugin things like Google Translate, Microsoft Translator, Language Weaver, etc.

So there are various engines that are more appropriate to different types of languages. That being said I think that one of the most interesting things that we’re seeing in this space is open source machine translation, specifically an engine called Moses, which evolved to a point in which a lot of people are running it independently.

We are looking at moving it to the Cloud. We’re hosted on Amazon’s EC2 Elastic Cloud, and looking at how we can put an open source machine translation engine into the Cloud to serve up to our customers. We don’t have our own proprietary engine, we’re looking into building it.

3. Is there a customer that you’d like to highlight as representative for Lingotek? I tend not to highlight the U.S. Federal Government, for somewhat obvious reasons. They don’t want to be highlighted. And I’m more interested in the commercial strategy and crowdsourcing translation.

One we clearly talked about is Adobe. Just because we’re evolving, what we’re doing now with Novell the next one I’d highlight. What I like about the Novell strategy is that there’s a community of 100,000 users for Drupal.

We’ve done integration with Drupal, and we have integrated with various content containers, Microsoft SharePoint, Alfresco, Oracle CM, Jive Software. But the Drupal one is interesting because Drupal is widely used, an open source content management system that we see as being used to run a lot of communities.

Novell said “we really want to work with you, we really want our community to engage more deeply with us across language, developing a multilingual community.” The answer was Lingotek inside, in-place translation within the Drupal system.

4. What deployment options are there for the Lingotek Collaborative Translation Platform? There are two basic deployment options. We can deliver a premise base deployment solution. We do so for all of our government clients, and for a few of our larger corporate clients.

We tend to promote, and it has been widely adopted, a SAS software service solution. It can be used as a standalone SAS solution to log in, load content, and translate.

An option to this SAS approach is Lingotek inside. So if you have a live website, or content system, you can use Lingotek SAS solution to embed Lingotek workbenches and workflows.

We’re very mindful to the fact that people have different approaches, and different ideas on how to do things, and enabling mash-ups and even enabling people to sell their own UI. We have a call center in China, and Adobe China and other folks are using in China because there’s a very stringent network and the great firewall of China.

We tend to be very flexible about things, and think that one UI is probably not going to satisfy every use case. So we’re enabling many UIs for many audiences which I think is critical from an enterprise software perspective going forward.

5. What languages does Lingotek support? The short answer is “Yes,” we support all languages. The longer answer is that there are certain challenges, and there are dead languages that we don’t support.

But we’ve done a lot of work with some of the most challenging character sets, even like Simplified Chinese without punctuation. How do we determine where one sentence ends and another begins?

There are no technical restrictions on what languages we can support. Someone asked me once “Do you support Klingon?” And Klingon is a made up Star Trek language, but I guess that it’s also a real language, in the sense that it’s been documented. And technically we do, we’ve had some users mess around and do translations in Klingon in our system.

There are unique requirements in certain language pairs that require additional natural language processing work for the translations to work. We’ve done a lot of work with Middle Eastern languages, far Eastern languages, clearly the Western European languages have been covered.

In our global data warehouse of translation memory, I think 112 languages are represented, but there’s hundreds more. So it’s a work in progress but there are certainly no technical restrictions to stop us from supporting all languages.

6. What can you tell me about the quality of crowdsourcing translations? We have the concept of a content value index, determined by how quickly you want the translation and how high quality do you want it to be. So if you want it quick and dirty you have the machine translation, automated and fast.

If you want it to be a higher level of quality, you have the community or the crowd post-edit that machine translation, fix the mistakes, vote on each other’s work, have thresholds for voting, etc. And then you can escalate it to introducing professional translators.

We do enable professional translation as well, with capabilities such as word counting, deadlines, payment system.

I think the real question is how do I know if the translation is any good? Well, there are certain controls you can include in the review process. The review process is different from the translation. You can have different people do it, set thresholds, etc.

We have a case study for eBay, and the site, in a blind survey their in-house, professional translators choose the crowdsourcing translation ahead of the professional translation. And this is not an aberration.

The reason for this is the fact that the crowd is a community that is centered around the content. And so in many ways what’s relevant for quality is the domain expertise. Linguistic skills and domain expertise. I think that linguistic skills are pretty widely available, over half the world speaks more than one language.

So the idea of being an expert at Adobe products, Novell products, eBay categories, etc. gives you a higher propensity to do high quality and more appropriate word selection if you actually know what the content is about.

7. What benefits do customers see from using Lingotek? It’s an easy answer for Chief Financial Officers that want to save money on translations. People saved anywhere from 15% to as much as 90%. I think that’s great and that’s a business decision.

But I think that what’s really great is that the actual community, the consumers are being able to select what content is being translated. They choose to participate. There is the ability for the users not just to generate content but also to select content that is relevant and translate it, as opposed to being told by the company that this is what you get in your language.

Corporations tell regions of the world ‘this is what you get.’ We’re trying to get out of that. So they see great benefit, as the community really embraces the opportunity for them to select what content they want.

But who are these people that would participate actively in their community to the point in which they would volunteer their time, efforts, skills, and expertise to translate on behalf of a content owner. Well, guess what, they are people that care deeply about Adobe, eBay, etc. So you have people that are not just consumers, but collaborators, I am part of that organization.

It’s a hugely different level of loyalty. There’s certainly the possibility to compensate these people. Sun Microsystem takes people that participate actively in community translation on stage at their annual conference and said ‘these are our top translators, here’ a certificate, we recognize your contributions, we value them.’

This is what people care about. A whole different level of brand engagement.

8. How does Lingotek stand out from the crow? We see a lot of the established products be clunky, enterprise software that’s client server that’s not easy to use and is overly complicated, without web-based architecture, etc.

Lingotek has a concept of easy to use, web-based, collaborative, enhanced UX, fun and productive environment. The advances we’re trying to accelerate are things like getting the best machine translation possible, using open source engines, integrating into existing content management systems.

We’re creating more of a connection with content offering and translations so that they exist in-line with each other, and not separately. The antiquated way to do things is to package up your files, send them to an agency, the agency assigns people to them.

But if you have the crowd selecting content and doing translations in place, lots of things can get done, and nobody’s really approaching it in that way, a fully integrated platform that the content owner controls, with all these resources available to the community like machine translation, translation memory, etc.

9. What does the future hold for Lingotek’s Collaborative Translation Platform which is currently at version 5? We have a very productive engineering team, and do releases every month. So there’s a very agile, iterative development process, and we tend to have a new version shipped each year. We’re five years in, so we’re on 5.0, 6.0 will come in our sixth year. We’re on track for the summer of next year.