Along with several new features related to captions

Nov 20, 2009 08:44 GMT  ·  By

YouTube has introduced some pretty interesting technologies which enable users to add automatic text captions to the videos. The feature is only available in several channels for now, as the speech recognition technology it uses isn't exactly perfect but Google says it should improve in time. The video site had enabled users to upload their own captions for about a year now, but the process is time consuming and only a few users have taken advantage of the feature until now.

“Since the original launch of captions in our products, we’ve been happy to see growth in the number of captioned videos on our services, which now number in the hundreds of thousands,” Ken Harrenstien, a Google software engineer working on the features, wrote.

“However, like everything YouTube does, captions face a tremendous challenge of scale...To help address this challenge, we've combined Google's automatic speech recognition (ASR) technology with the YouTube caption system to offer automatic captions, or auto-caps for short,” he added.

There are several related features being introduced at the same time, but the most interesting, albeit the most underdeveloped, is the automatic captions technology. For the videos on which the feature has been made available, users can use the right bottom corner menu to activate captions on a video by video basis. It's only available for English videos for now and the feature is limited to several educational channels and Google's own channels, as the technology is far from perfect. As it improves though, Google wants to expand it to other channels and eventually perhaps to all YouTube videos.

Less interesting, but certainly more practical for the moment, is another new feature also related to captions. Machine transcriptions aren't perfect, but creating captions for your videos manually can be a daunting task. You not only have to provide the text, but also time it so it's synced with the audio and the video on the screen. Now YouTube has introduced auto timing, which enables users to just provide a text file of the dialogue in the video and, using the same speech-recognition technology applied in auto captions, the site will take care of syncing the text to the video.

Finally, because Google already has some pretty comprehensive translation technologies it employs in various products, such as Google Translate, it was easy for the dev team to enable users to automatically translate any of the captions in the 51 languages supported now in Translate. This is another example of the kind of things that only Google can do at the moment as very few other companies would have a powerful speech recognition and translation technologies, which they could just take and stitch it on their other products.

There are several reasons why captions and especially automatic captions are important. The most obvious one is that it helps people with hearing problems enjoy, to a certain degree, the growing number of video content available online. Another advantage is that it makes it easier to learn English by having a text transcription of what's being said, so it's easier to follow, and it also makes it possible for those who don't speak any English at all to understand, again to a certain degree, the videos. But in the long run, perhaps the biggest opportunity here is that it allows Google to search within the content of the video for relevant keywords, opening up a lot of information which was otherwise “invisible” to the search engine.