But only for API users so far

Sep 30, 2009 13:58 GMT  ·  By

Google is yet again putting to good use its massive computing capabilities and technology portfolio to offer new features and tools for its products. The company has announced that it has enabled optical character recognition for files uploaded to Google Docs through an API it now provides. The OCR works for images with typewritten or printed text of reasonably large resolutions and it’s still pretty much in the experimental phase.

“Optical Character Recognition (OCR), allows your application to create editable Google Documents from high-resolution images containing text (such as faxes or scanned letters). To perform OCR on a .png, .jpg, or .gif upload, add the ocr=true parameter onto your upload request,” Jaron Schaeffer and Eric Bidelman, from the Google Docs team, wrote on the Google Data APIs blog. “OCR will only work well on high-resolution images. The quality of the extracted text isn't perfect yet, but we're busy improving it!”

The feature isn't available for everyday users and hasn't been integrated into the service yet but the underlying technology has been enabled and can be accessed by third-party developers using the Documents List Data API by using the 'ocr=true' parameter. Regular users though can have a go at testing the new technology by using the sample page and uploader which Google has provided as an example use case.

There are several limitations at the moment: the feature only accepts several common image formats and the images have to be rather high-resolution to get the best results. There is also a 10 MB per file size limit and a maximum resolution limit of 25 megapixels. The results are hit or miss but for high-quality images the OCR technology tends to get decent reading recognizing a fair number of them. Unfortunately, the process is still rather buggy and prone to errors, not entirely surprising considering this is just a first release but certainly not ready for production use. Still, as the technology matures it could prove a very compelling reason to use Google Docs.