Softpedia
 

NEWS CATEGORIES:



NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
Home > News > Webmaster > Google News

June 22nd, 2010, 09:35 GMT · By

Google Docs Adds OCR Conversion of PDF and Image Files

SHARE:

Adjust text size:


The new conversion option in Google Docs
Enlarge picture
One of Google’s greatest tricks is that it can develop a technology and then apply it to several products and get benefits for them all. For example, Google has been working on the optical character recognition (OCR) technology for years and uses it extensively for its Google Books project. But, recently, it has also started testing integration with Google Docs. Now, it has quietly included support for converting PDF and image files to native documents using OCR.

When uploading a file to Google Docs, you now have the choice to “Convert text from PDF or image files to Google Docs documents.” Selecting this option kicks in the OCR technology and Docs then tries to decipher the files and present them as plain text in the documents editor. This has the disadvantage of dropping most of the formatting, but what you get in return is a document you can edit.

However, the warning you get at the beginning of any converted document makes it pretty clear that the technology isn’t perfect. “This document contains text automatically extracted from a PDF or image file. Formatting may have been lost and not all text may have been recognized,” the notification reads. At least you get a rendered image of every page in a PDF file inserted in the new document so you can compare the results with the original.

The quality of the OCR varies greatly from file to file and some are going to be inherently harder to convert. In my testing, Google Docs performed flawlessly, without a single error as far as I can tell. Of course, the original PDF document was fairly high quality, so that helped greatly.

Others have had poorer results, so the quality of the conversion is going to vary on a case-by-case basis. But it’s an interesting feature and it should come in handy for those that don’t regularly need these kinds of tools. And since this is just the first iteration, you can expect the technology and the feature itself to improve over time.

A converted PDF document in Google Docs
Enlarge picture
The original PDF document
Enlarge picture


FILED UNDER:
Google
Google Docs
PDF
OCR

TELL US WHAT YOU THINK:

2,358 hits · Link to this article · Print article · Send to friend · Subscribe to news

MUST-READ RELATED ARTICLES:


GoogleCL, Command-Line Tools for Google Services

Google Docs Revamps, Simplifies Sharing Options

Google Chrome Introduces Extension Sync, Built-in PDF Viewer

Google Docs Drawings Get a Few Useful New Features

Google’s GDrive Is Here with Memeo Connect 2

READER COMMENTS:



No user comments yet.
Be the first to express your opinion!
Copyright © 2001-2012 Softpedia. Contact/Tip us at

WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM