How-to: Google Docs OCR
Update May 1, 2015: (a9t9) launched its very own free and open-source Online OCR service - try it out and let us know how it compares.
How to convert an image or (an image inside a) PDF into text using Google’s free OCR converter. They confusing part is: There is no button, no checkbox and no mentioning of OCR in the Google Docs of 2015. So if you cannot find it, it is not you being stupid…
Google OCR works indirectly by automatically scanning and converting images to text – but only if Google Docs “thinks” there is text in an image. In other words: If there is no text that Google can recognize – nothing happens. But now…. step by step:
OCR with Google Docs/Google Drive
If you are not there, leave Google Docs and go to Google Drive
Upload the image to Google Drive
Start the OCR conversion: Right-click on the file and select “Open with Google Docs”
Done: Now your image is inside a Google Doc and the extracted text is below the image
As with many things, once you know them its easy. The OCR’ed text is below the embedded image.
If there is no text, then Google could not extract anything and fails silently.
Example Google Doc with the conversion result https://docs.google.com/document/d/1AUPOWk9laXMLD0G-WT7DtFQUAOANCBRjF29-zmAN71o/edit?usp=sharing
Automatic OCR on uploads: In GDrive (not Google Docs!) click on the cog at top right, then select “Settings”, and in the popup that opens you can change the upload settings. For an automatic conversion to the Google doc format (and the automatic document OCR that comes with it) select the box at “Convert Uploads” as shown in the screenshot:
Of course, what automatic OCR does not not solve is my #1 complaint with Google OCR: One never knows if and when the OCR process kicks in, and if it does not, why not.
Google OCR API
Many asked me, but unlike from Baidu, there is no dedicated Google OCR API available. If you insist on using Google for doing OCR, you can only use the Google Drive REST API to upload/insert documents to Google Docs. Basically you use the API to replicate the manual process described above. The API takes parameter such as:
- ocr - Whether to attempt OCR on .jpg, .png, .gif, or .pdf uploads. (Default: false)
- ocrLanguage - If ocr is true, hints at the language to use. Valid values are BCP 47 codes.
- Details here: Google Drive REST API
Some more OCR tips for better results:
1. Example of images types of files suitable for OCR:
- Image or PDF files obtained using flatbed scanners
- Photos taken with digital cameras or mobile phones
- Screenshots (e. g. from Youtube videos)
2. For best results, the image or PDF files need to meet certain requirements:
Resolution: High-resolution files work best. Google recommends each line of text in the documents to be of at least 10 pixels height.
Orientation: Only documents with horizontal left-to-right text are recognized. If you’ve scanned or captured a document in a different orientation, you can use a program like “Windows Photo Viewer” (part of Windows!) to retouch and edit images to rotate them before uploading to Google Drive.
File size limitations: The maximum size for images (.jpg, .gif, .png) and PDF files (.pdf) is 2 MB. For PDF files, Google looks only at the first 10 pages when searching for text to extract.
3. Further reading:
- Manual upload settings in Google Drive (outdated? Does not reflect 2014/2015)
- Optical Character Recognition in Google Drive (confusing, at least for me. That is why I wrote this tutorial)
- Alternatives to Google OCR: Best Online OCR Software (Review)
- New: Try the open-source OCR project “(a9t9) OCR” - it includes a fast and free Online OCR service and a free OCR API.