by Dan Delong
Optical Character Recognition is not only useful for creating editable text documents, it can makes published works searchable, whether as doc, rtf, html, or pdf formats. Moreover, once changed into text, such files take a lot less space, either for storage, or for emailing.
Free OCR does not require that the text contain image you decide to use for OCRing be of a particular image format, or that you have created the image using a scanner. Free OCR supports a wide variety of Twain and WIA scanners, that will produce a flatter looking page, at a preferred dots-per-inch resolution (200dpi min, 300 dpi or better preferred), than will a digital camera.
[The latest version of the software, used to actually perform this character recognition, is from the year 2015. The Tesseract OCR PDF engine was developed in the 80s and 90s by HP, and is now maintained by Google.]
If you have a multi-page tiff or pdf file, Free OCR can load them all and read the pages one by one.
After installing, I would recommend clicking on "Open Help" button to understand the technique used for recreating the text in a word processor, so that it looks like the original layout. Some of the most expensive and not free programs automate this process, but Free OCR requires a few steps to insert an image, or to create certain font characteristics.
If Microsoft Word is installed on your computer, each section of character recognition can be transferee directly to Word. If Word is not installed, just copy / paste into any open word processor.
The example I tried, was a low resolution (96dpi) jpeg image of a newspaper scan. Since Word was not available on my laptop, I copied and pasted the resultant text into TextMaker (a SoftMaker word processor, very much like Word). The image of Gary Lautens was simply a Free OCR "Copy selection to clipboard" function obtained by drawing an outline around his image and right clicking on the menu item that looks like ants crawling around a rectangle.
Lots of character recognition mistakes were made, but I count that as due to the low resolution of the image file used for this experiment.
So, I found a file at 600dpi, with lots of text. It was bang on... error free!
Platform: Windows XP to Win 10
Language: English and 10 others
Download Size: 11 MB
Installed Size: 20 MB
Download Site Here