Dr James Bayley

Creating successful projects

OneNote does OCR better than PhantomPDF

I used to use Adobe Acrobat Professional but I got tired of the cost and licencing hassles. Now I use PhantomPDF Business Edition which costs less and works well.

I recently scanned a antiquarian document as PDF. This resulted in PDF wrapper around some images. I used PhantomPDF for OCR and it was OK but there were lots of suspects. I don’t think that Adobe Acrobat would have done much better.

Then I simply printed the PDF document to OneNote and right-clicked on the image to extract the text. It did an awesome job and it is free.

Thoughts on OCR and AI

There were still a few errors which shows that Microsoft have not implemented any AI in the solution. My document contained words in Victorian and Indian English like “dawk” and “waggon”. AI could have determined the nature of the document and improved on the recognition by comparison with other corpus. This seems like an easy win and a quick search does show some people pursuing this approach.

Dr James Bayley

02/06/2018

applications, Uncategorized

ocr, OneNote, pdf, PhantomPDF

OneNote does OCR better than PhantomPDF

Thoughts on OCR and AI

Share this:

Leave a comment Cancel reply