An email came through my inbox a couple of weeks ago from a person who wanted to somehow or other automate getting the text from an image after OneNote runs OCR (optical character recognition) on it. This bubbles up now and then, and this time Jeff Cardon kicked out a quick app to get the OCR text.
Now this is a pretty basic application that is limited in scope. It’s not very friendly or documented, but it gets the text. I saw it come through with Jeff’s reply to the email and asked if it was OK to give it away. He obliged, so if you want to extend this feel encouraged. All I ask is to let me know what you do with it.
Some ideas I had for this (mostly from a perspective of “teach yourself a little about the OneNote API” ) in case you want to run with it:
- Add an update command so you can change the text OneNote uses
- Show the image being used – a thumbnail image would be even better
- Let you choose any page rather than the active page
- Filter the pages from step 3 to only show pages that have OCR data on them
This only works on the active page (the page in the view). You don’t get prompted for this when you start it, but if you have no image with OCR data on the page, you do get notified.
To get to the exe file, look in the folder named “Bin \ Release”. The code is also included here. Again, just a little demo to complete this task. I hope you like it!
Questions, comments, concerns and criticisms always welcome,