I have introduced some candidates for OCR in the previous posts such as C# – OCR library candidates and C# – An example of OCR web service. Last year Microsoft published his new project for machine learning. The project is called Project Oxford. This project provides us the APIs for computer vision, face, emotion, video, speech, speaker recognition… The project is really interesting, you can learn more about it from its homepage. In this small post, I would like to illustrate how we can make a call to OCR service. OCR ist just a part of Computer Vision APIs.
Years ago I wrote a small post about C# – OCR library candidates for comparing between OCR libraries of Tesseract and Microsoft Office Document Imaging. Tesseract is an open source OCR framework. Unfortunately, its inaccuracy is still high and can’t be used in commercial products. Last week I would like to make a small OCR web service for training myself and to test Tesseract again. The result is still as bad as last time (I guess FineReader of Abby may be the best OCR SDK but I have no full version for testing). Although Tesseract is not able to recognize complex documents, I also used it for this example because there is no other better candidate. The sample OCR web service works pretty simply, he receives a file which is uploaded from the client, runs OCR and returns text back. No big deal.