Could it be possible to detect text, symbols, and components directly in a scanned PDF file with a program like Tensorflow or another program?

Question

I have this problem where I need to get information out of PDF document sent from a scanner. The program needs to be learnable in some way to recognize what different figures mean. Most of this should happen without human interference so it could just give a result after scanning the file. Do anyone know if it's possible to do with a machine learning program or any alternative way?

score 1 · Answer 1 · answered Jan 16 '19 at 10:36

Yes, that's possible. I am working on a project in which I have to detect text in images. I did a quick search and found these two algorithms:

1. EAST: (Efficient and Accurate Scene Text Detector)
I am not sure if it is based on Machine Learning. Here are some links link1 link2 explaining how to use it with an example and using tesseract to extract the detected text.

2. CTPN: (Connectionist Text Proposal Network)
This algorithm is based on Machine Learning. Here is its link in github. In the description, you will find a link to a pre-trained model that you can use. Or simply, you can prepare your own data and train your own model.

For me, I tried both of them, and the CTPN model gave better results especially when the image contains large text.

Could it be possible to detect text, symbols, and components directly in a scanned PDF file with a program like Tensorflow or another program?

1 Answers1