I'm trying to do OCR on a pdf with a two-page layout - in a landscape-orientation page of the PDF, the left half is one (portrait-orientation) page, the right half is the next (portrait-orientation) page. Sometimes the layout messes up tesseract. Can I tell it about the layout, or efficiently splice the original PDF before running it through tesseract?
Asked
Active
Viewed 213 times