I have recently scanned a book into a 600 page PDF file. However the pages are randomly skewed/rotated clockwise or counterclockwise. Any software to automatically correct this ? I know Acrobat Pro can, but any free Ubuntu software / script ?
3 Answers
Deskew
Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.
Installation: Download last release. It's written in Pascal, but seems well maintained.
pagetools: Page Layout Detection Tools
Automatic deskew and bounding box determination for scanned page images
sudo apt install pagetools
Last Update: 2013-03-22
- 17,371
This is almost automatic, starting with a multipage .pdf:
Install scantailor-advanced
- Open Gnome Software (install it if absent) / [This does not work on App Center/Snap store]
- Search for scantailor, select the one with Ubuntu as source (snap) (avoid flathub)
Split the pdf into .png files
gs -dBATCH -dNOPAUSE -sDEVICE=pnggray -r300 -dUseCropBox -sOutputFile=filename-%03d.png multipage.pdf
Launch scantailor-advanced
- select for "New Project" the folder with the .png files
- In the left menu go carefully through each option, one by one, and define the settings, pressing the title, and then, pressing the play icon
- Use "apply to/Change" "All pages" specially in the last option "Output"
Go to the output folder with the .tif files
Combine them with
convert *.tif Desired_Name.pdfIf that command fails, because of having more than 50 pages, use something like this: https://pastebin.com/pTsggARx
- 959
Do you mean skewed—as in, stretched in some way, like this:
—or rotated?
I'm assuming you mean rotated, since I honestly don't think it's possible for your scanner to mess the image up that badly!
If you just need to rotate, I would recommend PDF-Shuffler, a GUI-based program that can make the process of going through each page and rotating them as necessary a lot less painful. Have a look. And I'm sure there are other programs that could do the same thing.
Unfortunately, I don't know of any software that can look over all the pages in your PDF and decide for you which ones need to be transformed in some complex way, let alone rotated.
EDIT: If your file was a native pdf that could be converted into postscript (.ps) format, I think it's possible there is a way to autorotate pages using ghostscript. However, to my knowledge, you can't do this with scanned pages, because the auto-rotate feature relies on interpretation of text direction, which can only come from a native pdf or ps document. I'm not completely sure...I will look into this a little more.
- 886
