577

I have requirement of converting PDF pages to images. There is a background image with some text in my file, and when I save it as an image only the background image gets saved.

Is there any software available for the same so that complete page can be converted to an image?

Zanna
  • 72,312

13 Answers13

759

You can use pdftoppm from the poppler-utils package to convert a PDF to a PNG:

pdftoppm input.pdf outputname -png

This will output each page in the PDF using the format outputname-01.png, with 01 being the index of the page.

Converting a single page or a range of pages of the PDF

pdftoppm input.pdf outputname -png -f {page} -singlefile

Change {page} to the page number. It's indexed at 1, so -f 1 would be the first page.

If you'd like to work on a range of pages, you can also specify a number for the flag -l (last page), so having -f 1 -l 30 would specify the pages from 1 to 30.

Note again that .png will be appended to outputname automatically, so there's no need to include the extension. Also, -singlefile removes the -01 suffix cited above, since the output is known to have only one file.

Specifying the converted image's resolution

The default resolution for this command is 150 DPI. Increasing it will result in both a larger file size and more detail.

To increase the resolution of the converted PDF, add the options -rx {resolution} and -ry {resolution}. For example:

pdftoppm input.pdf outputname -png -rx 300 -ry 300
enzotib
  • 96,093
374

You can use ImageMagick for this. Note that newer versions of ImageMagick have disabled the ability to convert PDF files to images, because of security vulnerabilities that are being exploited in the wild. See the comments for more details and for a workaround.

  1. Install imagemagick by clicking here or by running:

    sudo apt install imagemagick
    
  2. Using a terminal where the PDF is located:

    • For the full document:

      convert -density 150 input.pdf -quality 90 output.png
      
    • For a single page:

      convert -density 150 input.pdf[666] -quality 90 output.png
      

Whereby:

  • PNG, JPG or (virtually) any other image format can be chosen.

  • -density xxx will set the DPI to xxx (common are 150 and 300).

  • -quality xxx will set the compression to xxx for PNG, JPG and MIFF file formates (100 means no compression).

  • [666] will convert only the 667th page to PNG (zero-based numbering so [0] is the 1st page).

  • All other options (such as trimming, grayscale, etc.) can be viewed on the website of Image Magic.

Flimm
  • 44,031
Binarylife
  • 16,662
33

IIRC GIMP is capable of using PDFs, i.e. converting them into images. So if you want to edit the images right away - GIMP is your friend.

tesseract
  • 486
20

The currently accepted answer does the job but results in an output which is larger in size and suffers from quality loss.

The method in the answer given here results in an output which is comparable in size to the input and doesn't suffer from quality loss.

TLDR - Use pdfimages : pdfimages -j input.pdf output

Quoting the linked answer:

It's not clear what you mean by "quality loss". That could mean a lot of different things. Could you post some samples to illustrate? Perhaps cut the same section out of the poor quality and good quality versions (as a PNG to avoid further quality loss).

Perhaps you need to use -density to do the conversion at a higher dpi:

convert -density 300 file.pdf page_%04d.jpg

(You can prepend -units PixelsPerInch or -units PixelsPerCentimeter if necessary. My copy defaults to ppi.)

Update: As you pointed out, gscan2pdf (the way you're using it) is just a wrapper for pdfimages (from poppler). pdfimages does not do the same thing that convert does when given a PDF as input.

convert takes the PDF, renders it at some resolution, and uses the resulting bitmap as the source image.

pdfimages looks through the PDF for embedded bitmap images and exports each one to a file. It simply ignores any text or vector drawing commands in the PDF.

As a result, if what you have is a PDF that's just a wrapper around a series of bitmaps, pdfimages will do a much better job of extracting them, because it gets you the raw data at its original size. You probably also want to use the -j option to pdfimages, because a PDF can contain raw JPEG data. By default, pdfimages converts everything to PNM format, and converting JPEG > PPM > JPEG is a lossy process.

So, try

pdfimages -j file.pdf page

You may or may not need to follow that with a convert to .jpg step (depending on what bitmap format the PDF was using).

I tried this command on a PDF that I had made myself from a sequence of JPEG images. The extracted JPEGs were byte-for-byte identical to the source images. You can't get higher quality than that.

12

If your pdfs are scanned, the images are already stored as part of pdf. you will simply need to extract them with pdfimages:

pdfimages my-file.pdf prefix 
VitoshKa
  • 264
5

If you only want to convert a specific page of a PDF to a PNG, you can pipe pdftk to convert (described above) like this:

pdftk document.pdf cat 12 output - | convert - document-page-12.png
IQAndreas
  • 3,298
3

pdftocairo file.pdf -png (was posted by Anthony Ebert as a comment at How to convert PDF to image?)

3

You can do this with ghostscript:

gs -dSAFER -dBATCH -dNOPAUSE -r300 -sDEVICE=png16m -dFirstPage=1 -dLastPage=1 -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -sOutputFile=output.png input.pdf

See https://www.ghostscript.com/doc/9.52/Devices.htm for details

3

You can use convert and specify a higher density using -density option.

eg. convert -d 300 foo.pdf bar.png

Arjun
  • 139
  • 3
3

To get a single page from gm convert, add [N] (with N the page number starting at 0) to the PDF name, ie gm convert foo.pdf[11] out.png to get the 12th page from the PDF.

For pdftoppm use -f N -singlefile, where N is the page number starting at 1, ie pdftoppm -f 12 -singlefile foo.pdf out for the same result. It appears to always add ".png" to the output filename and there is no way to stop this.

jkt123
  • 3,600
user3080602
  • 591
  • 4
  • 2
2

Master PDF Editor (ver 2.2) has this option built in. Open the PDF file and then go to File > Export to > Images. It presents a dialog where you can define different options for the output. Extremely useful. Hope this info helps.

Zanna
  • 72,312
Rush
  • 29
2

PDF Mod also allows exporting images of all or individual pages of PDF files.

  • Open PDF file in PDF Mod
  • Select page(s)-
  • Edit > Export image(s)
Zanna
  • 72,312
nhylated
  • 471
1

For high-quality output, mutool does a great job if the output resolution is specified to a high value (e.g., above 250). mutool comes from the mupdf-tools package, associated with the MuPDF viewer. The command can also do the opposite task, converting png back to pdf.

mutool convert -O resolution=600 -o out-pdf.png in-pdf.pdf