2

I have been using pdfimages to extract images from PDFs. It extracts all of following types of images

image - an opaque image

mask - a monochrome mask image

smask - a soft-mask image

stencil - a monochrome mask image used for painting a color or pattern

How can I extract only opaque type images and exclude mask, smask and stencil images?

1 Answers1

2

I know that I'm late to the party, but here are my two cents: 1) Extract all the images (regular images and smask images) with pdfimages

pdfimages -j file.pdf images/image

2) Obtain the smask names and remove them (beward of the zero leading names)

pdfimages -list file.pdf  | grep smask | column -t|awk '{print $2}' | xargs -I '{}' printf "%03d\n" '{}' | xargs -I '{}' rm images/image-'{}'.ppm
Juanan
  • 121
  • 4