0

I have pdf files I need to convert another format. The files contain images and text. I need the pictures.

I tried to convert using abiword but the result unfortunately gets only text.

Command I use:

abiword --to=doc file.pdf

I think that the format "odt" will be the best. Unless there's another way to draw the same pictures.

Zanna
  • 72,312

2 Answers2

1

If there are just a few figures to extract from the pdf file, you can use a GUI-based method, only possible if you have a photo editor (it is likely you have GIMP installed). Here it is a detailed process of how to do it (dispense me if you are an expert on this, but it might be good as a reference for others):

  • Open the pdf file, and put one picture in the screen so that it covers most of the screen (for example, if you are using Evince, just press F11 and adjust zoom). (The larger the picture on screen, the better the quality of the extracted file)
  • Press print screen key in keyboard.
  • Select Open with: your favorite photo editor (probably GIMP).
  • Use the rectangle selection tool to select the area of the picture you want to extract. To enable this tool, use the Toolbox panel or press R. Once you have selected the area, copy it using Ctrl+C and paste it as a new image using Shift+Ctrl+V. Then, select Export As from File menu, or press Shift+Ctrl+E. Then you can save the picture in any format you like. Just change the extension or select in the menu at the bottom.

If you have many pictures to extract, or you prefer a command-line method, use the pdfimages tool. To use it, install the poppler-utils package (you might have it already):

sudo apt-get install poppler-utils

Then, open a terminal window, go to the folder where you pdf file is, and run:

pdfimages -j file.pdf photo

This will extract the pictures from file.pdf and save them as photo-001.jpg, photo-002.jpg, etc (in the same folder where the pdf file is).

Zanna
  • 72,312
0

Open the pdf file with OpenOffice/LibreOffice, delete the text and save the file as odt. odt means 'open document text' file. It is the native format of OpenOffice.

ipse lute
  • 2,614