I'm able to view an epub file in, say, okular, select all the text and copy-paste into a text editor. I'd like a command line method - anyone know of such a thing?
5 Answers
I don't know if Calibre is worth installing for your job, but if you have it you could use the powerful ebook converter:
ebook-convert input.epub output.txt
Output format is deducted from output file extension
I imagine there could be some XML tools/scripts (XSLT) that can transform epub in text as epub is basically XHTML in ZIP archive
- 9,871
An alternative is epub2txt by Kevin Boone, available on Github.
epub2html is a simple command-line utility for extracting text from EPUB documents and, optionally, re-flowing it to fit a text display of a particular number of columns. It is written entirely in ANSI-standard C.
Usage example:
epub2txt input.epub > output.txt
MuPDF can convert from epub to html and txt. To install it:
sudo apt install mupdf mupdf-tools
To use it:
mutool convert -o somefilename.txt somefilename.epub
It assumes txt output from the -o option.
See mutool convert documentation for more information.
Maybe Calibre can suit your needs.
See What formats does calibre support conversion to/from? for information on supported formats.
- 119,640
- 1,318
To convert an epub document to plain text from terminal:
pandoc input.epub | lynx --stdin --dump > output.txt
It is assumed that pandoc and lynx are already installed.
- 191