5

I am an electronics engineer and I regularly view PDF schematics. Often I encounter the scenario where I would like to search the schematic for a component, e.g. "R1"

The problem is that searching for "R1" matches all the R[tens] and R[hundreds] on the schematic as well. So I would like to be able to use a regex in my search, or at least have tighter control of the search (e.g. search whole word only).

Has anyone here found a good PDF tool on Ubuntu which supports these features?

αғsнιη
  • 36,350

3 Answers3

3

Install pdfgrep :

sudo apt-get install pdfgrep

And then use -C option and word boundaries match:

pdfgrep -C 0 '\<WORD\>' file.pdf

or use \b...\b instead of \<...\>.

See its man pdfgrep

-C, --context NUM
      Print at most NUM characters of context around each match.

I have googled and found JPedal(30-days trial). Download and open it via command-line by the following command:

java -jar jpedal-trial.jar

Now press Ctrl+F, type the word that you want to search and check the "Find Whole Words Only" from Down-arrow icon (enter image description here) to search for whole word.

enter image description here

αғsнιη
  • 36,350
3

If you are fine with creating an index of your documents you could use Recoll which is a full-on desktop search engine. For screenshots and installation instructions please take a look at this answer.

Recoll searches are constructed using a poweful query language that supports wildcards and modifiers (e.g. proximity and slack).

For instance, the query "R1"l would only yield whole-word results. This is because the l modifier turns off stemming. (In this specific example you wouldn't even need the modifier because Recoll doesn't expand sequences of numbers by default).

Glutanimate
  • 21,763
1

If the problem is just to limit the search to whole words, that is easy enough. Just add spaces before and after your search string, like so: " R1 ". I use this trick in Evince all the time.

Brian Z
  • 712