7

What's the best way to search my file system on ubuntu and get results almost instantly? I have used catfish , tracker and the usual search tool provided with ubuntu.

Tracker finds nothing, ubuntu search tool is too slow and catfish most of the time finds nothing. I have a lot of PDFs and DJVU files that I want to access. In windows, there is a program called search everything that returns results almost instantly. I want a similar linux tool.

Please provide a detailed answer as possible as I'm a newbie in linux. If such a tool doesn't exist in ubuntu, what's the chance that I can find such tool in other linux distribution e.g mandriva, redhat?

Glutanimate
  • 21,763
Nabil
  • 71

7 Answers7

10

Recoll can do this for you. It features full-text indexing for almost every document type you can imagine and a result overview sorted by page numbers for PDF documents.

enter image description here

enter image description here

You can install it through the software center (search for Recoll) or get the new newest version through the Recoll PPA (including a Unity lens/scope). First add the official Recoll repository:

sudo add-apt-repository ppa:recoll-backports/recoll-1.15-on
sudo apt-get update

If you are on Ubuntu 13.04 and below you will have to install recoll-lens:

sudo apt-get install recoll recoll-lens

For Ubuntu 13.10 and up use unity-scope-recoll instead:

sudo apt-get install unity-scope-recoll

If this is the first time you are installing from a PPA, make sure you read these first:

What are PPAs and how do I use them?

Are PPA's safe to add to my system and what are some "red flags" to watch out for?

You will have to execute Recoll at least once to build your search index before being able to use the Recoll lens/scope.

More extensive documentation on how to use Recoll can be found here.

Glutanimate
  • 21,763
4

To search for file names only - ignoring content -
you can use locate tool. It is very fast on searching.

locate '*.pdf'

will list all the pdf file. See the manual page for more info.

$ locate --help
Usage: locate [OPTION]... [PATTERN]...

Search for entries in a mlocate database.

  -b, --basename         match only the base name of path names
  -c, --count            only print number of found entries
  -d, --database DBPATH  use DBPATH instead of default database (which is
                         /var/lib/mlocate/mlocate.db)
  -e, --existing         only print entries for currently existing files
  -L, --follow           follow trailing symbolic links when checking file
                         existence (default)
  -h, --help             print this help
  -i, --ignore-case      ignore case distinctions when matching patterns
  -l, --limit, -n LIMIT  limit output (or counting) to LIMIT entries
  -m, --mmap             ignored, for backward compatibility
  -P, --nofollow, -H     don't follow trailing symbolic links when checking file
                         existence
  -0, --null             separate entries with NUL on output
  -S, --statistics       don't search for entries, print statistics about each
                         used database
  -q, --quiet            report no error messages about reading databases
  -r, --regexp REGEXP    search for basic regexp REGEXP instead of patterns
      --regex            patterns are extended regexps
  -s, --stdio            ignored, for backward compatibility
  -V, --version          print version information
  -w, --wholename        match whole path name (default)
Volker Siegel
  • 13,295
Anwar
  • 77,855
1

I also do a lot of searching through very large libraries of PDFs. For me, this is the #1 frustration of Linux that makes me miss MS Windows. I've tried it all at this point, and the solution I have settled on for now is to use the following programs in combination.

Unfortunately, none of these seem to be in the Ubuntu repositories at the moment, and may be unstable. So if Recoll (now in the default repository for Ubuntu 14.04 I beleive?) or something else works for you, better to stick with that.

1) Synapse

Installation: Read this post for details, but basically you can install it by running the following commands in a terminal.

sudo apt-add-repository ppa:synapse-core/testing
sudo apt-get update
sudo apt-get install synapse

Positive

  • Very fast, smart search results
  • If what you want doesn't come up right away, you can press down and tab to find more with "locate".

Negative

  • Only searches filenames, not text inside.
  • Seems to miss a lot, especially before you try "locate".

2) Launchy

Installation: Download the package here.

Positive:

  • Almost as fast as Synapse
  • Results are very comprehensive.

Negative:

  • Also only searches filenames.
  • Probably the buggiest of these three.

3) DocFetcher

Installation: Unless you can find it in a repository somewhere, you are stuck with the portable version. Download it here and follow the instructions.

Positive:

  • Searches inside the text of your PDFs
  • Comprehensive but relevant results, in a logical order (I usually find the results in Recoll or Tracker to be completely screwy in comparison)
  • Full document preview pane so you can see more of the file before you open it (not just a few lines)
  • Reasonably fast

Negative:

  • Hard to install and run natively in Ubuntu (e.g. without Java runtime)
  • Much slower than the apps that only search filenames

Hopefully Dash will catch up and make all of this obsolete, but in the meantime these three are mostly what I am using.

Other options maybe worth trying:

  • Gnome-Do might be a worthy alternative to Synapse, but last I checked it can only index 5000 files, and that is not enough for me
  • pdfgrep is sometimes useful but slow and has no GUI that I am aware of
Brian Z
  • 712
0

enter image description hereyou can also use gnome-search-tool . you can get it by sudo apt-get install gnome-search-tool

Raja G
  • 105,327
  • 107
  • 262
  • 331
0

The following Python code will return search results very quickly. Just change the second parameter in fnmatch.fnmatch(file,'*.txt) to whatever you are looking for. It's incredibly quick.

import fnmatch
import os

for file in os.listdir('.'):
    if fnmatch.fnmatch(file, '*.txt'):
        print file
noel
  • 304
  • 1
  • 14
0

Another option is Synapse.
Integrates Zeitgeist results.
I have a lot of documents on my system, and was surprised at how fast Synapse was able to find the files I need.

sudo apt-get install synapse

cheers

DrewG
  • 1
0

For a command line option, "silver searcher" is in my opinion simply the best. Far faster than find and awk, and has simpler usage:

ag <path>

Install from ubuntu 14.04

sudo apt-get install silversearcher-ag

Take a look on some speed comparisons against find and awk

https://github.com/ggreer/the_silver_searcher