46

I saw this one-liner recently:

$ ps -ef | grep [f]irefox 

thorsen   16730     1  1 Jun19 ?        00:27:27 /usr/lib/firefox/firefox ...

So it seems to return the list of processes with "firefox" in the data but leaving out the grep process itself, and therefore seems roughly equivalent to:

ps -ef |grep -v grep| grep firefox

I can't understand how it works though. I've looked at the man page on grep and elsewhere but haven't found an explanation.

And to compound the mystery if I run:

$ ps -ef | grep firefox  > data
$ grep [f]irefox data

thorsen   15820 28618  0 07:28 pts/1    00:00:00 grep --color=auto firefox
thorsen   16730     1  1 Jun19 ?        00:27:45 /usr/lib/firefox/firefox ....

the [t]rick seems to stop working!

Someone here will know what's going on I'm sure.

Thanks.

ish
  • 141,990
Thorsen
  • 866

3 Answers3

68

The square bracket expression is part of the bash shell (and other shells as well) grep's character class pattern matching.

The grep program by default understands POSIX basic regular expressions. With that you can define character classes. For example ps -ef | grep [ab9]irefox would find "airefox", "birefox", "9irefox" if those existed, but not "abirefox".

The command grep [a-zA-Z0-9]irefox would even find all processes that start with exactly one letter or number and end in "irefox".

So ps -ef | grep firefox searches for lines with firefox in it. Since the grep process itself has "firefox" in it, grep finds that as well. By adding a [], we are only searching for the character class "[f]" (which consists of only the letter "f" and is therefor equivalent to just an "f" without the brackets). The advantage of the brackets now is that the string "firefox" no longer appears in the grep command. Hence, grep itself won't show up in the grep result.

Because not very many people are familiar with square brackets as character class matching and regular expressions in general, the second result might look a little mysterious.

If you want to fix the second result, you can use them this way:

ps -ef | grep [f]irefox  > data
grep firefox data

(Reference)

dessert
  • 40,956
jokerdino
  • 41,732
13

The reason is that the string

grep firefox

matches the pattern firefox, but the string

grep [f]irefox

does not match the pattern [f]irefox (which is equivalent to the pattern firefox).

That's why the first grep matches its own process command line, while the second doesn't.

0

Daniel's answer is spot-on, but there's one interesting complication brought to mind by jokerdino's (largely incorrect) answer about shell escaping.

First of all, notice that ps's unfiltered output will contain a line corresponding to the grep process launched by your shell. If you're running grep firefox at the moment you run ps, you'll see it in the output:

$ ps
thorsen   15820 28618  0 07:28 pts/1    00:00:00 grep firefox
thorsen   23983     1  1 Jun19 ?        00:12:34 some other process ....

If you then take ps's output and filter it through that grep process — grepping ps's output for strings that match the regex firefox — then, well, that line will match!

$ ps | grep firefox
thorsen   15820 28618  0 07:28 pts/1    00:00:00 grep firefox
                                                      ^^^^^^^ Found it!

But if you launched grep with arguments that do not match the regex you're grepping for, then that line of ps's output will not match the regex.

$ ps | grep 'f[ij]refo*x'

The unfiltered output will contain a line like

thorsen   15820 28618  0 07:28 pts/1    00:00:00 grep f[ij]refo*x

but the filtered output won't, because that line doesn't contain any substrings matching the regex f[ij]refo*x. (That line doesn't contain firefx, or fjrefx, or firefox, or fjrefoox, or...)

But, as jokerdino pointed out, there can be something else going on here, too! Because bracket characters are also magic to most shells. When you write

ls foo*.[ch]

the Bash shell actually looks at what files are available in your current working directory and expands that glob into, like,

ls foo.c foobar.c foobar.h

If you don't want shell globbing to happen, then you must backslash-escape the magic special characters *, [, ] or enclose them in single-quotes:

$ ls foo*.[ch]
foo.c   foobar.c        foobar.h

$ ls 'foo.[ch]' ls: foo.[ch]: No such file or directory

Globbing also becomes a no-op if Bash can't find any matching files in the current directory:

$ rm foo*.[ch]
$ ls foo*.[ch]
ls: foo*.[ch]: No such file or directory

So, when you wrote

$ grep [f]irefox

without any single-quotes, it caused grep to look for lines matching the regex [f]irefox precisely because there was no file matching the glob [f]irefox in your current working directory! This doesn't relate to your actual observations, but it's interesting to note that you could have observed the following behavior:

$ cd /usr
$ ps -ef | grep [f]irefox 
thorsen   16730     1  1 Jun19 ?        00:27:27 /usr/lib/firefox/firefox ....

$ cd /usr/lib $ ps -ef | grep [f]irefox thorsen 15820 28618 0 07:28 pts/1 00:00:00 grep --color=auto firefox thorsen 16730 1 1 Jun19 ? 00:27:27 /usr/lib/firefox/firefox ....

In the second case here, because the current directory has an entry named firefox, the unquoted argument [f]irefox is expanded by Bash before grep gets to see it, and you end up grepping for the regex firefox instead of [f]irefox. The solution would be to add single-quotes:

$ ps -ef | grep '[f]irefox'
thorsen   16730     1  1 Jun19 ?        00:27:27 /usr/lib/firefox/firefox ....

I recommend adding single-quotes around every argument to anything ever, that includes "shell metacharacters" such as *, [, (, {, =, ,, ;, etc. — especially regexes!