183
System information as of Fri Mar  9 19:40:01 KST 2012

  System load:    0.59               Processes:           167
  Usage of /home: 23.0% of 11.00GB   Users logged in:     1
  Swap usage:     0%                 IP address for eth1: 192.168.0.1

  => There is 1 zombie process.

  Graph this data and manage this system at https://landscape.canonical.com/

10 packages can be updated.
4 updates are security updates.

Last login: Fri Mar  9 10:23:48 2012
a@SERVER:~$ ps auxwww | grep 'Z'
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
usera     13572  0.0  0.0   7628   992 pts/2    S+   19:40   0:00 grep --color=auto Z
a@SERVER:~$ 

How to find that zombie process?

muru
  • 207,228
Pablo
  • 2,597

9 Answers9

223

To kill a zombie (process) you have to kill its parent process (just like real zombies!), but the question was how to find it.

Find the zombie (The question answered this part):

a@SERVER:~$ ps aux | grep 'Z'

What you get is Zombies and anything else with a Z in it, so you will also get the grep:

USER       PID     %CPU %MEM  VSZ    RSS TTY      STAT START   TIME COMMAND
usera      13572   0.0  0.0   7628   992 pts/2    S+   19:40   0:00 grep --color=auto Z
usera      93572   0.0  0.0   0      0   ??       Z    19:40   0:00 something

Find the zombie's parent:

a@SERVER:~$ pstree -p -s 93572

Will give you:

init(1)---cnid_metad(1311)---cnid_dbd(5145)

In this case you do not want to kill that parent process and you should be quite happy with one zombie, but killing the immediate parent process 5145 should get rid of it.

Additional resources on askubuntu:

Duncanmoo
  • 2,570
63

Even though this question is old I thought everyone deserved a more reliable answer:

ps axo pid=,stat=

This will emit two whitespace-delimited columns, the first of which is a PID and the second of which is its state.

I don't think even GNU ps provides a way to filter by state directly, but you can reliably do this with awk

ps axo pid=,stat= | awk '$2~/^Z/ { print }'

You now have a list of PIDs which are zombies. Since you know the state it's no longer necessary to display it, so that can be filtered out.

ps axo pid=,stat= | awk '$2~/^Z/ { print $1 }'

Giving a newline-delimited list of zombie PIDs.

You can now operate on this list with a simple shell loop

for pid in $(ps axo pid=,stat= | awk '$2~/^Z/ { print $1 }') ; do
    echo "$pid" # do something interesting here
done

ps is a powerful tool and you don't need to do anything complicated to get process information out of it.

(Meaning of different process states here - https://unix.stackexchange.com/a/18477/121634)

Alex Punnen
  • 302
  • 3
  • 7
Sorpigal
  • 722
9

Less is more though:

ps afuwwx | less +u -p'^(\S+\s+){7}Z.*'

That's like, give me a forest (tree) of all users' processes in a user oriented format with unlimited width on any tty and show it to me at half a screen above where it matches the case that the 8th column contains a Z, and why not highlight the whole line.

User oriented format seems to mean: USER, PID, %CPU, %MEM, VSZ, RSS, TTY, STAT, START, TIME, COMMAND so the Zombie status will show up in the 8th column.

You can throw in an N before the p if you want line numbers, and a J if you want an asterisk at the match. Sadly if you use G to not highlight the line that asterisk will not show, though J creates space for it.

You end up getting something that looks like:

…
  root      2919  0.0  0.0  61432  5852 ?      Ss Jan24 0:00 /usr/sbin/sshd -D
  root     12984  0.0  0.1 154796 15708 ?      Ss 20:20 0:00  \_ sshd: lamblin [priv]
  lamblin  13084  0.0  0.0 154796  9764 ?      S  20:20 0:00      \_ sshd: lamblin@pts/0
* lamblin  13086  0.0  0.0  13080  5056 pts/0  Z  20:20 0:00          \_ -bash <defunct>
  lamblin  13085  0.0  0.0  13080  5056 pts/0  Ss 20:20 0:00          \_ -bash
  root     13159  0.0  0.0 111740  6276 pts/0  S  20:20 0:00              \_ su - nilbmal
  nilbmal  13161  0.2  0.0  13156  5004 pts/0  S  20:20 0:00                  \_ -su
  nilbmal  13271  0.0  0.0  28152  3332 pts/0  R+ 20:20 0:00                      \_ ps afuwwx
  nilbmal  13275  0.0  0.0   8404   848 pts/0  S+ 20:20 0:00                      \_ less +u -Jp^(\S+\s+){7}Z.*
…

You could follow this up with (and it'll detect if your terminal likes -U Unicode or -A Ascii):

pstree -psS <PID LIST>

OR just, you know, use the up-arrow in less to follow that tree/forest through the hierarchy; which is what I was recommending with the "Less is more" approach.

dlamblin
  • 857
7

I usually find them on my server with

ps aux | grep 'defunct'
Marco
  • 219
3

ps aux | awk '{ print $8 " " $2 }' | grep -w Z

From: http://www.cyberciti.biz/tips/killing-zombie-process.html

From the comments an improved one:

for p in $(ps jauxww | grep Z | grep -v PID | awk '{print $3}'); do
    for every in $(ps auxw | grep $p | grep cron | awk '{print $2}'); do
        kill -9 $every;
    done;
done;

Careful though: this one also kills the proces.

Neil
  • 482
Rinzwind
  • 309,379
1

While both dlambin's and Sorpigal's answers are excellent and accomplish the job nicely, I just wanted to document my finding of how to use awk to find the STAT column when changing between ps output formats, instead of hard-coding it:

ps au | awk '{
    if (NR==1) {
        for (i=1; i<=NF; i++) {
            if ($i=="STAT")
                stat=i
        };
        print
    } else if ($stat~/^Z/)
        print
}'

This will list all processes in ps's "user-oriented format" and pipe into awk. On the header line (NR=record number == 1) it will walk all fields (up to NF=number of fields) and store the field number where the string "STAT" matches in the variable stat, then proceeding to print the line. For all other records (lines), it will check that column for a regex match of starting with a capital Z, only then printing the line.

Condensed version:

ps au|awk '{if(NR==1){for(i=1;i<=NF;i++){if($i=="STAT")stat=i};print}else if($stat~/^Z/)print}'
eMPee584
  • 226
0

I suggest you this command:

ps aux | awk '"[Zz]" ~ $8 { printf("%s, PID = %d\n", $8, $2); }'
-1

why we're not telling "ps" what info we want to get provided? let's have a try:

read zSTAT zPPID zPID zSTAT zCMD <<< $(ps -xao stat,ppid,pid,cmd|awk '$1=="Z" {print $1" "$2" "$3" "$4}')
[[ ! -z ${zPPID} ]] && echo "Zombie found! PID: "${zPID}" ("${zCMD}"), Parent to kill: "${zPPID}

this way is pretty quick and works fine. but careful! The system is marking a lot of processes for short time as zombie, some ms later they are reaped and gone... so make sure to count up a variable and only react on zombies which are not reaped after the third detection in a row... then i have to kill the parent process because the cpu is heating up by running on highest frequency, consuming a lot of electrical power for a parent process that is no longer working as expected...

checking if the zombie is reaped or not can be quicker if we only check one parameter:

zombie=$(ps -xao pid|awk '$1=="'${zPID}'" {print $1}')
[[ ! -z ${zombie} ]] && sudo kill -KILL ${zPPID}
6a5h4
  • 9
-1

To list process zombies, try this command:

ps j | awk '$7 ~ "Z"'

You may need to change $7 depending on your operating system.

This also will return the list of their parent process ids (PPID).

To try to kill the zombies (after testing the above command), try:

kill -9 $(ps j | awk 'NR>1 && $7 ~ "Z" {print $2}')

To identify their parents, try with pstree, like:

$ ps j | awk 'NR>1 && $7 ~ "T" {print $2}' | xargs -L1 pstree -sg
systemd(1)───sshd(1036)───sshd(2325)───sshd(2325)───bash(2383)───zombie(2430)
systemd(1)───sshd(1036)───sshd(2325)───sshd(2325)───bash(2383)───zombie(2431)
systemd(1)───sshd(1036)───sshd(2325)───sshd(2325)───bash(2383)───zombie(2432)
kenorb
  • 10,944