20

How can I show progress, bar or percentage, when unzipping large files?

unzip zipfile.zip does not show any progress info?

muru
  • 207,228
JPX
  • 452

6 Answers6

27

Without installing anything else, the easiest way is to have it print a dot for every file that is extracted or processed using awk.

unzip -o source.zip -d /destDirectory | awk 'BEGIN {ORS=" "} {print "."}'

If it is a large zip file, then you can elect to print a dot for every 10th or 20th file like this:

unzip -o source.zip -d /destDirectory | awk 'BEGIN {ORS=" "} {if(NR%10==0)print "."}'

Just change the "10" in the NR%10 piece to whatever increment you want.

Alternately, you can install the pv command, which doesn't work really well with unzip, but gives a one liner view that is not totally terrible.

Install pv:

sudo apt install pv

Unzip with pv:

unzip -o source.zip -d /destDirectory | pv -l >/dev/null

This shows output that looks like this:

28.2k 0:00:03 [9.36k/s] [        <=>                       ]

Because of the way that zip files are processed though, it will not show a progress bar in a meaningful way like we would wish.

dessert
  • 40,956
Scott
  • 371
  • 3
  • 4
16

Another alternative to show the zip/unzip progress is to use the program 7zip. In the latest version 16.02 (published 2016-05-21) it shows the progress as percentage.

The p7zip packages for version 16.02 are available in the Ubutuntu repository since release artuful/16.10. Older Ubuntu releases have only p7zip version 9.20.1 without progress indicator in the repository. I manually installed the pzip 16.02 version in Ubuntu xenial/16.04 from the bionic repository, there seems to be no other dependencies (p7zip, p7zip-full and p7zip-rar).

7z x source.zip -o/destDirectory

Note that there must be no space between the "-o" and the destination directory name.

palto
  • 496
4

You might want to use tqdm, this is a python library but has a CLI too and will show the real progress during extraction, not only after everything's done:

unzip zipfile.zip | tqdm > /dev/null

Showing the progress as percentage is more difficult since the number of lines that unzip would print is unknown. Getting the number of files from the unzip -l zipfile.zip output is possible:

n_files=`unzip -l zipfile.zip | tail -n 1 | xargs echo -n | cut -d' ' -f2`

and then you may wish to:

unzip -o zipfile.zip | tqdm --desc extracted --unit files --unit_scale --total $n_files > /dev/null

Final output:

extracted: 100%|███████████████████████████████| 15126/15126 [00:00<00:00, 16218.28files/s]

But the output printed by unzip may not only be the extracted files' names, but also some other operations (like inflating, or linking stuff), depending on the content of your zipfile. So n_files does contain the right number of files, but the number of steps may be greater. Once tqdm gets over 100% it will switch to the default progress output, as in the first example above.

Ubuntu package name is currently python3-tqdm. Of course, man tqdm is very helpful.

zezollo
  • 180
2

You can create simple wrapper for that:

function punzip {
   unzip $1 | pv -l -s $(unzip -Z -1 $1 | wc -l) > /dev/null;
}

And then use it like follows:

$ punzip file.zip

It might be useful if there are a lot of small files in an archive. But if files are large, it is better to use something like this:

function plunzip {
    for f in $(unzip -Z -1 $1 | grep -v '/$');
    do
        [[ "$f" =~ "/" ]] && mkdir -p ${f%/*}
        echo "Extracting $f"
        unzip -o -c $1 $f \
            | pv -s $(unzip -Z $1 $f | awk '{print $4}') \
            > $f
    done
}

It will show progress bar for each individual file.

1

Alternatively, you can watch the size of the destination directory you are unzipping your files into. For example you can use:

watch -n 60 du -sh dest_directory

where dest_directory is where unzip is writing to.

-s is used to show the size and -h is there to make it human readable. Also, this repeats this process every 60 seconds. You can change it for your setting.

Because it calls du(directory usage) in a loop it may be less efficient than other answers. However, if you are calling unzip with options like -q, which means unzip will no longer write to stdout and accordingly you cannot use other answers based on that, it may be your only option. It can also be faster, because -q will increase the speed for unzip when you have a large number of small files.

1

Similar to Scott's answer, this converts each output line to a progress indicator, and drops the newlines (alternative to awk):

unzip 'file.zip' | sed 's/.*/./' | tr -d '\n'

However, the pipes get buffered, so for realtime progress you need to add stdbuf -o0:

unzip 'file.zip' | stdbuf -o0 sed 's/.*/./' | stdbuf -o0 tr -d '\n'

https://unix.stackexchange.com/q/515522/455274

MichaelK
  • 159