3

I have an external drive that's filling with zipped files. I'd like to get a larger drive but also store everything uncompressed. Is there a way to check how much space they will need so I can avoid buying a drive that doesn't provide any future proofing? I've tried 7z l but that only shows 1 file at a time and I have thousands.

2 Answers2

5

You can use awk to sum up the output of unzip -l like this:

for zipfile in *.zip; do unzip -l $zipfile | tail -n 1; done | awk '{ sum += $1 }; END { print sum }'

Here

  1. The for-loop iterates over all ZIP-files in the current directory. Starting with for and ending with done
  2. In the loop-body unzip -l outputs the content of each ZIP-file, the last line containing the overall size of the files in the ZIP-archive
  3. This last line is extracted by tail -n 1
  4. Finally, the output of the for-loop is passed to awk summing up the first arguments and then, at the END, printing its overall result

You can prettify this output by adding formats for KB and MB, for example like this:

for zipfile in *.zip; do unzip -l $zipfile | tail -n 1; done | awk '{ sum += $1 }; END { printf "%d BYTES\n%d KB\n%d MB\n", sum, sum / 1024, sum / 1024 / 1024 }'

(Suggested by comment by @pLumo):
To make this recursive, you can use the following command:

shopt -s globstar; for zipfile in **/*.zip; do unzip -l $zipfile | tail -n 1; done | awk '{ sum += $1 }; END { printf "%d BYTES\n%d KB\n%d MB\n", sum, sum / 1024, sum / 1024 / 1024 }'; shopt -u globstar

HINT: Be aware of the difference between file size and size on the disk!.

zx485
  • 2,865
0

You could use

find . \( -name '*.zip' -o -name '*.7z' -o -name '*.rar' \) -exec bash -c "7z l '{}' | tail -n 1 | tr -s \  | cut -d\  -f3" \; | grep '[^[:blank:]]' | paste -sd + - | bc | numfmt --to=si

which gives you the sum of all the uncompressed file sizes, as required.

Add necessary other formats of archives as needed, with -o -name '*.rar' before the -exec ... stanza for rar files as an example.

Using find has the advantage that all archives on different directory levels are found and taken into account.

emk2203
  • 4,393
  • 1
  • 26
  • 52