5

I use the following code at the end of one of my scripts to tally up the number of files I have processed and moved into that directory.

# Report on Current Status
echo -n "Cropped Files: "
ls "${Destination}" | wc -l

My problem lies with how I handle duplicate files. As of right now, I check for the file's presence first (as my script is destructive in nature to the source files I am processing). If it senses a file of that name already processed, I alter the filename as follows.

Duplicate file: foo.pdf

Changed name: foo.x.pdf

If there is a foo.x.pdf, then I rename again to foo.xx.pdf. Repeat as necessary. I intend to go in later and evaluate each 'version' and select the best one to keep on hand. But herein lies my problem. I would like to count the number of files that do not contain .x. .xx. and so on. How do I strip these out of the ls output so wc -l can count the unique files only?

TL;DR: How do I get the count of files in a given directory that do not contain a given substring in their filename?

wjandrea
  • 14,504

3 Answers3

9

To find the number of files in a directory that do not contain .x.pdf, try:

find "${Destination}" -mindepth 1 ! -name '*.x.pdf' -printf '1' | wc -c

To find the number of files in a directory that do not contain period - one or more x - period - pdf, try:

find "${Destination}" -mindepth 1 ! -regex '.*\.x+\.pdf' -printf '1' | wc -c

The above search recursively through subdirectories. If you don't want that, add the option -maxdepth 1. For example:

find "${Destination}" -mindepth 1 -maxdepth 1 ! -regex '.*\.x+\.pdf' -printf '1' | wc -c

Note that because we use -printf '1', this method is safe even if the directory contains files whose names contain newline characters.

John1024
  • 13,947
2

Without subdirectories:

echo $(($(for file in *.sh ; do echo -n 1+; done; echo 0;)))

because:

for file in *.sh ; do echo -n 1+; done; echo 0;
1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+0
user unknown
  • 6,892
0

You can exclude a file or files that match to a pattern from the ls command by using (one or more times) the option -I, --ignore=PATTERN (reference):

ls -I "*.x*.pdf" "${Destination}" | wc -l

Or you could use the subtraction method in this way:

echo $(($(ls "${Destination}" | wc -l) - $(ls "${Destination}"/*.x*.pdf | wc -l)))
pa4080
  • 30,621