61

I have a script which searches all files in multiple subfolders and archives to tar. My script is

for FILE in `find . -type f  -name '*.*'`
  do
if [[ ! -f archive.tar ]]; then

  tar -cpf archive.tar $FILE
else 
  tar -upf archive.tar $FILE 
fi
done

The find command gives me the following output

find . -type f  -iname '*.*'
./F1/F1-2013-03-19 160413.csv
./F1/F1-2013-03-19 164411.csv
./F1-FAILED/F2/F1-2013-03-19 154412.csv
./F1-FAILED/F3/F1-2011-10-02 212910.csv
./F1-ARCHIVE/F1-2012-06-30 004408.csv
./F1-ARCHIVE/F1-2012-05-08 190408.csv

But the FILE variable only stores first part of the path ./F1/F1-2013-03-19 and then the next part 160413.csv.

I tried using read with a while loop,

while read `find . -type f  -iname '*.*'`;   do ls $REPLY; done

but I get the following error

bash: read: `./F1/F1-2013-03-19': not a valid identifier

Can anyone suggest an alternative way?

Update

As suggested in the answers below I updated the scripts

#!/bin/bash

INPUT_DIR=/usr/local/F1
cd $INPUT_DIR
for FILE in "$(find  . -type f -iname '*.*')"
do
archive=archive.tar

        if [ -f $archive ]; then
        tar uvf $archive "$FILE"
        else
        tar -cvf $archive "$FILE"
        fi
done

The output that i get is

./test.sh
tar: ./F1/F1-2013-03-19 160413.csv\n./F1/F1-2013-03-19 164411.csv\n./F1/F1-2013-03-19 153413.csv\n./F1/F1-2013-03-19 154412.csv\n./F1/F1-2012-09-10 113409.csv\n./F1/F1-2013-03-19 152411.csv\n./.tar\n./F1-FAILED/F3/F1-2013-03-19 154412.csv\n./F1-FAILED/F3/F1-2013-03-19 170411.csv\n./F1-FAILED/F3/F1-2012-09-10 113409.csv\n./F1-FAILED/F2/F1-2011-10-03 113911.csv\n./F1-FAILED/F2/F1-2011-10-02 165908.csv\n./F1-FAILED/F2/F1-2011-10-02 212910.csv\n./F1-ARCHIVE/F1-2012-06-30 004408.csv\n./F1-ARCHIVE/F1-2011-08-17 133905.csv\n./F1-ARCHIVE/F1-2012-10-21 154410.csv\n./F1-ARCHIVE/F1-2012-05-08 190408.csv: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors
Ubuntuser
  • 10,012

10 Answers10

73

Using for with find is the wrong approach here, see for example this writeup about the can of worms you are opening.

The recommended approach is to use find, while and read as described here. Below is an example that should work for you:

find . -type f -name '*.*' -print0 | 
while IFS= read -r -d '' file; do
    printf '%s\n' "$file"
done

This way you delimit the filenames with null (\0) characters, this means that variation in space and other special characters will not cause problems.

In order to update an archive with the files that find locates, you can pass its output directly to tar:

find . -type f -name '*.*' -printf '%p\0' | 
tar --null -uf archive.tar -T -

Note that you do not have to differentiate between if the archive exists or not, tar will handle it sensibly. Also note the use of -printf here to avoid including the ./ bit in the archive.

Thor
  • 3,678
28

This works and is simpler:

find . -name '<pattern>' | while read LINE; do echo "$LINE" ; done

Credit to Rupa (https://github.com/rupa/z) for this answer.

ShawnMilo
  • 389
20

Try quoting the for loop like this:

for FILE in "`find . -type f  -name '*.*'`"   # note the quotation marks

Without quotes, bash doesn't handle spaces and newlines (\n) well at all...

Also try setting

IFS=$'\n'
kiri
  • 28,986
4

In addition to proper quoting, you can tell find to use a NULL separator, and then read and process the results in a while loop

while read -rd $'\0' file; do
    something with "$file"
done < <(find  . -type f -name '*.*' -print0)

This should handle any filenames that are POSIX-compliant - see man find

   -print0
          True; print the full file name on the standard output, followed by a null character (instead of the newline character that  -print  uses).   This  allows  file
          names that contain newlines or other types of white space to be correctly interpreted by programs that process the find output.  This option corresponds to the
          -0 option of xargs.
steeldriver
  • 142,475
2

I think you may be better off using find's -exec option.

find . -type f -name '*.*' -exec tar -cpf archive.tar {} +

Find then executes the command using a system call, so that spaces and newlines are preserved (rather a pipe, which would require quoting of special characters). Note that "tar -c" works whether or not the archive already exists, and that (at least with bash) neither {} nor + need to be quoted.

Jim Van Zandt
  • 109
  • 1
  • 3
2

I did something like this to find files that may contain spaces.

IFS=$'\n'
for FILE in `/usr/bin/find $DST/shared -name *.nsf | grep -v bookmark.nsf | grep -v names.nsf`; do
    file $FILE | tee -a $LOG
done

Worked like a charm :)

Scott B
  • 21
1
find . <find arguments> -print0 | xargs -0 grep <pattern>
0

I had a similar problem in a script I used to convert audio files. The file names had spaces, which caused issues for the converted file names. This solution worked for me on OSX, using zsh:

  1. Get all the files using find. Cut out the slash and dot. Sort the output.
  2. Get the count of all the files.
  3. Use the count to loop through the files
  4. Use a combination of tail and head to select the files line by line (like a SQL cursor)
  5. Handle the file as necessary.

Since I was converting audio, I wanted to use the original filenames (including their spaces) for the new, converted audio files. The script I used includes FROM and TO parameters for specifying the audio formats. It also does some additional cutting in the loop to remove the extension. I was only interested in getting the complete file name, so I found it necessary to remove the extension before converting it using the TO variable.

#!/bin/zsh

Convert all audio files in directory

audioconvert [original] [converted]

export FROM=$1 export TO=$2

export FILES=$(find . -name "*.$FROM" | cut -d "/" -f 2 | sort) export CNT=$(echo $FILES | wc -l)

while [ $CNT -gt 0 ];do export song=$(echo $FILES | tail -n $CNT | head -n 1) export song_title=$(echo $song | cut -d . -f 1)

ffmpeg -i $song $song_title.$TO

let CNT=$CNT-1 done

0

As minerz029 suggested, you need to quote the expansion of the find command. You also need to quote all the substitutions of $FILE in your loop.

for FILE in "$(find . -type f  -name '*.*')"
do
    if [ ! -f archive.tar ]; then
        tar -cpf archive.tar "$FILE"
    else 
        tar -upf archive.tar "$FILE" 
    fi
done

Note that the $() syntax should be preferred to the use of backticks; see this U & L question. I also removed the [[ keyword and replaced it by the [ command because it's POSIX.

Joseph R.
  • 320
0

Most answers here break if there is a newline character in the filename. I use the bash more then 15 years, but only interactive.

In Python you can us os.walk(): http://docs.python.org/2/library/os.html#os.walk

And the tarfile module: http://docs.python.org/2/library/tarfile.html#tar-examples

guettli
  • 1,765