I have a list of filenames inside a file called list_of_files.txt.
I want to copy the contents of each file in that list into another file called all_compounds.sdf.
How should I do this from the command line?
I have a list of filenames inside a file called list_of_files.txt.
I want to copy the contents of each file in that list into another file called all_compounds.sdf.
How should I do this from the command line?
Don't use simple command substitution to get filenames (that could easily break with spaces and other special characters). Use something like xargs:
xargs -d '\n' -a list_of_files.txt cat > all_compounds.sdf
Or a while read loop:
while IFS= read -r file; do cat "$file"; done < list_of_files.txt > all_compounds.sdf
To use command substitution safely, at least set IFS to just the newline and disable globbing (wildcard expansion):
(set -f; IFS=$'\n'; cat $(cat list_of_files.txt) > all_compounds.sdf)
The surrounding parentheses () are to run this in a subshell, so that your current shell isn't affected by these changes.
Quick and dirty way...
cat $(cat list_of_files.txt) >> all_compounds.sdf
Please note: this only works if the filenames in your list are very well behaved - things will go wrong if they have spaces, newlines, or any characters that have special meaning to the shell - use this answer instead for reliable results)
cat concatenates files. It also prints their contents.
Using command substitution command2 $(command1) you can pass the output of command1 (cat list...) to command2 (cat) which concatenates the files.
Then use redirection >> to send the output to a file instead of printing to stdout. If you want to see the output, use tee instead:
cat $(cat list_of_files.txt) | tee -a all_compounds.sdf
(I have used >> instead of > and tee with the -a switch in case your file already exists - this appends to the file instead of overwriting it, if it already exists)
While GNU awk is a text processing utility, it allows running external shell commands via system() call. We can utilize that to our advantage like so:
$ awk '{cmd=sprintf("cat \"%s\"",$0); system(cmd)}' file_list.txt
The idea here is simple: we read the file line by line, and out of each line we create formatted string cat "File name.txt", which is then passed to system().
And here it is in action:
$ ls
file1.txt file2.txt file3 with space.txt file_list.txt
$ awk '{cmd=sprintf("cat \"%s\"",$0); system(cmd)}' file_list.txt
Hi, I'm file2
Hi, I'm file1
Hi, I'm file3
So we've done the big part of the task there already - we printed all the files on the list. The rest is simple : redirect final output to file with > operator into the summary file.
awk '{cmd=sprintf("cat \"%s\"",$0); system(cmd)}' file_list.txt > output.txt