I want to clone one very large directory (many terabytes of multiple-gigabyte files) into another on another drive. I have been using this command:
ionice -c 3 rsync -avz /path/to/sourcedir/ /path/to/destdir/
The process takes over a day and more often than not gets interrupted, hence the use of rsync to be able to resume without restarting from zero. The theory should be that the above command is idempotent, so when anything fails I should just be able to reissue the same command to let it work out where it was interrupted and continue from there.
Now, because the point of the operation is to retire and recycle the source drive, before doing that I wanted to be super-sure that all files had been properly copied. So I used the approach in this question to compare each file byte by byte. Sure enough, there were a number of files that had a different hash.
So the theory question: does rsync, unlike what I thought I understood, work merely on file names, rather than content, or at least length?
And the (more important) practice question: are there other options I could be using instead, to force rsync to produce an exact clone of the source directory? In particular, in the case in which rsync is launched when the dest directory already has a file with the same name as one in the source directory, but with different content, I want the command to ensure it is replaced (or "completed") with the actual original file from the source directory.