27

I've already copied terabytes of files with rsync but I forgot to use --archive to preserve files' special attributes.

I tried executing rsync again this time with --archive but it was way slower than what I expected. Is there any easy way to do this faster by just copying metadata recursively?

Mohammad
  • 815

5 Answers5

25

Ok, you can copy owner, group, permission and timestamps using the --reference parameter to chown, chmod, touch. Here is a script to do so

#!/bin/bash
# Filename: cp-metadata

myecho=echo
src_path="$1"
dst_path="$2"

find "$src_path" |
  while read src_file; do
    dst_file="$dst_path${src_file#$src_path}"
    $myecho chmod --reference="$src_file" "$dst_file"
    $myecho chown --reference="$src_file" "$dst_file"
    $myecho touch --reference="$src_file" "$dst_file"
  done

You should run it with sudo (to allow chown) and with two parameters: source and destination directory. The script only echo what it would do. If satisfied change the line myecho=echo with myecho=.

enzotib
  • 96,093
10

Treating the question as "rsync only has metadata to copy, so why is it so slow, and how can I make it faster?":

rsync usually uses equal mtimes as a heuristic to detect and skip unchanged files. Without --archive (specifically, without --times) the destination files' mtimes remain set to the time you rsync-ed them, while the source files' mtimes remain intact (ignoring manual trickery by you). Without external guarantees from you that the source files' contents haven't changed, rsync has to assume they might have and therefore has to checksum them and/or copy them to the destination again. This, plus the fact that --whole-file is implied for local->local syncs, makes rsync without --times approximately equivalent to cp for local syncs.

Provided that updating the destination files' contents is acceptable, or if the source files are untouched since the original copy, you should find rsync --archive --size-only quicker than a naive rsync.

If in doubt as to what rsync is copying that is taking so long, rsync --archive --dry-run --itemize-changes ... tells you in exhaustive, if terse, detail.

muru
  • 207,228
ZakW
  • 341
7

WARNING: Without special workarounds, GNU cp --attributes-only will truncate the destination files, at least in Precise. See the edit below.

Original:

In this situation you probably want GNU cp's --attributes-only option, together with --archive, as it's tried and tested code, does all filesystem-agnostic attributes and doesn't follow symlinks (following them can be bad!):

cp --archive --attributes-only /source/of/failed/backup/. /destination/

As with files, cp is additive with extended attributes: if both source and destination have extended attributes it adds the source's extended attributes to the destination (rather than deleting all of the destination's xattrs first). While this mirrors how cp behaves if you copy files into an existing tree, it might not be what you expect.

Also note that if you didn't preserve hard links the first time around with rsync but want to preserve them now then cp won't fix that for you; you're probably best off rerunning rsync with the right options (see my other answer) and being patient.

If you found this question while looking to deliberately separate and recombine metadata/file contents then you might want to take a look at metastore which is in the Ubuntu repositories.

Source: GNU coreutils manual


Edited to add:

cp from GNU coreutils >= 8.17 and above will work as described, but coreutils <= 8.16 will truncate files when restoring their metadata. If in doubt, don't use cp in this situation; use rsync with the right options and/or be patient.

I wouldn't recommend this unless you fully understand what you're doing, but earlier GNU cp can be prevented from truncating files using the LD_PRELOAD trick:

/*
 * File: no_trunc.c
 * Author: D.J. Capelis with minor changes by Zak Wilcox
 *
 * Compile:
 * gcc -fPIC -c -o no_trunc.o no_trunc.c
 * gcc -shared -o no_trunc.so no_trunc.o -ldl
 *
 * Use:
 * LD_PRELOAD="./no_trunc.so" cp --archive --attributes-only <src...> <dest>
 */

#define _GNU_SOURCE
#include <dlfcn.h>
#define _FCNTL_H
#include <bits/fcntl.h>

extern int errorno;

int (*_open)(const char *pathname, int flags, ...);
int (*_open64)(const char *pathname, int flags, ...);

int open(const char *pathname, int flags, mode_t mode) {
        _open = (int (*)(const char *pathname, int flags, ...)) dlsym(RTLD_NEXT, "open");
        flags &= ~(O_TRUNC);
        return _open(pathname, flags, mode);
}

int open64(const char *pathname, int flags, mode_t mode) {
        _open64 = (int (*)(const char *pathname, int flags, ...)) dlsym(RTLD_NEXT, "open64");
        flags &= ~(O_TRUNC);
        return _open64(pathname, flags, mode);
}
ZakW
  • 341
3

I had to do this remotely to another computer so I couldn't use --reference

I used this to make the script...

find -printf "touch -d \"%Tc\" \"%P\"\n" >/tmp/touch.sh

But make sure there aren't any filenames with " in them first...

find | grep '"'

Then copy touch.sh to your remote computer, and run...

cd <DestinationFolder>; sh /tmp/touch.sh

There're also options in find -printf to print user,group name if you want copy those.

niknah
  • 141
  • 3
2

In local transfers, when source and destination are on locally mounted filesystems, rsync will always copy whole files content. To avoid this you can use

rsync -a --no-whole-file source dest
enzotib
  • 96,093