24

I've already copied terabytes of files with rsync but I forgot to use --archive to preserve files' special attributes.

I tried executing rsync again this time with --archive but it was way slower than what I expected. Is there any easy way to do this faster by just copying metadata recursively?

Mohammad
  • 775
  • 2
  • 6
  • 14
  • With "metadata" you mean file permissions and file ownership or more complicated things like extended file attributes? – Marcel Stimberg Aug 12 '11 at 08:55
  • The filesystem where source files reside is mounted locally or not? – enzotib Aug 12 '11 at 08:58
  • by metadata I mean permissions and time-stamps. time-stamps are particularly important for me. – Mohammad Aug 12 '11 at 10:15
  • the filsystem both in source and destination is mounted locally. – Mohammad Aug 12 '11 at 10:16
  • Related questions: https://unix.stackexchange.com/questions/44253/how-to-clone-copy-all-file-directory-attributes-onto-different-file-directory and https://unix.stackexchange.com/questions/20645/clone-ownership-and-permissions-from-another-file – Sohail Si Jan 27 '22 at 16:48

5 Answers5

24

Ok, you can copy owner, group, permission and timestamps using the --reference parameter to chown, chmod, touch. Here is a script to do so

#!/bin/bash
# Filename: cp-metadata

myecho=echo
src_path="$1"
dst_path="$2"

find "$src_path" |
  while read src_file; do
    dst_file="$dst_path${src_file#$src_path}"
    $myecho chmod --reference="$src_file" "$dst_file"
    $myecho chown --reference="$src_file" "$dst_file"
    $myecho touch --reference="$src_file" "$dst_file"
  done

You should run it with sudo (to allow chown) and with two parameters: source and destination directory. The script only echo what it would do. If satisfied change the line myecho=echo with myecho=.

enzotib
  • 92,255
  • 11
  • 164
  • 178
  • 1
    Yes, that's what I need: --reference in chmod. Thank you. And I really appreciate it if anyone could introduce something like chmod --reference for copying time-stamps. – Mohammad Aug 12 '11 at 11:14
  • 1
    @Mohammad: for that you can use `touch --reference=otherfile file`. Updated the answer – enzotib Aug 12 '11 at 11:18
  • That's great. Actually I was reading touch manual just now ;-) – Mohammad Aug 12 '11 at 11:20
  • Just a note: `touch` by design only changes the modification and access times, the "creation" time is not affected. (I think ext2/3 do not support changing ctime anyway, but it might matter if you're using NTFS or the like). – Amro Apr 12 '15 at 08:43
  • 1
    In case you want to only change metadata of *existing* files and don't need to assure the existence of files, add a `-c` switch to the `touch` command to stop it creating empty files in the `$dst_path`. – Synchro Jun 20 '16 at 16:23
  • note that if you've run rsync with `--partial` flag, there will be unfished files. If running this script, the date will be will no longer be older and the file won't update under certain flags – CervEd Apr 06 '21 at 15:38
  • I should also make a note that I found that using this method or the other answer using `cp --attributes-only`, from `ext4` to `exfat`, the time was rounded. This appeared to cause trouble with`rsync --update` as the times differed. I didn't investigate much further but it appeared that attributes copied using `rsync` didn't have this problem. – CervEd Apr 18 '21 at 12:06
  • How can I do this against a remote server? Please see https://unix.stackexchange.com/questions/721195/how-can-i-clone-an-entire-remote-directory-tree-and-file-structure-but-have-empt – user658182 Oct 16 '22 at 15:36
9

Treating the question as "rsync only has metadata to copy, so why is it so slow, and how can I make it faster?":

rsync usually uses equal mtimes as a heuristic to detect and skip unchanged files. Without --archive (specifically, without --times) the destination files' mtimes remain set to the time you rsync-ed them, while the source files' mtimes remain intact (ignoring manual trickery by you). Without external guarantees from you that the source files' contents haven't changed, rsync has to assume they might have and therefore has to checksum them and/or copy them to the destination again. This, plus the fact that --whole-file is implied for local->local syncs, makes rsync without --times approximately equivalent to cp for local syncs.

Provided that updating the destination files' contents is acceptable, or if the source files are untouched since the original copy, you should find rsync --archive --size-only quicker than a naive rsync.

If in doubt as to what rsync is copying that is taking so long, rsync --archive --dry-run --itemize-changes ... tells you in exhaustive, if terse, detail.

muru
  • 193,181
  • 53
  • 473
  • 722
ZakW
  • 331
  • 4
  • 3
  • 4
    Very useful info. --archive --size-only is a great combo. Not only does it prevent recopying files that already exist in the destination, but it will also update their metadata. This was unexpected for me, because rsync's man page describes --size-only as "skipping" files whose sizes match. Turns out that it just skips the copy, but will still sync the metadata. Ideal. – Chad von Nau Mar 06 '13 at 05:59
  • I did so, and it doesn't modify change or birth times (reported by `stat`). – Yaroslav Nikitenko Aug 17 '22 at 08:21
7

WARNING: Without special workarounds, GNU cp --attributes-only will truncate the destination files, at least in Precise. See the edit below.

Original:

In this situation you probably want GNU cp's --attributes-only option, together with --archive, as it's tried and tested code, does all filesystem-agnostic attributes and doesn't follow symlinks (following them can be bad!):

cp --archive --attributes-only /source/of/failed/backup/. /destination/

As with files, cp is additive with extended attributes: if both source and destination have extended attributes it adds the source's extended attributes to the destination (rather than deleting all of the destination's xattrs first). While this mirrors how cp behaves if you copy files into an existing tree, it might not be what you expect.

Also note that if you didn't preserve hard links the first time around with rsync but want to preserve them now then cp won't fix that for you; you're probably best off rerunning rsync with the right options (see my other answer) and being patient.

If you found this question while looking to deliberately separate and recombine metadata/file contents then you might want to take a look at metastore which is in the Ubuntu repositories.

Source: GNU coreutils manual


Edited to add:

cp from GNU coreutils >= 8.17 and above will work as described, but coreutils <= 8.16 will truncate files when restoring their metadata. If in doubt, don't use cp in this situation; use rsync with the right options and/or be patient.

I wouldn't recommend this unless you fully understand what you're doing, but earlier GNU cp can be prevented from truncating files using the LD_PRELOAD trick:

/*
 * File: no_trunc.c
 * Author: D.J. Capelis with minor changes by Zak Wilcox
 *
 * Compile:
 * gcc -fPIC -c -o no_trunc.o no_trunc.c
 * gcc -shared -o no_trunc.so no_trunc.o -ldl
 *
 * Use:
 * LD_PRELOAD="./no_trunc.so" cp --archive --attributes-only <src...> <dest>
 */

#define _GNU_SOURCE
#include <dlfcn.h>
#define _FCNTL_H
#include <bits/fcntl.h>

extern int errorno;

int (*_open)(const char *pathname, int flags, ...);
int (*_open64)(const char *pathname, int flags, ...);

int open(const char *pathname, int flags, mode_t mode) {
        _open = (int (*)(const char *pathname, int flags, ...)) dlsym(RTLD_NEXT, "open");
        flags &= ~(O_TRUNC);
        return _open(pathname, flags, mode);
}

int open64(const char *pathname, int flags, mode_t mode) {
        _open64 = (int (*)(const char *pathname, int flags, ...)) dlsym(RTLD_NEXT, "open64");
        flags &= ~(O_TRUNC);
        return _open64(pathname, flags, mode);
}
ZakW
  • 331
  • 4
  • 3
  • `errorno` should be `errno`, right? – enzotib Apr 14 '15 at 07:41
  • A quick test removing it seems to work, so I guess I perpetuated a redundancy/mistake in [the original](https://stackoverflow.com/a/69884), but everyone will be on newer coreutils by now anyway. – ZakW Apr 15 '15 at 09:26
  • but what you call `rsync` with the right options is an answer to another question... – Jean Paul May 22 '19 at 12:01
  • On MacOS, those attributes are not available. The command `cp` does not have `--archive` or `--attributes-only`. You can use `cp -a sourcefile destfile`. The `-a` will copy attributes, date, etc, but unfortunately copies the contents as well. – Sohail Si Jan 27 '22 at 13:09
  • None of these work on MacOS. For a solution for MacOS, see my answer here: https://unix.stackexchange.com/questions/20645/clone-ownership-and-permissions-from-another-file – Sohail Si Jan 27 '22 at 16:40
3

I had to do this remotely to another computer so I couldn't use --reference

I used this to make the script...

find -printf "touch -d \"%Tc\" \"%P\"\n" >/tmp/touch.sh

But make sure there aren't any filenames with " in them first...

find | grep '"'

Then copy touch.sh to your remote computer, and run...

cd <DestinationFolder>; sh /tmp/touch.sh

There're also options in find -printf to print user,group name if you want copy those.

niknah
  • 141
  • 3
  • Thanks for the ideas to a) "just use a shell script" and b) to generate said script using `find`. I was in the same situation - forgot to copy attributes, source and destination disks were already in different machines and didn't *really* want to reverse that. – i336_ Nov 08 '18 at 12:32
2

In local transfers, when source and destination are on locally mounted filesystems, rsync will always copy whole files content. To avoid this you can use

rsync -a --no-whole-file source dest
enzotib
  • 92,255
  • 11
  • 164
  • 178
  • I tried rsync with --no-whole-file and --progress and I can still see the copying progress (about 30 MB/s); so I guess it is not fast enough, yet. I am losing my hope on rsync... – Mohammad Aug 12 '11 at 10:40
  • This option is used to tell `rsync` not to use the shortcut when files are both in local path, but it does not prevent `rsync` from copying the content. – Jean Paul May 22 '19 at 11:45