112

I'm running rsync to sync a directory onto my external USB HDD. It's about 150 gigs of data. 50000+ files I would guess.

It's running it's first sync at the moment, but its copying files at a rate of only 1-5 MB/s. That seems incredibly slow for a USB 2.0 enclosure. There are no other transfers happening on the drive either.

Here are the options I used:

rsync -avz --progress /mysourcefolder /mytargetfolder

I'm running Ubuntu Server 9.10.

Graham Leggett
  • 562
  • 1
  • 6
  • 17
Jake Wilson
  • 4,314
  • 10
  • 38
  • 39
  • 2
    are you sure you're getting a USB2 connection? does a (non-rsync) copy or other write operation run at normal speeds? if not, have you tried a copy/other write op with another USB port/cable? – quack quixote Feb 17 '10 at 00:14
  • See also http://serverfault.com/questions/43014/copying-a-large-directory-tree-locally-cp-or-rsync - there people also propose using two piped `tar` commands or `cpio`. – Blaisorblade Feb 23 '13 at 17:32
  • Like @tom-hale pointed out in your case the compression makes no sense, because you copy between local filesystems. Compression only makes sense when you copy between two hosts over a network. – mac13k Dec 21 '20 at 13:38

12 Answers12

68

If you're using rsync with a fast network or disk to disk in the same machine,

not using compression -z

and using --inplace

speeds it up to the performance of the harddrives or network

compression uses lots of CPU

not using inplace makes the harddrive thrash alot (it uses a temp file before creating the final)

compression & not using inplace is better for doing it over the internet (slow network)

NEW: Be aware of the destination... if there is NTFS "compression" enabled... this severely slows down large files (I'd say 200MB+) rsync almost seems stalled, it's caused by this.

Scott Kramer
  • 780
  • 5
  • 5
60

First - the number of files in this case is going to be a major factor. It's an average size of 3MB each. There's probably an io bottleneck influencing the speed in the OP's case. More here - that's a pretty dry read, but the cover picture is worth it.

So, using rsync to copy to an empty directory? Here are some ways to speed it up:

  1. No -z - definitely don't use -z as in the OP.
  2. --no-compress might speed you up. This could have the biggest impact... my test was 13,000 files, total size 200MB, and using rsync 3.1.3. I synced to a different partition on the same internal SSD drive. With --no-compress, I get 18 MBps, and without it I get 15 MBps. cp, by the way, gets 16 MBps. That's a much smaller average file size though. Also - I can't find any documentation for --no-compress. I learned about it from this post on stackexchange.com.
  3. -W to copy files whole - always use this if you don't want it to compare differences; never mind that the point of rsync is to compare differences and only update the changes.
  4. -S to handle sparse files well - can't hurt if you don't have sparse files.
  5. --exclude-from or something similar to exclude files you might not need will cut down the time, but it won't increase your transfer speed.
  6. It's possible if you send the output to a file like this rsync -a /source /destination >/somewhere/rsync.out 2>/somewhere/rsync.err - the first > basically prints a file with all the stuff you would normally see, and the 2> refers to error messages.
  7. Finally, running multiple instances of rsync for different parts of your transfer could be a big help.

My command would be:

rsync -avAXEWSlHh /source /destination --no-compress --info=progress2 --dry-run

If all looked well, I'd delete "--dry-run" and let it go. A, X, and E cover extended attributes and permissions not covered by -a, l is for soft links, H is for hard links, and h is for human readable.

Updating an already synced directory on a USB drive, or the same drive, or over a network, will all require different rsync commands to maximize transfer speed.

Bonus - here's the rsync man page, and if you want to test your hard drive speed, bonnie++ is a good option, and for your network speed, try iperf.


*The post is almost ten years old, but search engines sure like it, and I keep seeing it. It's a good question, and I don't think the top answer to "how to speed up rsync" should be "use cp instead."

Fin Hirschoff
  • 700
  • 5
  • 4
54

Use the -W option. This disables delta/diff comparisons. When the file time/sizes differ, rsync copies the whole file.

Also remove the -z option. This is only useful for compressing network traffic.

Now rsync should be as fast as cp.

vdboor
  • 696
  • 5
  • 5
  • 9
    Minor note: ``-z`` is only useful for **low speed** network traffic. If your network is fast enough, it'll slow things down, since you'll be limited by CPU. – WhyNotHugo Jul 08 '13 at 02:28
  • 3
    These tips vastly sped up the transfer of my files between two NAS devices, thanks! – djhworld Sep 22 '13 at 13:30
  • 2
    but note that according to the man page says for `-W`: "This is the default when both the source and destination are specified as local paths, but only if no batch-writing option is in effect." – GuoLiang Oon Jul 05 '17 at 08:15
  • 3
    `-W` was the key in my case (over local network). I went from ~3.5 MB/s to ~35 MB/s. a 10x factor!! – logoff Jan 09 '21 at 12:19
  • 1
    The `-W` also worked spectacular for me. 10x factor exactly like logoff. Especially super-useful for big files! – robertspierre Jan 29 '21 at 02:25
  • Do not forget `--inplace` here – Philippe Remy Jun 17 '23 at 16:32
51

For the first sync just use

cp -a  /mysourcefolder /mytargetfolder

rsync only adds overhead when the destination is empty.

also.. the -z option is probably killing your performance, you shouldn't be using it if you are not transfering data over a slow link.

user23307
  • 6,859
  • 1
  • 19
  • 13
  • 4
    rsync is so called because it's for *remote* synchronization and is not really appropriate for a locally-connected volume for this very reason. – msanford Jun 27 '10 at 22:30
  • 17
    It's supposed to be usable also for local transfers, and it's much more flexible. It's only possibly overkill for the first sync. – Blaisorblade Feb 23 '13 at 17:31
  • 2
    rsync is also a one way sync. Very good for backing up to a server or from a server. However, if you want local TWO-Way sync to a removable drive, you may want to check out csync https://www.csync.org/get-it/ not to be confused with csync2 which is a completely different project. – Jesse the Wind Wanderer Dec 31 '14 at 13:32
  • 5
    `rsync -avz --progress /mysourcefolder/ /mytargetfolder` or you'll get a copy of `mysourcefolder` inside of `mytargetfolder` rather than mirroring the contents – editor Dec 01 '17 at 08:16
  • 32
    This answer does not answer the question. The question was about how to optimize rsync - not replace it with the cp command. – oemb1905 Nov 25 '18 at 04:08
  • 1
    I disagree that this should be done on the first sync: what if it's interrupted? I find myself having this from time to time and rsync can continue, cp can't. Is there *significant* overhead? If yes, one may argues about cp. If not, it's irrelevant. – Mayou36 Oct 14 '22 at 09:04
  • If first sync, check out Fin's answer https://superuser.com/a/1361692/977552 – evantkchong Jan 16 '23 at 07:14
5

You definitely want to give rclone a try. This thing is crazy fast :

$ tree /usr [...] 26105 directories, 293208 files

$ sudo rclone sync /usr /home/fred/temp -P -L --transfers 64

Transferred: 17.929G / 17.929 GBytes, 100%, 165.692 MBytes/s, ETA 0s Errors: 75 (retrying may help) Checks: 691078 / 691078, 100% Transferred: 345539 / 345539, 100% Elapsed time: 1m50.8s

This is a local copy from and to a LITEONIT LCS-256 (256GB) SSD.

You can add --ignore-checksum on the first run to make it even more faster.

Frédéric N.
  • 51
  • 1
  • 1
5

Avoid

  • -z/--compress: compression will only load up the CPU as the transfer isn't over a network but over RAM.
  • --append-verify: resume an interrupted transfer. This sounds like a good idea, but it has the dangerous failure case: any destination file the same size (or greater) than the source will be IGNORED. Also, it checksums the whole file at the end, meaning no significant speed up over --no-whole-file while adding a dangerous failure case.

Use

  • -S/--sparse: turn sequences of nulls into sparse blocks
  • --partial or -P which is --partial --progress: save any partially transferred files for future resuming. Note: files won't have a temporary name, so ensure that nothing else is expecting to use the destination until the whole copy has completed.
  • --no-whole-file so that anything that needs to be resent uses delta transfer. Reading half of a partially transferred file is often much quicker than writing it again.
  • --inplace to avoid file copy (but only if nothing is reading the destination until the whole transfer completes)
Tom Hale
  • 2,274
  • 2
  • 24
  • 35
3

You don't say what size distribution your files have. If there are many small files then this will reduce overall transfer rate by increasing head movement latency in both the source and destination drives as the tool opens new files and the OS keeps directory entries and other metadata (such as the filesystem's journal if you are using meta-data journaling like ext3/ext4 and NTFS do by default) up to date during the transfer. A file copy proces will only "get into its stride" for larger objects, when a simple bulk transfer is happening.

David Spillett
  • 23,420
  • 1
  • 49
  • 69
1

i found this as my answer

https://gist.github.com/KartikTalwar/4393116

rsync -aHAXxv --numeric-ids  --progress -e 'ssh -T -c aes128-gcm@openssh.com -o Compression=no -x ' <source_dir> user@<host>:<dest_dir>

Some explanation regarding switches below

rsync (Everyone seems to like -z, but it is much slower for me)

a: archive mode - rescursive, preserves owner, preserves permissions, preserves modification times, preserves group, copies symlinks as symlinks, preserves device files.
H: preserves hard-links
A: preserves ACLs
X: preserves extended attributes
x: don't cross file-system boundaries
v: increase verbosity
--numeric-ds: don't map uid/gid values by user/group name
--progress: show progress during transfer

ssh

T: turn off pseudo-tty to decrease cpu load on destination.
c aes128-gcm@openssh.com: use the weakest but fastest SSH encryption.
o Compression=no: Turn off SSH compression.
x: turn off X forwarding if it is on by default.
mati kepa
  • 259
  • 2
  • 5
0

I don't have any reason to suspect rsync is the culprit for this slow speed.

I would suspect that the drive itself or its filesystem is the issue.

In fact, a few factors hint at a possible cause. You mention it's a USB drive, and you're transferring 150GB at once. I would hazard a guess that your USB drive uses Shingled Magnetic Recording (SMR).

SMR drives overlap the tracks in such a way that reading is done as normal, but every write to the drive, since it overwrites the neighbouring tracks as well, requires the drive to re-write a sizeable chunk of the drive, and the writing is slow. The miracle is that these drives perform as normal most of the time, because the drive remaps writes to a temporary holding area that doesn't use SMR, and then re-writes them to the drive later in the background. But this holding area is only a few GB in size, so for any continuous transfers longer than, say, 10GB, the write speed drops dramatically down to around 5MB/s. If you stop writing to the drive and let it idle for about 10 minutes, its performance will return to normal because it has had time to clear its temporary holding area, but will fall again if you do another multi-GB sized continuous write.

Despite some comments otherwise, rsync is nice and fast when operating locally, altering its setup accordingly. When operating locally, rsync uses no delta transfer. And, since compressing the transfer would make no sense when it's essentially just over a local pipe, I doubt the -z has any effect, although the documentation doesn't say. Even if it does, it shouldn't be clamping the speed to <5MB/s, as its compression would be capable of many times that speed.

And, --inplace only affects rsync's behaviour when the file exists at the destination and only part of the file needs to be updated. Since delta transfer is disabled when rsync is operating locally, I believe it should have no effect.

thomasrutter
  • 1,893
  • 1
  • 18
  • 32
0

Avoid using the -v, --verbose option, except when debugging. Especially when redirecting the output to some file. In my case, running rsync with -v was 8x slower than letting it run silently. Consider using the --log-file parameter instead.

Illya Moskvin
  • 133
  • 1
  • 10
  • Other answers here already contain good optimizations. This is one I haven't seen mentioned yet. For context, I invoked `rsync` in a "User data" script that runs on AWS EC2 instance startup. The user data script writes to `/var/log/cloud-init-output.log`. I was syncing about 25,000 files, so it needed to write 25K lines to the log file. Running rsync silently took about 12 seconds. Running the same command with `-v` took 100 seconds. Huge difference! – Illya Moskvin Aug 24 '21 at 00:59
0

I made the rsync to my locally attached USB drives run ten times as fast by doing the following

sudo sysctl -w vm.dirty_bytes=50331648
sudo sysctl -w vm.dirty_background_bytes=16777216
  • I don't need any ssh as there is no network --rsh=/usr/bin/rsh

  • No need to burn cycles compressing as there is no network --no-compress

  • --inplace This option changes how rsync transfers a file when its data needs to be updated: instead of the default method of creating a new copy of the file and moving it into place when it is complete, rsync instead writes the updated data directly to the destination file.

    rsync -ta -H --inplace --no-compress --rsh=/usr/bin/rsh

user939857
  • 141
  • 3
0

as I didn't see the answer here yet: What helped me speed up a very slow resync copy inside of a Hyper-V console was to "quiet" the "pretty verbose" default output.

The console output was slowing everything down, and adding a -q to the rsync command speeded everything up a lot.

Best regards Marko

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Feb 08 '23 at 15:12