Quickly Transferring Files In Linux: Transfer files intellegently with RSync

Posted on Oct 13, 2014

This weekend I spent some time migrating my Minecraft server a good 700 miles closer to my house. The new server is ~30 milliseconds closer. If you play games, you probably know how huge that can be.

So, how did I move data to the new server? FTP, right? No. Definitely no.

Why not FTP?

FTP is plain-text

Unless you're using something like FTPS, you'll be sending your server credentials across the wire in plain text, so anyone sitting between you and your new server has all the info they need to login.

Per-file overhead can kill speed

My Minecraft server had just over 1000 files in the world save. Each file transfer over FTP adds a significant cost that shows itself as downtime. Not so cool.

No Deltas

If I use FTP I have to fully stop the server before sending any data across since there isn't a practical way for me to know if something changed after I take the first cut. That means even more downtime.

How about rsync?

As you can probably guess, we can avoid all of these problems with rsync. It has super lower per-file overhead, only sends files that have changed, and is usually transported over SSH1.

We'll copy our local ‘minecraft’ folder to our new server as jdoe at example.com with one quick command.

rsync -avz --progress minecraft jdoe@example.com:/home/jdoe

sending incremental file list
created directory test
minecraft/
minecraft/ForgeModLoader-server-0.log
     1452252 100%  135.37MB/s    0:00:00 (xfer#1, to-check=1038/1040)
minecraft/ForgeModLoader-server-0.log.lck
           0 100%    0.00kB/s    0:00:00 (xfer#2, to-check=1037/1040)
minecraft/ForgeModLoader-server-1.log
     1525594 100%   76.57MB/s    0:00:00 (xfer#3, to-check=1036/1040)
...
sent 904047183 bytes  received 18332 bytes  139087002.31 bytes/sec
total size is 903869415  speedup is 1.00

So, how did that strange line work?

Decomposing the Command

There's actually a lot going on in this short command. Let's start with -a. This one letter told rsync that we want to archive this folder, which amongst other things means keeping file ownership, permissions, and to recurse2.

Next was -v for verbose. That might seem a bit unimpressive after all that was packed into the previous argument, but without this argument, rsync sits with a blank screen until file transfer completes, letting you guess if everything is working properly.

Past this we've got -z in order to compress the data for transit. This increases CPU demand, but can greatly lower the amount of bandwidth required.

Finally, we've added –progress so that during file transfer we can see speed, time elapsed, and percentage complete. Like -v this is optional, but ends up being quite nice.

This brings us to the source for our file copy. For our case, it is our local folder name without a trailing slash. The lack of slash is important here. Since we didn't include a slash, the source folder is copied into the destination folder. With a slash, the contents of our minecraft folder would have been copied into our destination, without a parent ‘minecraft’ folder.

Lastly, we add our destination folder. In this case, we're taking the form ‘username@server:/filepath’. Also, since we're using this format, rsync knows to transport over SSH.

Copying the Delta

After getting the first cut of the data onto the server, I spent a bit of time verifying that everything was working. Ensuring that it was, it was time to copy any changes from the old server.

An important detail here is that not only can files be updated and created, they can also be deleted. Using FTP this would have caused us to require a brand new cut of data. In this case, we'll modify our previous command to only show differences.

rsync -avz --progress --delete --dry-run minecraft jdoe@example.com:/home/jdoe

sending incremental file list
minecraft/world/DIM1/region/
deleting minecraft/world/DIM1/region/r.0.0.mca

sent 27571 bytes  received 114 bytes  55370.00 bytes/sec
total size is 902472679  speedup is 32597.89 (DRY RUN)

We've made two changes: –delete and –dry-run. We added –delete in order to delete any files on the target server not present on the source server. With that, we added –dry-run for sanity. This gives us a last chance to see what would happen if we were to run the command. Since a typo in the source or destination could cause data loss, I highly advise using a dry run before deleting.

In this case, everything looks correct, so we'll remove the –dry-run. A second or two later, our destination folder should be a mirror of our source folder.

Footnotes


  1. While rsync can be used over its native protocol, that usually isn't the case. ↩︎

  2. -a also commands rsync to copy device files, special files, and to follow links. In this case we didn't care about any of that functionality. ↩︎