I used to not back up my data. I know, I know, I was just asking for trouble. When I came to my senses (fortunately before losing anything important) I started looking for the cheapest backup solution possible. Here’s what I came up with.1
The gold standard for backups is “3–2–1”: three copies of your data, two of which are on different local devices, and one of which is offsite. Don’t settle for less, because you don’t have to.
The first copy is easy – it’s the one you’ve already got on your hard drive. One down, two to go.
The second copy is local, but on a different device. The benefit to having a second local copy is that in the event that your hard drive blows up, you can get your data back immediately, without having to download anything. It also means you don’t fully depend on your remote copy.
The conventional way to make a second copy is using an external hard drive, and you can get a decent one of those for $50. But we’re going for a dramatically cheap backup solution here, so how about we consider a different option: USB drives.
We gotta put that USB drive somewhere though, so let’s buy a Raspberry Pi. It’ll be our cheap little backup server. If we’re just backing up one computer, we could plug the USB straight into it, but this way we can back up multiple machines using the same setup. A Pi and a USB drive together will cost you a good $35, but you’ve probably got a USB drive lying around already, and maybe you already have a Pi too. So, optimistically, we’re at $0 so far.
Start the Pi and mount your USB drive at
Copy your remote ssh key into
/home/pi/.ssh/authorized_keys and make sure you can ssh into it from your local machine.
Make a folder on the Pi at
/opt/usb1/backups/, and run
chown -R pi:pi /opt/usb1/ so that you can write to it.
Here’s the script we’re going to use to back up our stuff.
Notice that this is a zsh script.
I find it’s nicer for scripting in than bash.
sudo apt install zsh and you’ve got it.
apt — use whichever package manager comes with your system.
apt in my examples here.
./snapshot.zsh on your computer and, if you’ve set everything up right, you should see all your data zap itself over to your Raspberry Pi.
Here’s how this script works: the first time you run it, it copies all your data into a folder named with the date and time, and creates a symbolic link to that folder called
last always points to your most recent backup.
Every subsequent time you run it, the script finds the difference between the current state of your files and your most recent backup, and uploads only the new data.
If you change 10% of a file, the 90% that stays the same is copied over from the previous backup, instead of being sent over the wire.
If you don’t change the file at all (here’s the cool part) it hardlinks the file from your last backup, so that it doesn’t consume any extra disk space.
That’s the magic of
This way, your remote drive winds up looking like this:
/opt/usb1/backups: 2020-May-24_13:00:00/ 2020-May-26_07:21:22/ 2020-May-26_13:52:31/ 2020-May-27_13:00:00/ 2020-May-27_18:48:47/ 2020-May-28_13:40:33/ 2020-May-29_13:00:00/ 2020-May-30_13:00:00/ 2020-May-31_13:00:00/ last/
Every one of those folders is a snapshot of your data at that exact point in time. This is (fun fact) exactly how Mac’s Time Machine backups work. If your USB drive starts to get too full, just delete some of the old snapshots.
So that’s the second local copy done. What about the third one?
Well, in the spirit of frugality, I use the cheapest cloud storage I can find: Backblaze B2. B2 is S3-compatible object storage at $0.005/GB/month. Storing 10 GB for a year will cost you 60 cents.
I like B2 because it’s cheap even when you’re not buying storage in bulk. B2 works out to $60/TB/year, which is good, but not unheard-of. I don’t have a whole terabyte of valuable data though. I only have a few gigs, and B2 lets me store that for basically nothing.
That being said, I’m not married to Backblaze and you shouldn’t be either. Vendor lock-in is not fun. So these scripts will work with any S3-compatible object storage API.
Speaking of which, here are the scripts I use to back my data up to the cloud. Put them on the Raspberry Pi.
This one goes at
It tars & compresses your most recent backup, encrypts it using
pyAesCrypt, and uploads it to B2 (or any other cloud storage which is compatible with the S3 API).
Make sure you install
pyAesCrypt on the Pi before running the script:
This next one belongs at
This one’s simple enough; it handles the actual file uploading.
It depends on AWS’s
boto3 Python API, so install that using
pip3 install boto3.
If you want to isolate these Python dependencies in a virtualenv, that’s probably smart, but I haven’t.
The last script one goes at
This one is a bit complicated: See, B2 and other S3-compatible object storage systems let you version a file. Every time you upload a new backup under the same name, you’re adding a new version of the same file. You can expire these versions after a certain amount of time, but I didn’t want to do that, so I wrote this script, which preserves the three most recent versions of a file and removes the rest. That way, if I stop doing backups for some reason, the old versions won’t expire.
There are two config files you need to run these scripts –
backup.conf holds information about your backup:
s3.conf holds the authentication information to upload to the cloud:
Log into your cloud storage account and create a bucket called
myname-backups (or whatever – just make sure it’s the same as the one in the config file).
$COPIES parameter in
backup.conf is set to 3, which tells the script to keep three versions of your data in the cloud – the most recent copy, plus the last two versions.
Set it to whatever you want, depending on whether you want to maximize safety or minimize spending.
To copy your backup to the cloud, run this:
Dump that into a script, run that once a week as a cron job for the
pi user, and you’ve got your third copy of the data in the cloud.
That’s it! 3–2–1 — three copies, two on different local devices, one off-site. With nothing more than a USB stick, a Raspberry Pi, and a few cents a year.
So far, we’ve only talked about using a USB drive as a medium to store your backups. USB drives are a fairly cheap solution, but they’re not the most reliable thing in the world. If you want something really robust, get a network-attached storage device and set up a RAID array.2 But on this blog, cheap never precludes robust: let’s get a second USB drive and set up a fake RAID.
Plug both USBs into the Pi, mount them at
/opt/usb2 (or wherever), and copy the following into a file called
mirror_usb.zsh to your crontab, run it once per day, and voila:
Your second USB drive has become an automated mirror of the first one which will detect and alert you to any data decay.
If you want to get really fancy, you can set up a little Flask server on your Pi to detect the
You could have it also keep track of the most recent backup from your computer, and the last time a backup was uploaded to the cloud, and put this information in a little web dashboard, but this is all beyond the scope of this article.
I just wanted to share some handy shell scripts.
Caveat: I use a fair amount of shell scripting in here. If you use Windows, you’ll have to use WSL or something to run them. ↩︎
Full disclosure: I actually did get a used NAS instead of using a USB stick. ↩︎