Adam's blog: My new home data backup strategy

23 Mar 2026, 943 words

A few years back, I described how my off-site home data backup strategy worked. Since then, it was due for a big overhaul, which I have successfully completed. First, let’s go over how it was before and how I improved upon it.

Previous setup

In the past, my setup was fairly simple. I had one main server in the basement and one RPi in another house. Connected to the main server were two sets of HDDs – one for primary data, and a second for full backups. The RPi also had its own set of HDDs connected, but only for full backups. One crontab job rsynced data from the main HDDs to the backup HDDs, while a second job rsynced data from the main HDDs to the off-site RPi.

This setup worked well enough for multiple years, but it had two main drawbacks:

  1. As the backup was done once a week, changes that happened more quickly would not be backed up anywhere.
  2. The speed on the off-site RPi was so slow that the backup completed successfully only if there was not much new data added, effectively finishing only once in a few months.

It was evident that something needed to change; a week of work is a week of work, after all. You don’t want to lose it.

Current setup

My current setup is much more layered. The first stage is daily snapshots using my backup-prepare tool. This tool is able to create a full snapshot of the current state of a directory either via hardlinks or via BTRFS subvolumes, the latter being instant and the method I use. Every midnight, a read-only snapshot is created and kept for 8 days. Unless there is a hard drive failure, I lose at most a day’s work in case a restore is needed.

The second layer is a weekly data backup using borg. Borg is great at deduplicating data and never backs up anything that was already uploaded. After the first full backup is completed, every subsequent backup is much faster as it is only differential. This solution scales perfectly; it works for my VPSs with 10 GiB of HDD space up to my data disks with tens of TBs. I have created a custom wrapper (+ example config) that works great with the daily snapshots (each snapshot directory has a suffix with the current date, so borg always scans the files as new instead of using the cache). It uses bwrap to make it seem like the latest snapshot always has the same path. The added benefit of using borg over snapshots is that I never get “file changed during backup” warnings.

Because borg can work over SSH connections, the borg repository is not saved on the backed-up device itself, but instead on one of my VPSes. This VPS has my dropbear-backup Docker image running. The main advantage is that I can easily pair SSH keys with different subfolders, making the creation of a new account with restricted access (only for borg, sftp, or rsync) a breeze.

Still, we are talking about backing up tens of terabytes of data, which would cost a lot of money if it was all saved on the VPS disk. Instead, I use an rclone Docker plugin with an rclone-compatible storage provider mounted inside the dropbear backup container. This setup makes my storage virtually infinite, while the VPS itself has only a few GB of storage. A bonus is that I can put a virtual layer of encryption on top of the cloud storage using rclone’s crypt remote.

The cloud storage can be considered warm; if needed, I can recover the files quickly. However, if a ransomware attack hit the VPS, it could very well encrypt the whole warm storage, as it is not backed up itself. To mitigate this, every week the entire warm cloud storage is synced to my local Raspberry Pi with multiple external HDDs attached. Together, they have enough storage to keep the whole cloud storage with 2 months of weekly snapshots. As the RPi is limited by residential network speeds, I consider it cold storage, because recovering a lot of data from there would take several weeks at least.

However, we are still not done. Having only 2 months of history may not be enough for all my use cases, so I back up all the data from the RPi to a freezing cold Backblaze storage, which offers versioning and object locking, preventing a ransomware attack from destroying everything. Now we are done.

If this was a lot to take in, here is a diagram detailing the levels:

Diagram of my backup solution

Closing words

My personal setup may be overkill for many, but it still might be a nice pointer on where to start when trying to establish your own backup procedures. I opted to have more storage capacity and accept the possible downtime in case of a hard drive failure instead of some form of RAID. However, if you would like to sleep better at night, I would suggest RAID 1 with BTRFS, for example. This way you can keep all the other features, such as CoW and (almost) instant snapshots.

Honorable mentions

I also considered other services for a while, but decided they were not the right match for me. That does not mean they won’t be the best match for you, though:

Discuss on Mastodon and Bluesky