TrueNASGuide
Isometric view showing data flowing from a NAS through snapshot checkpoints to a remote backup server for replication
data-protection

TrueNAS Snapshot and Replication Strategy

Snapshots protect against accidents. Replication protects against fires. Here is a practical TrueNAS snapshot schedule and replication setup for a home

By TrueNASGuide Editorial · · 8 min read

The first thing to internalize: a ZFS snapshot is not a backup. It lives on the same pool as the data it captures. If the pool is gone — pool corruption, fire, theft, an rm -rf followed by a zfs destroy -r tank — your snapshots are gone with it. Snapshots protect against accidents. Replication to a separate system protects against everything else.

Both layers matter. This guide covers how to build them on TrueNAS.

What a snapshot actually is

A ZFS snapshot is a read-only, point-in-time view of a dataset. It captures the dataset’s exact state at the moment it was taken. Because ZFS is copy-on-write, taking a snapshot is essentially free: ZFS just marks the current block pointers as “preserved” and starts writing new data to new blocks.

A snapshot grows in size only as the underlying dataset diverges from it. A snapshot of an unchanged dataset takes essentially zero additional space.

Snapshots let you:

  • Recover an individual file that was deleted or modified.
  • Recover an entire dataset to a prior state (a “rollback”).
  • Replicate to another system using zfs send / zfs receive.

The snapshot schedule we recommend

There is no single correct schedule, but a defensible default for a home NAS:

DatasetSnapshot intervalRetention
tank/usersHourlyKeep 24 hourly, 14 daily, 8 weekly, 12 monthly
tank/photosDailyKeep 30 daily, 12 monthly
tank/mediaWeeklyKeep 8 weekly
tank/appsDailyKeep 14 daily
tank/vmHourlyKeep 12 hourly, 7 daily
tank/scratchNonen/a — designate as not-snapshotted

The thinking:

  • High-churn user data gets frequent snapshots and longer retention so you can recover from “I deleted the wrong thing three weeks ago” without paying much in storage.
  • Append-mostly media does not need frequent snapshots — you rarely need to roll back a movie collection.
  • VMs are higher risk and benefit from hourly snapshots, but their snapshots can be larger because VM disk activity is constant.
  • Scratch data should explicitly opt out, because there is no point preserving point-in-time views of throwaway data.

Configure these in TrueNAS at Data Protection → Periodic Snapshot Tasks. Use recursive snapshots at a parent dataset level when the retention policies match across children.

The retention math

Long retention is cheap as long as the dataset is not in heavy churn. A snapshot of a dataset that gets 1 GB of new writes per day will grow by roughly 1 GB per day. A 12-month-old daily snapshot of that dataset is holding 365 GB of preserved blocks.

For VMs and database datasets, snapshots grow faster — every write to a zvol or VM image creates new blocks. If you find your snapshots are eating significantly more space than you expect, look at the zfs list -t snapshot -o name,used,refer output to spot the offenders, and consider whether your VM datasets need their own shorter retention.

What snapshots cannot do alone

Snapshots cannot save you from:

  • The pool itself being lost (drive failures beyond your redundancy, ZFS metadata corruption, controller wipe).
  • Physical loss — theft, fire, flood.
  • A rogue process that destroys the dataset with zfs destroy -r — the snapshots go with it.
  • Ransomware that encrypts data fast enough that even hourly snapshots only preserve already-encrypted data. (Mitigated somewhat by frequent snapshots and not exposing destructive privileges over SMB.)

For all of these you need replication to a separate target. Snapshots and replication are also a separate layer from disk health: scrubs and SMART tests catch a degrading drive before it fails, but they cannot recover data the pool no longer has a good copy of. You want both layers.

Replication: the second layer

TrueNAS supports two main replication patterns:

  1. TrueNAS-to-TrueNAS over SSH using zfs send / zfs receive. Native, efficient, incremental. This is the standard for a primary NAS replicating to a backup NAS.
  2. Cloud sync tasks to S3, B2, etc. Slower, less incremental at the block level, but useful as a tertiary off-site backup.

For a home setup with two NAS systems (often a primary in a “main” room and a secondary in a basement or at a relative’s house), TrueNAS-to-TrueNAS over SSH is the right pattern.

Setting up TrueNAS-to-TrueNAS replication

Conceptually:

  1. On the destination NAS: create a user account for replication (or use root if the source is fully trusted) and add the source NAS’s SSH key.
  2. On the source NAS: go to Data Protection → Replication Tasks → Add.
  3. Choose Local source → Remote destination, point at the destination NAS over SSH, and select the datasets to replicate.
  4. Configure the schedule — typically once daily, after the day’s snapshots have completed.
  5. Encrypted source datasets: decide whether to replicate raw (preserves encryption end-to-end, destination cannot read data without keys) or to receive and re-encrypt at destination.

The first replication run will be a full transfer. After that, replications are incremental and proportional to the data that changed since the last successful run.

Bandwidth

A first full replication of a 10 TB pool over a 100 Mbit/s residential upload link takes around 10 days of continuous transfer. Plan accordingly — many home users seed the first replication by physically moving the destination NAS to the same location as the source, completing the initial sync over LAN, and then placing the destination at its remote site.

Off-site target options

  • A second NAS at home, on a different circuit. Cheapest. Protects against pool loss and most accidents. Does not protect against physical loss of the house.
  • A second NAS at a family member or friend’s house. Best of common options. Run replication over a WireGuard or Tailscale tunnel for security and to avoid exposing SSH directly to the internet.
  • A rented offsite (cheap cloud VM with attached storage, or a hosted ZFS target). Costs ongoing money but gives you geographic separation without coordinating with another household.
  • Encrypted off-site backup to S3/B2 via TrueNAS Cloud Sync tasks. Tertiary. Use for the truly irreplaceable subset (family photos, document archives), not the full pool.

Test your restore

A backup you have never restored from is not a backup; it is a hope. Twice a year, pick a random file from a snapshot, restore it, and verify it opens. Once a year, do a full dataset restore from the secondary NAS into a test dataset on the primary and verify a sampling of files. Replication can fail silently in ways that show up only at restore time.

What to put in tank/scratch

It is genuinely useful to have one dataset that is explicitly not snapshotted, not replicated, and not protected. This is where you put:

  • Build artifacts and temporary downloads.
  • VM working files you would not miss.
  • Anything where preservation has negative value (e.g., disk images you want to gc’d quickly).

Without this, snapshot retention will gradually accumulate stuff you don’t actually want to keep, in places you didn’t notice.

Concrete starting recommendation

For a typical home TrueNAS with a primary pool of family photos, documents, media, and a couple of VMs:

  1. Set up the snapshot schedule above using Periodic Snapshot Tasks.
  2. Acquire a secondary TrueNAS (refurbished, smaller pool — does not need to mirror primary’s topology, only its used data).
  3. Set up nightly replication of tank/users, tank/photos, tank/apps, and tank/vm to the secondary.
  4. Set up a tertiary monthly Cloud Sync task of tank/photos and tank/users only to a B2 or S3 bucket, with client-side encryption.
  5. Test a file restore from the secondary monthly. Test a full-dataset restore yearly.

This is more work than “just turn on snapshots.” It is also the difference between losing six hours of work to a typo and losing six years of irreplaceable data to a hardware failure or a burglary.

Next steps

See also

Related

Comments