ZFS Replication on Proxmox: Sync Data Between Nodes
Learn how to set up ZFS replication on Proxmox to sync datasets between nodes for disaster recovery, offsite backup, and HA storage redundancy.
On this page
If you're running Proxmox VE with ZFS and you're not replicating your datasets to at least one other node, you're one drive failure away from a very bad day. ZFS replication using zfs send and zfs receive is one of the most powerful and underutilized tools in the homelab admin's toolkit — and Proxmox makes it surprisingly approachable once you understand the mechanics.
This guide walks you through setting up incremental ZFS replication between two Proxmox nodes, automating it with a cron job, and verifying your replicas are actually usable when you need them.
Why ZFS Replication Over Other Backup Methods
Proxmox Backup Server (PBS) is excellent for VM and container backups, but ZFS replication serves a different purpose. With zfs send/receive, you're transferring an exact byte-for-byte snapshot of your dataset — including all metadata, compression, and deduplication tables.
This approach has some distinct advantages:
- Incremental transfers — after the initial full send, only changed blocks are transferred
- Dataset-level granularity — replicate individual datasets, not entire pools
- Application-consistent snapshots — combined with VM quiescing, you get crash-consistent replicas
- Fast recovery — the replica is immediately mountable on the destination node
It's not a replacement for PBS, but it complements it well. Use PBS for point-in-time VM recovery, and ZFS replication for fast node failover.
Prerequisites
Before you start, make sure you have:
- Two Proxmox nodes, each with a ZFS pool
- SSH access between nodes (ideally key-based)
- Root or a privileged user with ZFS delegation rights
- Network connectivity between nodes (a dedicated storage VLAN is ideal)
For this guide, we'll call the source node pve1 (IP: 192.168.10.11) and the destination node pve2 (IP: 192.168.10.12). Our source dataset is rpool/data/vm-100-disk-0.
Setting Up SSH Key Authentication
ZFS replication over SSH requires passwordless login between nodes. On pve1, generate an SSH key pair if you don't already have one:
ssh-keygen -t ed25519 -C "pve1-zfs-replication" -f /root/.ssh/zfs_replication
Copy the public key to pve2:
ssh-copy-id -i /root/.ssh/zfs_replication.pub root@192.168.10.12
Test the connection:
ssh -i /root/.ssh/zfs_replication root@192.168.10.12 "hostname"
You should see pve2 printed back. If you get prompted for a password, the key wasn't accepted — double-check the authorized_keys file on the destination.
Restricting SSH Access for Replication
For better security, restrict what the replication key can do on pve2. Edit /root/.ssh/authorized_keys on pve2 and prefix the key entry:
command="zfs receive -F rpool/replicas",no-port-forwarding,no-X11-forwarding,no-pty ssh-ed25519 AAAA...
This limits that key to only running zfs receive — nothing else.
Understanding ZFS Snapshots and Send/Receive
Before automating anything, it helps to understand the underlying primitives.
Creating a Snapshot
zfs snapshot rpool/data/vm-100-disk-0@$(date +%Y%m%d-%H%M%S)
This creates a point-in-time snapshot. List your snapshots:
zfs list -t snapshot rpool/data/vm-100-disk-0
Initial Full Send
For the very first replication, you need to send the entire dataset. On pve1:
zfs snapshot rpool/data/vm-100-disk-0@initial
zfs send -v rpool/data/vm-100-disk-0@initial |
ssh -i /root/.ssh/zfs_replication root@192.168.10.12
"zfs receive -F rpool/replicas/vm-100-disk-0"
The -v flag shows transfer progress. For large datasets over a LAN, expect roughly 500MB/s–1GB/s depending on your hardware and compression settings.
Incremental Sends
After the initial send, subsequent syncs only transfer changed blocks:
zfs snapshot rpool/data/vm-100-disk-0@$(date +%Y%m%d-%H%M%S)
zfs send -v -i
rpool/data/vm-100-disk-0@initial
rpool/data/vm-100-disk-0@20260321-060000 |
ssh -i /root/.ssh/zfs_replication root@192.168.10.12
"zfs receive -F rpool/replicas/vm-100-disk-0"
The -i flag specifies the previous snapshot as the incremental base. ZFS calculates the diff and sends only what changed.
Automating Replication with a Script
Manually running these commands works, but you'll want this automated. Here's a production-ready replication script:
#!/bin/bash
# /usr/local/bin/zfs-replicate.sh
# ZFS incremental replication script for Proxmox
set -euo pipefail
Configuration
SOURCE_DATASET="rpool/data/vm-100-disk-0" DEST_HOST="192.168.10.12" DEST_DATASET="rpool/replicas/vm-100-disk-0" SSH_KEY="/root/.ssh/zfs_replication" SNAP_PREFIX="auto" KEEP_SNAPS=7 LOG="/var/log/zfs-replicate.log"
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG" }
SSH="ssh -i $SSH_KEY -o BatchMode=yes -o ConnectTimeout=10"
Create new snapshot
NEW_SNAP="${SOURCE_DATASET}@${SNAP_PREFIX}-$(date +%Y%m%d-%H%M%S)" log "Creating snapshot: $NEW_SNAP" zfs snapshot "$NEW_SNAP"
Find the most recent common snapshot
PREV_SNAP=$(zfs list -H -t snapshot -o name "$SOURCE_DATASET" |
grep "@${SNAP_PREFIX}-" | tail -2 | head -1)
if [ -z "$PREV_SNAP" ]; then
log "No previous snapshot found — performing full send"
zfs send -v "$NEW_SNAP" |
$SSH root@$DEST_HOST "zfs receive -F $DEST_DATASET"
else
log "Incremental send from $PREV_SNAP to $NEW_SNAP"
zfs send -v -i "$PREV_SNAP" "$NEW_SNAP" |
$SSH root@$DEST_HOST "zfs receive -F $DEST_DATASET"
fi
Prune old snapshots (keep last N)
log "Pruning snapshots older than $KEEP_SNAPS"
zfs list -H -t snapshot -o name "$SOURCE_DATASET" |
grep "@${SNAP_PREFIX}-" |
head -n "-${KEEP_SNAPS}" |
xargs -r -I{} zfs destroy {}
Prune destination snapshots too
$SSH root@$DEST_HOST "zfs list -H -t snapshot -o name $DEST_DATASET |
grep '@${SNAP_PREFIX}-' |
head -n -${KEEP_SNAPS} |
xargs -r -I{} zfs destroy {}"
log "Replication complete"
Make it executable:
chmod +x /usr/local/bin/zfs-replicate.sh
Test it manually first:
/usr/local/bin/zfs-replicate.sh
Check /var/log/zfs-replicate.log for output.
Scheduling with Cron
Once the script is working, add it to cron. Edit the root crontab on pve1:
crontab -e
Add a schedule — hourly replication is reasonable for most homelab workloads:
cron
ZFS replication - every hour
0 * * * * /usr/local/bin/zfs-replicate.sh >> /var/log/zfs-replicate.log 2>&1
For production workloads where you want tighter RPO (recovery point objective), run every 15 minutes:
cron */15 * * * * /usr/local/bin/zfs-replicate.sh >> /var/log/zfs-replicate.log 2>&1
Replicating Multiple Datasets
If you have multiple VMs to replicate, create a wrapper script that calls the replication script for each dataset:
#!/bin/bash
# /usr/local/bin/zfs-replicate-all.sh
DATASETS=( "rpool/data/vm-100-disk-0" "rpool/data/vm-101-disk-0" "rpool/data/subvol-200-disk-0" )
for dataset in "${DATASETS[@]}"; do SOURCE_DATASET="$dataset" /usr/local/bin/zfs-replicate.sh done
Using Proxmox's Built-In ZFS Replication
Proxmox VE has a built-in ZFS replication feature in the web UI, which simplifies this significantly for VMs and containers managed by Proxmox.
Enabling Replication via the GUI
- In the Proxmox web interface, select your VM or CT
- Click Replication in the left sidebar
- Click Add
- Select the Target node from the dropdown
- Set the Schedule (e.g.,
*/15for every 15 minutes) - Click Create
Proxmox handles the snapshot management and SSH transport automatically. The built-in replication is tied to the Proxmox cluster, so both nodes need to be in the same cluster.
Checking Replication Status
From the CLI on pve1:
pvesr status
This shows all configured replication jobs, their last run time, and whether they succeeded. You can also run a job manually:
pvesr run <job-id>
And watch the logs:
journalctl -u pvesr -f
Verifying Your Replicas
A backup you haven't tested is not a backup — it's a hope. Periodically verify your replicas are actually usable.
List Replicated Snapshots on Destination
On pve2:
zfs list -t snapshot rpool/replicas/vm-100-disk-0
You should see snapshots corresponding to your replication schedule.
Mount and Inspect a Replica
# On pve2, mount the most recent snapshot read-only
zfs clone rpool/replicas/vm-100-disk-0@auto-20260321-060000 \
rpool/test/vm-100-verify
zfs mount rpool/test/vm-100-verify ls /rpool/test/vm-100-verify
Clean up when done
zfs unmount rpool/test/vm-100-verify zfs destroy rpool/test/vm-100-verify
Test Failover
For VMs, practice actually starting the replica on pve2. In the Proxmox web UI on pve2, you'll need to create a VM config that points to the replicated dataset, then boot it. This is the only way to confirm your replica is actually bootable.
Performance Tuning for Replication
Large datasets over slow links require some tuning.
Compression in Transit
If your nodes communicate over a slower link (1GbE or WAN), enable SSH compression:
zfs send -v -i $PREV_SNAP $NEW_SNAP | \
ssh -C -i /root/.ssh/zfs_replication root@$DEST_HOST \
"zfs receive -F $DEST_DATASET"
For WAN replication, mbuffer helps smooth out I/O bursts:
zfs send -v -i $PREV_SNAP $NEW_SNAP | \
mbuffer -s 128k -m 1G | \
ssh -i /root/.ssh/zfs_replication root@$DEST_HOST \
"mbuffer -s 128k -m 1G | zfs receive -F $DEST_DATASET"
Install mbuffer on both nodes: apt install mbuffer
Bandwidth Limiting
To avoid saturating your network during business hours, use pv to cap throughput:
zfs send -i $PREV_SNAP $NEW_SNAP | \
pv -L 50m | \
ssh -i /root/.ssh/zfs_replication root@$DEST_HOST \
"zfs receive -F $DEST_DATASET"
The -L 50m flag limits to 50MB/s. Install with apt install pv.
Monitoring and Alerting
Replication silently failing is worse than not having replication — you think you're protected but aren't.
Simple Age-Based Alert
Add this check to your monitoring or run it via cron:
#!/bin/bash
# Alert if most recent snapshot is older than 2 hours
DATASET="rpool/replicas/vm-100-disk-0"
MAX_AGE_SECONDS=7200
LATEST=$(zfs list -H -t snapshot -o name,creation
-S creation "$DATASET" | head -1 | awk '{print $2, $3, $4, $5, $6}')
LATEST_TS=$(date -d "$LATEST" +%s 2>/dev/null ||
stat -c %Y "$(zfs list -H -t snapshot -o name "$DATASET" | tail -1 |
sed 's|@|/.zfs/snapshot/|')")
AGE=$(( $(date +%s) - LATEST_TS ))
if [ "$AGE" -gt "$MAX_AGE_SECONDS" ]; then echo "WARNING: ZFS replica $DATASET is ${AGE}s old (max ${MAX_AGE_SECONDS}s)" exit 1 fi
For more sophisticated monitoring, the replication status integrates with Prometheus via the Proxmox exporter — check your Grafana dashboard if you've already set up the Proxmox monitoring stack.
Common Issues and Fixes
"cannot receive incremental stream: most recent snapshot does not match"
This happens when the source and destination snapshots have diverged. The safest fix is to destroy the destination dataset and start a fresh full send:
# On pve2
zfs destroy -r rpool/replicas/vm-100-disk-0
Then run your script without the -i flag for a fresh full send.
Replication is very slow
Check if your ZFS dataset has compression=off. Enable LZ4 compression on the source — it's faster to compress and transfer than to send uncompressed data:
zfs set compression=lz4 rpool/data
New data written after this will be compressed. Existing data won't be retroactively compressed.
SSH connection refused
Verify the destination node's firewall allows SSH from the source IP. In Proxmox's built-in firewall, check that port 22 is open for your storage VLAN subnet.
Conclusion
ZFS replication transforms your Proxmox setup from "I have backups" to "I have instant failover." The initial setup takes an hour or two, but once automated, it runs invisibly in the background — creating a continuously updated copy of your critical datasets on another node.
The key points to take away: start with a full send, automate incremental syncs with cron or Proxmox's built-in replication, keep 5–7 snapshots on both ends, and — most importantly — actually test your replicas by mounting them and booting from them. An untested replica is just disk space with extra hope.
Pair this with Proxmox Backup Server for VM-level point-in-time recovery, and you've got a genuinely robust data protection strategy that rivals enterprise solutions at homelab cost.