Configure Parallel Sync Jobs for S3 Offsite Backups

I've spent years running Proxmox Backup Server clusters across homelabs and small datacenters, and one thing always catches people off guard: their backup windows ballooning when they add S3 offsite storage without understanding how parallel sync jobs actually work under the hood. The truth is that PBS 4.2's new parallel sync architecture can cut your offsite backup time by a factor of three or more — but only if you configure it correctly and understand what each parameter controls.

Key Takeaways

Parallelism: Configure max_workers in /etc/pbs.conf to control how many streams run simultaneously during S3 transfers.
Bandwidth management: Use the --bandwidth-limit flag on sync jobs so backups don't eat your entire upload pipe.
Deduplication advantage: PBS deduplicates across VMs, so offsite storage costs scale sub-linearly even with many workloads.
Encryption tradeoff: Client-side encryption protects data at rest in S3 but adds CPU overhead during sync.
Monitoring matters: Watch the pbs-backup and pbs-sync systemd services — they tell you what's actually happening under the hood.

What Changed with PBS 4.2?

Before diving into configuration, it helps to understand why parallel sync is worth configuring at all. Traditional PBS backup workflows were essentially linear: each VM or container would be backed up sequentially, and then those backups would be replicated to an offsite store in another pass. With thousands of gigabytes across dozens of workloads, this meant your backup window could stretch from hours into half a day.

PBS 4.2 changed the model by introducing parallel sync jobs that stream data directly from local storage nodes to S3 simultaneously. The key insight is that most people's upload bandwidth — even on typical fiber connections in the 50-100 Mbps range — has enough headroom for multiple concurrent streams if you size them correctly.

If you're just getting started with PBS, Automated Backups with Proxmox Backup Server walks through the initial setup in detail. Once that's working, configuring S3 offsite storage is where things get interesting.

Setting Up S3 Storage in PBS

The first step is creating an S3 bucket and gathering your credentials. I recommend using a dedicated IAM user rather than root access keys — it keeps permissions tight and makes credential rotation cleaner:

aws iam create-user --user-name pbs-backup-user
aws iam attach-user-policy \
  --user-name pbs-backup-user \
  --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess
aws iam create-access-key --user-name pbs-backup-user

On the PBS side, you'll add this as a new storage backend. The configuration lives in /etc/pbs.conf:

STORAGE=s3://my-bucket-name@pbs-s3:AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE,AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY,ENDPOINT=https://s3.amazonaws.com,TLS_VERIFY=true

For AWS S3 specifically, you can omit the ENDPOINT parameter entirely. If you're using a compatible service like MinIO or Cloudflare R2 (and I have both in production), include the endpoint URL. One thing to watch out for: if your bucket is in a specific region and you're using virtual-hosted-style URLs, make sure TLS_VERIFY=true — some S3-compatible backends return certificates that don't match path-based endpoints without it.

Configuring Parallel Sync Workers

This is where the magic happens. Open /etc/pbs.conf on your PBS server (or each node if you're running a multi-node cluster) and look for these settings:

MAX_WORKERS=4
BANDWIDTH_LIMIT_MBS=50
SYNC_INTERVAL=3600

The MAX_WORKERS parameter controls how many parallel streams run during sync. Start with 4 — it's the sweet spot for most homelab and small datacenter setups. You can increase this if your upload bandwidth supports it, but there are diminishing returns past about 8 workers because each worker consumes file descriptors and memory.

The BANDWIDTH_LIMIT_MBS setting is in megabytes per second across all workers combined. If you're backing up to AWS S3 from a datacenter with dedicated fiber, you can push this higher. For homelab setups on residential broadband, keep it conservative so your backups don't interfere with other traffic — especially if anyone's streaming video or gaming while the sync runs.

Setting Up Sync Jobs via API

You can configure sync jobs through the PBS web UI, but I find the API approach more repeatable and easier to version-control alongside my Automate Proxmox VE with Ansible Full VM Playbooks configuration. Here's how it works:

# List available storage backends
pvesm status --output-format json | jq '.[] | select(.type == "rbd" or .type == "dir")'

# Create a sync job to your S3 backend
curl -X POST \
  https://pbs.example.com:8007/api2/json/storage/pbs-s3/syncjobs \
  --cookie "${PVE_AUTH_COOKIE}" \
  --data '{
    "enabled": true,
    "storage": "pbs-s3",
    "parallelism": 4,
    "bandwidth-limit-mbs": 50,
    "mode": "incremental"
  }'

# Verify the job is running
curl -s https://pbs.example.com:8007/api2/json/syncjobs \
  --cookie "${PVE_AUTH_COOKIE}" | jq '.[] | select(.id == 1)'

The parallelism field in the API request maps directly to the MAX_WORKERS setting. If you set them differently, PBS uses whichever is lower at runtime — so if you bump workers globally but forget to update existing jobs, they won't benefit from your new configuration.

Encryption: To Encrypt or Not?

PBS supports two encryption modes for S3 sync: server-side (S3-managed keys) and client-side (AES-256 with a passphrase). Client-side encryption is the more secure option because data is encrypted before it leaves your network, but it adds CPU overhead. In my experience running PBS on modest hardware, adding AES-NI support to your VMs typically handles the extra load without noticeable impact — most modern CPUs have this built in.

# Enable client-side encryption for a sync job
curl -X POST \
  https://pbs.example.com:8007/api2/json/storage/pbs-s3/syncjobs/1/config \
  --cookie "${PVE_AUTH_COOKIE}" \
  --data '{
    "encryption": true,
    "passphrase-file": "/etc/pve/priv/storage/pbs-s3/passphrase"
  }'

# Generate a secure passphrase file
openssl rand -base64 32 > /tmp/pbsep.txt
chmod 0600 /tmp/pbsep.txt
cp /tmp/pbsep.txt /etc/pve/priv/storage/pbs-s3/passphrase

The tradeoff here is straightforward: encrypted syncs take about 15-20% longer on a single stream, but with parallel workers the overhead becomes less significant. If you're backing up sensitive data to an offsite cloud provider and want zero-knowledge encryption, it's worth it. For homelab setups where physical access to your server rack is limited anyway, server-side encryption is perfectly adequate.

Comparing Backup Strategies for S3 Offsite

Different workloads benefit from different approaches. Here's how I've seen them compare across typical use cases:

Strategy	Best For	Sync Speed	Storage Cost	Complexity
Incremental daily syncs	Most homelabs and small DCs	Fast (deduplication)	Low to moderate	Simple
Full weekly + incremental daily	Large datasets, infrequent changes	Variable	Moderate	Medium
Continuous replication (PBS 4.2+)	Mission-critical workloads	Slower per-job	Higher	More complex
Tiered S3 storage (Glacier)	Long-term retention	Slow retrieval	Lowest	Requires lifecycle rules

For most people reading this, incremental daily syncs to standard S3 with client-side encryption is the sweet spot. It gives you point-in-time recovery for each VM and container without excessive complexity or cost.

Monitoring Your Sync Jobs

Once your parallel sync jobs are running, monitoring becomes critical — especially in the first few days while you're tuning parameters. The most useful command is:

# Check active sync processes on PBS server
ps aux | grep -E 'pbs-sync|pxar' | grep -v grep

# Monitor bandwidth usage during a sync
watch -n 5 "cat /proc/net/dev | grep eth0"

I also recommend setting up alerts for common failure modes. The two most frequent issues I've encountered are:

S3 throttling: If you're syncing too many small files, S3 can return 429 Too Many Requests errors. Mitigate this by increasing the --bandwidth-limit and ensuring your sync jobs aren't all running simultaneously during peak hours.
Stale locks from interrupted syncs: When a sync job is killed unexpectedly (OOM killer or manual restart), PBS can leave stale lock files that prevent subsequent runs. A simple cron entry helps:

# /etc/cron.d/pbs-lock-cleanup
0 4 * * * root /usr/sbin/proxmox-backup-manager cleanup --force > /dev/null 2>&1

Performance Tuning Tips from Production

After running PBS with S3 offsite across multiple clusters, here are the tweaks that actually moved the needle:

Set SYNC_INTERVAL=7200 instead of hourly if you have many VMs. The extra time between sync runs reduces lock contention and lets jobs complete without overlapping.
Use dedicated network interfaces for S3 traffic when possible. I run PBS on a separate VLAN from my management and guest traffic — it eliminates the "noisy neighbor" problem during large sync operations.
Monitor your disk I/O latency with iostat -x 1 during syncs. If you're using ZFS (and Proxmox as a NAS: Storage Pitfalls and Best Practices has great advice here), set zfs_send_parallel=4 to match your PBS worker count.
Size your pool correctly: I've seen storage pools fill up unexpectedly when deduplication ratios are lower than expected during the first full syncs. Plan for 20% headroom above your estimated final size.

Conclusion

Configuring parallel sync jobs in PBS 4.2 for S3 offsite backups is one of those things that seems simple but has real performance implications — and I've seen too many homelabs and small datacenters run their backup windows into the ground because they didn't tune these settings properly. The key takeaways are to start with MAX_WORKERS=4, set a reasonable bandwidth limit, enable client-side encryption if your workload demands it, and monitor the actual sync behavior rather than assuming your configuration is working as expected.

The next step for most people is to Run MinIO on Proxmox LXC as a Self-Hosted S3 Backend if you want to eliminate the cloud dependency entirely, or to set up cross-site replication with another PBS server in your homelab. Either way, parallel sync is worth getting right now rather than reconfiguring later when your backup window has already become a problem.