Build a Software-Defined Datacenter with Proxmox VE
Build a Proxmox VE 9.1 SDDC with Ceph shared storage, SDN overlay networking, and HA clustering — VMware vSphere capabilities at zero licensing cost.
On this page
A software-defined datacenter on Proxmox VE means compute, storage, and networking are all managed in software — no proprietary SAN required, no $45,000 VMware license, no vendor lock-in. By the end of this guide, you'll have a three-node Proxmox VE 9.1 cluster running shared Ceph 18.2 storage, SDN overlay networking, HA policies that automatically restart VMs on node failure, and Proxmox Datacenter Manager as a unified control plane. This is the architecture I run in production on commodity hardware — it scales from a three-machine homelab to a small business deployment without redesigning anything.
Key Takeaways
- Minimum hardware: Three nodes with at least 32 GB RAM, one dedicated OSD disk per node, and a separate Ceph replication network.
- Shared storage is the HA dependency: Without Ceph, HA restarts require migrating disk data first, adding several minutes to failover time.
- SDN replaces static bridges: Proxmox SDN VNet zones let you define VM networks cluster-wide without touching
/etc/network/interfaceson each node. - HA requires quorum: A three-node cluster tolerates exactly one node failure; a two-node cluster needs a dedicated quorum device.
- PDM is alpha but functional: Proxmox Datacenter Manager 0.5 handles monitoring and VM management reliably — don't expose it publicly without TLS.
What the Proxmox SDDC Stack Looks Like
Before diving into steps, it helps to map the Proxmox components against the VMware equivalents most people are replacing:
| VMware Component | Proxmox Equivalent | Status |
|---|---|---|
| vSphere / ESXi | Proxmox VE 9.1 + KVM | Stable, GA |
| vSAN | Ceph 18.2 (Reef) | Stable, integrated |
| NSX / vDS | Proxmox SDN 1.x | Stable, built-in |
| vCenter | Proxmox Datacenter Manager 0.5 | Alpha |
| vSphere HA | Proxmox HA Manager | Stable |
| DRS auto-migration | Not available | Manual migration only |
The honest gap is automated workload balancing (DRS). Proxmox HA restarts a VM on the best available node after a failure, but it won't proactively migrate VMs based on live CPU load. For most workloads this isn't an issue — you pin workloads at creation time and live-migrate manually when needed.
Step 1: Build the Three-Node Cluster
Everything else in this stack depends on a functioning corosync cluster. On your first node:
pvecm create my-sddc --link0 192.168.10.1
On each additional node, join with:
# Run on node2 — substitute its own corosync ring IP
pvecm add 192.168.10.1 --link0 192.168.10.2
Verify cluster health before proceeding:
pvecm status
You want Quorate: Yes and Nodes: 3. If pvecm add hangs, corosync multicast is almost certainly being dropped — many managed switches block it by default. Switch to unicast at cluster creation time:
pvecm create my-sddc --link0 192.168.10.1 --transport udpu
Use a dedicated VLAN or physical interface for corosync traffic. Sharing the corosync link with VM traffic creates latency spikes that trigger spurious node fencing — I learned this the hard way after a busy backup window caused two unexpected failovers.
Step 2: Deploy Shared Ceph Storage
Ceph is what turns "three Proxmox nodes that talk to each other" into a real SDDC. With Ceph, a VM's disk lives on the cluster rather than a single node, so HA can restart it anywhere in under 90 seconds.
Install Ceph on All Three Nodes
Via the web UI: Node > Ceph > Install, or on each node via CLI:
pveceph install --version reef
Proxmox VE 9.1 defaults to Ceph 18.2 (Reef). Do not mix Ceph versions across nodes during the cluster lifetime — upgrades must be done node-by-node with health checks between each.
Bootstrap Monitors and Managers
On the first node, initialize Ceph and point it at a dedicated replication network:
pveceph init --network 10.10.10.0/24
pveceph mon create
pveceph mgr create
Then add monitors on the other two nodes (run on node2 and node3 respectively):
pveceph mon create
pveceph mgr create
You need at least three monitors for Ceph quorum. A single monitor failure will pause all Ceph I/O until recovery.
Add OSDs
Each node needs at least one dedicated block device — never the OS disk:
# Run on each node; substitute the correct device
pveceph osd create /dev/nvme1n1
With one OSD per node (three total), you get size-3 replication by default — every write goes to all three disks simultaneously. Effective usable space is one-third of raw capacity: three 1 TB NVMes yield roughly 1 TB usable.
After adding all OSDs, check cluster health:
ceph status
Wait for HEALTH_OK before continuing. HEALTH_WARN (clock skew detected) means NTP is broken — fix it first:
systemctl enable --now chrony
chronyc makestep
Create the VM Storage Pool
pveceph pool create vm-pool --add-storages true
This creates a Ceph RBD pool and registers it as Proxmox storage automatically. It appears under Datacenter > Storage as rbd-vm-pool. On NVMe-to-NVMe with 3x replication, expect sequential writes around 600–800 MB/s and random 4K write IOPS around 15,000–25,000 on a lightly loaded cluster — more than sufficient for typical VM workloads.
Step 3: Configure SDN Overlay Networking
Static Linux bridges (vmbr0, vmbr1) require manual duplication on every node. Proxmox SDN lets you define zones and VNets once, then push the config cluster-wide from a single command or web UI action.
Create a VXLAN Zone
VXLAN zones use UDP port 4789 to tunnel overlay traffic between nodes. Create one via the web UI at Datacenter > SDN > Zones > Add > VXLAN, or via the API:
pvesh create /cluster/sdn/zones \
--zone production \
--type vxlan \
--peers "192.168.10.1,192.168.10.2,192.168.10.3"
Make sure UDP 4789 is open between all nodes at the firewall and switch level. VXLAN failures are silent — VMs show as connected but cannot reach each other.
Create VNets for Workload Segmentation
# Web tier
pvesh create /cluster/sdn/vnets \
--vnet vnet-web \
--zone production \
--tag 100
# Database tier
pvesh create /cluster/sdn/vnets \
--vnet vnet-db \
--zone production \
--tag 200
Apply the config to all cluster nodes simultaneously:
pvesh set /cluster/sdn
VNets appear as bridge interfaces that you assign to VMs just like vmbr0. The overlay handles cross-node VM communication transparently. For the VLAN fundamentals that inform good SDN zone design, Configuring VLANs on Proxmox with Linux Bridges covers the tagging concepts that map directly to SDN zone architecture.
Step 4: Enable High Availability
With Ceph storage and a quorate cluster, HA is now a configuration step rather than an infrastructure challenge. Create an HA group to control node preference:
ha-manager groupadd production-group \
--nodes "node1:3,node2:2,node3:1" \
--restricted 0
Node priority (3 = most preferred) tells HA where to restart a VM first. --restricted 0 allows fallback to any cluster node if the preferred nodes are unavailable.
Enable HA for a specific VM:
ha-manager add vm:100 --state started --group production-group
Test it for real — power off a node while a managed VM is running. The HA manager fences the failed node via watchdog and restarts the VM elsewhere. On my hardware, the VM is back online in 80–110 seconds from the moment the node powers off. The default two-minute fence timer is configurable in /etc/pve/ha/manager_status if you need faster recovery.
The critical gotcha: HA is disabled when the cluster loses quorum. If two of three nodes go down simultaneously, the surviving node will not take over VMs — it suspects a network partition rather than a genuine failure. Always run an odd number of nodes, or add a quorum device for a two-node cluster:
# Run on the Proxmox cluster; 192.168.1.100 is a lightweight VM running corosync-qdevice
pvecm qdevice setup 192.168.1.100
Step 5: Install Proxmox Datacenter Manager
PDM is your vCenter equivalent — a single web interface across multiple Proxmox clusters. Download the PDM 0.5.x ISO from the official Proxmox downloads page and install it as a dedicated 2 vCPU / 4 GB RAM VM with a 32 GB disk. After first boot, access the interface at https://<pdm-ip>:8007.
Create a Scoped API Token for PDM
Never use root credentials. Create a read-only audit token with the minimum required permissions:
# Run on your Proxmox cluster
pveum user add pdm-ro@pve
pveum aclmod / -user pdm-ro@pve -role PVEAuditor
pveum user token add pdm-ro@pve monitoring --privsep 0
In PDM, go to Remote > Add Remote, enter the cluster URL and the token string. PDM displays VM states, storage utilization, node health, and task logs across all connected clusters from a single view.
PDM 0.5 is appropriate for monitoring and read-heavy operations in production. VM migrations and creation work reliably but occasionally need a page refresh after submission — this is an alpha limitation, not a data integrity issue. Put PDM behind a reverse proxy with a valid TLS certificate; the default self-signed cert will break browser warnings and some API clients.
If you're building this as a replacement for a commercial VMware environment, Build a Private Cloud at Home with Proxmox VE gives useful context on how the simpler single-node setup compares to this multi-node architecture — helpful for explaining the tradeoffs to stakeholders.
Securing the SDDC
An SDDC has a larger attack surface than a single Proxmox node. Three non-negotiable steps:
Cluster-level firewall: Enable it at Datacenter > Firewall > Options > Firewall: Yes. This pushes iptables rules to all nodes simultaneously. Block everything except corosync (UDP 5405–5412), Ceph monitor (TCP 6789, 3300) and OSD (TCP 6800–7300), and your management VLAN.
# Verify firewall is active on all nodes
pvesh get /nodes/node1/firewall/options
SSH hardening: Corosync uses SSH for cluster operations, so you cannot disable it — but you can restrict it to key-based auth only and add fail2ban. The full checklist is in Hardening Proxmox VE: Firewall, fail2ban, and SSH Security.
API token scoping: Never run automation with root-level API tokens. The token you created for PDM (PVEAuditor) is a good template — scope any additional automation tokens to the minimum role required.
Automating Day-2 Operations
Once the cluster is validated, automate VM provisioning and routine maintenance. The Proxmox API is well-suited to Ansible:
# Provision a VM onto Ceph storage with a specific VNet
- name: Create application VM
hosts: proxmox_nodes[0]
tasks:
- name: Create VM
command: >
qm create 200
--name app-server-01
--memory 8192
--cores 4
--net0 virtio,bridge=vnet-web
--scsi0 rbd-vm-pool:32,ssd=1
--boot order=scsi0
--agent enabled=1
For a complete playbook that handles cloud-init, template cloning, and idempotent VM state management across a cluster, Automate Proxmox VE with Ansible Full VM Playbooks covers the full pattern in production-ready form.
When This Complexity Is Worth It
This architecture makes sense if you're running 15+ VMs that need fault tolerance, replacing a commercial VMware or Nutanix deployment, or building a private cloud for a small business that can't afford downtime.
For a homelab with a single machine, it's overkill. Local ZFS storage with manual PBS backups covers 90% of homelab needs at a fraction of the operational burden. The SDDC setup pays off when a VM restart time measured in minutes is genuinely unacceptable.
One more realistic caveat on Ceph: if all three nodes are consumer NVMe drives on a 1 GbE network, your replication bottleneck is the network, not the disks. Use a dedicated 10 GbE or 25 GbE interface for Ceph replication traffic — the --network parameter in pveceph init is there for exactly this reason.
Conclusion
You now have a functional software-defined datacenter: Ceph 18.2 shared storage, VXLAN overlay networking, HA automatic failover under two minutes, and Proxmox Datacenter Manager as the control plane — on commodity hardware, fully open source, zero licensing cost. The immediate next step is hardening cluster access and defining precise firewall rules; start with the checklist in Hardening Proxmox VE: Firewall, fail2ban, and SSH Security.