K3s Kubernetes Cluster on Proxmox VMs Setup Guide
Deploy a three-node K3s Kubernetes cluster on Proxmox VMs using cloud-init templates. From bare VMs to a working kubeconfig in 30 minutes, with Longhorn persistent storage.
On this page
Deploying K3s on Proxmox VMs gives you a production-ready Kubernetes cluster that boots from cloud-init templates in under 30 minutes. The end result: a three-node cluster — one control plane, two workers — with a working kubeconfig ready for kubectl and Longhorn handling persistent storage across both workers. This guide uses Proxmox VE 9.1 and K3s v1.32, the current stable release as of April 2026. If you've got Proxmox running and want Kubernetes without upstream complexity, K3s is the most direct path there.
Key Takeaways
- Template first: Build one Ubuntu 24.04 cloud-init template, clone it for every node — identical base, zero config drift.
- VM sizing: Control plane needs 2 vCPU and 4 GB RAM minimum; workers at 2 vCPU / 4 GB handle most homelab workloads comfortably.
- Networking: K3s uses Flannel VXLAN by default — a single Proxmox Linux bridge handles it without SDN or VLAN config.
- Storage: Longhorn needs a dedicated virtio disk per worker, not the OS disk — add it before deploying Longhorn.
- HA tradeoff: Single control plane is fine for homelabs; true HA requires three control-plane nodes with embedded etcd, worth it only for production.
Why K3s Instead of Full Kubernetes on Proxmox
K3s is a CNCF-certified Kubernetes distribution maintained by SUSE. It packages the entire control plane as a single ~70 MB binary, replaces etcd with SQLite by default (or embedded etcd for HA), and drops cloud-provider integrations that don't apply to Proxmox anyway.
For homelab and small production setups, the advantages over kubeadm-managed Kubernetes are concrete:
- Single binary install: No
kubeadm init, no separate etcd cluster, no kubelet config juggling. - Lower RAM floor: K3s control plane idles around 500 MB vs. 1.5–2 GB for a full Kubernetes control plane.
- Auto-upgrade support: The
system-upgrade-controllerlets you roll cluster upgrades without SSH-ing into each node. - Built-in ingress: Traefik ships as the default ingress controller — functional out of the box, swappable if you prefer ingress-nginx.
If you're already running Docker inside LXC containers on Proxmox and want to move toward orchestration, K3s is the lowest-friction upgrade path. Docker-in-LXC works well for a handful of services, but once you hit five or more containers that need health checks, scheduling, and rolling deployments, Kubernetes scheduling pays for itself immediately.
Hardware and VM Requirements
You don't need a dedicated machine. A single Proxmox host with 32 GB RAM and an NVMe drive can run this entire setup with room to spare for other VMs.
| Node | vCPU | RAM | OS Disk | Extra Disk | Role |
|---|---|---|---|---|---|
| k3s-control | 2 | 4 GB | 32 GB | — | Control plane |
| k3s-worker-1 | 2 | 4 GB | 32 GB | 50 GB (Longhorn) | Worker |
| k3s-worker-2 | 2 | 4 GB | 32 GB | 50 GB (Longhorn) | Worker |
The Longhorn disks are thin-provisioned virtio disks — Proxmox allocates storage lazily, so a 50 GB thin disk only consumes what Longhorn actually writes. On NVMe-to-NVMe, expect a 50 GB Longhorn volume to provision in under 10 seconds.
All three VMs on the same bridge (vmbr0) is sufficient for a homelab cluster. If you want to isolate cluster traffic from your LAN — and for anything exposed to the internet you should — see configuring VLANs on Proxmox with Linux bridges for a clean segmentation approach before you start.
Building the Base Ubuntu Cloud-Init Template
Every node in this cluster starts as a clone of the same base template. Get the template right and the rest is cloning plus one install command per node.
Download the Ubuntu 24.04 Cloud Image
SSH into your Proxmox host and pull the cloud image:
wget https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img \
-O /tmp/noble-server-cloudimg-amd64.img
Ubuntu cloud images ship with cloud-init pre-installed and the ubuntu user pre-configured for SSH key injection. No guest OS bootstrapping required.
Create and Configure the Template VM
# VM ID 9000 is a common convention for templates
qm create 9000 \
--name ubuntu-2404-cloud \
--memory 4096 \
--cores 2 \
--net0 virtio,bridge=vmbr0 \
--ostype l26 \
--agent enabled=1
# Import the cloud image as the primary disk
qm importdisk 9000 /tmp/noble-server-cloudimg-amd64.img local-lvm
# Attach it as a scsi disk and set boot order
qm set 9000 \
--scsihw virtio-scsi-pci \
--scsi0 local-lvm:vm-9000-disk-0,discard=on,ssd=1 \
--boot c \
--bootdisk scsi0
# Add the cloud-init drive
qm set 9000 --ide2 local-lvm:cloudinit
# Serial console is required for cloud-init display
qm set 9000 --serial0 socket --vga serial0
# Inject your SSH public key and set DHCP as the default
qm set 9000 \
--ciuser ubuntu \
--sshkeys ~/.ssh/authorized_keys \
--ipconfig0 ip=dhcp
Resize the OS Disk and Convert to Template
The cloud image ships as a 2.2 GB raw disk. Resize it before converting — you cannot resize a template disk after conversion.
qm resize 9000 scsi0 32G
qm template 9000
That's the template. Every K3s node will be a full clone of VM 9000, booting with a fresh hostname and a DHCP address on first start. If the disk shows as unused0 after import instead of being attached, run qm set 9000 --scsi0 local-lvm:vm-9000-disk-0 to re-attach it — this happens when the storage name in the import path doesn't exactly match the storage ID in Proxmox.
How to Deploy the K3s Control Plane Node
Clone the Template and Assign a Static IP
Static IPs are not optional here. If the control plane IP changes after cluster initialization, the TLS certificates and Flannel overlay both break.
# Full clone so the worker is independent (not a linked clone)
qm clone 9000 101 --name k3s-control --full
# Static IP for the control plane
qm set 101 --ipconfig0 ip=192.168.1.10/24,gw=192.168.1.1
qm start 101
Wait about 30 seconds for cloud-init to finish its first-boot run, then SSH in:
ssh ubuntu@192.168.1.10
Install K3s on the Control Plane
curl -sfL https://get.k3s.io | sh -s - server \
--cluster-init \
--tls-san 192.168.1.10 \
--disable traefik \
--node-name k3s-control
Flags worth explaining:
--cluster-initinitializes embedded etcd, enabling HA expansion later if you add more control-plane nodes.--tls-san 192.168.1.10adds the control plane's IP to the TLS certificate SANs — required forkubectlconnections from outside the VM.--disable traefikis personal preference; remove this flag if you want Traefik as your ingress controller out of the box.
K3s installs, starts, and enables the k3s systemd service in about 45 seconds on a modern CPU. Grab the node token for worker joins:
sudo cat /var/lib/rancher/k3s/server/node-token
Copy the kubeconfig to your local machine and fix the server address:
# Run from your local machine
scp ubuntu@192.168.1.10:/etc/rancher/k3s/k3s.yaml ~/.kube/config
sed -i 's/127.0.0.1/192.168.1.10/g' ~/.kube/config
chmod 600 ~/.kube/config
Verify:
kubectl get nodes
k3s-control should appear in Ready state within 60 seconds of the install completing.
Joining Worker Nodes to the Cluster
Clone the template twice more:
qm clone 9000 102 --name k3s-worker-1 --full
qm set 102 --ipconfig0 ip=192.168.1.11/24,gw=192.168.1.1
qm clone 9000 103 --name k3s-worker-2 --full
qm set 103 --ipconfig0 ip=192.168.1.12/24,gw=192.168.1.1
qm start 102 && qm start 103
SSH into each worker and run the K3s agent installer. Replace <your-node-token> with the token from the previous step:
curl -sfL https://get.k3s.io | \
K3S_URL=https://192.168.1.10:6443 \
K3S_TOKEN=<your-node-token> \
sh -s - agent \
--node-name k3s-worker-1
Repeat on k3s-worker-2 with --node-name k3s-worker-2. Each agent join completes in under 30 seconds. From your local machine:
kubectl get nodes -o wide
Expected output:
NAME STATUS ROLES AGE VERSION
k3s-control Ready control-plane,etcd,master 5m v1.32.3+k3s1
k3s-worker-1 Ready <none> 2m v1.32.3+k3s1
k3s-worker-2 Ready <none> 1m v1.32.3+k3s1
Adding Persistent Storage with Longhorn
K3s includes a local-path provisioner that creates node-local volumes — fine for stateless workloads, useless for anything that needs to survive pod rescheduling to a different node. Longhorn replicates block storage across your worker nodes and fixes this.
Add a Dedicated Disk to Each Worker
From the Proxmox host (VMs stay running — virtio hotplug works on Linux 5.x+ guests):
qm set 102 --virtio1 local-lvm:50,discard=on
qm set 103 --virtio1 local-lvm:50,discard=on
The disks appear immediately as /dev/vdb inside the VMs. Do not partition or format them — Longhorn manages the raw block device directly.
Install Longhorn Prerequisites on Each Worker
sudo apt-get install -y open-iscsi nfs-common
sudo systemctl enable --now iscsid
Skipping open-iscsi is the single most common reason Longhorn volumes get stuck in Attaching. The iSCSI initiator failure is silent — the pod just hangs.
Deploy Longhorn v1.7.1
kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.7.1/deploy/longhorn.yaml
Watch the rollout (takes 3–4 minutes on first deploy):
kubectl -n longhorn-system get pods --watch
Once all pods are running, set Longhorn as the default storage class:
kubectl patch storageclass local-path \
-p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
kubectl patch storageclass longhorn \
-p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
Now any PVC without an explicit storage class gets a Longhorn volume replicated across both workers.
Common Gotchas and How to Fix Them
Nodes stuck in NotReady after join: Almost always a firewall issue. K3s needs ports 6443 (API), 8472/UDP (Flannel VXLAN), and 10250 (kubelet) open between nodes. If ufw is active on your VMs, it will silently drop VXLAN traffic. Either disable it or open those ports explicitly:
sudo ufw allow 6443/tcp
sudo ufw allow 8472/udp
sudo ufw allow 10250/tcp
Node IPs change after a DHCP lease renewal: This is why the static IP step matters. If you skipped it and used DHCP, the flannel overlay breaks when the IP changes. Fix by setting static IPs via qm set cloud-init config and reprovisioning the affected node.
kubectl get nodes shows the node as NotReady after a Proxmox host reboot: Check that the k3s and k3s-agent services started correctly. Cloud-init sometimes races with systemd on first boot after a snapshot restore.
sudo systemctl status k3s # control plane
sudo journalctl -u k3s-agent -f # workers
Template disk shows as unused0 after import: Re-attach it manually:
qm set 9000 --scsi0 local-lvm:vm-9000-disk-0
This happens when the storage name used in qm importdisk doesn't exactly match the storage pool ID shown in the Proxmox UI.
Securing the Cluster
The default K3s kubeconfig at /etc/rancher/k3s/k3s.yaml is world-readable on the control plane — fix that immediately:
sudo chmod 600 /etc/rancher/k3s/k3s.yaml
The node token in /var/lib/rancher/k3s/server/node-token grants full cluster join rights. Treat it like a root password and rotate it after initial setup.
For the Proxmox host layer, hardening Proxmox VE with firewall, fail2ban, and SSH security covers host-level lockdown you should do in parallel with cluster setup.
For disaster recovery: Proxmox Backup Server can snapshot all three K3s VMs at the hypervisor level, giving you a clean restore point before cluster upgrades or Kubernetes version bumps. Pair hypervisor snapshots with Longhorn's built-in snapshot support for application-level recovery.
Conclusion
You now have a three-node K3s v1.32 cluster on Proxmox VE 9.1: control plane with embedded etcd, two workers with Longhorn persistent storage, and a local kubeconfig ready for kubectl. The cloud-init template approach means adding a fourth node is a qm clone and a 30-second agent join — no manual OS setup. The logical next step is deploying an ingress controller (ingress-nginx requires two kubectl apply commands) and exposing your first service outside the cluster.