Proxmox Monitoring with Grafana, Prometheus & Loki
Deploy a full Grafana, Prometheus, and Loki observability stack on Proxmox VE using LXC containers to monitor node metrics, VMs, and logs in one dashboard.
On this page
If you're running Proxmox VE in a homelab or production environment and relying on the built-in web UI for monitoring, you're flying half blind. The Proxmox dashboard gives you a snapshot, but it won't alert you when a node's memory climbs to 95%, when a VM starts hammering your NVMe, or when a service inside an LXC starts spewing errors at 3am. A proper observability stack changes that entirely.
This guide walks you through deploying Grafana, Prometheus, and Loki on Proxmox VE — all running inside lightweight LXC containers. By the end you'll have real-time node metrics, VM stats, and log aggregation flowing into a unified Grafana dashboard you can actually act on.
Why Grafana + Prometheus + Loki?
These three tools form the most popular open-source observability stack in existence, and for good reason:
- Prometheus scrapes and stores time-series metrics (CPU, RAM, disk I/O, network throughput)
- Loki aggregates and indexes logs without the heavyweight indexing of Elasticsearch
- Grafana unifies both into dashboards, alerting, and exploration tools
All three are actively maintained, have massive community dashboards you can import, and run comfortably inside LXC containers with modest resource requirements.
Architecture Overview
Here's how the stack fits together on a typical Proxmox node:
┌─────────────────────────────────────────────────────┐
│ Proxmox Node |
│ |
│ ┌──────────────┐ ┌──────────────┐ ┌─────────────┐ │
│ │ pve-exporter │ │ node-export │ │ Promtail │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬──────┘ │
│ │ │ │ │
│ ┌──────▼──────────────────▼──────────┐ │ │
│ │ LXC: Prometheus │◄──────┘ │
│ └──────────────────┬─────────────────┘ │
│ │ │
│ ┌──────────────────▼─────────────────┐ │
│ │ LXC: Loki │◄──────────┘
│ └──────────────────┬─────────────────┘
│ │
│ ┌──────────────────▼─────────────────┐
│ │ LXC: Grafana │
│ └────────────────────────────────────┘
└─────────────────────────────────────────────────────┘
If you're running a Proxmox cluster, the exporters run on each node, but Prometheus, Loki, and Grafana only need to run once.
Step 1: Create the LXC Containers
We'll use three separate LXC containers to keep concerns isolated and make upgrades easier. Ubuntu 22.04 is a solid choice for all three.
Prometheus LXC
In the Proxmox UI, go to Create CT and use these settings:
- Template: Ubuntu 22.04
- Hostname: prometheus
- CPU: 2 cores
- RAM: 1024 MB (2048 MB if you have many targets)
- Disk: 20 GB (metrics retention storage — increase for longer history)
- Network: Static IP on your management VLAN (e.g.
192.168.10.20/24)
Or via CLI:
pct create 200 local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
--hostname prometheus \
--cores 2 \
--memory 1024 \
--rootfs local-lvm:20 \
--net0 name=eth0,bridge=vmbr0,ip=192.168.10.20/24,gw=192.168.10.1 \
--unprivileged 1 \
--start 1
Repeat for Loki (CT 201, IP 192.168.10.21, 10 GB disk) and Grafana (CT 202, IP 192.168.10.22, 5 GB disk).
Step 2: Install Prometheus
SSH into the Prometheus container and run:
apt update && apt install -y wget
wget https://github.com/prometheus/prometheus/releases/download/v2.51.0/prometheus-2.51.0.linux-amd64.tar.gz
tar xvf prometheus-2.51.0.linux-amd64.tar.gz
mv prometheus-2.51.0.linux-amd64 /opt/prometheus
Create a dedicated system user:
useradd --no-create-home --shell /bin/false prometheus
chown -R prometheus:prometheus /opt/prometheus
Create the systemd service at /etc/systemd/system/prometheus.service:
[Unit]
Description=Prometheus
After=network.target
[Service]
User=prometheus
ExecStart=/opt/prometheus/prometheus \
--config.file=/opt/prometheus/prometheus.yml \
--storage.tsdb.path=/opt/prometheus/data \
--storage.tsdb.retention.time=30d \
--web.listen-address=0.0.0.0:9090
Restart=always
[Install]
WantedBy=multi-user.target
Enable and start it:
systemctl daemon-reload
systemctl enable --now prometheus
Prometheus should now be accessible at http://192.168.10.20:9090.
Step 3: Install the Proxmox VE Exporter
The pve-exporter exposes Proxmox node and VM metrics to Prometheus. Install it directly on your Proxmox host (not in a container):
apt install -y python3-pip
pip3 install prometheus-pve-exporter
Create a config file at /etc/prometheus/pve.yml with a dedicated read-only Proxmox API user:
default:
user: pve-exporter@pve
password: yourpasswordhere
verify_ssl: false
First, create that API user in Proxmox:
pveum user add pve-exporter@pve --password yourpasswordhere
pveum aclmod / -user pve-exporter@pve -role PVEAuditor
Create the systemd service at /etc/systemd/system/pve-exporter.service:
[Unit]
Description=Proxmox VE Prometheus Exporter
After=network.target
[Service]
ExecStart=pve_exporter --config.file /etc/prometheus/pve.yml
Restart=always
[Install]
WantedBy=multi-user.target
systemctl enable --now pve-exporter
The exporter listens on port 9221 by default. Test it:
curl http://localhost:9221/pve?target=localhost
You should see a wall of pve_* metrics.
Step 4: Install Node Exporter
Node exporter provides hardware and OS-level metrics. Install it on each Proxmox host:
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz
tar xvf node_exporter-1.7.0.linux-amd64.tar.gz
mv node_exporter-1.7.0.linux-amd64/node_exporter /usr/local/bin/
useradd --no-create-home --shell /bin/false node_exporter
Systemd service at /etc/systemd/system/node_exporter.service:
[Unit]
Description=Node Exporter
After=network.target
[Service]
User=node_exporter
ExecStart=/usr/local/bin/node_exporter
Restart=always
[Install]
WantedBy=multi-user.target
systemctl enable --now node_exporter
Node exporter listens on port 9100.
Step 5: Configure Prometheus Scrape Targets
Edit /opt/prometheus/prometheus.yml on your Prometheus container:
global:
scrape_interval: 30s
evaluation_interval: 30s
scrape_configs:
- job_name: 'node'
static_configs:
- targets:
- '192.168.10.1:9100' # proxmox-node-1
- '192.168.10.2:9100' # proxmox-node-2 (if clustered)
- job_name: 'pve'
metrics_path: /pve
params:
module: [default]
static_configs:
- targets:
- '192.168.10.1:9221' # proxmox-node-1
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: '192.168.10.1:9221'
Restart Prometheus to apply:
systemctl restart prometheus
Navigate to http://192.168.10.20:9090/targets to confirm all targets show UP in green.
Step 6: Install and Configure Loki
SSH into the Loki container and install it:
wget https://github.com/grafana/loki/releases/download/v2.9.4/loki-linux-amd64.zip
apt install -y unzip
unzip loki-linux-amd64.zip
mv loki-linux-amd64 /usr/local/bin/loki
useradd --no-create-home --shell /bin/false loki
mkdir -p /opt/loki/data
chown -R loki:loki /opt/loki
Create a minimal config at /opt/loki/loki-config.yml:
auth_enabled: false
server:
http_listen_port: 3100
ingester:
lifecycler:
ring:
kvstore:
store: inmemory
replication_factor: 1
chunk_idle_period: 5m
chunk_retain_period: 30s
schema_config:
configs:
- from: 2024-01-01
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /opt/loki/data/index
cache_location: /opt/loki/data/cache
filesystem:
directory: /opt/loki/data/chunks
limits_config:
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h
chunk_store_config:
max_look_back_period: 0s
table_manager:
retention_deletes_enabled: false
retention_period: 0s
Systemd service at /etc/systemd/system/loki.service:
[Unit]
Description=Loki Log Aggregation
After=network.target
[Service]
User=loki
ExecStart=/usr/local/bin/loki --config.file=/opt/loki/loki-config.yml
Restart=always
[Install]
WantedBy=multi-user.target
systemctl enable --now loki
Step 7: Install Promtail on Proxmox Hosts
Promtail ships your Proxmox host logs to Loki. Install it on each Proxmox node:
wget https://github.com/grafana/loki/releases/download/v2.9.4/promtail-linux-amd64.zip
unzip promtail-linux-amd64.zip
mv promtail-linux-amd64 /usr/local/bin/promtail
Create /etc/promtail/config.yml:
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://192.168.10.21:3100/loki/api/v1/push
scrape_configs:
- job_name: syslog
static_configs:
- targets:
- localhost
labels:
job: syslog
host: proxmox-node-1
__path__: /var/log/syslog
- job_name: pve
static_configs:
- targets:
- localhost
labels:
job: pve
host: proxmox-node-1
__path__: /var/log/pve/tasks/active
Create and enable the systemd service similarly to the others, then restart Promtail.
Step 8: Install and Configure Grafana
SSH into the Grafana container:
apt-get install -y apt-transport-https software-properties-common wget
wget -q -O - https://apt.grafana.com/gpg.key | apt-key add -
echo "deb https://apt.grafana.com stable main" > /etc/apt/sources.list.d/grafana.list
apt update && apt install -y grafana
systemctl enable --now grafana-server
Grafana is now running on port 3000. Access it at http://192.168.10.22:3000 with default credentials admin / admin.
Adding Data Sources
Navigate to Connections > Data Sources and add:
- Prometheus — URL:
http://192.168.10.20:9090 - Loki — URL:
http://192.168.10.21:3100
Click Save & Test for each to confirm connectivity.
Step 9: Import Community Dashboards
This is where the stack really pays off. The Grafana dashboard community has done most of the work for you.
Go to Dashboards > Import and enter these IDs:
| Dashboard | ID |
|---|---|
| Node Exporter Full | 1860 |
| Proxmox VE (pve-exporter) | 10347 |
| Loki Dashboard | 13639 |
For each: enter the ID, click Load, select your Prometheus or Loki data source, and click Import.
Within a few minutes you'll have comprehensive dashboards showing:
- Node Exporter Full — CPU steal, memory pressure, disk saturation, network errors, load average
- Proxmox VE — VM and LXC CPU/RAM usage, storage pool utilization, node status
- Loki — Log streams with live tail, regex filtering, error rate panels
Step 10: Set Up Alerting
Grafana's built-in alerting can notify you via email, Slack, Telegram, PagerDuty, and more. Here's a practical alert for high memory usage:
- Open the Node Exporter Full dashboard
- Find the Memory Usage panel and click Edit
- Go to the Alert tab and click Create alert rule
- Set the condition:
avg(node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) < 0.10 - Set evaluation interval to
1mfor5m - Add a Contact Point under Alerting > Contact points
Useful alert rules to configure:
- Node memory available < 10%
- ZFS pool usage > 80%
- Node CPU usage > 90% sustained for 5 minutes
- Proxmox node offline (target down)
Practical Tips for Production Use
Firewall the exporters. Node exporter and pve-exporter expose sensitive system data. Use the Proxmox firewall or iptables to restrict access to only the Prometheus container IP.
# On Proxmox host — allow only Prometheus to scrape
iptables -A INPUT -p tcp --dport 9100 -s 192.168.10.20 -j ACCEPT
iptables -A INPUT -p tcp --dport 9100 -j DROP
Set retention periods. Prometheus defaults to 15 days. The --storage.tsdb.retention.time=30d flag in the systemd service extends this. 30 days on 20 GB is comfortable for a single node.
Use LXC snapshots before upgrades. Before upgrading Grafana or Prometheus, take a snapshot of the container:
pct snapshot 202 before-grafana-upgrade
You can roll back in seconds if something breaks.
Bind Grafana behind a reverse proxy. Don't expose Grafana directly on port 3000. Use Nginx or Caddy in another LXC container with HTTPS and authentication:
server {
listen 443 ssl;
server_name grafana.yourdomain.local;
location / {
proxy_pass http://192.168.10.22:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
Conclusion
Deploying Grafana, Prometheus, and Loki on Proxmox transforms how you operate your infrastructure. Instead of reactively checking the web UI when something feels slow, you get proactive visibility into every node, every VM, and every log stream — all from a single dashboard.
The entire stack runs comfortably in three LXC containers using around 2.5 GB of RAM combined, which is a tiny tax for the operational insight you gain. Start with the community dashboards, add alerts for the metrics that matter most to your environment, and you'll catch problems before your users do.
If you're running a Proxmox cluster, the same setup scales naturally — just add each node's exporters as additional Prometheus scrape targets and you get a unified view across your entire cluster without deploying additional instances of Grafana or Loki.