Proxmox Monitoring with Grafana, Prometheus & Loki

If you're running Proxmox VE in a homelab or production environment and relying on the built-in web UI for monitoring, you're flying half blind. The Proxmox dashboard gives you a snapshot, but it won't alert you when a node's memory climbs to 95%, when a VM starts hammering your NVMe, or when a service inside an LXC starts spewing errors at 3am. A proper observability stack changes that entirely.

This guide walks you through deploying Grafana, Prometheus, and Loki on Proxmox VE — all running inside lightweight LXC containers. By the end you'll have real-time node metrics, VM stats, and log aggregation flowing into a unified Grafana dashboard you can actually act on.

Why Grafana + Prometheus + Loki?

These three tools form the most popular open-source observability stack in existence, and for good reason:

Prometheus scrapes and stores time-series metrics (CPU, RAM, disk I/O, network throughput)
Loki aggregates and indexes logs without the heavyweight indexing of Elasticsearch
Grafana unifies both into dashboards, alerting, and exploration tools

All three are actively maintained, have massive community dashboards you can import, and run comfortably inside LXC containers with modest resource requirements.

Architecture Overview

Here's how the stack fits together on a typical Proxmox node:

┌─────────────────────────────────────────────────────┐
│ Proxmox Node |
│ |
│ ┌──────────────┐ ┌──────────────┐ ┌─────────────┐ │
│ │ pve-exporter │ │ node-export │ │ Promtail │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬──────┘ │
│ │ │ │ │
│ ┌──────▼──────────────────▼──────────┐ │ │
│ │ LXC: Prometheus │◄──────┘ │
│ └──────────────────┬─────────────────┘ │
│ │ │
│ ┌──────────────────▼─────────────────┐ │
│ │ LXC: Loki │◄──────────┘
│ └──────────────────┬─────────────────┘
│ │
│ ┌──────────────────▼─────────────────┐
│ │ LXC: Grafana │
│ └────────────────────────────────────┘
└─────────────────────────────────────────────────────┘

If you're running a Proxmox cluster, the exporters run on each node, but Prometheus, Loki, and Grafana only need to run once.

Step 1: Create the LXC Containers

We'll use three separate LXC containers to keep concerns isolated and make upgrades easier. Ubuntu 22.04 is a solid choice for all three.

Prometheus LXC

In the Proxmox UI, go to Create CT and use these settings:

Template: Ubuntu 22.04
Hostname: prometheus
CPU: 2 cores
RAM: 1024 MB (2048 MB if you have many targets)
Disk: 20 GB (metrics retention storage — increase for longer history)
Network: Static IP on your management VLAN (e.g. 192.168.10.20/24)

Or via CLI:

pct create 200 local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
  --hostname prometheus \
  --cores 2 \
  --memory 1024 \
  --rootfs local-lvm:20 \
  --net0 name=eth0,bridge=vmbr0,ip=192.168.10.20/24,gw=192.168.10.1 \
  --unprivileged 1 \
  --start 1

Repeat for Loki (CT 201, IP 192.168.10.21, 10 GB disk) and Grafana (CT 202, IP 192.168.10.22, 5 GB disk).

Step 2: Install Prometheus

SSH into the Prometheus container and run:

apt update && apt install -y wget

wget https://github.com/prometheus/prometheus/releases/download/v2.51.0/prometheus-2.51.0.linux-amd64.tar.gz
tar xvf prometheus-2.51.0.linux-amd64.tar.gz
mv prometheus-2.51.0.linux-amd64 /opt/prometheus

Create a dedicated system user:

useradd --no-create-home --shell /bin/false prometheus
chown -R prometheus:prometheus /opt/prometheus

Create the systemd service at /etc/systemd/system/prometheus.service:

[Unit]
Description=Prometheus
After=network.target

[Service]
User=prometheus
ExecStart=/opt/prometheus/prometheus \
  --config.file=/opt/prometheus/prometheus.yml \
  --storage.tsdb.path=/opt/prometheus/data \
  --storage.tsdb.retention.time=30d \
  --web.listen-address=0.0.0.0:9090
Restart=always

[Install]
WantedBy=multi-user.target

Enable and start it:

systemctl daemon-reload
systemctl enable --now prometheus

Prometheus should now be accessible at http://192.168.10.20:9090.

Step 3: Install the Proxmox VE Exporter

The pve-exporter exposes Proxmox node and VM metrics to Prometheus. Install it directly on your Proxmox host (not in a container):

apt install -y python3-pip
pip3 install prometheus-pve-exporter

Create a config file at /etc/prometheus/pve.yml with a dedicated read-only Proxmox API user:

default:
  user: pve-exporter@pve
  password: yourpasswordhere
  verify_ssl: false

First, create that API user in Proxmox:

pveum user add pve-exporter@pve --password yourpasswordhere
pveum aclmod / -user pve-exporter@pve -role PVEAuditor

Create the systemd service at /etc/systemd/system/pve-exporter.service:

[Unit]
Description=Proxmox VE Prometheus Exporter
After=network.target

[Service]
ExecStart=pve_exporter --config.file /etc/prometheus/pve.yml
Restart=always

[Install]
WantedBy=multi-user.target

systemctl enable --now pve-exporter

The exporter listens on port 9221 by default. Test it:

curl http://localhost:9221/pve?target=localhost

You should see a wall of pve_* metrics.

Step 4: Install Node Exporter

Node exporter provides hardware and OS-level metrics. Install it on each Proxmox host:

wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz
tar xvf node_exporter-1.7.0.linux-amd64.tar.gz
mv node_exporter-1.7.0.linux-amd64/node_exporter /usr/local/bin/

useradd --no-create-home --shell /bin/false node_exporter

Systemd service at /etc/systemd/system/node_exporter.service:

[Unit]
Description=Node Exporter
After=network.target

[Service]
User=node_exporter
ExecStart=/usr/local/bin/node_exporter
Restart=always

[Install]
WantedBy=multi-user.target

systemctl enable --now node_exporter

Node exporter listens on port 9100.

Step 5: Configure Prometheus Scrape Targets

Edit /opt/prometheus/prometheus.yml on your Prometheus container:

global:
  scrape_interval: 30s
  evaluation_interval: 30s

scrape_configs:
  - job_name: 'node'
    static_configs:
      - targets:
          - '192.168.10.1:9100'  # proxmox-node-1
          - '192.168.10.2:9100'  # proxmox-node-2 (if clustered)

  - job_name: 'pve'
    metrics_path: /pve
    params:
      module: [default]
    static_configs:
      - targets:
          - '192.168.10.1:9221'  # proxmox-node-1
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: '192.168.10.1:9221'

Restart Prometheus to apply:

systemctl restart prometheus

Navigate to http://192.168.10.20:9090/targets to confirm all targets show UP in green.

Step 6: Install and Configure Loki

SSH into the Loki container and install it:

wget https://github.com/grafana/loki/releases/download/v2.9.4/loki-linux-amd64.zip
apt install -y unzip
unzip loki-linux-amd64.zip
mv loki-linux-amd64 /usr/local/bin/loki

useradd --no-create-home --shell /bin/false loki
mkdir -p /opt/loki/data
chown -R loki:loki /opt/loki

Create a minimal config at /opt/loki/loki-config.yml:

auth_enabled: false

server:
  http_listen_port: 3100

ingester:
  lifecycler:
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
  chunk_idle_period: 5m
  chunk_retain_period: 30s

schema_config:
  configs:
    - from: 2024-01-01
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

storage_config:
  boltdb_shipper:
    active_index_directory: /opt/loki/data/index
    cache_location: /opt/loki/data/cache
  filesystem:
    directory: /opt/loki/data/chunks

limits_config:
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h

chunk_store_config:
  max_look_back_period: 0s

table_manager:
  retention_deletes_enabled: false
  retention_period: 0s

Systemd service at /etc/systemd/system/loki.service:

[Unit]
Description=Loki Log Aggregation
After=network.target

[Service]
User=loki
ExecStart=/usr/local/bin/loki --config.file=/opt/loki/loki-config.yml
Restart=always

[Install]
WantedBy=multi-user.target

systemctl enable --now loki

Step 7: Install Promtail on Proxmox Hosts

Promtail ships your Proxmox host logs to Loki. Install it on each Proxmox node:

wget https://github.com/grafana/loki/releases/download/v2.9.4/promtail-linux-amd64.zip
unzip promtail-linux-amd64.zip
mv promtail-linux-amd64 /usr/local/bin/promtail

Create /etc/promtail/config.yml:

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://192.168.10.21:3100/loki/api/v1/push

scrape_configs:
  - job_name: syslog
    static_configs:
      - targets:
          - localhost
        labels:
          job: syslog
          host: proxmox-node-1
          __path__: /var/log/syslog

  - job_name: pve
    static_configs:
      - targets:
          - localhost
        labels:
          job: pve
          host: proxmox-node-1
          __path__: /var/log/pve/tasks/active

Create and enable the systemd service similarly to the others, then restart Promtail.

Step 8: Install and Configure Grafana

SSH into the Grafana container:

apt-get install -y apt-transport-https software-properties-common wget
wget -q -O - https://apt.grafana.com/gpg.key | apt-key add -
echo "deb https://apt.grafana.com stable main" > /etc/apt/sources.list.d/grafana.list
apt update && apt install -y grafana
systemctl enable --now grafana-server

Grafana is now running on port 3000. Access it at http://192.168.10.22:3000 with default credentials admin / admin.

Adding Data Sources

Navigate to Connections > Data Sources and add:

Prometheus — URL: http://192.168.10.20:9090
Loki — URL: http://192.168.10.21:3100

Click Save & Test for each to confirm connectivity.

Step 9: Import Community Dashboards

This is where the stack really pays off. The Grafana dashboard community has done most of the work for you.

Go to Dashboards > Import and enter these IDs:

Dashboard	ID
Node Exporter Full	`1860`
Proxmox VE (pve-exporter)	`10347`
Loki Dashboard	`13639`

For each: enter the ID, click Load, select your Prometheus or Loki data source, and click Import.

Within a few minutes you'll have comprehensive dashboards showing:

Node Exporter Full — CPU steal, memory pressure, disk saturation, network errors, load average
Proxmox VE — VM and LXC CPU/RAM usage, storage pool utilization, node status
Loki — Log streams with live tail, regex filtering, error rate panels

Step 10: Set Up Alerting

Grafana's built-in alerting can notify you via email, Slack, Telegram, PagerDuty, and more. Here's a practical alert for high memory usage:

Open the Node Exporter Full dashboard
Find the Memory Usage panel and click Edit
Go to the Alert tab and click Create alert rule
Set the condition: avg(node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) < 0.10
Set evaluation interval to 1m for 5m
Add a Contact Point under Alerting > Contact points

Useful alert rules to configure:

Node memory available < 10%
ZFS pool usage > 80%
Node CPU usage > 90% sustained for 5 minutes
Proxmox node offline (target down)

Practical Tips for Production Use

Firewall the exporters. Node exporter and pve-exporter expose sensitive system data. Use the Proxmox firewall or iptables to restrict access to only the Prometheus container IP.

# On Proxmox host — allow only Prometheus to scrape
iptables -A INPUT -p tcp --dport 9100 -s 192.168.10.20 -j ACCEPT
iptables -A INPUT -p tcp --dport 9100 -j DROP

Set retention periods. Prometheus defaults to 15 days. The --storage.tsdb.retention.time=30d flag in the systemd service extends this. 30 days on 20 GB is comfortable for a single node.

Use LXC snapshots before upgrades. Before upgrading Grafana or Prometheus, take a snapshot of the container:

pct snapshot 202 before-grafana-upgrade

You can roll back in seconds if something breaks.

Bind Grafana behind a reverse proxy. Don't expose Grafana directly on port 3000. Use Nginx or Caddy in another LXC container with HTTPS and authentication:

server {
    listen 443 ssl;
    server_name grafana.yourdomain.local;

    location / {
        proxy_pass http://192.168.10.22:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Conclusion

Deploying Grafana, Prometheus, and Loki on Proxmox transforms how you operate your infrastructure. Instead of reactively checking the web UI when something feels slow, you get proactive visibility into every node, every VM, and every log stream — all from a single dashboard.

The entire stack runs comfortably in three LXC containers using around 2.5 GB of RAM combined, which is a tiny tax for the operational insight you gain. Start with the community dashboards, add alerts for the metrics that matter most to your environment, and you'll catch problems before your users do.

If you're running a Proxmox cluster, the same setup scales naturally — just add each node's exporters as additional Prometheus scrape targets and you get a unified view across your entire cluster without deploying additional instances of Grafana or Loki.