Setting Up Monitoring

You have a server running. You SSH in to check on it. But you don’t want to SSH in every time you wonder “is the disk full?” or “did Docker crash?”.

Monitoring collects that data automatically and puts it on a dashboard. You open a browser, see everything at once, and move on.

What to Monitor

Start with these four things:

CPU - Is it pegged? That’s a problem.
Memory - Running out means the kernel starts killing processes.
Disk - Full disk means services stop writing. Logs, databases, Docker images all grow over time.
Network - Is the server reachable? How much traffic is passing through?

Add service-specific metrics later (Docker container status, database connections, Nginx request rates). Start with the basics.

Method 1: Prometheus and Grafana

This is the standard setup in the homelab world. Prometheus collects and stores metrics. Grafana turns them into dashboards. They run as Docker containers.

What You’re Setting Up

Three containers:

Prometheus - scrapes metrics from your server and stores them
Node Exporter - runs on the host and exposes system metrics (CPU, memory, disk)
Grafana - reads from Prometheus and shows dashboards

Steps

1. Create a directory

mkdir ~/monitoring && cd ~/monitoring

2. Create docker-compose.yml

services:
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus-data:/prometheus
    ports:
      - "9090:9090"
    restart: unless-stopped

  node-exporter:
    image: prom/node-exporter:latest
    network_mode: host
    pid: host
    restart: unless-stopped

  grafana:
    image: grafana/grafana:latest
    volumes:
      - grafana-data:/var/lib/grafana
    ports:
      - "3000:3000"
    restart: unless-stopped

volumes:
  prometheus-data:
  grafana-data:

3. Create prometheus.yml

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: "node"
    static_configs:
      - targets: ["localhost:9100"]

Node Exporter exposes metrics on port 9100. Prometheus scrapes them every 15 seconds.

4. Start everything

docker compose up -d

5. Open Grafana

Go to http://your-server-ip:3000. Login with admin / admin. It asks you to change the password.

6. Add Prometheus as a data source

In Grafana: Configuration > Data Sources > Add data source. Select Prometheus. Set the URL to http://prometheus:9090. Click Save & Test.

7. Import a dashboard

Go to Dashboards > Import. Enter ID 1860 (that’s the standard Node Exporter dashboard by rfrail3). Click Load, select your Prometheus data source, and Import.

You now see CPU, memory, disk, and network graphs updating in real time.

What You Can See Now

CPU usage per core
Memory used, available, cached
Disk space and I/O
Network traffic in and out
System load and uptime

Keep this tab open. Check it when something feels slow. You’ll start noticing patterns.

Useful PromQL Queries

Prometheus has its own query language. A few to start with:

# CPU usage percentage
100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# Memory usage percentage
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100

# Disk space remaining
node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"} * 100

Paste these into Grafana’s Explore tab to try them.

Method 2: Netdata

Netdata is the opposite of Prometheus. One command, zero config, instant dashboard. Good for a single server. Great for getting started without learning PromQL.

Steps

bash <(curl -Ss https://my-netdata.io/kickstart.sh)

That’s it. Wait a minute, then open http://your-server-ip:19999.

What You Get

Real-time graphs for everything: CPU, memory, disk, network, processes
Per-process monitoring (see which process is using the most CPU right now)
Alerts pre-configured (disk filling up, high CPU, OOM events)
A clean web UI that updates every second

Netdata uses almost no resources. It’s written in C and uses about 1% CPU and 100MB RAM on a typical server.

When to Use Each

Netdata is for quick checks and real-time debugging. Something feels slow? Open Netdata, look at the spike.

Prometheus + Grafana is for trend data and historical views. Want to know if your disk filled up last week? Prometheus has that answer.

Run both if you want. They don’t conflict.

Your First Alert

Monitoring without alerts means you still have to check the dashboard. Set up one alert: disk space.

Grafana can send alerts to email, Discord, Slack, or any webhook. The simplest: add a Discord webhook and get notified when disk space drops below 20%.

In Grafana: Alerting > Contact points > Add contact point. Select Discord, paste your webhook URL. Then Alerting > Alert rules > Create alert rule. Pick your dashboard panel for disk space, set condition to “when last() is below 20”, and route to your Discord contact point.

Now you don’t check the dashboard. The dashboard checks you.

Next Steps

You can see what your server is doing. Next: Backing Up Your Homelab so you don’t lose anything when something breaks.