Profile

PiHook: Hardening my Homelab Monitoring

Published on 2026.05.05

Introduction

Monitoring is the heartbeat of any homelab. After trying several heavy solutions, I decided to build something targeted: PiHook. It’s a Python-based service designed to monitor both my internal services (like Jellyfin and Gitea) and my external footprint, providing real-time alerts without the overhead of a full enterprise stack.

The Core Architecture

PiHook is built for stability and low resource usage, making it perfect for running on a Raspberry Pi or a background container in a Proxmox VM.

Key Features:

  • YAML-Based Configuration: I can add or remove services just by editing a simple services.yaml file.
  • Dual-Zone Monitoring: It distinguishes between local IP services and external DuckDNS/public URLs.
  • Discord Integration: Real-time webhooks notify me the second a service drops or recovers.
  • Persistent History: Every check is logged to a SQLite database, allowing for long-term health analysis.
  • System Awareness: It doesn't just watch services; it monitors the host's CPU, RAM, and Disk usage, even checking for apt updates.

Design Decisions

I wanted the system to be "set and forget." To achieve this, I implemented: - Maintenance Mode: A simple flag file (maintenance.flag) that silences all Discord alerts when I'm intentionally working on the rack. - Escalation Logic: It doesn't spam me on a single blink. It uses an escalation threshold (e.g., 3 consecutive failures) before firing a critical alert. - WSGI Dashboard: A lightweight Flask server provides a quick HTML status table at a glance.

Implementation Snippet

The heart of the service-checking logic handles retries and response time logging:

# Attempt request with retries
for attempt in range(retries + 1):
    try:
        start = time.time()
        r = requests.get(url, timeout=(connect_timeout, read_timeout), verify=verify)
        resp_time = time.time() - start
        if r.ok:
            curr = "UP"
            logging.info(f"✅ {name} is UP (resp_time={resp_time:.2f}s)")
            break
    except Exception as e:
        if attempt == retries:
            curr = "DOWN"

Weekly Reporting

One of my favorite additions was the Weekly CSV Export. Every 7 days, the system prunes old logs and generates a CSV summary of uptime percentages and average response times. It keeps the database lean while giving me enough data to spot trends (like a server that's slowing down over time).

Conclusion

PiHook has become an essential part of my toolkit. It gives me peace of mind knowing that if a hard drive fails or my WAN IP changes, I'll know within minutes. Next on the roadmap: adding more granular Telegram support and integrating Grafana for visual dashboards.

Comments

← Back to Home