diff --git a/.env.example b/.env.example new file mode 100644 index 0000000..29dd395 --- /dev/null +++ b/.env.example @@ -0,0 +1,42 @@ +# ─── uptop configuration ─────────────────────────────────── +# Copy to .env and edit. Only uncomment what you need. + +# ─── Core ────────────────────────────────────────────────── +UPTOP_PORT=23234 # SSH server port +UPTOP_HTTP_PORT=8080 # HTTP port (status page, push endpoints, metrics) +UPTOP_DB_TYPE=sqlite # sqlite or postgres +UPTOP_DB_DSN=/data/uptop.db # File path (SQLite) or connection string (Postgres) + +# ─── Security ───────────────────────────────────────────── +# UPTOP_ADMIN_KEY=ssh-ed25519 AAAA... you@host # Seed first admin user on startup +# UPTOP_KEYS=/data/authorized_keys # Path to authorized_keys file (one key per line) +# UPTOP_SSH_HOST_KEY=/data/.ssh/id_ed25519 # SSH host key path (auto-generated if missing) +# UPTOP_ENCRYPTION_KEY= # AES-256-GCM key for alert credentials (64 hex chars) +# # Generate: openssl rand -hex 32 +# # Without this, alert passwords/tokens stored in plaintext +# UPTOP_CLUSTER_SECRET= # Shared key for cluster API + import/export auth + +# ─── Status Page ────────────────────────────────────────── +# UPTOP_STATUS_ENABLED=false # Enable public status page at /status +# UPTOP_STATUS_TITLE=System Status # Status page heading + +# ─── TLS ────────────────────────────────────────────────── +# UPTOP_TLS_CERT= # Path to TLS certificate (enables HTTPS) +# UPTOP_TLS_KEY= # Path to TLS private key + +# ─── Clustering (leader/follower) ──────────────────────── +# See docs/clustering.md for setup guides. +# UPTOP_CLUSTER_MODE=leader # leader, follower, or probe +# UPTOP_PEER_URL= # Leader HTTP URL (required for follower and probe) + +# ─── Distributed Probing ───────────────────────────────── +# UPTOP_NODE_ID= # Unique node identifier (required for probe mode) +# UPTOP_NODE_NAME= # Human-readable node name +# UPTOP_NODE_REGION= # Region tag (e.g. us-east, eu-west) for monitor routing +# UPTOP_AGG_STRATEGY=any-down # How multi-probe results combine: any-down, majority-down, all-down + +# ─── Advanced ───────────────────────────────────────────── +# UPTOP_INSECURE_SKIP_VERIFY=false # Skip TLS cert verification on monitored targets +# UPTOP_ALLOW_PRIVATE_TARGETS=false # Allow monitoring RFC1918/loopback addresses +# UPTOP_METRICS_PUBLIC=false # Expose /metrics without auth +# UPTOP_CORS_ORIGIN= # Access-Control-Allow-Origin for /status/json diff --git a/.gitignore b/.gitignore index a3bc8dd..9037dde 100644 --- a/.gitignore +++ b/.gitignore @@ -25,3 +25,4 @@ authorized_keys tmp *.local.json *.local.md +.env diff --git a/README.md b/README.md index 4099255..bd3938d 100644 --- a/README.md +++ b/README.md @@ -101,6 +101,8 @@ go install gitea.lerkolabs.com/lerkolabs/uptop/cmd/uptop@latest +**Upgrading:** Pull the new image (or binary) and restart. Database migrations run automatically on startup. + ## Config as code Export your current monitors: @@ -129,12 +131,30 @@ Full reference in [docs/config-as-code.md](docs/config-as-code.md). | `UPTOP_DB_DSN` | `uptop.db` | Database path or connection string | | `UPTOP_STATUS_ENABLED` | `false` | Enable public status page | | `UPTOP_STATUS_TITLE` | `System Status` | Status page title | -| `UPTOP_CLUSTER_MODE` | `leader` | `leader` or `follower` | +| `UPTOP_ENCRYPTION_KEY` | | AES-256-GCM key for alert credentials ([details](#encryption)) | +| `UPTOP_CLUSTER_MODE` | `leader` | `leader`, `follower`, or `probe` | | `UPTOP_PEER_URL` | | Leader URL for follower nodes | | `UPTOP_CLUSTER_SECRET` | | Shared key for cluster + API auth | | `UPTOP_INSECURE_SKIP_VERIFY` | `false` | Skip TLS verification for checks | +| `UPTOP_ALLOW_PRIVATE_TARGETS` | `false` | Allow monitoring RFC1918/loopback addresses | | `UPTOP_ADMIN_KEY` | | SSH public key seeded as first admin on startup | +See [`.env.example`](.env.example) for all options including TLS, probes, and advanced settings. + +### Encryption + +Set `UPTOP_ENCRYPTION_KEY` to encrypt alert credentials (SMTP passwords, webhook URLs, API tokens) at rest with AES-256-GCM. Generate a key: + + openssl rand -hex 32 + +Without this, credentials are stored as plaintext in the database. uptop warns on startup if unset. To encrypt credentials on an existing install, run `uptop migrate-secrets` with the key set. + +## Clustering + +uptop supports three modes: **leader** (default single node), **follower** (HA failover — takes over if the leader goes down), and **probe** (stateless distributed checks from multiple regions). + +See [docs/clustering.md](docs/clustering.md) for setup guides, or the working examples in [`deploy/`](deploy/). + ## Migrating from Uptime Kuma Export your Kuma backup JSON, then: diff --git a/docs/clustering.md b/docs/clustering.md new file mode 100644 index 0000000..7af982a --- /dev/null +++ b/docs/clustering.md @@ -0,0 +1,80 @@ +# Clustering + +uptop supports three deployment modes for different reliability and coverage needs. + +## Single node (default) + +Out of the box, uptop runs as a standalone leader. One process, one database, runs all checks. No clustering config needed. + +## Leader + follower (HA failover) + +A follower is a standby replica that takes over if the leader goes down. + +**How it works:** +- The follower polls the leader's `/api/health` endpoint every 5 seconds +- After 3 consecutive failures (15 seconds), the follower promotes itself and starts running checks +- When the leader recovers, the follower detects it and goes back to standby +- Both nodes have their own database — they do not share state + +**Required env vars:** + +| Node | Variable | Value | +|------|----------|-------| +| Both | `UPTOP_CLUSTER_SECRET` | Same shared secret | +| Follower | `UPTOP_CLUSTER_MODE` | `follower` | +| Follower | `UPTOP_PEER_URL` | Leader's HTTP URL (e.g. `http://leader:8080`) | + +See [`deploy/docker-compose.cluster.yml`](../deploy/docker-compose.cluster.yml) for a working example. + +## Leader + probes (distributed monitoring) + +Probes are lightweight, stateless nodes that run checks from different locations and report results back to the leader. + +**How it works:** +- A probe registers with the leader on startup +- Every 30 seconds, it fetches check assignments filtered by its region +- It runs the assigned checks (up to 10 concurrent) and posts results back +- The leader aggregates results from all probes and triggers alerts based on the aggregation strategy +- Probes have no database, no UI, and no configuration of their own + +**Required env vars:** + +| Node | Variable | Value | +|------|----------|-------| +| Both | `UPTOP_CLUSTER_SECRET` | Same shared secret | +| Leader | `UPTOP_AGG_STRATEGY` | `any-down`, `majority-down`, or `all-down` | +| Probe | `UPTOP_CLUSTER_MODE` | `probe` | +| Probe | `UPTOP_PEER_URL` | Leader's HTTP URL | +| Probe | `UPTOP_NODE_ID` | Unique identifier (e.g. `probe-us-east`) | +| Probe | `UPTOP_NODE_REGION` | Region tag matching monitor assignments | + +Optional: `UPTOP_NODE_NAME` for a human-readable label in the TUI. + +See [`deploy/docker-compose.probe.yml`](../deploy/docker-compose.probe.yml) for a multi-region example. + +## Aggregation strategies + +When multiple probes check the same monitor, the leader combines their results: + +| Strategy | Behavior | +|----------|----------| +| `any-down` (default) | DOWN if **any** probe reports down | +| `majority-down` | DOWN if **most** probes report down | +| `all-down` | DOWN only if **all** probes report down | + +Set via `UPTOP_AGG_STRATEGY` on the leader. + +## Follower vs probe + +| | Follower | Probe | +|---|---|---| +| **Purpose** | Failover / redundancy | Distributed checks from multiple regions | +| **Database** | Own database (independent) | None (stateless) | +| **Runs checks** | Only when leader is down | Always, on assigned monitors | +| **Scales to** | 1 follower per leader | Many probes per leader | + +## Security + +- Set `UPTOP_CLUSTER_SECRET` on all nodes. Without it, cluster API endpoints are unauthenticated. +- Secrets are sent in HTTP headers (`X-Upkeep-Secret`). Use TLS or a reverse proxy for production. +- uptop warns on startup if the cluster secret is missing or if cluster mode is active without TLS.