uptop

Author	SHA1	Message	Date
lerko	0b64d13bb6	fix(security): serve /status/json through a public DTO The handler serialized raw models.Site — LastError internals, Hostname, Port, DNSServer, AlertID, intervals all public, and every future Site field public the day it's added. statusSite now exposes exactly what the status page renders: Name, Type, URL, Status, Paused, LastCheck, Latency. Replaces the vacuous TestStatusJSON_TokensStripped, which injected via UpdateSiteConfig (a no-op for unknown IDs) and asserted over zero sites. The new test seeds the store, starts the engine, waits for live state, and asserts internal fields are absent from the raw JSON.	2026-06-11 12:26:40 -04:00
lerko	809620340e	fix(security): close XFF bypass and three secret-leak paths CI / test (pull_request) Successful in 2m36s Details CI / lint (pull_request) Successful in 56s Details CI / vulncheck (pull_request) Successful in 46s Details Four fixes hardening the secrets and rate-limit posture a prior audit left or that regressed: X-Forwarded-For rate-limit bypass + memory DoS (ratelimit.go): clientIP returned the raw XFF header, so an attacker rotating it minted unlimited distinct limiter keys — never tripping the limit and growing the visitors map without bound. XFF is now honored only when the immediate peer is a configured trusted proxy (UPTOP_TRUSTED_PROXIES, CIDRs or bare IPs), using the right-most non-trusted hop; otherwise the key is the real RemoteAddr. The visitors map is bounded with LRU eviction as defense in depth. Export redaction denylist -> per-provider allowlist (server.go): the old six-key denylist missed the actual credentials — the webhook URL for discord/slack/webhook/ntfy/gotify and api_key for opsgenie — exporting them in the clear. redactByProvider keeps only known-safe keys per provider type and redacts everything else, so unknown/new keys fail safe. ImportData plaintext secrets (sqlstore.go): import inserted raw json.Marshal(settings), bypassing the encryption AddAlert/UpdateAlert use. It now routes through marshalSettings, so a restore with UPTOP_ENCRYPTION_KEY set stores enc:-prefixed ciphertext, not plaintext. Alert error credential leak (alert.go): provider Send returned the raw *url.Error, whose URL carries the secret (Telegram bot token in the path, webhook secrets in the URL); it was persisted to AlertHealth.LastError and shown in the TUI. sanitizeError strips the URL, keeping the operation and underlying cause. Tests cover trusted/untrusted XFF + spoofed-bypass + map bound, the allowlist per provider, encrypted-on-import round-trip, and URL-stripped errors. README documents UPTOP_TRUSTED_PROXIES. Full suite green under -race; golangci-lint clean.	2026-06-10 18:50:19 -04:00
lerko	8b39d4c1a1	fix(monitor): serialize DB writes through a single drained writer CI / test (pull_request) Successful in 2m36s Details CI / lint (pull_request) Successful in 56s Details CI / vulncheck (pull_request) Successful in 51s Details Every check spawned `go e.db.Save*(...)` with the error discarded: a fire-and-forget goroutine per log line, check, state change, and alert health update. SaveLog ran a full-table prune DELETE on every insert and SaveCheck a COUNT + conditional prune on every check, so the hot path amplified each write into several statements. Nothing tracked these goroutines, so at shutdown they raced the store's Close() — writes to a closing DB, silently swallowed. Introduce a single writer goroutine that drains a buffered channel of typed dbWrite values (log/check/state-change/alert-health). Writes are enqueued non-blocking; a saturated queue drops and notes it in the in-memory log rather than blocking the check loop. Write errors are now logged instead of discarded. Retention moves off the hot path: SaveLog and SaveCheck become plain INSERTs, and PruneLogs/PruneCheckHistory/ PruneStateChanges run on a 10-minute timer inside the writer (single keep-newest-N-per-site pass via a window function). state_changes was previously never pruned — now bounded. Add Engine.Stop(): cancels the engine's context, then waits for the writer to drain every buffered write before returning. main wires it in before the deferred store Close() so no write races a closed DB. SQLite gains busy_timeout=5000 and synchronous=NORMAL, applied via the DSN so every pooled connection inherits them (a post-open PRAGMA only touches one connection); WAL moves to the DSN too. :memory: test DBs are left as-is. Tests: writer drains on Stop, Stop is idempotent, and the prune queries keep newest-N per site / N logs on real SQLite. Full suite green under -race.	2026-06-10 18:14:28 -04:00
lerko	21a1563e53	feat(monitor): auto-prune expired maintenance windows CI / test (pull_request) Successful in 2m33s Details CI / lint (pull_request) Successful in 56s Details CI / vulncheck (pull_request) Successful in 50s Details Background goroutine runs every 15 minutes, deletes maintenance windows that expired beyond the retention period (default 7 days). Configurable via UPTOP_MAINT_RETENTION env var (Go duration format). Closes #72	2026-06-05 18:27:42 -04:00
lerko	60592ef810	feat(tui): add SLA reporting view CI / test (pull_request) Successful in 2m35s Details CI / lint (pull_request) Successful in 56s Details CI / vulncheck (pull_request) Successful in 41s Details Full-screen SLA report accessible via [s] from detail panel. Computes uptime%, downtime, outage count, longest outage, MTTR, and MTBF from state_changes table. Includes daily breakdown with bar chart, switchable time periods (24h/7d/30d/90d), and scrollable viewport. LATE/STALE treated as UP for SLA purposes.	2026-06-04 14:24:39 -04:00
lerko	8f17deba67	chore: migrate module path to lerkolabs org CI / test (pull_request) Successful in 2m39s Details CI / lint (pull_request) Successful in 1m6s Details CI / vulncheck (pull_request) Successful in 46s Details Move Go module from gitea.lerkolabs.com/lerko/uptop to gitea.lerkolabs.com/lerkolabs/uptop. Updates all imports, go.mod, goreleaser owner, and README links.	2026-05-29 14:22:49 -04:00
lerko	026e969b74	chore: TUI screenshots, README polish, changelog rewrite (#32 ) CI / test (push) Successful in 2m41s Details CI / lint (push) Successful in 1m11s Details CI / vulncheck (push) Successful in 56s Details - Add 6 TUI screenshots to assets/ (monitors, alerts, logs, nodes, detail, theme) - Rewrite README with hero image, badges, collapsible install sections - Rewrite changelog to match actual CalVer tag history - VHS tooling extracted to lerko/uptop-vhs Reviewed-on: lerko/uptop#32	2026-05-29 17:45:31 +00:00
lerko	bc3a44beac	feat: show error reason when monitors go DOWN CI / test (pull_request) Successful in 2m42s Details CI / lint (pull_request) Successful in 1m11s Details CI / vulncheck (pull_request) Successful in 51s Details Propagate check failure reasons through the entire stack: - Checker captures specific errors (DNS, timeout, HTTP status, SSL, etc.) - Engine tracks LastError, StatusChangedAt, LastSuccessAt per monitor - State transitions persisted to new state_changes table - Detail panel shows error reason, HTTP code, state duration, last success time, and last 5 state change events - Monitor table shows inline error preview for DOWN services - Alert messages include error reason - Probe nodes forward error reasons to leader 15 files changed across models, checker, engine, store, TUI, and probes.	2026-05-27 19:32:30 -04:00
lerko	9d12e3ecf1	chore: complete rename from go-upkeep to uptop CI / test (pull_request) Successful in 4m26s Details CI / lint (pull_request) Successful in 1m11s Details - Module path: gitea.lerkolabs.com/lerko/uptop - Binary: cmd/uptop/ - All imports updated to full module path - Env vars: UPKEEP_* → UPTOP_* - Prometheus metrics: upkeep_* → uptop_* - Default DB: uptop.db - Docker image: lerko/uptop - All docs, compose files, CI updated Only remaining "go-upkeep" reference is the fork attribution in README.	2026-05-24 20:20:35 -04:00
lerko	c6d120d7a4	test(server): add HTTP handler tests for all API endpoints 24 tests covering push heartbeat, health check, backup export/import, probe registration/assignments/results, and status page endpoints. Tests verify auth enforcement (constant-time secret), method validation, input validation, token stripping on status JSON, and maintenance window overrides.	2026-05-23 21:10:32 -04:00

10 Commits