Commit Graph

9 Commits

Author SHA1 Message Date
lerko 023234f4c3 fix(alert): email send respects context deadline
smtp.SendMail ignores context entirely — a blackholed SMTP server
hangs the alert goroutine for the OS TCP timeout (minutes), while the
30s context from the engine does nothing.

Replace with sendMailContext: dials with ctx deadline, sets connection
deadlines, handles STARTTLS and AUTH when advertised. Behavioral
parity with smtp.SendMail but cancellation works throughout.
2026-06-12 12:46:45 -04:00
lerko 809620340e fix(security): close XFF bypass and three secret-leak paths
CI / test (pull_request) Successful in 2m36s
CI / lint (pull_request) Successful in 56s
CI / vulncheck (pull_request) Successful in 46s
Four fixes hardening the secrets and rate-limit posture a prior audit
left or that regressed:

X-Forwarded-For rate-limit bypass + memory DoS (ratelimit.go): clientIP
returned the raw XFF header, so an attacker rotating it minted unlimited
distinct limiter keys — never tripping the limit and growing the visitors
map without bound. XFF is now honored only when the immediate peer is a
configured trusted proxy (UPTOP_TRUSTED_PROXIES, CIDRs or bare IPs), using
the right-most non-trusted hop; otherwise the key is the real RemoteAddr.
The visitors map is bounded with LRU eviction as defense in depth.

Export redaction denylist -> per-provider allowlist (server.go): the old
six-key denylist missed the actual credentials — the webhook URL for
discord/slack/webhook/ntfy/gotify and api_key for opsgenie — exporting
them in the clear. redactByProvider keeps only known-safe keys per
provider type and redacts everything else, so unknown/new keys fail safe.

ImportData plaintext secrets (sqlstore.go): import inserted raw
json.Marshal(settings), bypassing the encryption AddAlert/UpdateAlert
use. It now routes through marshalSettings, so a restore with
UPTOP_ENCRYPTION_KEY set stores enc:-prefixed ciphertext, not plaintext.

Alert error credential leak (alert.go): provider Send returned the raw
*url.Error, whose URL carries the secret (Telegram bot token in the path,
webhook secrets in the URL); it was persisted to AlertHealth.LastError
and shown in the TUI. sanitizeError strips the URL, keeping the operation
and underlying cause.

Tests cover trusted/untrusted XFF + spoofed-bypass + map bound, the
allowlist per provider, encrypted-on-import round-trip, and URL-stripped
errors. README documents UPTOP_TRUSTED_PROXIES. Full suite green under
-race; golangci-lint clean.
2026-06-10 18:50:19 -04:00
lerko f23014ab12 feat(alert): add Opsgenie provider
CI / test (pull_request) Successful in 2m37s
CI / lint (pull_request) Successful in 57s
CI / vulncheck (pull_request) Successful in 51s
Support Opsgenie Alert API v2 with US/EU endpoint selection,
configurable priority (P1-P5), and GenieKey auth. TUI form
includes API key, priority picker, and EU instance toggle.
2026-06-04 13:32:14 -04:00
lerko 8f17deba67 chore: migrate module path to lerkolabs org
CI / test (pull_request) Successful in 2m39s
CI / lint (pull_request) Successful in 1m6s
CI / vulncheck (pull_request) Successful in 46s
Move Go module from gitea.lerkolabs.com/lerko/uptop to
gitea.lerkolabs.com/lerkolabs/uptop. Updates all imports,
go.mod, goreleaser owner, and README links.
2026-05-29 14:22:49 -04:00
lerko 60b30935b3 fix(security): phase 1 critical fixes for public release
CI / test (pull_request) Successful in 4m40s
CI / lint (pull_request) Successful in 1m2s
- Redact PostgreSQL DSN password from stdout/logs
- Harden .dockerignore to exclude .ssh/, .claude/, *.db, *.local files
- SSRF protection: block private/loopback/link-local IPs by default
  (UPTOP_ALLOW_PRIVATE_TARGETS=true to override for homelab use)
- Fix email header injection via CRLF in monitor names
- AES-256-GCM encryption for alert credentials at rest
  (UPTOP_ENCRYPTION_KEY env var, migrate-secrets subcommand)
- TLS support for HTTP server (UPTOP_TLS_CERT/UPTOP_TLS_KEY)
  with HSTS header when TLS enabled
2026-05-25 11:26:47 -04:00
lerko 9d12e3ecf1 chore: complete rename from go-upkeep to uptop
CI / test (pull_request) Successful in 4m26s
CI / lint (pull_request) Successful in 1m11s
- Module path: gitea.lerkolabs.com/lerko/uptop
- Binary: cmd/uptop/
- All imports updated to full module path
- Env vars: UPKEEP_* → UPTOP_*
- Prometheus metrics: upkeep_* → uptop_*
- Default DB: uptop.db
- Docker image: lerko/uptop
- All docs, compose files, CI updated

Only remaining "go-upkeep" reference is the fork attribution in README.
2026-05-24 20:20:35 -04:00
lerko 8e6d97710b fix(alert): add context to Provider.Send, log alert failures
Provider.Send now accepts context.Context for timeout/cancellation.
HTTPProvider and NtfyProvider use NewRequestWithContext so HTTP alerts
respect the 30s deadline. triggerAlert logs send failures and config
load errors instead of silently swallowing them.
2026-05-23 13:18:04 -04:00
lerko 52a54f9c5c feat(alert): add Telegram, PagerDuty, Pushover, Gotify providers
Expand alert provider count from 5 to 9. All new providers use
the shared HTTPProvider with closure-based payload functions.
Includes TUI form support and tests for each provider.
2026-05-15 10:53:38 -04:00
lerko f023e38fdc refactor(monitor): encapsulate engine state, add graceful shutdown and tests
Replace all monitor package-level mutable state with Engine struct.
All state (liveState, logStore, histories, tokenIndex, HTTP clients)
is now encapsulated in Engine, created via NewEngine(store).

Key changes:
- Engine struct holds all monitor state with proper mutex protection
- Engine.Start(ctx) and monitorRoutine respect context cancellation
  for graceful shutdown — no more leaked goroutines
- cluster.runFollowerLoop also respects context for clean exit
- Token index (map[string]int) for O(1) push heartbeat lookup,
  replacing O(n) linear scan through LiveState
- UpdateSiteConfig preserves 8 runtime fields instead of copying
  17 config fields individually
- triggerAlert goroutines get 30s timeout context
- All consumers (TUI, server, cluster, main) receive *Engine via
  constructor/parameter — no package-level state access
- main.go creates context.WithCancel, passes to engine and cluster

First test suite: 12 tests across store and alert packages
- Store: CRUD for sites/alerts/users, push token generation,
  import/export round-trip, check history persistence
- Alert: Discord/Slack/Webhook payload format, HTTP 4xx error
  propagation, Ntfy headers, unknown provider returns nil
2026-05-15 08:21:17 -04:00