Commit Graph

16 Commits

Author SHA1 Message Date
lerko 8f9210b451 feat: add --version flag with build metadata injection
Supports `goupkeep version`, `--version`, and `-v`. Prints version,
commit hash, and build date when injected via ldflags. Shows "dev"
for local builds. Dockerfile updated with ARGs for version injection.
2026-05-24 14:14:13 -04:00
lerko 359cff7292 chore: add golangci-lint config and fix all lint issues
Add .golangci.yml enabling errcheck, staticcheck, govet, gosec,
ineffassign, and unused linters. Fix 66 issues across 16 files:
- Check all unchecked errors (errcheck)
- Use HTTP status constants instead of numeric literals (staticcheck)
- Replace deprecated LineUp/LineDown with ScrollUp/ScrollDown (staticcheck)
- Convert sprintf+write patterns to fmt.Fprintf (staticcheck)
- Add ReadHeaderTimeout to http.Server (gosec)
- Remove unused types and functions (unused)
- Add nolint comments for intentional patterns (InsecureSkipVerify,
  math/rand for jitter, dialect-only SQL formatting)
2026-05-23 22:02:06 -04:00
lerko 4891843c94 fix: graceful shutdown for HTTP, SSH servers and database
HTTP and SSH servers now shut down cleanly on SIGINT/SIGTERM with a
30s timeout. Database connection closed via defer. Replaced log.Fatalf
in SSH goroutine with log.Printf + ErrServerClosed check to prevent
unclean process exits.
2026-05-23 13:23:27 -04:00
lerko ed082e4080 feat: persist logs to DB, load on startup 2026-05-16 15:25:08 -04:00
lerko ca5a42314f feat(cluster): add probe execution mode, check extraction, and result aggregation
Phase 2 of distributed probing:
- Extract check logic into standalone RunCheck() for use by probes
- Add probe cluster mode: stateless nodes that fetch assignments, execute
  checks, and report results to the leader
- Add multi-node result aggregation with configurable strategy
  (any-down, majority-down, all-down)
- Leader ingests probe results into engine live state and triggers alerts
- New env vars: UPKEEP_NODE_ID, UPKEEP_NODE_NAME, UPKEEP_NODE_REGION,
  UPKEEP_AGG_STRATEGY
- Example docker-compose.probe.yml with leader + 2 regional probes
2026-05-16 11:19:57 -04:00
lerko 5b01b9ee30 feat(config): add config-as-code YAML import/export
Add declarative config-as-code support via YAML files. Monitors and
alerts can be exported, version controlled, and applied across instances.

- goupkeep export [-o file.yaml] dumps current state
- goupkeep apply -f file.yaml creates/updates to match desired state
- --dry-run shows planned changes without applying
- --prune deletes monitors/alerts not in the YAML
- Matching by name, alert references by name, nested group children
- CLI refactored to subcommands (apply, export, serve) with backward compat
- 24 tests covering apply, export, validation, round-trip idempotency
2026-05-15 20:40:49 -04:00
lerko f023e38fdc refactor(monitor): encapsulate engine state, add graceful shutdown and tests
Replace all monitor package-level mutable state with Engine struct.
All state (liveState, logStore, histories, tokenIndex, HTTP clients)
is now encapsulated in Engine, created via NewEngine(store).

Key changes:
- Engine struct holds all monitor state with proper mutex protection
- Engine.Start(ctx) and monitorRoutine respect context cancellation
  for graceful shutdown — no more leaked goroutines
- cluster.runFollowerLoop also respects context for clean exit
- Token index (map[string]int) for O(1) push heartbeat lookup,
  replacing O(n) linear scan through LiveState
- UpdateSiteConfig preserves 8 runtime fields instead of copying
  17 config fields individually
- triggerAlert goroutines get 30s timeout context
- All consumers (TUI, server, cluster, main) receive *Engine via
  constructor/parameter — no package-level state access
- main.go creates context.WithCancel, passes to engine and cluster

First test suite: 12 tests across store and alert packages
- Store: CRUD for sites/alerts/users, push token generation,
  import/export round-trip, check history persistence
- Alert: Discord/Slack/Webhook payload format, HTTP 4xx error
  propagation, Ntfy headers, unknown provider returns nil
2026-05-15 08:21:17 -04:00
lerko a6bb9a7aff refactor(core): remove store global singleton, thread store explicitly
Remove store.Get()/SetGlobal()/Current. Store is now passed explicitly
to all consumers via constructor parameters and function arguments.

- TUI Model holds store field, set via InitialModel(isAdmin, store)
- monitor.StartEngine(s) and InitHistoryFromStore(s) accept store
- server.Start(cfg, s) closes over store in HTTP handlers
- main.go threads store to SSH server, TUI, monitor, server
- isKeyAllowed receives store as parameter

No more hidden dependency on package-level mutable state in store pkg.
Monitor package still uses package-level state (LiveState, etc.) — will
be encapsulated into Engine struct in Phase 7.
2026-05-15 00:45:07 -04:00
lerko d4f4012c8a refactor(store): add error returns to all Store interface methods
Every Store method now returns an error. Callers handle errors
gracefully — TUI logs to event log, server returns HTTP 500,
monitor engine logs and retries. All rows.Scan() errors are now
checked in sqlstore.go instead of silently appending corrupt data.

- GetSites, GetAllAlerts, GetAllUsers return ([]T, error)
- GetAlert returns (AlertConfig, error) instead of (AlertConfig, bool)
- AddSite, UpdateSite, DeleteSite, etc. all return error
- SaveCheck, LoadAllHistory, ExportData return error
- ~25 caller sites updated across tui, server, monitor, main
2026-05-15 00:37:20 -04:00
lerko ab75f61c6b refactor(store): unify SQLite and Postgres into dialect-based SQLStore
Extract shared SQLStore with Dialect interface for the ~5% that
differs between backends (DDL, placeholders, sequence resets).

- New dialect.go: Dialect interface + placeholder rewriter (? → $N)
- New sqlstore.go: single implementation of all 19 Store methods
- sqlite.go: reduced from 286 to 83 lines (SQLiteDialect only)
- postgres.go: reduced from 266 to 78 lines (PostgresDialect only)
- main.go: use NewSQLiteStore/NewPostgresStore constructors

Zero CRUD logic duplication. Every future schema change written once.
2026-05-15 00:31:44 -04:00
lerko 4d5116644f fix(core): correctness and robustness fixes across all subsystems
- Move status page template to package-level template.Must (panic on
  parse error at init instead of nil deref at runtime)
- Fix XSS in import error responses (log detail server-side, return
  generic message to client)
- Handle ListenAndServe errors in HTTP and SSH servers
- Use defer resp.Body.Close() in all alert providers, check
  json.Marshal errors
- Share HTTP clients across checks instead of creating per-request
- Use http.NewRequestWithContext for per-site timeout control
- Support HTTP method field (was always GET despite DB storing method)
- Implement AcceptedCodes validation (was hardcoded >= 400 despite DB
  storing accepted code ranges)
- Add defer tx.Rollback() to ImportData for transaction safety
2026-05-15 00:00:02 -04:00
lerko e97780ad38 fix(tui,status,store): add delete confirm, input validation, XSS fix, history persistence
Prevent accidental deletes with y/n confirmation dialog. Validate all
numeric form inputs (interval, port, timeout, threshold, retries) with
range checks instead of silently defaulting to zero. Escape user-supplied
data in status page JavaScript to close XSS via monitor names. Persist
check history to new check_history table so sparklines and uptime
percentages survive restarts.
2026-05-14 20:51:06 -04:00
lerko 6d92df4f46 feat(importer): add Uptime Kuma backup converter with CLI and API
Convert Kuma monitorList/notificationList to go-upkeep Backup format.
Maps all monitor types (http, ping, port, dns, group), ntfy notifications
with auth, parent IDs, and alert assignments. Available via --import-kuma
flag and POST /api/import/kuma endpoint.
2026-05-14 17:30:17 -04:00
lerko f06dd5702b feat(models): widen Site struct and DB schema for ping, port, dns, group monitor types
Add Hostname, Port, Timeout, Method, Description, ParentID, AcceptedCodes,
DNSResolveType, DNSServer, and IgnoreTLS fields. Refactor AddSite/UpdateSite
to accept models.Site instead of individual params. Includes DB migrations
for existing databases, per-monitor timeout/TLS in the engine, new type
options in TUI forms, and TYPE column in the sites table.
2026-05-14 17:10:56 -04:00
lerko 11848ce674 fix(security): harden TLS, timeouts, validation, logging, and token generation
- Default TLS verification on, opt-in UPKEEP_INSECURE_SKIP_VERIFY
- Alert webhooks use 10s timeout client, close response bodies
- URL input validates http/https scheme for HTTP monitors
- Stdlib logs route to stderr instead of discard
- Panic on crypto/rand failure in token generation
- Cluster startup warnings for non-HTTPS and missing secret
- Replace demo SMTP creds with obvious placeholders
- Color-coded log entries and scroll hints in logs tab
2026-05-14 15:28:04 -04:00
lerko 02f0a39d97 feat: initial commit — uptime monitor (forked from go-upkeep)
Go-based uptime monitor with SQLite/Postgres storage, TUI dashboard,
SSH server, alerting, and clustering support.
2026-05-14 11:05:10 -04:00