Commit Graph

87 Commits

Author SHA1 Message Date
lerko f01533080f feat(tui): split available width evenly between NAME and HISTORY columns 2026-05-16 13:43:34 -04:00
lerko cc9dc24892 fix(tui): sort children by ID before status to prevent map-order shuffling 2026-05-16 13:36:49 -04:00
lerko 426c38ea94 fix(tui): use stable sort to prevent site list shuffling each tick 2026-05-16 13:33:20 -04:00
lerko 22c6022121 feat(tui): DOWN-first sort, health pulse, and site filter
- DOWN/SSL EXP monitors float to top of sites list
- Pulse indicator turns red when any monitor is down, green when healthy
- Press / to filter sites by name, Enter to lock filter, Esc to clear
- Active filter shown in status bar
2026-05-16 13:28:37 -04:00
lerko f2ea0dc758 feat(tui): bordered modals, welcome state, and dynamic name width
- Delete confirmation wrapped in rounded border box with danger color
- Empty sites view shows styled welcome box with onboarding hint
- NAME column width scales with terminal width (13-40 chars)
2026-05-16 12:56:09 -04:00
lerko 3bc8e31b89 fix(tui): make status bar and tab badges visible
- Tab badges now always show count (Sites (12)), not just on failure
- Status bar UP count uses green/red coloring instead of subtle gray
2026-05-16 12:28:09 -04:00
lerko 769954c8f5 feat(tui): add status bar, tab badges, and detail panel
Polish pass for TUI professionalism:
- Status bar replaces generic footer with live stats (UP/DOWN count,
  online probes) plus contextual key hints
- Tab badges show DOWN count on Sites tab and offline count on Nodes tab
- Detail panel (press i) shows full monitor info: URL, latency, uptime,
  SSL, probe results, sparkline — without entering edit mode
2026-05-16 12:25:46 -04:00
lerko 0396acdc59 feat(cluster): add region affinity, Nodes TUI tab, and probe metrics
Phase 3 of distributed probing:
- Add regions column to sites table for per-monitor probe affinity
- Region-filtered probe assignments (empty regions = all probes)
- New Nodes TUI tab showing connected probes with status/region/last-seen
- Regions input field in site form for configuring probe affinity
- Config-as-code support for regions (export/import/diff)
- Prometheus upkeep_probe_up metric with per-node labels
- Reindex TUI tabs: Sites, Alerts, Logs, Nodes, Users
2026-05-16 11:50:16 -04:00
lerko ca5a42314f feat(cluster): add probe execution mode, check extraction, and result aggregation
Phase 2 of distributed probing:
- Extract check logic into standalone RunCheck() for use by probes
- Add probe cluster mode: stateless nodes that fetch assignments, execute
  checks, and report results to the leader
- Add multi-node result aggregation with configurable strategy
  (any-down, majority-down, all-down)
- Leader ingests probe results into engine live state and triggers alerts
- New env vars: UPKEEP_NODE_ID, UPKEEP_NODE_NAME, UPKEEP_NODE_REGION,
  UPKEEP_AGG_STRATEGY
- Example docker-compose.probe.yml with leader + 2 regional probes
2026-05-16 11:19:57 -04:00
lerko ca9faa0acd feat(cluster): add distributed probing foundation — schema, models, and probe APIs
Add node-aware check history and probe registration infrastructure:
- ProbeNode model and nodes table (SQLite + Postgres)
- node_id column on check_history for multi-source tracking
- Store interface: RegisterNode, GetNode, GetAllNodes, DeleteNode, SaveCheckFromNode
- Dialect: UpsertNodeSQL (INSERT OR REPLACE / ON CONFLICT)
- API endpoints: POST /api/probe/register, GET /api/probe/assignments, POST /api/probe/results
- Backward compatible: existing SaveCheck wraps SaveCheckFromNode with empty node_id
2026-05-16 11:05:06 -04:00
lerko 5b01b9ee30 feat(config): add config-as-code YAML import/export
Add declarative config-as-code support via YAML files. Monitors and
alerts can be exported, version controlled, and applied across instances.

- goupkeep export [-o file.yaml] dumps current state
- goupkeep apply -f file.yaml creates/updates to match desired state
- --dry-run shows planned changes without applying
- --prune deletes monitors/alerts not in the YAML
- Matching by name, alert references by name, nested group children
- CLI refactored to subcommands (apply, export, serve) with backward compat
- 24 tests covering apply, export, validation, round-trip idempotency
2026-05-15 20:40:49 -04:00
lerko 9e5bb74c5c feat(tui): expose HTTP method and accepted status codes in monitor form
DB fields existed but were never surfaced in the TUI. Adds an HTTP
Settings form group with method select (7 methods) and accepted
codes input, visible only for HTTP monitors.
2026-05-15 15:42:51 -04:00
lerko b7b8aa6f03 feat(metrics): add Prometheus /metrics endpoint
Zero-dependency Prometheus text exposition format. Exposes monitor
up/down, latency, status code, check timestamps, pause state,
SSL cert expiry, and check counters — all from in-memory state.
2026-05-15 11:26:21 -04:00
lerko 52a54f9c5c feat(alert): add Telegram, PagerDuty, Pushover, Gotify providers
Expand alert provider count from 5 to 9. All new providers use
the shared HTTPProvider with closure-based payload functions.
Includes TUI form support and tests for each provider.
2026-05-15 10:53:38 -04:00
lerko f023e38fdc refactor(monitor): encapsulate engine state, add graceful shutdown and tests
Replace all monitor package-level mutable state with Engine struct.
All state (liveState, logStore, histories, tokenIndex, HTTP clients)
is now encapsulated in Engine, created via NewEngine(store).

Key changes:
- Engine struct holds all monitor state with proper mutex protection
- Engine.Start(ctx) and monitorRoutine respect context cancellation
  for graceful shutdown — no more leaked goroutines
- cluster.runFollowerLoop also respects context for clean exit
- Token index (map[string]int) for O(1) push heartbeat lookup,
  replacing O(n) linear scan through LiveState
- UpdateSiteConfig preserves 8 runtime fields instead of copying
  17 config fields individually
- triggerAlert goroutines get 30s timeout context
- All consumers (TUI, server, cluster, main) receive *Engine via
  constructor/parameter — no package-level state access
- main.go creates context.WithCancel, passes to engine and cluster

First test suite: 12 tests across store and alert packages
- Store: CRUD for sites/alerts/users, push token generation,
  import/export round-trip, check history persistence
- Alert: Discord/Slack/Webhook payload format, HTTP 4xx error
  propagation, Ntfy headers, unknown provider returns nil
2026-05-15 08:21:17 -04:00
lerko 0e6dc774cb refactor(tui): extract shared table rendering, fix cursor bounds
- New table_helpers.go with renderTable() and shared styles
- Remove 4 duplicated style blocks (header/cell/selected/border)
  from tab_alerts.go and tab_users.go
- All 3 tab views now use renderTable() for offset/end calc,
  selected row highlighting, and table construction
- Sites tab keeps siteGroupStyle via StyleOverride callback
- Clamp cursor to list length at end of refreshData() to prevent
  index-out-of-bounds after concurrent list changes
- Fix off-by-one in tab click handler (i <= maxTabs → i < tabCount)
2026-05-15 00:49:14 -04:00
lerko d6f33a4d1f refactor(alert): extract shared HTTPProvider for webhook-based alerts
Discord, Slack, and Webhook providers now use a single HTTPProvider
struct with a PayloadFunc for the only part that differs. Centralizes
response body handling and adds HTTP status code checking (4xx/5xx
now return errors instead of being silently ignored).

Email and Ntfy keep separate implementations (different protocols).
Adding a new HTTP-based alert provider is now a one-line PayloadFunc.
2026-05-15 00:46:05 -04:00
lerko a6bb9a7aff refactor(core): remove store global singleton, thread store explicitly
Remove store.Get()/SetGlobal()/Current. Store is now passed explicitly
to all consumers via constructor parameters and function arguments.

- TUI Model holds store field, set via InitialModel(isAdmin, store)
- monitor.StartEngine(s) and InitHistoryFromStore(s) accept store
- server.Start(cfg, s) closes over store in HTTP handlers
- main.go threads store to SSH server, TUI, monitor, server
- isKeyAllowed receives store as parameter

No more hidden dependency on package-level mutable state in store pkg.
Monitor package still uses package-level state (LiveState, etc.) — will
be encapsulated into Engine struct in Phase 7.
2026-05-15 00:45:07 -04:00
lerko d4f4012c8a refactor(store): add error returns to all Store interface methods
Every Store method now returns an error. Callers handle errors
gracefully — TUI logs to event log, server returns HTTP 500,
monitor engine logs and retries. All rows.Scan() errors are now
checked in sqlstore.go instead of silently appending corrupt data.

- GetSites, GetAllAlerts, GetAllUsers return ([]T, error)
- GetAlert returns (AlertConfig, error) instead of (AlertConfig, bool)
- AddSite, UpdateSite, DeleteSite, etc. all return error
- SaveCheck, LoadAllHistory, ExportData return error
- ~25 caller sites updated across tui, server, monitor, main
2026-05-15 00:37:20 -04:00
lerko ab75f61c6b refactor(store): unify SQLite and Postgres into dialect-based SQLStore
Extract shared SQLStore with Dialect interface for the ~5% that
differs between backends (DDL, placeholders, sequence resets).

- New dialect.go: Dialect interface + placeholder rewriter (? → $N)
- New sqlstore.go: single implementation of all 19 Store methods
- sqlite.go: reduced from 286 to 83 lines (SQLiteDialect only)
- postgres.go: reduced from 266 to 78 lines (PostgresDialect only)
- main.go: use NewSQLiteStore/NewPostgresStore constructors

Zero CRUD logic duplication. Every future schema change written once.
2026-05-15 00:31:44 -04:00
lerko 4d5116644f fix(core): correctness and robustness fixes across all subsystems
- Move status page template to package-level template.Must (panic on
  parse error at init instead of nil deref at runtime)
- Fix XSS in import error responses (log detail server-side, return
  generic message to client)
- Handle ListenAndServe errors in HTTP and SSH servers
- Use defer resp.Body.Close() in all alert providers, check
  json.Marshal errors
- Share HTTP clients across checks instead of creating per-request
- Use http.NewRequestWithContext for per-site timeout control
- Support HTTP method field (was always GET despite DB storing method)
- Implement AcceptedCodes validation (was hardcoded >= 400 despite DB
  storing accepted code ranges)
- Add defer tx.Rollback() to ImportData for transaction safety
2026-05-15 00:00:02 -04:00
lerko 77fa6324f2 style(tui): add fixed column widths to sites table
Use lipgloss StyleFunc to set per-column widths, with NAME as
the flex column absorbing remaining space. History column tied
to sparkWidth for consistency.
2026-05-14 22:07:32 -04:00
lerko c480f519c4 feat(tui): add monitor groups with collapse/expand and tree view
Groups act as visual organizers in the sites table. Monitors can be
assigned to a parent group via the form. Group rows show aggregated
worst-child status, children render with tree chars (├/└), and Space
toggles collapse/expand. Group form hides irrelevant connection and
advanced sections.
2026-05-14 21:15:34 -04:00
lerko cfcd71dabe refactor(tui): replace database ID column with row counter
Display sequential # instead of internal database IDs in sites, alerts,
and users tables for a cleaner view without gaps from deleted records.
2026-05-14 20:57:03 -04:00
lerko e97780ad38 fix(tui,status,store): add delete confirm, input validation, XSS fix, history persistence
Prevent accidental deletes with y/n confirmation dialog. Validate all
numeric form inputs (interval, port, timeout, threshold, retries) with
range checks instead of silently defaulting to zero. Escape user-supplied
data in status page JavaScript to close XSS via monitor names. Persist
check history to new check_history table so sparklines and uptime
percentages survive restarts.
2026-05-14 20:51:06 -04:00
lerko d5ab3a18a4 feat(tui,status): add per-site pause, fix viewport, polish status page
Per-site pause: [p] key toggles pause for selected monitor in TUI.
Paused monitors skip checks, persist to DB, show on status page.

Status page: replace full-page reload with fetch-based DOM updates
to eliminate scroll-jump on refresh. Add summary bar (UP/DOWN/PAUSED
counts), stale-data indicator, and fix SSL EXP CSS class bug.

TUI: constrain tables to terminal width via lipgloss .Width() to
prevent row wrapping that pushed header off-screen. Add MaxHeight
safety net. Bump subtle style from #383838 to #565f89 for
readability on dark terminals.
2026-05-14 18:46:17 -04:00
lerko f17f8c9f93 feat(tui): add mouse wheel scrolling for all tabs 2026-05-14 18:02:59 -04:00
lerko 6d92df4f46 feat(importer): add Uptime Kuma backup converter with CLI and API
Convert Kuma monitorList/notificationList to go-upkeep Backup format.
Maps all monitor types (http, ping, port, dns, group), ntfy notifications
with auth, parent IDs, and alert assignments. Available via --import-kuma
flag and POST /api/import/kuma endpoint.
2026-05-14 17:30:17 -04:00
lerko a1f22af179 feat(alerts): add ntfy notification provider with TUI support
POST to ntfy server/topic with title, priority, and optional basic auth.
TUI alert form includes ntfy type with server URL, topic, priority
selector (1-5), and credential fields.
2026-05-14 17:25:22 -04:00
lerko 93fe372497 feat(monitor): add ping, port, and DNS check routines
Implement checkPing (pro-bing ICMP), checkPort (TCP dial), and checkDNS
(miekg/dns) with per-monitor timeout, configurable DNS record types,
and fallback defaults. Groups skip checks entirely.
2026-05-14 17:22:57 -04:00
lerko f06dd5702b feat(models): widen Site struct and DB schema for ping, port, dns, group monitor types
Add Hostname, Port, Timeout, Method, Description, ParentID, AcceptedCodes,
DNSResolveType, DNSServer, and IgnoreTLS fields. Refactor AddSite/UpdateSite
to accept models.Site instead of individual params. Includes DB migrations
for existing databases, per-monitor timeout/TLS in the engine, new type
options in TUI forms, and TYPE column in the sites table.
2026-05-14 17:10:56 -04:00
lerko 2bf452d429 feat(tui): upgrade alerts tab with lipgloss table, click zones, colored types
Alerts tab now matches sites/users quality: bordered table, click-to-select,
color-coded provider types, and improved config column display.
2026-05-14 15:55:40 -04:00
lerko 11848ce674 fix(security): harden TLS, timeouts, validation, logging, and token generation
- Default TLS verification on, opt-in UPKEEP_INSECURE_SKIP_VERIFY
- Alert webhooks use 10s timeout client, close response bodies
- URL input validates http/https scheme for HTTP monitors
- Stdlib logs route to stderr instead of discard
- Panic on crypto/rand failure in token generation
- Cluster startup warnings for non-HTTPS and missing secret
- Replace demo SMTP creds with obvious placeholders
- Color-coded log entries and scroll hints in logs tab
2026-05-14 15:28:04 -04:00
lerko b7592ee9e5 feat(tui): upgrade users tab with lipgloss table, edit support, role select
Users tab now matches sites/alerts quality: lipgloss bordered table,
click-to-select zones, edit form with role picker, and UpdateUser
support across both store backends.
2026-05-14 15:28:04 -04:00
lerko c24bb7a0d4 fix(tui): forward all msg types to huh forms, improve row selection UX
Huh forms need all message types (timers, resize, etc.) not just
KeyMsg to function. Restructured Update to delegate all messages to
huh when in form state. Fixed selected row style to be visually
distinct from header (white text on darker bg). Moved click zone
from narrow ID cell to wider Name cell for better click targets.
2026-05-14 15:28:04 -04:00
lerko 5a8e016183 feat(tui): enhanced dashboard with lipgloss tables, huh forms, mouse support, and animations
Split monolithic tui.go into per-tab view files. Added sparkline history
tracking, color-coded latency/uptime, rounded bordered tables, Dracula-themed
huh forms with conditional groups, BubbleZone mouse click support for tabs
and rows, and Harmonica spring-physics pulse indicator. Includes --demo flag
for seeding test data.
2026-05-14 15:28:04 -04:00
lerko 02f0a39d97 feat: initial commit — uptime monitor (forked from go-upkeep)
Go-based uptime monitor with SQLite/Postgres storage, TUI dashboard,
SSH server, alerting, and clustering support.
2026-05-14 11:05:10 -04:00