Commit Graph

69 Commits

Author SHA1 Message Date
lerko fa96c5fd3f fix(tui): render sparkline 2 chars narrower than column to prevent wrapping
CI / test (pull_request) Successful in 3m5s
CI / lint (pull_request) Failing after 1m12s
CI / vulncheck (pull_request) Successful in 56s
Cell padding inside the table causes sparkline content at full column
width to wrap. Subtract 2 from sparkWidth for content rendering.
2026-05-28 16:01:44 -04:00
lerko c487a8eb26 fix(tui): compute table Width from column sum, not terminal width
CI / test (pull_request) Successful in 2m47s
CI / lint (pull_request) Failing after 1m6s
CI / vulncheck (pull_request) Successful in 56s
Table Width was set to terminal width, forcing lipgloss to redistribute
surplus space into columns like #. Now computed from sum of column widths
+ border overhead. Table is exactly as wide as needed, capped at terminal.
2026-05-28 15:57:33 -04:00
lerko 5401266e83 refactor(tui): two-tier responsive table layout (compact/wide at 120 cols)
CI / test (pull_request) Successful in 2m54s
CI / lint (pull_request) Failing after 1m12s
CI / vulncheck (pull_request) Successful in 56s
Replace continuous surplus distribution with two fixed layouts per table.
Breakpoint at 120 columns — matches how btop/k9s do it.

Compact (<120): short headers (LAT, UP%, RT, ST, MON, SENT, VER),
  tight fixed widths, no surplus guessing.

Wide (≥120): full headers (LATENCY, UPTIME, RETRIES, STATUS, MONITORS,
  LAST SENT, VERSION), generous widths.

Sites tab keeps content-aware NAME sizing + sparkline flex.
All other tabs (Alerts, Maint, Nodes, Users) use simple fixed tiers.

Removed old computeTableLayout/colDef/tierCol/pickTier — no longer needed.
2026-05-28 15:50:23 -04:00
lerko a84f4894f8 fix(tui): maint tab — MONITORS is flex, dates get room, time adapts to width
CI / test (pull_request) Successful in 2m46s
CI / lint (pull_request) Failing after 1m7s
CI / vulncheck (pull_request) Successful in 51s
TITLE was flex eating all space while dates/monitors squeezed. Flipped:
MONITORS is now flex, TITLE fixed (12-24), dates min 14, STATUS min 11.
fmtMaintTime uses compact format (Jan 02) at narrow widths.
2026-05-28 15:40:18 -04:00
lerko d9dcd58b66 fix(tui): maint tab min widths fit 80-column terminals
CI / test (pull_request) Successful in 2m49s
CI / lint (pull_request) Failing after 1m12s
CI / vulncheck (pull_request) Successful in 56s
Reduce minimums so fixedMin=51 (was 71). At narrow: compact headers (ST, MON).
At wide: full headers (STATUS, MONITORS, STARTED) with expanded widths.
2026-05-28 15:31:01 -04:00
lerko 251c723fbd fix(tui): maint tab — bump MONITORS/STARTED/ENDS min widths, respect column width
CI / test (pull_request) Successful in 2m50s
CI / lint (pull_request) Failing after 1m1s
CI / vulncheck (pull_request) Successful in 56s
MONITORS min 13→15 (monitor names can be long, truncate to column width).
STARTED/ENDS min 10→14 (fits '18:30 May 28' = 12 chars + padding).
fmtMaintMonitorW truncates name to actual column width.
2026-05-28 15:24:25 -04:00
lerko eb61a0dd3c fix(tui): increase maint tab min column widths for TYPE/MONITORS/STATUS
CI / test (pull_request) Successful in 2m51s
CI / lint (pull_request) Successful in 1m6s
CI / vulncheck (pull_request) Successful in 51s
TYPE min 10→13 (fits 'maintenance'), MONITORS min 10→13, STATUS min 8→11
(fits 'SCHEDULED'). Prevents word wrapping.
2026-05-28 15:22:18 -04:00
lerko d05bbd007b feat(tui): responsive table layout for all tabs
CI / test (pull_request) Successful in 2m45s
CI / lint (pull_request) Successful in 1m12s
CI / vulncheck (pull_request) Successful in 1m6s
Extract shared computeTableLayout() into table_helpers.go — takes column
definitions with short/full headers, min/max widths, and a flex column
that absorbs surplus space. All tabs now use it:

- Alerts: CONFIG column is flex, NAME/TYPE/SENT expand with width
- Maint: TITLE column is flex, TYPE/MONITORS/STATUS/dates expand
- Nodes: NAME column is flex, REGION/LAST SEEN/VERSION expand
- Users: PUBLIC KEY column is flex, USERNAME expands
- Sites: uses same colDef type (keeps special dual-flex for NAME+HISTORY)

Headers auto-switch short/full based on available width across all tabs.
2026-05-28 15:20:12 -04:00
lerko 217276ca18 fix(tui): correct table border overhead calculation
CI / test (pull_request) Successful in 2m54s
CI / lint (pull_request) Successful in 1m6s
CI / vulncheck (pull_request) Successful in 56s
Was hardcoded to 30 — actual overhead is 2 (borders) + numCols-1
(separators) = 10 for 9 columns. The 20-char gap was being
redistributed by lipgloss into columns like # making them too wide.
2026-05-28 15:11:29 -04:00
lerko 2e489cdc1a fix(tui): apply Width/MaxWidth to header row cells too
CI / test (pull_request) Successful in 2m36s
CI / lint (pull_request) Successful in 1m1s
CI / vulncheck (pull_request) Successful in 54s
Header row was returning bare tableHeaderStyle with no width constraints,
letting lipgloss table-level Width() inflate the # column.
2026-05-28 15:04:38 -04:00
lerko 2569a252ff fix(tui): enforce column MaxWidth to prevent lipgloss redistribution
CI / test (pull_request) Successful in 2m50s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Successful in 56s
lipgloss table with Width(tableWidth) redistributes surplus space across
all columns. Adding MaxWidth() caps each column to its computed width.
Also dump any remaining surplus into the HISTORY sparkline column.
2026-05-28 14:18:32 -04:00
lerko 9121b79582 fix(tui): prevent # and SSL columns from expanding unnecessarily
CI / test (pull_request) Successful in 2m48s
CI / lint (pull_request) Successful in 1m12s
CI / vulncheck (pull_request) Successful in 56s
Set min=max for columns that don't benefit from extra width.
Surplus space goes to sparkline instead.
2026-05-28 13:58:04 -04:00
lerko 2c78c60d08 fix(tui): set explicit NAME column width to match content truncation
CI / test (pull_request) Successful in 2m41s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Successful in 56s
Was width=0 (auto) which let lipgloss over-allocate the column,
causing visible empty space between truncated names and TYPE column.
Now set to nameW explicitly so column width = truncation limit.
2026-05-28 13:52:24 -04:00
lerko c5477c7ef6 fix(tui): size NAME column to actual content, surplus goes to sparkline
CI / test (pull_request) Successful in 2m44s
CI / lint (pull_request) Successful in 1m12s
CI / vulncheck (pull_request) Successful in 56s
Compute max monitor name length and cap NAME column to that + 4 (icon/padding).
Extra space goes to HISTORY sparkline instead of dead whitespace.
2026-05-28 13:39:00 -04:00
lerko ecdb1a6632 fix(tui): increase LAT/UPTIME min column widths to prevent wrapping
CI / test (pull_request) Successful in 2m59s
CI / lint (pull_request) Successful in 1m16s
CI / vulncheck (pull_request) Successful in 1m2s
LAT min 5→7 (fits '142ms' + padding), UPTIME min 5→8 (fits '100.0%' + padding).
2026-05-28 13:35:55 -04:00
lerko 82d7b2942b feat(tui): responsive table columns — expand headers with terminal width
CI / test (pull_request) Successful in 2m48s
CI / lint (pull_request) Successful in 1m17s
CI / vulncheck (pull_request) Successful in 51s
Replace hardcoded column widths with dynamic layout system:
- Each column has short/full header and min/max width
- At narrow terminals: LAT, UP%, RT, compact widths
- At wide terminals: LATENCY, UPTIME, RETRIES, expanded widths
- Surplus space distributed left-to-right across expandable columns
- Headers switch between short/full based on actual column width

Column definitions:
  # (4-6)  TYPE (8-10)  STATUS (8-10)  LAT/LATENCY (5-10)
  UP%/UPTIME (5-8)  SSL (5-7)  RT/RETRIES (5-9)
2026-05-28 13:28:08 -04:00
lerko af5246e777 chore(tui): visual polish — detail sections, column headers, alert detail
CI / test (pull_request) Successful in 2m41s
CI / lint (pull_request) Successful in 1m12s
CI / vulncheck (pull_request) Successful in 51s
Detail panel:
- Grouped fields into sections (ENDPOINT, TIMING, HTTP, CONFIG)
- Omit Timeout when 0 (unconfigured)
- Omit Method when default GET
- Show explicit "200-299" when AcceptedCodes empty

Table:
- LATENCY header → LAT (design short, never truncate)

Alerts:
- Press [i] for alert detail panel: full config, health status,
  send counts, last error
- Keybinding display updated with [i]Info

Bundled remaining UX polish items from screenshot review.
2026-05-28 13:18:27 -04:00
lerko 0aa2f9cd8a feat: alert channel health indicator + test alerts
CI / test (pull_request) Successful in 2m46s
CI / lint (pull_request) Successful in 1m1s
CI / vulncheck (pull_request) Successful in 51s
Track alert delivery health at runtime:
- AlertHealth struct: LastSendAt, LastSendOK, LastError, SendCount, FailCount
- triggerAlert records success/failure after each Send()
- Health data exposed via GetAlertHealth() for TUI

Alerts tab enriched:
- Health dot column: green (OK), red (failed), gray (never sent)
- LAST SENT column: relative time ("2m ago", "never")
- [t] key sends test notification through selected channel

Inspired by Grafana's contact point health columns.
2026-05-27 21:23:06 -04:00
lerko b14d5e19db feat: logs tab overhaul — severity tags, filtering, recovery durations
CI / test (pull_request) Successful in 2m36s
CI / lint (pull_request) Successful in 1m1s
CI / vulncheck (pull_request) Successful in 51s
Logs tab visual overhaul:
- Severity-classified entries: DOWN (red), UP (green), WARN (amber),
  SYS (cyan), info (gray) — rendered as inline tags, not whole-line color
- Column-aligned format: [timestamp] [severity tag] [message]
- Filter toggle (f key): All vs Important only (hides retry noise)
- Header shows entry count, filter state, hidden count

Engine log improvements:
- Recovery messages include downtime duration ("was down 14m")
- LATE transition logged ("heartbeat overdue")
- Push monitor recovery includes downtime duration
2026-05-27 20:14:43 -04:00
lerko 5dc31108f8 feat: proper push monitor lifecycle — PENDING, LATE, DOWN states
CI / test (pull_request) Successful in 2m41s
CI / lint (pull_request) Successful in 1m7s
CI / vulncheck (pull_request) Successful in 46s
Push monitors no longer lie about status:

- PENDING stays until first heartbeat (no auto-promote to UP)
- LATE state (amber) when overdue but within grace period
- DOWN only after grace period expires
- Grace period = interval/2, minimum 60s

RecordHeartbeat now handles all transitions:
- PENDING → UP (first heartbeat, logged)
- LATE → UP (late arrival, logged)
- DOWN → UP (recovery, alert + state change persisted)

TUI updates:
- LATE rendered in amber/warning color
- Status bar shows LATE count separately
- Tab badge shows ⚠ for late monitors
- Sort order: DOWN > LATE > UP > PENDING > PAUSED
- Detail panel shows error for LATE monitors

Inspired by Healthchecks.io state machine (new/up/grace/down).
2026-05-27 19:56:50 -04:00
lerko bc3a44beac feat: show error reason when monitors go DOWN
CI / test (pull_request) Successful in 2m42s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Successful in 51s
Propagate check failure reasons through the entire stack:
- Checker captures specific errors (DNS, timeout, HTTP status, SSL, etc.)
- Engine tracks LastError, StatusChangedAt, LastSuccessAt per monitor
- State transitions persisted to new state_changes table
- Detail panel shows error reason, HTTP code, state duration, last
  success time, and last 5 state change events
- Monitor table shows inline error preview for DOWN services
- Alert messages include error reason
- Probe nodes forward error reasons to leader

15 files changed across models, checker, engine, store, TUI, and probes.
2026-05-27 19:32:30 -04:00
lerko 986f9f1d55 fix(security): phase 4 code quality and low-severity fixes
CI / test (pull_request) Successful in 4m24s
CI / lint (pull_request) Successful in 1m1s
- Fix limitStr to handle multi-byte UTF-8 characters correctly
- Sanitize log messages: strip ANSI escape sequences and newlines
- URL-encode probe node_id instead of string concatenation
- Fix follower resp.Body leak on non-200 responses
- Make SSH host key path configurable via UPTOP_SSH_HOST_KEY env var
- Add HTTP method checks on GET-only endpoints (405 for wrong methods)
- Extract magic numbers into named constants across monitor/store/server
- Standardize error output to stderr for all startup errors
2026-05-26 17:25:47 -04:00
lerko 9d12e3ecf1 chore: complete rename from go-upkeep to uptop
CI / test (pull_request) Successful in 4m26s
CI / lint (pull_request) Successful in 1m11s
- Module path: gitea.lerkolabs.com/lerko/uptop
- Binary: cmd/uptop/
- All imports updated to full module path
- Env vars: UPKEEP_* → UPTOP_*
- Prometheus metrics: upkeep_* → uptop_*
- Default DB: uptop.db
- Docker image: lerko/uptop
- All docs, compose files, CI updated

Only remaining "go-upkeep" reference is the fork attribution in README.
2026-05-24 20:20:35 -04:00
lerko fee84c9363 fix(tui): tighten zebra row contrast for Tokyo Night and Gruvbox
CI / test (pull_request) Successful in 4m48s
CI / lint (pull_request) Successful in 1m11s
Previous ZebraBg was too far from Bg, washing out text on those
themes. Reduced to a 2-step shift for subtle row alternation.
2026-05-24 19:19:51 -04:00
lerko 87edd4aa40 feat(tui): swap light theme for Tokyo Night and Gruvbox
Light theme doesn't work well on dark terminals. Replace with
two proven dark palettes. Now 5 themes: Flexoki Dark, Tokyo Night,
Catppuccin Mocha, Nord, Gruvbox.
2026-05-24 19:10:29 -04:00
lerko 602f1b2c52 feat(tui): add theme system with 4 curated palettes
Flexoki Dark (default), Flexoki Light, Catppuccin Mocha, Nord.
Press T to cycle themes; selection persists in preferences.

All hardcoded colors replaced with theme-driven values.
Dedicated ZebraBg per theme for subtle row striping.
2026-05-24 19:05:40 -04:00
lerko 0a56f01929 fix(tui): guard max retries validator for group type
CI / test (pull_request) Successful in 4m40s
CI / lint (pull_request) Successful in 1m1s
Consistent with interval/timeout validators that already skip for
group monitors. Prevents potential validation block if field is
cleared while editing.
2026-05-24 17:45:19 -04:00
lerko b5b9cc81a5 fix(tui): skip irrelevant field validation by monitor type
URL, SSL threshold, and port validators blocked form progression
when editing monitors that don't use those fields (e.g. ping monitors
failing URL validation, non-SSL sites failing threshold check).

Scope each validator to fire only for its relevant monitor type.
2026-05-24 17:38:40 -04:00
lerko 359cff7292 chore: add golangci-lint config and fix all lint issues
Add .golangci.yml enabling errcheck, staticcheck, govet, gosec,
ineffassign, and unused linters. Fix 66 issues across 16 files:
- Check all unchecked errors (errcheck)
- Use HTTP status constants instead of numeric literals (staticcheck)
- Replace deprecated LineUp/LineDown with ScrollUp/ScrollDown (staticcheck)
- Convert sprintf+write patterns to fmt.Fprintf (staticcheck)
- Add ReadHeaderTimeout to http.Server (gosec)
- Remove unused types and functions (unused)
- Add nolint comments for intentional patterns (InsecureSkipVerify,
  math/rand for jitter, dialect-only SQL formatting)
2026-05-23 22:02:06 -04:00
lerko fb11e9ba85 fix(tui): stable monitor count and universal group icons
Site count in tab label and footer now reflects total monitors
(excluding groups) regardless of collapse state. Down count also
computed from all sites so collapsed groups with down children
still surface in the badge. Replaced Nerd Font folder glyphs
with standard Unicode triangles for cross-font compatibility.
2026-05-23 11:01:34 -04:00
lerko e84b64f8ed feat(tui): zebra striping, detail breadcrumb, sparkline stats, collapse persistence
Add alternating row backgrounds for easier table scanning. Detail panel
now shows breadcrumb path (Sites > Group > Name) and min/avg/max latency
stats below the sparkline. Group collapse state persists across restarts
via new preferences table in both SQLite and Postgres.
2026-05-22 20:53:23 -04:00
lerko 88e4f0ed69 fix(tui): group selection highlight, layout constants, group history graphs
Group rows now show selection background when navigated to. Layout
chrome extracted to named constants to prevent viewport drift. Groups
display aggregate history as dot sparkline (●) distinct from site
bar sparklines, with uptime computed from active children only.
Paused and maintenance children excluded from group aggregates.
2026-05-22 20:26:49 -04:00
lerko dc672d6cba fix(tui): exclude maintenance'd monitors from down count and pulse
Sites badge, status line, and pulse indicator now skip monitors under
maintenance when counting DOWN — consistent with group behavior.
2026-05-22 19:25:27 -04:00
lerko d437f54797 fix(tui): constrain form height to terminal and forward resize events
Forms overflowed past terminal because huh didn't know about the
surrounding chrome (header, footer, padding). Now sets WithHeight()
on every render and forwards WindowSizeMsg during form state.
2026-05-22 19:06:27 -04:00
lerko b146f34d19 feat: add incident management and maintenance windows
Maintenance windows suppress alerts during planned downtime while checks
continue running. Incidents provide informational tracking. Supports
targeting all monitors, single monitor, or group (applies to children).

New Maint tab in TUI with create/end/delete. Status page, JSON API, and
Prometheus metrics all reflect maintenance state.
2026-05-22 18:45:02 -04:00
lerko ea401136a9 fix(tui): correct viewport sizing and dynamic chrome calculation
Replace hardcoded row offset with counted chrome lines, account for
filter bar, and fix log viewport dimensions.
2026-05-22 18:19:08 -04:00
lerko 52c85b11b8 fix(tui): compute uptime from windowed statuses, not running counters 2026-05-16 14:58:34 -04:00
lerko 1eddb851b0 feat(tui): add type icons to sites table
Arrow-style icons per monitor type plus Nerd Font folder icons for
groups (closed when collapsed, open when expanded):
  → http, ↓ push, ↔ ping, ⊡ port, ◆ dns, / group
2026-05-16 14:35:38 -04:00
lerko adf46a1654 fix(tui): increase history buffer to 60 so sparkline fills completely 2026-05-16 14:01:25 -04:00
lerko fc7b6f72e1 fix(tui): sparkline right-aligned — current time at right edge, dots fill left 2026-05-16 13:57:41 -04:00
lerko 1917540731 fix(tui): sparkline now spans full column width 2026-05-16 13:49:20 -04:00
lerko f01533080f feat(tui): split available width evenly between NAME and HISTORY columns 2026-05-16 13:43:34 -04:00
lerko cc9dc24892 fix(tui): sort children by ID before status to prevent map-order shuffling 2026-05-16 13:36:49 -04:00
lerko 426c38ea94 fix(tui): use stable sort to prevent site list shuffling each tick 2026-05-16 13:33:20 -04:00
lerko 22c6022121 feat(tui): DOWN-first sort, health pulse, and site filter
- DOWN/SSL EXP monitors float to top of sites list
- Pulse indicator turns red when any monitor is down, green when healthy
- Press / to filter sites by name, Enter to lock filter, Esc to clear
- Active filter shown in status bar
2026-05-16 13:28:37 -04:00
lerko f2ea0dc758 feat(tui): bordered modals, welcome state, and dynamic name width
- Delete confirmation wrapped in rounded border box with danger color
- Empty sites view shows styled welcome box with onboarding hint
- NAME column width scales with terminal width (13-40 chars)
2026-05-16 12:56:09 -04:00
lerko 3bc8e31b89 fix(tui): make status bar and tab badges visible
- Tab badges now always show count (Sites (12)), not just on failure
- Status bar UP count uses green/red coloring instead of subtle gray
2026-05-16 12:28:09 -04:00
lerko 769954c8f5 feat(tui): add status bar, tab badges, and detail panel
Polish pass for TUI professionalism:
- Status bar replaces generic footer with live stats (UP/DOWN count,
  online probes) plus contextual key hints
- Tab badges show DOWN count on Sites tab and offline count on Nodes tab
- Detail panel (press i) shows full monitor info: URL, latency, uptime,
  SSL, probe results, sparkline — without entering edit mode
2026-05-16 12:25:46 -04:00
lerko 0396acdc59 feat(cluster): add region affinity, Nodes TUI tab, and probe metrics
Phase 3 of distributed probing:
- Add regions column to sites table for per-monitor probe affinity
- Region-filtered probe assignments (empty regions = all probes)
- New Nodes TUI tab showing connected probes with status/region/last-seen
- Regions input field in site form for configuring probe affinity
- Config-as-code support for regions (export/import/diff)
- Prometheus upkeep_probe_up metric with per-node labels
- Reindex TUI tabs: Sites, Alerts, Logs, Nodes, Users
2026-05-16 11:50:16 -04:00
lerko 9e5bb74c5c feat(tui): expose HTTP method and accepted status codes in monitor form
DB fields existed but were never surfaced in the TUI. Adds an HTTP
Settings form group with method select (7 methods) and accepted
codes input, visible only for HTTP monitors.
2026-05-15 15:42:51 -04:00