viewLogsTab filtered logViewport.View() — the visible window — so the
entry count showed the window size and hidden lines reappeared while
scrolling. Filter and render now happen at content-set time from
engine.GetLogs(); the view only reads stored counts.
1. UpdateSite handles token-read Scan error instead of ignoring it.
sql.ErrNoRows (nonexistent site) passes through; real DB errors
surface.
2. RunCheck allowPrivate changed from variadic to real bool param.
Dead maxRequestBody duplicate removed from sqlstore.go.
3. Footer help bar documents [Space] for group collapse.
4. adjustCursor unified with clampCursor — one clamping path
instead of two with different semantics.
5. Compose cluster/probe example files annotate hardcoded secrets
with "EXAMPLE ONLY — rotate before use".
6. huhForm.WithHeight moved from View() to handleResize — no longer
mutates form state during render.
7. maxTableRows recalculated on filter enter/exit via recalcLayout()
— was only recalculated on resize, causing off-by-one when the
filter bar appeared/disappeared.
1. Rate limiter cleanup goroutine now stoppable via Stop() channel
instead of looping forever. Prevents goroutine leak in tests.
2. Dead WindowSizeMsg branch in handleFormMsg removed — top-level
Update handles resize before forms see it.
3. Probe results sorted by node ID — map iteration no longer
reorders rows every render.
4. fmtAlertConfig takes models.AlertConfig directly instead of an
anonymous struct the caller builds inline.
5. Backspace no longer aliases delete — d is the documented key.
Prevents accidental delete-confirm on habitual backspace.
6. SLA daily buckets use time.Date day arithmetic instead of
Add(-i*24h) — lands on midnight correctly across DST transitions.
1. Delete braille.go + braille_test.go — dead code, only referenced
by its own test. Can be re-added when latency charts are built.
2. Hoist duplicate `const sparkWidth = 40` (update.go + view_detail.go)
to package-level `detailSparkWidth`. Click-index resolution and
rendering now share one constant.
3. Remove tea.ClearScreen on every resize — caused full-screen flash
during continuous resizes. ctrl+l manual clear kept.
Cursor tracked by site ID instead of positional index. When the
list re-sorts every tick (sites change status), the selection stays
on the same monitor instead of silently jumping to whatever now
occupies that index position.
q now means "back" in detail, history, SLA, and alert-detail views
— consistent with muscle memory from navigating deeper views.
Only the dashboard q quits the app. ctrl+c always quits from
anywhere.
1. Alertless monitors no longer spam error logs — triggerAlert
returns early when alertID <= 0.
2. HTTP response body drained before close — enables connection
reuse via keep-alive instead of fresh TCP+TLS per check.
3. /api/backup/export enforces GET — was the only endpoint
accepting any HTTP method.
4. limitStr guards against max < 3 — prevents negative slice
index panic on very narrow terminals.
5. Filter input accepts multibyte characters — len(msg.Runes)
instead of len(msg.String()) for proper Unicode support.
6. Startup warning corrected — with no UPTOP_CLUSTER_SECRET,
endpoints reject (401), not accept. Warning now says so.
7. UPTOP_KEYS file open failure logged — was silently swallowed,
leaving operators with no admin seeded and no message.
Site now embeds SiteConfig (22 persistent fields) and SiteState
(11 ephemeral runtime fields). Field access unchanged via promotion
— site.Name and site.Status still work.
Store layer deals exclusively in SiteConfig — the DB never sees
runtime state. Engine's liveState keeps full Site composites.
UpdateSiteConfig reduced from 11-line field-by-field copy to
`existing.SiteConfig = cfg`.
RunCheck takes SiteConfig (only needs config fields). Checker is
now statically prevented from reading/writing runtime state.
Backup.Sites changed to []SiteConfig — exports no longer carry
zero-valued runtime fields. Import backward-compatible (json
ignores unknown fields).
New internal/store/storetest/mock.go provides BaseMock implementing
the full Store interface with no-op defaults and optional Func field
overrides. Each test file embeds BaseMock and shadows only the methods
it needs.
Removes ~400 lines of duplicated stub methods across 6 test files.
Adding a Store method now requires one addition (BaseMock) instead
of editing 6 files.
Replace ~150 bare status string comparisons with typed models.Status
constants (StatusUp, StatusDown, StatusPending, StatusLate, StatusStale,
StatusSSLExp). Single IsBroken() method replaces the duplicated
isBroken lambda in monitor.go and isDown function in sla.go.
Adding a new status value (e.g. DEGRADED) now requires one constant
definition instead of grep-and-pray across 16 files.
CheckResult.Status stays string — the checker is the boundary between
raw protocol results and typed status. Cast happens at the edge in
handleStatusChange.
Every Store interface method (except Close) now takes context.Context
as first parameter. All 54 db.Query/Exec/QueryRow calls in SQLStore
replaced with their *Context variants. DB operations now respect
cancellation and deadlines.
Context sources by caller:
- Engine dbWriter/poll/pruner: engine ctx from Start()
- HTTP handlers: r.Context()
- config.Apply/Export: caller-provided ctx
- TUI/main.go init: context.Background()
RunCheck and all sub-checks (HTTP/ping/port/DNS) accept parent ctx.
HTTP checks now inherit shutdown cancellation instead of rooting in
context.Background(). dbWrite.exec takes ctx so the writer goroutine
can cancel stuck DB operations.
DeleteSite/ImportData use BeginTx(ctx) instead of Begin().
The alert detail panel dumped a.Settings raw — SMTP passwords, bot
tokens, API keys on screen and into any recording or screen share. The
table view leaked the PagerDuty routing key, Pushover user key, and
full discord/slack/webhook URLs (the URL path is the credential).
The redaction allowlist moves from internal/server to
models.RedactAlertSettings so the backup export and the TUI render
through one policy. Panel keys are sorted so rows stop reshuffling
every tick; webhook URLs show scheme+host only; keys show
first4…last4.
Deletes, pause toggles, maintenance end, theme/collapse prefs, and all
four form submits wrote to the store synchronously on the UI goroutine;
with busy_timeout=5000 a contended DB froze input for up to 5s.
Writes now run through a writeCmd helper returning writeDoneMsg. The
in-memory engine/model mutations stay in Update so rows react
instantly; the reply logs failures and reloads tab data, so the UI
converges on what was actually written. Closures capture snapshotted
values only — never the model.
The #101 refactor stopped at the tick path; 'h' history and the SLA
view still queried state changes synchronously in Update, freezing the
UI for up to busy_timeout on a contended DB. Both now load through
Cmds with loading placeholders.
Also closes the remaining staleness holes in the async data flow:
- tabDataMsg carries a sequence number; out-of-order replies from
slower earlier loads are dropped instead of overwriting newer data
- history/SLA replies are dropped when the user has navigated to a
different site or period
- the open detail panel refreshes on the tab-data cadence instead of
loading once on entry and going stale
- initSiteHuhForm reads the m.alerts cache instead of hitting the store
applyTheme mutated ~18 package-global lipgloss styles while every SSH
session's tea.Program read them concurrently from its own goroutine.
Pressing T or opening a new connection raced other sessions' View and
bled themes across users.
Styles now live in an immutable per-Model struct built by newStyles;
free formatter helpers that consumed the globals became Model methods.
The TUI ran database queries on the UI goroutine: handleTick called
refreshData every second, which issued four blocking SQLite queries
(GetAllAlerts/GetAllUsers/GetAllNodes/GetAllMaintenanceWindows) and
swallowed their errors; viewDetailPanel ran GetStateChanges — a DB query
— inside View(), on every render (tick, keypress, mouse). A slow disk
stalled input and animation.
Split refreshData into refreshLive() (in-memory engine copies only —
sites + logs — safe every tick) and loadTabDataCmd(), a tea.Cmd that
loads the four DB-backed tables off the UI goroutine and returns a
tabDataMsg. handleTick now refreshes live state every tick but dispatches
the tab-data load only when older than tabRefreshTTL (5s), so tab-bar
counts stay fresh without a per-second query storm. Errors surface to the
log instead of being dropped, and a transient failure keeps the previous
data rather than blanking the view.
The detail panel's state-change history is loaded once on enter via
loadDetailCmd and cached on the model; viewDetailPanel reads the cache,
so View no longer touches the database. Init kicks an initial load so the
dashboard isn't empty on the first frame, and the bare time.Time tick
message is now a named tickMsg (no cross-message collision). The
test-alert handler's raw goroutine becomes a tea.Cmd.
Adds the package's first Update()-driven tests: tab-data load + apply,
error-keeps-previous-data, detail cache with a store-hit counter proving
View does zero IO across repeated renders, and the handleTick throttle.
Full suite green under -race; golangci-lint clean.
Click any sparkline character to see data point details — approximate
time, latency, and up/down status. Esc dismisses tooltip without
leaving detail view. Uses existing BubbleZone infrastructure with
zone-relative coordinate math for index resolution.
Replace misleading relative-only sparkline with dual-channel design:
bar height uses relative scaling (shows stability and anomalies),
color+brightness uses absolute thresholds (shows fast vs slow).
- Add brightness gradient within color bands (dim→bright as latency
increases toward the next threshold)
- Pass row background through sparkline rendering so zebra stripes
and selection highlights carry through ANSI sequences
- Cap sparkline width to 60 (matches maxHistoryLen) and column
width to 62 to eliminate trailing dead space
- Quiet group sparkline: subtle dots for healthy, bold red for down
- Add braille subpixel canvas (ported from meridian) for future
multi-row graph use
- STATUS column shows icon + clean state only (▲ UP, ▼ DOWN, ◆ LATE,
◆ STALE, ◇ PAUSED, ◼ MAINT, ○ PENDING). Error classification
(DNS/TLS/TMO) removed from STATUS — stays in NAME inline hint.
- Detail panel Last Check shows relative time ("12s ago") instead of
absolute timestamp.
- Extract shared fmtTimeAgo() to format.go, consolidate duplicate
formatters in tab_alerts.go and tab_nodes.go.
Sites table with many rows exceeded the fixed content height,
pushing footer down. MaxHeight now clips content that overflows
while Height still pads shorter content upward.
Each tab returned different leading newlines (Sites/tables: 1,
Logs: 3, empty states: varies). TrimSpace content before layout
so JoinVertical controls all spacing. Remove leading \n from
footer since JoinVertical handles gaps.
Replace string concatenation layout with lipgloss.JoinVertical
and fixed-height content area. Footer now stays at the same
vertical position regardless of tab content height. Uses
lipgloss.Height() to dynamically measure header/footer and
fill remaining space.
Logs were dumping all lines directly, pushing the dashboard
footer off screen. Now uses logViewport with proper height
accounting so footer stays visible and scrolling works.
- Extract divider() and emptyState() helpers to format.go
- All empty states now use bordered box with accent color
- Detail and alert detail panels get header/section dividers
- SLA label width 14→16 to match detail/alert panels
- Logs key hints moved from content to dashboard footer
- History/SLA panels use shared divider helper
- Show version in dashboard footer (wired from goreleaser ldflags)
- Cap name column at 35, raise sparkline minimum to 15 chars
- Preserve zebra background on group rows (was lost by style override)
- Group sparkline uses bullet • instead of heavy circle ●
Full-screen SLA report accessible via [s] from detail panel.
Computes uptime%, downtime, outage count, longest outage, MTTR,
and MTBF from state_changes table. Includes daily breakdown with
bar chart, switchable time periods (24h/7d/30d/90d), and
scrollable viewport. LATE/STALE treated as UP for SLA purposes.
New intermediate state between LATE and DOWN at the midpoint of
the grace period. Gives operators earlier warning that a push
monitor has gone quiet. Includes dedicated orange theme color
across all 5 themes and proper styling in dashboard, detail
panel, and history view.
Support Opsgenie Alert API v2 with US/EU endpoint selection,
configurable priority (P1-P5), and GenieKey auth. TUI form
includes API key, priority picker, and EU instance toggle.
Update() routed form and confirm-delete states before handling
time.Time ticks, so those handlers swallowed the tick command and
permanently broke the refresh loop. After opening any form or
delete dialog, the TUI stopped auto-refreshing until restarted.
Move time.Time and WindowSizeMsg handling before state dispatch
so ticks always fire regardless of view state.
Also fix fmtRetries off-by-one: FailureCount=1 displayed as 0/N
instead of 1/N due to an erroneous subtract-one.
- Replace deprecated LineUp/LineDown/HalfViewUp/HalfViewDown with
ScrollUp/ScrollDown/HalfPageUp/HalfPageDown
- Use tagged switch for mouse button dispatch
- Use fmt.Fprintf instead of WriteString(Sprintf)
Full-screen scrollable history view accessible via [h] from detail
panel. Shows all state transitions with computed outage durations,
event density sparkline for flapping detection, and summary stats.
- Detail panel STATE CHANGES now shows outage duration per recovery
- Event density sparkline highlights flapping periods
- Summary footer: event count, outage count, avg outage duration
- Vim-style navigation (j/k/g/G) + mouse scroll in history view
Categorize raw error strings into DNS/TCP/TLS/HTTP/ICMP/TMO/PRIV
so users get instant triage from the monitor list without opening
the detail panel.
- Status column shows DOWN:DNS, DOWN:TLS, DOWN:HTTP, etc.
- Inline NAME column errors prefixed with category tag [DNS], [TLS]
- Detail panel shows connection chain checklist for HTTP monitors
(✓ DNS → ✓ TCP → ✗ TLS → · HTTP) pinpointing failure layer
- All display-side only — no database or model changes
tui.go (1032→164) and tab_sites.go (993→482) violated "small functions"
and "testable in isolation" standards. Extracted 6 new files by concern:
- format.go: pure formatting functions (fmtLatency, fmtUptime, etc.)
- sparkline.go: sparkline rendering (latency, heartbeat, group)
- update.go: Update method decomposed into 15 named handlers
- view_dashboard.go: View, dashboard composition, tab bar, footer
- view_detail.go: site detail panel
- data.go: data refresh with extracted sortSitesForDisplay/filterSites
Added 17 unit tests for the newly-testable pure functions covering
format, sparkline, sort ordering, and filter logic. No behavioral
changes — strict move-and-extract refactor.
Move Go module from gitea.lerkolabs.com/lerko/uptop to
gitea.lerkolabs.com/lerkolabs/uptop. Updates all imports,
go.mod, goreleaser owner, and README links.
- Add 6 TUI screenshots to assets/ (monitors, alerts, logs, nodes, detail, theme)
- Rewrite README with hero image, badges, collapsible install sections
- Rewrite changelog to match actual CalVer tag history
- VHS tooling extracted to lerko/uptop-vhs
Reviewed-on: lerko/uptop#32
Track alert delivery health at runtime:
- AlertHealth struct: LastSendAt, LastSendOK, LastError, SendCount, FailCount
- triggerAlert records success/failure after each Send()
- Health data exposed via GetAlertHealth() for TUI
Alerts tab enriched:
- Health dot column: green (OK), red (failed), gray (never sent)
- LAST SENT column: relative time ("2m ago", "never")
- [t] key sends test notification through selected channel
Inspired by Grafana's contact point health columns.
Push monitors no longer lie about status:
- PENDING stays until first heartbeat (no auto-promote to UP)
- LATE state (amber) when overdue but within grace period
- DOWN only after grace period expires
- Grace period = interval/2, minimum 60s
RecordHeartbeat now handles all transitions:
- PENDING → UP (first heartbeat, logged)
- LATE → UP (late arrival, logged)
- DOWN → UP (recovery, alert + state change persisted)
TUI updates:
- LATE rendered in amber/warning color
- Status bar shows LATE count separately
- Tab badge shows ⚠ for late monitors
- Sort order: DOWN > LATE > UP > PENDING > PAUSED
- Detail panel shows error for LATE monitors
Inspired by Healthchecks.io state machine (new/up/grace/down).
Propagate check failure reasons through the entire stack:
- Checker captures specific errors (DNS, timeout, HTTP status, SSL, etc.)
- Engine tracks LastError, StatusChangedAt, LastSuccessAt per monitor
- State transitions persisted to new state_changes table
- Detail panel shows error reason, HTTP code, state duration, last
success time, and last 5 state change events
- Monitor table shows inline error preview for DOWN services
- Alert messages include error reason
- Probe nodes forward error reasons to leader
15 files changed across models, checker, engine, store, TUI, and probes.