Compare commits

..

41 Commits

Author SHA1 Message Date
lerko adf8fed44f docs(changelog): regenerate at HEAD before v0.1.0 tag
CI / test (pull_request) Successful in 1m49s
CI / lint (pull_request) Successful in 1m17s
CI / vulncheck (pull_request) Successful in 51s
Release Binaries / release (push) Successful in 2m22s
Release Docker / docker (push) Successful in 11m3s
2026-06-17 14:00:05 -04:00
lerko c2bfa5ad82 fix: resolve 4 tag-blocking issues for v0.1.0
CI / test (pull_request) Successful in 1m43s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Successful in 51s
- README/CONTRIBUTING quick start: add UPTOP_ADMIN_KEY so SSH works on
  fresh DB, fix single-file go run path that doesn't compile
- apply --dry-run: assign placeholder IDs for new alerts and groups so
  resolveAlertID succeeds when monitors reference not-yet-created alerts
- deploy/*.yml: switch user-facing compose files from broken build
  context to image: lerkolabs/uptop:latest, fix dev context to ..
2026-06-16 20:32:41 -04:00
lerko 2e07e16b45 refactor(tui): restructure site form to 2 type-aware pages
CI / test (pull_request) Successful in 2m2s
CI / lint (pull_request) Successful in 1m17s
CI / vulncheck (pull_request) Successful in 56s
Replace 4-page paginated form (17 fields for HTTP) with a 2-page
type-aware layout. Page 1 shows core fields + type-specific target
(URL for HTTP, Hostname for ping, etc). Page 2 shows configuration
with pre-filled defaults. Group type gets 1 page.

Form rebuilds dynamically when monitor type changes, preserving
all entered values via pointer-bound siteFormData. Focus returns
to the Type select after rebuild so users can continue forward.
WithWidth set explicitly on rebuild to prevent placeholder truncation.
2026-06-16 19:39:52 -04:00
lerko dd34da4d67 fix(tui): sync selectedID on click so refreshLive doesn't revert cursor
handleClick set m.cursor but returned without calling syncSelectedID,
causing the next refreshLive tick to snap the cursor back to the
previously selected site.
2026-06-16 16:58:56 -04:00
lerko de51dde6e6 docs(changelog): regenerate full history for v0.1.0
CI / test (pull_request) Successful in 1m44s
CI / lint (pull_request) Successful in 1m6s
CI / vulncheck (pull_request) Successful in 57s
Replaces the stale CalVer changelog (last updated 2026-06-02, all
referenced tags since deleted) with git-cliff output simulated against
the v0.1.0 tag — matches what release-binaries.yml will publish as
release notes.
2026-06-12 19:38:55 -04:00
lerko e2024bcab1 fix(release): drop body-grep Security grouping, map polish type in cliff
The body=".*security" parser ran before the docs/chore skip rules, so
any commit merely mentioning security in its body landed in the
Security section (e.g. a docs commit and the CalVer->SemVer ci commit).
Real security fixes never reached it — ^fix matches first — so the rule
only ever produced miscategorized entries. Removed.

polish(...) commits had no parser, and git-cliff defaults the group to
the raw type — rendering a stray lowercase "polish" section. Mapped to
Changed.
2026-06-12 19:38:55 -04:00
lerko fb4e14ecd1 docs(readme): fix broken quick start, stale binary targets, canonical origin
- go run cmd/uptop/main.go broke when PR #108 split config.go out of
  main.go — use ./cmd/uptop package path (same fix #104 applied to
  Dockerfile and goreleaser)
- binary section claimed Linux amd64 only; goreleaser ships
  linux/darwin/windows x amd64/arm64 + deb/rpm since #104
- state canonical repo (Gitea) vs mirror (GitHub) up front
- UPTOP_ADMIN_KEY is the only documented first-run path
2026-06-12 19:38:55 -04:00
lerko 9ee5908af5 fix(version): fall back to embedded build info when ldflags absent
CI / test (pull_request) Successful in 1m49s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Successful in 51s
go install module@tag compiles without GoReleaser's ldflags, so
--version reported "dev" and the TUI footer matched. The module
version and vcs stamps are embedded in every binary; read them via
debug.ReadBuildInfo when the ldflags defaults are untouched. Release
builds unchanged — ldflags still win.
2026-06-12 18:41:43 -04:00
lerko eff67332aa fix(release): exclude rc tags from cliff tag_pattern so launch notes span full history
CI / test (pull_request) Successful in 1m48s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Successful in 51s
ignore_tags drops rc-tagged commits from the final tag's section instead
of folding them forward — a simulated v0.1.0 rendered zero commits.
Excluding rc tags from tag_pattern makes finals span back to the last
real tag (full history for v0.1.0, verified 8.8KB in a scratch clone)
and rc tags render [Unreleased] with everything pending.
2026-06-12 17:47:48 -04:00
lerko dc4c5fdf8a fix(release): remove tagged scan image in cleanup step
CI / test (pull_request) Successful in 1m51s
CI / lint (pull_request) Successful in 1m12s
CI / vulncheck (pull_request) Successful in 51s
Release Binaries / release (push) Successful in 2m18s
Release Docker / docker (push) Successful in 10m52s
2026-06-12 17:21:42 -04:00
lerko 96eb3e8185 fix(release): scan gates docker push, rc tags spare :latest, mirror waits for stable assets
CI / test (pull_request) Successful in 1m52s
CI / lint (pull_request) Successful in 1m16s
CI / vulncheck (pull_request) Successful in 50s
rc.2 proved the grype gate was decorative — buildx pushed before the
scan ran, so a red run still shipped the image (and rc tags moved
:latest). Build amd64 locally, scan that, then run the multi-arch push
from the warm builder cache. :latest now only moves on non-rc tags.

mirror-release: poll until the Gitea asset count is stable across two
polls (GoReleaser uploads sequentially — assets>0 could mirror a partial
set) and stretch the timeout to 20 min since the release run can queue
behind the Docker job on the single runner.
2026-06-12 17:20:48 -04:00
lerko 37bf443e29 fix(release): suppress wish GHSA alias in grype, fold rc tags into launch notes
CI / test (pull_request) Successful in 1m44s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Successful in 51s
Release Binaries / release (push) Successful in 2m9s
Release Docker / docker (push) Successful in 10m18s
The existing .grype.yaml ignore listed the wish SCP traversal only by CVE
id; grype's db now matches it as GHSA-xjvp-7243-rg9h and ignores are
exact-id, so the rc.2 scan gate tripped on an already-triaged finding.
List both ids. Vulnerable SCP middleware is never compiled in; real fix
is the charm v2 stack migration (#126).

cliff.toml ignore_tags folds rc tags into the next real release so
v0.1.0's notes cover full history instead of commits-since-rc.2.
2026-06-12 17:02:55 -04:00
lerko f53dfa1c4c fix(release): repair pipeline defects found in v0.1.0-rc.1 rehearsal
CI / test (pull_request) Successful in 1m44s
CI / lint (pull_request) Successful in 1m12s
CI / vulncheck (pull_request) Successful in 50s
Release Binaries / release (push) Successful in 2m11s
Release Docker / docker (push) Failing after 10m3s
Four defects from the rc.1 dress rehearsal:

- Dockerfile pinned golang:1.26-alpine3.23 at a 1.26.3 digest while
  go.mod requires 1.26.4; golang images set GOTOOLCHAIN=local, so the
  build hard-fails. Pin 1.26.4-alpine3.23 explicitly.
- changelog.disable swallowed --release-notes (the flag is consumed by
  the changelog pipe), publishing empty release bodies. Re-enable.
- Remove the Gitea-side GitHub relay step: redundant with
  .github/workflows/mirror-release.yml, which runs on GitHub Actions
  with the built-in token and copies the canonical Gitea assets.
- mirror-release.yml: jq '.body // empty' treats "" as truthy so the
  notes fallback never fired; use select(). Mark rc tags --prerelease.
2026-06-12 16:16:28 -04:00
lerko 4070691407 docs: close pre-release documentation gaps
CI / test (pull_request) Successful in 2m0s
CI / lint (pull_request) Successful in 1m22s
CI / vulncheck (pull_request) Successful in 51s
Release Binaries / release (push) Failing after 8m31s
Release Docker / docker (push) Failing after 2m17s
- Docker compose: ping_group_range sysctl, without which ping monitors
  silently report DOWN in containers
- README: data retention table (1000 checks / 5000 state changes per
  monitor, 200 logs, pruned automatically), group-alert limitation note
- config-as-code: apply is not atomic + re-run convergence, backup
  redaction footgun (/api/backup/export redacts by default), opsgenie
  example (provider count was stale at 9), ntfy auth keys
2026-06-12 15:37:47 -04:00
lerko 6dfd56dcd4 ci(release): skip GitHub relay when GH_MIRROR_TOKEN absent
CI / test (pull_request) Successful in 1m55s
CI / lint (pull_request) Successful in 1m26s
CI / vulncheck (pull_request) Successful in 56s
Relay is on hold until after the rc dress rehearsal — without the
secret the step exits cleanly instead of failing the release run.
Adding the secret later enables it with no workflow change.
2026-06-12 15:31:57 -04:00
lerko 17b5557e23 test(importer): cover malformed Kuma backup input
Importer parses untrusted JSON on the migration onboarding path with no
coverage. Add malformed-input table (truncated, wrong types, null
lists), notification config edge cases, and field-mapping checks.
2026-06-12 15:31:57 -04:00
lerko dc27547ffb docs(monitor): document before-Start contract on engine setters 2026-06-12 15:31:57 -04:00
lerko 83ec6bee42 fix(tui): apply log filter to full log list, not viewport window
viewLogsTab filtered logViewport.View() — the visible window — so the
entry count showed the window size and hidden lines reappeared while
scrolling. Filter and render now happen at content-set time from
engine.GetLogs(); the view only reads stored counts.
2026-06-12 15:31:57 -04:00
lerko d538aad18e ci(release): relay release artifacts to GitHub mirror
GoReleaser publishes to exactly one SCM (Gitea); the push mirror carries
refs but not releases, so GitHub Releases — where the README points —
stayed empty. After the Gitea release, wait for the mirrored tag and
create the GitHub release with the same artifacts and notes.

Needs new Gitea secret GH_MIRROR_TOKEN (GitHub PAT with repo scope).
GITHUB_TOKEN is reserved by Gitea Actions, hence the different name.
2026-06-12 15:31:57 -04:00
lerko ab0a69d06b fix(cluster)!: rename X-Upkeep-Secret header to X-Uptop-Secret
CI / test (pull_request) Successful in 1m57s
CI / lint (pull_request) Successful in 1m27s
CI / vulncheck (pull_request) Successful in 56s
Last upkeep-era name in the wire protocol. Breaking for mixed-version
clusters, but zero installed base exists pre-v0.1.0 — free now, breaking
forever after first tag.
2026-06-12 14:27:44 -04:00
lerko 7bf278e538 docs(cluster): document split-brain limitation in failover
CI / test (pull_request) Successful in 1m56s
CI / lint (pull_request) Successful in 1m16s
CI / vulncheck (pull_request) Successful in 1m1s
No leader fencing exists — during a network partition both nodes run
checks and fire alerts independently. Document the behavior honestly:
duplicate alerts, doubled history, ~15s takeover, converges on heal.
2026-06-12 12:47:03 -04:00
lerko 023234f4c3 fix(alert): email send respects context deadline
smtp.SendMail ignores context entirely — a blackholed SMTP server
hangs the alert goroutine for the OS TCP timeout (minutes), while the
30s context from the engine does nothing.

Replace with sendMailContext: dials with ctx deadline, sets connection
deadlines, handles STARTTLS and AUTH when advertised. Behavioral
parity with smtp.SendMail but cancellation works throughout.
2026-06-12 12:46:45 -04:00
lerko 4328d25f22 fix(security): API import no longer replaces user accounts
Cluster-secret holder could POST a backup with their own admin key to
/api/backup/import, replacing all users — privilege escalation from
cluster-auth to admin. Also, Kuma imports produced zero users but
ImportWipe unconditionally deleted the users table — locking out all
accounts until restart reseeded UPTOP_ADMIN_KEY.

- Server handlers strip data.Users (set nil) before calling ImportData
- ImportData only wipes+replaces users when data.Users != nil
- New ImportWipeUsers dialect method separates user wipe from data wipe
- CLI restore (main.go) unchanged — full import still replaces users
2026-06-12 12:45:16 -04:00
lerko f745dcb21f fix(security): close DNS-rebind TOCTOU on ping/port checks
Pre-check resolved and validated the target IP, then runPingCheck and
runPortCheck re-resolved by hostname — a DNS rebind between the two
lookups could redirect to a private IP, bypassing the SSRF guard.

Resolve once in RunCheck, pin the validated IP, and pass it down:
- runPingCheck: SetIPAddr with pinned IP (skips internal resolve)
- runPortCheck: dial pinned IP literal instead of hostname

HTTP checks are unaffected (SafeDialContext resolves+validates at
dial time). DNS checks validate the server address, not the target.
2026-06-12 12:42:50 -04:00
lerko e99e959b64 ci: switch versioning from CalVer to SemVer
CI / test (pull_request) Successful in 1m59s
CI / lint (pull_request) Successful in 1m27s
CI / vulncheck (pull_request) Successful in 56s
Go module tooling requires v-prefixed semver tags (go install @latest
ignores CalVer tags entirely), GoReleaser errors on non-semver tags,
and zero-padded CalVer months are invalid semver. Old CalVer tags and
releases were deleted due to pre-release security issues; relaunch
tags as v1.0.0.

- Workflow tag triggers: [0-9]* -> v[0-9]* (Gitea + GitHub relay)
- cliff.toml tag_pattern: regex v[0-9].* (was matching everything --
  tag_pattern is regex since git-cliff 1.4, not glob)
- Docker image tags drop the v prefix per registry convention
2026-06-12 11:13:18 -04:00
lerko c3eac80e14 fix(store): chmod SQLite DB files to 0600 on open
CI / test (pull_request) Successful in 1m57s
CI / lint (pull_request) Successful in 1m26s
CI / vulncheck (pull_request) Successful in 1m2s
Bare-metal installs created the DB with process umask (often 022),
making uptop.db, -wal, and -shm world-readable. These files contain
alert credentials and config. Now chmod 0600 after open. Missing
WAL/SHM siblings (not yet created) are silently skipped. Docker
installs were already mitigated by the non-root UID.
2026-06-12 09:51:11 -04:00
lerko 6cf0efed9b fix: seven fixes — token scan, variadic cleanup, TUI layout, compose secrets
CI / test (pull_request) Successful in 1m54s
CI / lint (pull_request) Successful in 1m27s
CI / vulncheck (pull_request) Successful in 1m1s
1. UpdateSite handles token-read Scan error instead of ignoring it.
   sql.ErrNoRows (nonexistent site) passes through; real DB errors
   surface.

2. RunCheck allowPrivate changed from variadic to real bool param.
   Dead maxRequestBody duplicate removed from sqlstore.go.

3. Footer help bar documents [Space] for group collapse.

4. adjustCursor unified with clampCursor — one clamping path
   instead of two with different semantics.

5. Compose cluster/probe example files annotate hardcoded secrets
   with "EXAMPLE ONLY — rotate before use".

6. huhForm.WithHeight moved from View() to handleResize — no longer
   mutates form state during render.

7. maxTableRows recalculated on filter enter/exit via recalcLayout()
   — was only recalculated on resize, causing off-by-one when the
   filter bar appeared/disappeared.
2026-06-12 09:36:00 -04:00
lerko 9115ab720c fix: six small fixes — rate limiter leak, DST SLA, probe sort, TUI cleanup
CI / test (pull_request) Successful in 1m55s
CI / lint (pull_request) Successful in 1m27s
CI / vulncheck (pull_request) Successful in 56s
1. Rate limiter cleanup goroutine now stoppable via Stop() channel
   instead of looping forever. Prevents goroutine leak in tests.

2. Dead WindowSizeMsg branch in handleFormMsg removed — top-level
   Update handles resize before forms see it.

3. Probe results sorted by node ID — map iteration no longer
   reorders rows every render.

4. fmtAlertConfig takes models.AlertConfig directly instead of an
   anonymous struct the caller builds inline.

5. Backspace no longer aliases delete — d is the documented key.
   Prevents accidental delete-confirm on habitual backspace.

6. SLA daily buckets use time.Date day arithmetic instead of
   Add(-i*24h) — lands on midnight correctly across DST transitions.
2026-06-12 09:18:52 -04:00
lerko edfe6122b1 fix: Kuma import tokens/paused, Docker hardening, migrate-secrets idempotency
CI / test (pull_request) Successful in 1m54s
CI / lint (pull_request) Successful in 1m27s
CI / vulncheck (pull_request) Successful in 56s
1. Kuma import now maps push monitor tokens (generates crypto/rand
   token) and paused state (Active=false → Paused=true). Previously
   push monitors imported with empty token sat DOWN forever, and
   paused Kuma monitors came in unpaused and started alerting.

2. Dockerfile adds HEALTHCHECK against /api/health on port 8080.
   Container orchestrators can now detect unhealthy instances.

3. migrate-secrets sets the encryptor before loading alerts, so
   already-encrypted settings are decrypted correctly on second run
   instead of failing with a JSON unmarshal error.

4. docker-compose.yml adds container hardening: read_only filesystem,
   cap_drop ALL, no-new-privileges, tmpfs for /tmp.
2026-06-12 08:39:30 -04:00
lerko 13637ec216 chore(tui): delete dead braille code, hoist sparkWidth, stop resize flash
CI / test (pull_request) Successful in 1m58s
CI / lint (pull_request) Successful in 1m22s
CI / vulncheck (pull_request) Successful in 1m1s
1. Delete braille.go + braille_test.go — dead code, only referenced
   by its own test. Can be re-added when latency charts are built.

2. Hoist duplicate `const sparkWidth = 40` (update.go + view_detail.go)
   to package-level `detailSparkWidth`. Click-index resolution and
   rendering now share one constant.

3. Remove tea.ClearScreen on every resize — caused full-screen flash
   during continuous resizes. ctrl+l manual clear kept.
2026-06-12 08:03:16 -04:00
lerko 916c963663 fix(engine): apply convergence + push/group check history
CI / test (pull_request) Successful in 1m54s
CI / lint (pull_request) Successful in 1m27s
CI / vulncheck (pull_request) Successful in 1m1s
1. Poll loop now fully converges with the DB: updated site configs
   are refreshed via UpdateSiteConfig, and sites removed from the DB
   are evicted from liveState. Previously the loop only added new
   sites — config edits via apply were ignored until restart, and
   pruned sites kept being checked and alerting.

2. Push monitors now record check history on each heartbeat via
   recordCheck. Previously RecordHeartbeat updated state but never
   wrote to check_history — push uptime % and sparklines were empty.

3. Groups record a synthetic check per evaluation tick so they get
   uptime history and sparklines instead of blank displays.
2026-06-11 20:45:30 -04:00
lerko fa56f47f96 fix(tui): track selection by site ID + q means back everywhere
CI / test (pull_request) Successful in 2m5s
CI / lint (pull_request) Successful in 1m26s
CI / vulncheck (pull_request) Successful in 1m2s
Cursor tracked by site ID instead of positional index. When the
list re-sorts every tick (sites change status), the selection stays
on the same monitor instead of silently jumping to whatever now
occupies that index position.

q now means "back" in detail, history, SLA, and alert-detail views
— consistent with muscle memory from navigating deeper views.
Only the dashboard q quits the app. ctrl+c always quits from
anywhere.
2026-06-11 19:34:21 -04:00
lerko f7da69f25f fix(security): SSRF guard gaps + DNS port restriction + metrics auth
CI / test (pull_request) Successful in 1m54s
CI / lint (pull_request) Successful in 1m27s
CI / vulncheck (pull_request) Successful in 1m1s
1. SSRF guard now blocks 0.0.0.0/8 (routes to localhost on Linux)
   and 100.64.0.0/10 (CGNAT). Also rejects unspecified, multicast,
   and loopback IPs via net.IP methods for defense in depth.

2. DNS monitor type no longer bypasses SSRF guard. The DNSServer
   address is resolved and validated against isPrivateIP before use.
   Port restricted to 53 — prevents arbitrary internal port probing
   via crafted DNSServer values.

3. /metrics now default-deny when MetricsPublic is false, regardless
   of whether UPTOP_CLUSTER_SECRET is set. Previously, no secret =
   no auth check = metrics exposed to everyone.
2026-06-11 18:57:37 -04:00
lerko 5d2b7a3e66 fix: seven quick-win bug fixes across engine, server, TUI, CLI
CI / test (pull_request) Successful in 1m55s
CI / lint (pull_request) Successful in 1m27s
CI / vulncheck (pull_request) Successful in 1m1s
1. Alertless monitors no longer spam error logs — triggerAlert
   returns early when alertID <= 0.

2. HTTP response body drained before close — enables connection
   reuse via keep-alive instead of fresh TCP+TLS per check.

3. /api/backup/export enforces GET — was the only endpoint
   accepting any HTTP method.

4. limitStr guards against max < 3 — prevents negative slice
   index panic on very narrow terminals.

5. Filter input accepts multibyte characters — len(msg.Runes)
   instead of len(msg.String()) for proper Unicode support.

6. Startup warning corrected — with no UPTOP_CLUSTER_SECRET,
   endpoints reject (401), not accept. Warning now says so.

7. UPTOP_KEYS file open failure logged — was silently swallowed,
   leaving operators with no admin seeded and no message.
2026-06-11 18:28:32 -04:00
lerko 341d60d2fe refactor: unify logging with log/slog
CI / test (pull_request) Successful in 1m57s
CI / lint (pull_request) Successful in 1m22s
CI / vulncheck (pull_request) Successful in 56s
Replace three uncoordinated logging systems (log.Printf, fmt.Fprintf
to stderr, fmt.Println warnings) with structured slog calls.

68 log calls migrated:
- log.Printf → slog.Error/Warn/Info (45 calls across 5 files)
- fmt.Fprintf(os.Stderr) → slog.Error (23 calls in main.go)

Kept unchanged:
- fmt.Println/Printf for CLI user output (version, banners, import results)
- engine.AddLog for TUI-visible ring buffer (monitoring events)

Store migration diagnostics demoted to slog.Debug (silent at default
info level). HTTP request logging now structured with method/path/
status/duration/ip attributes.
2026-06-11 18:00:19 -04:00
lerko 52ccd7ad91 refactor(models): split Site into SiteConfig + SiteState
CI / test (pull_request) Successful in 1m58s
CI / lint (pull_request) Successful in 1m21s
CI / vulncheck (pull_request) Successful in 1m2s
Site now embeds SiteConfig (22 persistent fields) and SiteState
(11 ephemeral runtime fields). Field access unchanged via promotion
— site.Name and site.Status still work.

Store layer deals exclusively in SiteConfig — the DB never sees
runtime state. Engine's liveState keeps full Site composites.
UpdateSiteConfig reduced from 11-line field-by-field copy to
`existing.SiteConfig = cfg`.

RunCheck takes SiteConfig (only needs config fields). Checker is
now statically prevented from reading/writing runtime state.

Backup.Sites changed to []SiteConfig — exports no longer carry
zero-valued runtime fields. Import backward-compatible (json
ignores unknown fields).
2026-06-11 17:13:09 -04:00
lerko ba4465daa2 refactor(server): extract Server type with named handler methods
CI / test (pull_request) Successful in 1m53s
CI / lint (pull_request) Successful in 1m21s
CI / vulncheck (pull_request) Successful in 1m2s
Replace the 328-line Start() god function with a Server struct +
11 named handler methods. Routes registered in routes(), middleware
applied in one place.

Start() kept as a convenience wrapper (NewServer + Start) so
existing callers don't need to change unless they want the Server
reference.

Each handler is now independently readable and testable without
parsing a 300-line closure nest.
2026-06-11 16:32:38 -04:00
lerko 54790db5c8 refactor(config): consolidate env parsing into appConfig struct
New cmd/uptop/config.go with appConfig struct + parseConfig() that
reads all 25 UPTOP_* env vars in one place with defaults. Replaces
~120 lines of scattered os.Getenv calls in runServe.

runServe now reads cfg := parseConfig() up front. ServerConfig
built via cfg.serverConfig(). Uniform flag > env > default
precedence for port/db-type/dsn via flag defaults from config.
2026-06-11 16:29:47 -04:00
lerko 2b357341c8 refactor(store): shared storetest.BaseMock replaces 5 duplicated mocks
CI / test (pull_request) Successful in 1m57s
CI / lint (pull_request) Successful in 1m16s
CI / vulncheck (pull_request) Successful in 1m1s
New internal/store/storetest/mock.go provides BaseMock implementing
the full Store interface with no-op defaults and optional Func field
overrides. Each test file embeds BaseMock and shadows only the methods
it needs.

Removes ~400 lines of duplicated stub methods across 6 test files.
Adding a Store method now requires one addition (BaseMock) instead
of editing 6 files.
2026-06-11 16:09:29 -04:00
lerko 0974ab2b4c refactor(store): schema_version migration table + DeleteAlert FK fix
Replace the error-string-matching migration runner with a proper
schema_version table. Migrations are now numbered and recorded;
only unapplied versions run. Fresh databases seed at baseline
version (CREATE TABLE already includes all columns).

CREATE TABLE statements updated to include regions (sites) and
node_id (check_history) — previously only added via ALTER.

DeleteAlert now nulls sites.alert_id before deleting, preventing
dangling references that caused every incident to hit the error
path instead of alerting.
2026-06-11 16:02:17 -04:00
lerko f00acbc280 refactor(models): typed Status constants with IsBroken() predicate
Replace ~150 bare status string comparisons with typed models.Status
constants (StatusUp, StatusDown, StatusPending, StatusLate, StatusStale,
StatusSSLExp). Single IsBroken() method replaces the duplicated
isBroken lambda in monitor.go and isDown function in sla.go.

Adding a new status value (e.g. DEGRADED) now requires one constant
definition instead of grep-and-pray across 16 files.

CheckResult.Status stays string — the checker is the boundary between
raw protocol results and typed status. Cast happens at the edge in
handleStatusChange.
2026-06-11 15:56:51 -04:00
67 changed files with 2899 additions and 2004 deletions
+5 -1
View File
@@ -3,7 +3,7 @@ name: Release Binaries
on: on:
push: push:
tags: tags:
- "[0-9]*" - "v[0-9]*"
jobs: jobs:
release: release:
@@ -52,3 +52,7 @@ jobs:
GORELEASER_FORCE_TOKEN: gitea GORELEASER_FORCE_TOKEN: gitea
GITEA_TOKEN: ${{ secrets.RELEASE_TOKEN }} GITEA_TOKEN: ${{ secrets.RELEASE_TOKEN }}
GITEA_API_URL: http://gitea:3000/api/v1 GITEA_API_URL: http://gitea:3000/api/v1
# GitHub release relaying is handled by .github/workflows/mirror-release.yml,
# which runs on GitHub Actions when the push mirror delivers the tag and
# copies this run's Gitea release assets — no PAT needed on this side.
+31 -8
View File
@@ -3,11 +3,11 @@ name: Release Docker
on: on:
push: push:
tags: tags:
- "[0-9]*" - "v[0-9]*"
workflow_dispatch: workflow_dispatch:
inputs: inputs:
tag: tag:
description: "Image tag (e.g. 2026.06.1). Defaults to latest commit SHA." description: "Image tag (e.g. 1.0.0, no v prefix). Defaults to latest commit SHA."
required: false required: false
jobs: jobs:
@@ -27,14 +27,20 @@ jobs:
TAG="${{ github.sha }}" TAG="${{ github.sha }}"
fi fi
else else
# Docker convention: git tag v1.2.3 -> image tag 1.2.3
TAG="${{ github.ref_name }}" TAG="${{ github.ref_name }}"
TAG="${TAG#v}"
fi fi
echo "tag=$TAG" >> "$GITHUB_OUTPUT" echo "tag=$TAG" >> "$GITHUB_OUTPUT"
TAGS="lerkolabs/uptop:${TAG}" TAGS="lerkolabs/uptop:${TAG}"
TAGS="${TAGS},lerkolabs/uptop:sha-${SHORT_SHA}" TAGS="${TAGS},lerkolabs/uptop:sha-${SHORT_SHA}"
# :latest only for real releases — rc rehearsal tags must not move it
if [ "${{ github.ref_type }}" = "tag" ]; then if [ "${{ github.ref_type }}" = "tag" ]; then
TAGS="${TAGS},lerkolabs/uptop:latest" case "$TAG" in
*-*) ;;
*) TAGS="${TAGS},lerkolabs/uptop:latest" ;;
esac
fi fi
echo "tags=$TAGS" >> "$GITHUB_OUTPUT" echo "tags=$TAGS" >> "$GITHUB_OUTPUT"
@@ -50,6 +56,26 @@ jobs:
username: ${{ secrets.DOCKERHUB_USERNAME }} username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }} password: ${{ secrets.DOCKERHUB_TOKEN }}
# Scan must gate the push: build amd64 locally, scan it, and only then run
# the multi-arch push (amd64 layers come from the builder cache, so the
# second build only adds the arm64 work).
- name: Build for scan (amd64, local)
uses: docker/build-push-action@v5
with:
context: .
load: true
platforms: linux/amd64
tags: uptop-scan:${{ steps.meta.outputs.tag }}
build-args: |
VERSION=${{ steps.meta.outputs.tag }}
COMMIT=${{ github.sha }}
BUILD_DATE=${{ github.event.head_commit.timestamp }}
- name: Scan image for CVEs
run: |
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh -s -- -b /usr/local/bin v0.114.0
grype uptop-scan:${{ steps.meta.outputs.tag }} --fail-on critical --output table
- name: Build and push - name: Build and push
uses: docker/build-push-action@v5 uses: docker/build-push-action@v5
with: with:
@@ -64,11 +90,6 @@ jobs:
COMMIT=${{ github.sha }} COMMIT=${{ github.sha }}
BUILD_DATE=${{ github.event.head_commit.timestamp }} BUILD_DATE=${{ github.event.head_commit.timestamp }}
- name: Scan image for CVEs
run: |
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh -s -- -b /usr/local/bin v0.114.0
grype lerkolabs/uptop:${{ steps.meta.outputs.tag }} --fail-on critical --output table
- name: Update Docker Hub description - name: Update Docker Hub description
uses: peter-evans/dockerhub-description@v4 uses: peter-evans/dockerhub-description@v4
with: with:
@@ -79,5 +100,7 @@ jobs:
- name: Cleanup Docker artifacts - name: Cleanup Docker artifacts
if: always() if: always()
run: | run: |
# the scan image is tagged, so image prune won't catch it
docker image rm "uptop-scan:${{ steps.meta.outputs.tag }}" 2>/dev/null || true
docker image prune -f docker image prune -f
docker builder prune -f --keep-storage=2GB docker builder prune -f --keep-storage=2GB
+20 -8
View File
@@ -3,7 +3,7 @@ name: Mirror Release to GitHub
on: on:
push: push:
tags: tags:
- "[0-9]*" - "v[0-9]*"
permissions: permissions:
contents: write contents: write
@@ -19,26 +19,35 @@ jobs:
run: | run: |
API="https://gitea.lerkolabs.com/api/v1/repos/lerkolabs/uptop/releases/tags/${TAG}" API="https://gitea.lerkolabs.com/api/v1/repos/lerkolabs/uptop/releases/tags/${TAG}"
for i in $(seq 1 20); do # 40 x 30s = 20 min: the Gitea release can queue behind the ~18-min
# Docker job on the single runner. Asset count must hold steady for
# two consecutive polls — GoReleaser uploads one file at a time, and
# mirroring mid-upload would publish a partial asset set.
PREV_COUNT=0
ASSET_COUNT=0
for i in $(seq 1 40); do
if RESPONSE=$(curl -sf "$API" 2>/dev/null); then if RESPONSE=$(curl -sf "$API" 2>/dev/null); then
ASSET_COUNT=$(echo "$RESPONSE" | jq '.assets | length') ASSET_COUNT=$(echo "$RESPONSE" | jq '.assets | length')
if [ "$ASSET_COUNT" -gt 0 ]; then if [ "$ASSET_COUNT" -gt 0 ] && [ "$ASSET_COUNT" -eq "$PREV_COUNT" ]; then
echo "Found release with $ASSET_COUNT assets" echo "Found release with $ASSET_COUNT assets (stable)"
break break
fi fi
echo "Release exists but no assets yet... attempt $i/20" echo "Release has $ASSET_COUNT assets (was $PREV_COUNT)... attempt $i/40"
PREV_COUNT="$ASSET_COUNT"
else else
echo "Waiting for Gitea release... attempt $i/20" echo "Waiting for Gitea release... attempt $i/40"
fi fi
sleep 30 sleep 30
done done
if [ -z "$RESPONSE" ] || [ "$ASSET_COUNT" -eq 0 ]; then if [ -z "$RESPONSE" ] || [ "$ASSET_COUNT" -eq 0 ]; then
echo "::error::Gitea release for ${TAG} not found or has no assets after 10 minutes" echo "::error::Gitea release for ${TAG} not found or has no assets after 20 minutes"
exit 1 exit 1
fi fi
echo "$RESPONSE" | jq -r '.body // empty' > /tmp/release-notes.md # select() so an empty-string body produces an empty file — `// empty`
# treats "" as truthy and wrote a blank line, defeating this fallback.
echo "$RESPONSE" | jq -r '.body | select(. != null and . != "")' > /tmp/release-notes.md
if [ ! -s /tmp/release-notes.md ]; then if [ ! -s /tmp/release-notes.md ]; then
echo "Release ${TAG} from [Gitea](https://gitea.lerkolabs.com/lerkolabs/uptop/releases/tag/${TAG})" > /tmp/release-notes.md echo "Release ${TAG} from [Gitea](https://gitea.lerkolabs.com/lerkolabs/uptop/releases/tag/${TAG})" > /tmp/release-notes.md
@@ -62,8 +71,11 @@ jobs:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
TAG: ${{ github.ref_name }} TAG: ${{ github.ref_name }}
run: | run: |
PRERELEASE=""
case "$TAG" in *-*) PRERELEASE="--prerelease" ;; esac
gh release create "$TAG" \ gh release create "$TAG" \
--repo "$GITHUB_REPOSITORY" \ --repo "$GITHUB_REPOSITORY" \
--title "$TAG" \ --title "$TAG" \
--notes-file /tmp/release-notes.md \ --notes-file /tmp/release-notes.md \
$PRERELEASE \
/tmp/assets/* /tmp/assets/*
+5 -2
View File
@@ -8,6 +8,7 @@ release:
gitea: gitea:
owner: lerkolabs owner: lerkolabs
name: uptop name: uptop
prerelease: auto
builds: builds:
- main: ./cmd/uptop - main: ./cmd/uptop
@@ -58,5 +59,7 @@ nfpms:
dst: /usr/share/doc/uptop/LICENSE dst: /usr/share/doc/uptop/LICENSE
type: doc type: doc
changelog: # Changelog generation must stay enabled: the --release-notes flag is consumed
disable: true # by the changelog pipe, so disabling it silently drops the git-cliff notes
# (empty release body on v0.1.0-rc.1). With --release-notes set, GoReleaser
# skips its own generation and uses the file.
+8 -3
View File
@@ -1,6 +1,11 @@
ignore: ignore:
# CVE-2026-41589: SCP path traversal in charmbracelet/wish. # SCP path traversal in charmbracelet/wish — same flaw, two ids: grype has
# matched it as CVE-2026-41589 and as GHSA-xjvp-7243-rg9h depending on db
# version, and ignore matching is exact-id, so both stay listed.
# We only import wish/bubbletea for the SSH TUI server — the vulnerable # We only import wish/bubbletea for the SSH TUI server — the vulnerable
# scp.Middleware / scp.NewFileSystemHandler symbols are never compiled in. # scp.Middleware / scp.NewFileSystemHandler symbols are never compiled in
# No fix available for wish v1; v2 (charm.land/wish/v2) patched in 2.0.1. # (govulncheck reachability agrees). No fix for wish v1; v2
# (charm.land/wish/v2 >= 2.0.1) requires the bubbletea-v2 stack migration,
# tracked in issue #126. Remove both entries when that lands.
- vulnerability: CVE-2026-41589 - vulnerability: CVE-2026-41589
- vulnerability: GHSA-xjvp-7243-rg9h
+175 -121
View File
@@ -1,129 +1,183 @@
# Changelog # Changelog
## [2026.06.2] — 2026-06-02 (infrastructure) ## [Unreleased]
### Added
- initial commit — uptime monitor (forked from go-upkeep)
- enhanced dashboard with lipgloss tables, huh forms, mouse support, and animations
- upgrade users tab with lipgloss table, edit support, role select
- upgrade alerts tab with lipgloss table, click zones, colored types
- widen Site struct and DB schema for ping, port, dns, group monitor types
- add ping, port, and DNS check routines
- add ntfy notification provider with TUI support
- add Uptime Kuma backup converter with CLI and API
- add mouse wheel scrolling for all tabs
- add per-site pause, fix viewport, polish status page
- add monitor groups with collapse/expand and tree view
- add Telegram, PagerDuty, Pushover, Gotify providers
- add Prometheus /metrics endpoint
- expose HTTP method and accepted status codes in monitor form
- add config-as-code YAML import/export
- add distributed probing foundation — schema, models, and probe APIs
- add probe execution mode, check extraction, and result aggregation
- add region affinity, Nodes TUI tab, and probe metrics
- add status bar, tab badges, and detail panel
- bordered modals, welcome state, and dynamic name width
- DOWN-first sort, health pulse, and site filter
- split available width evenly between NAME and HISTORY columns
- add type icons to sites table
- persist logs to DB, load on startup
- add incident management and maintenance windows
- zebra striping, detail breadcrumb, sparkline stats, collapse persistence
- add --version flag with build metadata injection
- add theme system with 4 curated palettes
- swap light theme for Tokyo Night and Gruvbox
- seed SSH users from env var and authorized_keys file (#31)
- show error reason when monitors go DOWN
- proper push monitor lifecycle — PENDING, LATE, DOWN states
- logs tab overhaul — severity tags, filtering, recovery durations
- alert channel health indicator + test alerts
- add GitHub release relay workflow
- classify error reasons on DOWN monitors
- add state change history view with outage duration
- add Opsgenie provider
- add STALE state for push monitors
- add SLA reporting view
- overhaul latency sparkline scaling, color, and layout
- auto-prune expired maintenance windows
- click-to-inspect sparkline tooltips in detail view
### Changed ### Changed
- Split release pipeline into separate binary and Docker workflows (#45)
- Pin Docker base images by digest (#45) - replace database ID column with row counter
- Add GitHub release relay — mirrors Gitea releases to GitHub (#49) - unify SQLite and Postgres into dialect-based SQLStore
- Add Grype CVE scanning to Docker pipeline (#45) - add error returns to all Store interface methods
- Make CVE scan non-blocking for non-exploitable wish SCP vulnerability (#48) - remove store global singleton, thread store explicitly
- extract shared HTTPProvider for webhook-based alerts
- extract shared table rendering, fix cursor bounds
- encapsulate engine state, add graceful shutdown and tests
- split release pipeline, add nfpm/homebrew/git-cliff
- decompose god files into single-concern modules
- consistent chrome across all views
- status icons, clean STATUS column, relative time
- extract magic numbers into named constants
- check all discarded errors in sqlstore_test.go
- overhaul tab bar — consistent counts, active highlight, colored alerts
- responsive column hiding — 3-tier priority-based layout
- swap mattn/go-sqlite3 for modernc.org/sqlite
- propagate context.Context through all Store methods
- typed Status constants with IsBroken() predicate
- schema_version migration table + DeleteAlert FK fix
- shared storetest.BaseMock replaces 5 duplicated mocks
- consolidate env parsing into appConfig struct
- extract Server type with named handler methods
- split Site into SiteConfig + SiteState
- unify logging with log/slog
- restructure site form to 2 type-aware pages
### Fixed ### Fixed
- git-cliff install in CI — resolve download URL dynamically, extract to /tmp (#46, #47)
## [2026.06.1] — 2026-06-01 - forward all msg types to huh forms, improve row selection UX
- harden TLS, timeouts, validation, logging, and token generation
- add delete confirm, input validation, XSS fix, history persistence
- correctness and robustness fixes across all subsystems
- make status bar and tab badges visible
- use stable sort to prevent site list shuffling each tick
- sort children by ID before status to prevent map-order shuffling
- sparkline now spans full column width
- sparkline right-aligned — current time at right edge, dots fill left
- increase history buffer to 60 so sparkline fills completely
- compute uptime from windowed statuses, not running counters
- seed status and latency from DB history on startup
- strip push tokens from /status/json response
- correct viewport sizing and dynamic chrome calculation
- constrain form height to terminal and forward resize events
- skip children in maintenance when computing group status
- exclude maintenance'd monitors from down count and pulse
- group selection highlight, layout constants, group history graphs
- stable monitor count and universal group icons
- replace panic with error return, handle unmarshal errors
- add context to Provider.Send, log alert failures
- constant-time secret comparison, request size limits
- graceful shutdown for HTTP, SSH servers and database
- add jitter to check intervals and stagger startup
- use sh instead of bash for runner compatibility
- enable CGO for race detector, use lint-action v7
- install gcc for race detector support
- skip irrelevant field validation by monitor type
- guard max retries validator for group type
- tighten zebra row contrast for Tokyo Night and Gruvbox
- phase 1 critical fixes for public release
- phase 2 high-severity hardening
- phase 3 medium reliability and hardening
- phase 4 code quality and low-severity fixes
- rename GITEA_TOKEN to RELEASE_TOKEN
- remove explicit container, use sh shell
- bump golang.org/x/crypto v0.47.0 → v0.52.0
- install git and gcc for GoReleaser in release pipeline
- use internal Gitea URL for GoReleaser API calls
- use docker-builder runner for Docker image builds
- patch Docker Scout CVEs and remove unused openssh-client (#41)
- non-root user, supply chain attestations, build cleanup
- move SSH host key path into /data for non-root user
- create .ssh dir explicitly, ensure entrypoint is executable
- resolve git-cliff download URL dynamically
- extract git-cliff to /tmp to avoid dirty worktree
- make Grype CVE scan non-blocking for known wish vuln
- bump Go 1.26.3 → 1.26.4
- remove error truncation from detail panel
- classify safedial "failed to connect" as TCP
- resolve staticcheck lint errors in history view
- trigger immediate recheck after site config edit
- broken tick chain after form/dialog + retries off-by-one
- wire up [e] edit key in detail panel
- show push token and URL in detail panel
- show correct push heartbeat curl command in detail panel
- propagate STALE/LATE child status to group
- quick wins batch — version footer, column widths, zebra, sparkline
- logs tab use viewport for scrollable content
- pin footer to bottom of terminal
- normalize content whitespace for consistent footer position
- clip overflowing content to keep footer pinned
- remove extra blank lines above footer
- expand log viewport to fill content area
- log STALE recovery in push heartbeat handler
- check fmt.Sscanf return value (errcheck lint)
- inject time into ComputeDailyBreakdown for testability
- cascade delete related rows when removing a site
- merge check results into live state, never overwrite
- serialize DB writes through a single drained writer
- close XFF bypass and three secret-leak paths
- move blocking DB IO out of Update/View into tea.Cmds
- move theme styles onto the Model to end cross-session races
- finish moving keypress DB reads into tea.Cmds
- move all store writes out of Update into tea.Cmds
- mask alert secrets in the TUI detail panel and table
- serve /status/json through a public DTO
- make SSH key revocation fail closed
- six correctness fixes for the state machine
- migrate Postgres timestamps to TIMESTAMPTZ
- seven quick-win bug fixes across engine, server, TUI, CLI
- SSRF guard gaps + DNS port restriction + metrics auth
- track selection by site ID + q means back everywhere
- apply convergence + push/group check history
- Kuma import tokens/paused, Docker hardening, migrate-secrets idempotency
- six small fixes — rate limiter leak, DST SLA, probe sort, TUI cleanup
- seven fixes — token scan, variadic cleanup, TUI layout, compose secrets
- chmod SQLite DB files to 0600 on open
- close DNS-rebind TOCTOU on ping/port checks
- API import no longer replaces user accounts
- email send respects context deadline
- rename X-Upkeep-Secret header to X-Uptop-Secret
- apply log filter to full log list, not viewport window
- repair pipeline defects found in v0.1.0-rc.1 rehearsal
- suppress wish GHSA alias in grype, fold rc tags into launch notes
- scan gates docker push, rc tags spare :latest, mirror waits for stable assets
- remove tagged scan image in cleanup step
- exclude rc tags from cliff tag_pattern so launch notes span full history
- fall back to embedded build info when ldflags absent
- drop body-grep Security grouping, map polish type in cliff
- sync selectedID on click so refreshLive doesn't revert cursor
- resolve 4 tag-blocking issues for v0.1.0
### Changed
- Container runs as non-root user `uptop` (UID/GID 1000) instead of root (#44)
- SSH host key relocated to `/data/.ssh/id_ed25519` for non-root compatibility (#44)
- Release workflow prunes dangling images and build cache after Docker push (#44)
### Added
- SBOM and provenance attestations on Docker images for supply chain compliance (#44)
- Entrypoint script with volume writability check and migration guidance (#44)
### Breaking
- Existing Docker volumes with root-owned files require migration before upgrading:
`docker run --rm -v <volume>:/data alpine chown -R 1000:1000 /data`
## [2026.05.6] — 2026-05-30 (infrastructure)
### Changed
- Sync README to Docker Hub on release (#43)
### Security
- Patch Docker Scout CVEs, remove unused openssh-client (#41)
## [2026.05.5] — 2026-05-29
### Added
- Error reason display when monitors go DOWN (#33)
- Push monitor lifecycle — PENDING, LATE, DOWN states (#34)
- Logs tab overhaul — severity tags, filtering, recovery durations (#35)
- Alert channel health indicator and test alerts (#36)
- TUI screenshots in `assets/` (#32)
- CI status badge in README
### Changed
- Visual polish — detail sections, column headers, alert detail (#37)
- README rewritten with hero image, badges, collapsible install sections (#32)
- Changelog rewritten to match actual CalVer tag history
- Migrated to `lerkolabs` org namespace (#38)
- Docker-compose files moved to `deploy/`
## [2026.05.4] — 2026-05-27
### Added
- SSH user seeding from `UPTOP_ADMIN_KEY` env var and `UPTOP_KEYS` file (#31)
- GoReleaser for binary releases
- govulncheck in CI pipeline
- Multi-arch Docker builds (amd64 + arm64)
### Changed
- CI overhaul — Go 1.26, build caching, streamlined pipeline (#30)
- Bumped golang.org/x/crypto v0.47.0 → v0.52.0
- Bumped Alpine 3.21 → 3.23
### Security
- Phase 1: SSRF protection, input validation, safe dial (#26)
- Phase 2: TLS hardening, auth bypass fixes, rate limiting (#27)
- Phase 3: Graceful degradation, connection limits, timeout enforcement (#28)
- Phase 4: Code quality, error handling, linter fixes (#29)
## [2026.05.3] — 2026-05-25
### Added
- Theme system with 5 dark palettes — Default, Dracula, Nord, Tokyo Night, Gruvbox (#24)
- `--version` flag with build metadata injection
- Gitea Actions CI pipeline — test + lint (#20)
- golangci-lint configuration
- Comprehensive test suite — 94 tests across monitor, server, cluster (#19)
- CONTRIBUTING.md and SECURITY.md
### Changed
- Renamed project from go-upkeep to uptop (#25)
- Updated LICENSE with dual copyright for independent fork
### Fixed
- Form validators scoped to relevant monitor types (#23)
- Graceful shutdown for HTTP, SSH servers and database (#19)
- Constant-time secret comparison, request size limits (#19)
- Check interval jitter to prevent thundering herd (#19)
- TUI visual polish — zebra striping, group icons, sparkline stats (#18)
## [2026.05.2] — 2026-05-22
### Added
- Incident management and maintenance windows (#17)
- Production docker-compose.yml
### Fixed
- Viewport sizing and dynamic chrome calculation (#16)
- Form height constrained to terminal with resize forwarding
- Maintenance'd monitors excluded from down count and pulse
- Group status correctly skips children in maintenance
## [2026.05.1] — 2026-05-16
### Added
- Distributed probing with leader + probe nodes
- Config-as-code — YAML apply/export with dry-run and prune
- TUI polish — status bar, tab badges, detail panel, modals
- DOWN-first sort, health pulse, site filter
- Type icons in sites table
- Sparkline history graphs
- Persistent state — uptime, status, latency, and logs survive restarts
- Push token stripping from /status/json response
## [2026.04.1] — 2026-04-01
### Added
- SSH-accessible TUI built on Bubble Tea + Wish
- 6 check types — HTTP, Push, Ping, Port, DNS, Group
- 9 alert providers — Discord, Slack, Email, Ntfy, Telegram, PagerDuty, Pushover, Gotify, Webhook
- SQLite and PostgreSQL support
- HA clustering with automatic failover
- Prometheus /metrics endpoint
- Public status page (HTML + JSON)
- Uptime Kuma backup import
+1 -1
View File
@@ -3,7 +3,7 @@
## Development ## Development
```sh ```sh
go run cmd/uptop/main.go -demo # starts with sample data go run ./cmd/uptop -demo # starts with sample data
ssh -p 23234 localhost # connect to TUI ssh -p 23234 localhost # connect to TUI
``` ```
+3 -1
View File
@@ -1,5 +1,5 @@
# --- Stage 1: Builder --- # --- Stage 1: Builder ---
FROM golang:1.26-alpine3.23@sha256:91eda9776261207ea25fd06b5b7fed8d397dd2c0a283e77f2ab6e91bfa71079d AS builder FROM golang:1.26.4-alpine3.23@sha256:f23e8b227fb4493eabe03bede4d5a32d04092da71962f1fb79b5f7d1e6c2a17f AS builder
WORKDIR /app WORKDIR /app
COPY go.mod go.sum ./ COPY go.mod go.sum ./
RUN --mount=type=cache,target=/go/pkg/mod \ RUN --mount=type=cache,target=/go/pkg/mod \
@@ -31,6 +31,8 @@ ENV UPTOP_SSH_HOST_KEY=/data/.ssh/id_ed25519
ENV UPTOP_PORT=23234 ENV UPTOP_PORT=23234
EXPOSE 23234 EXPOSE 23234
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD wget -qO- http://localhost:8080/api/health || exit 1
USER uptop USER uptop
ENTRYPOINT ["docker-entrypoint.sh"] ENTRYPOINT ["docker-entrypoint.sh"]
CMD ["./uptop"] CMD ["./uptop"]
+27 -6
View File
@@ -19,6 +19,8 @@ An uptime monitor you manage entirely from the terminal. It runs as a server, ex
Built on [RDGames/go-upkeep](https://github.com/RDGames/go-upkeep). Rewritten for clustering, config-as-code, and a proper dashboard. Built on [RDGames/go-upkeep](https://github.com/RDGames/go-upkeep). Rewritten for clustering, config-as-code, and a proper dashboard.
Canonical repo: [gitea.lerkolabs.com/lerkolabs/uptop](https://gitea.lerkolabs.com/lerkolabs/uptop) — [GitHub](https://github.com/lerkolabs/uptop) is a mirror; releases are published to both.
## Features ## Features
- **6 check types** — HTTP, Push (heartbeat), Ping, Port, DNS, Groups - **6 check types** — HTTP, Push (heartbeat), Ping, Port, DNS, Groups
@@ -30,6 +32,8 @@ Built on [RDGames/go-upkeep](https://github.com/RDGames/go-upkeep). Rewritten fo
- **SQLite or Postgres** — SQLite for single-node, Postgres for production - **SQLite or Postgres** — SQLite for single-node, Postgres for production
- **Uptime Kuma import** — migrate from Kuma with one command - **Uptime Kuma import** — migrate from Kuma with one command
> Group monitors roll up child status for display but don't fire their own alerts yet — attach alerts to the children.
## Screenshots ## Screenshots
<table> <table>
@@ -49,14 +53,14 @@ Built on [RDGames/go-upkeep](https://github.com/RDGames/go-upkeep). Rewritten fo
## Quick start ## Quick start
```bash ```bash
go run cmd/uptop/main.go UPTOP_ADMIN_KEY="$(cat ~/.ssh/id_ed25519.pub)" go run ./cmd/uptop
ssh -p 23234 localhost ssh -p 23234 localhost
``` ```
Want some data to look at first: Want some data to look at first:
```bash ```bash
go run cmd/uptop/main.go -demo UPTOP_ADMIN_KEY="$(cat ~/.ssh/id_ed25519.pub)" go run ./cmd/uptop -demo
``` ```
## Install ## Install
@@ -79,16 +83,20 @@ services:
# - UPTOP_ADMIN_KEY=ssh-ed25519 AAAA... you@host # - UPTOP_ADMIN_KEY=ssh-ed25519 AAAA... you@host
volumes: volumes:
- ./data:/data - ./data:/data
sysctls:
- net.ipv4.ping_group_range=0 2147483647
``` ```
First run: set `UPTOP_ADMIN_KEY` to your SSH public key, or attach to the container and add it in the Users tab. First run: set `UPTOP_ADMIN_KEY` to your SSH public key.
The `sysctls` line enables unprivileged ICMP inside the container — without it, ping monitors get no response and silently report DOWN.
</details> </details>
<details> <details>
<summary><strong>Binary (Linux amd64)</strong></summary> <summary><strong>Binary (Linux, macOS, Windows)</strong></summary>
Download from [Releases](https://github.com/lerkolabs/uptop/releases). Download from [Releases](https://github.com/lerkolabs/uptop/releases) — amd64 and arm64 tarballs (zip for Windows), plus `.deb`/`.rpm` packages and `checksums.txt`.
</details> </details>
@@ -162,6 +170,19 @@ Set `UPTOP_ENCRYPTION_KEY` to encrypt alert credentials (SMTP passwords, webhook
Without this, credentials are stored as plaintext in the database. uptop warns on startup if unset. To encrypt credentials on an existing install, run `uptop migrate-secrets` with the key set. Without this, credentials are stored as plaintext in the database. uptop warns on startup if unset. To encrypt credentials on an existing install, run `uptop migrate-secrets` with the key set.
### Data retention
uptop prunes its own history in the background — no external cleanup jobs needed:
| Data | Kept |
|---|---|
| Check history | newest 1,000 checks per monitor |
| State changes (UP/DOWN transitions) | newest 5,000 per monitor |
| Logs | newest 200 entries |
| Maintenance windows | 7 days after they end (configurable) |
Sparklines, uptime percentages, and SLA reports are computed from these windows, so very long-horizon stats aren't retained. Export to Prometheus via `/metrics` if you need unlimited history.
## Clustering ## Clustering
uptop supports three modes: **leader** (default single node), **follower** (HA failover — takes over if the leader goes down), and **probe** (stateless distributed checks from multiple regions). uptop supports three modes: **leader** (default single node), **follower** (HA failover — takes over if the leader goes down), and **probe** (stateless distributed checks from multiple regions).
@@ -174,7 +195,7 @@ Export your Kuma backup JSON, then:
```bash ```bash
curl -X POST http://localhost:8080/api/import/kuma \ curl -X POST http://localhost:8080/api/import/kuma \
-H "X-Upkeep-Secret: your-secret" \ -H "X-Uptop-Secret: your-secret" \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d @kuma-backup.json -d @kuma-backup.json
``` ```
+8 -2
View File
@@ -23,7 +23,13 @@ filter_unconventional = true
split_commits = false split_commits = false
protect_breaking_commits = false protect_breaking_commits = false
filter_commits = false filter_commits = false
tag_pattern = "[0-9]*" # Only final tags count as releases — rc rehearsal tags must not become
# section boundaries, or the final tag's notes would cover only
# commits-since-last-rc (v0.1.0 rendered 0 commits with ignore_tags, which
# drops rc-tagged commits instead of folding them forward). With rc tags
# outside the pattern, finals render the full span and rc tags render
# [Unreleased] with everything pending. Verified empirically on both.
tag_pattern = 'v[0-9]+\.[0-9]+\.[0-9]+$'
topo_order = false topo_order = false
sort_commits = "oldest" sort_commits = "oldest"
@@ -33,7 +39,7 @@ commit_parsers = [
{ message = "^perf", group = "Changed" }, { message = "^perf", group = "Changed" },
{ message = "^refactor", group = "Changed" }, { message = "^refactor", group = "Changed" },
{ message = "^security", group = "Security" }, { message = "^security", group = "Security" },
{ body = ".*security", group = "Security" }, { message = "^polish", group = "Changed" },
{ body = "BREAKING", group = "Breaking" }, { body = "BREAKING", group = "Breaking" },
{ footer = "BREAKING.CHANGE", group = "Breaking" }, { footer = "BREAKING.CHANGE", group = "Breaking" },
{ message = "^docs", skip = true }, { message = "^docs", skip = true },
+133
View File
@@ -0,0 +1,133 @@
package main
import (
"net"
"os"
"strconv"
"time"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/server"
)
type appConfig struct {
Port int
SSHHostKey string
DBType string
DBDSN string
HTTPPort int
TLSCert string
TLSKey string
StatusEnabled bool
StatusTitle string
ClusterMode string
ClusterSecret string
PeerURL string
NodeID string
NodeName string
NodeRegion string
AggStrategy string
AllowPrivateTargets bool
InsecureSkipVerify bool
MaintRetention time.Duration
EncryptionKey string
MetricsPublic bool
CORSOrigin string
TrustedProxies []*net.IPNet
AdminKey string
KeysFile string
}
func parseConfig() appConfig {
cfg := appConfig{
Port: 23234,
SSHHostKey: ".ssh/id_ed25519",
DBType: "sqlite",
DBDSN: "uptop.db",
HTTPPort: 8080,
StatusTitle: "System Status",
ClusterMode: "leader",
MaintRetention: 7 * 24 * time.Hour,
}
if v := os.Getenv("UPTOP_PORT"); v != "" {
if n, err := strconv.Atoi(v); err == nil {
cfg.Port = n
}
}
if v := os.Getenv("UPTOP_DB_TYPE"); v != "" {
cfg.DBType = v
}
if v := os.Getenv("UPTOP_DB_DSN"); v != "" {
cfg.DBDSN = v
}
if v := os.Getenv("UPTOP_HTTP_PORT"); v != "" {
if n, err := strconv.Atoi(v); err == nil {
cfg.HTTPPort = n
}
}
if os.Getenv("UPTOP_STATUS_ENABLED") == "true" {
cfg.StatusEnabled = true
}
if v := os.Getenv("UPTOP_STATUS_TITLE"); v != "" {
cfg.StatusTitle = v
}
if v := os.Getenv("UPTOP_CLUSTER_MODE"); v != "" {
cfg.ClusterMode = v
}
if v := os.Getenv("UPTOP_PEER_URL"); v != "" {
cfg.PeerURL = v
}
if v := os.Getenv("UPTOP_CLUSTER_SECRET"); v != "" {
cfg.ClusterSecret = v
}
cfg.NodeID = os.Getenv("UPTOP_NODE_ID")
cfg.NodeName = os.Getenv("UPTOP_NODE_NAME")
cfg.NodeRegion = os.Getenv("UPTOP_NODE_REGION")
cfg.AggStrategy = os.Getenv("UPTOP_AGG_STRATEGY")
cfg.AllowPrivateTargets = os.Getenv("UPTOP_ALLOW_PRIVATE_TARGETS") == "true"
cfg.InsecureSkipVerify = os.Getenv("UPTOP_INSECURE_SKIP_VERIFY") == "true"
cfg.MetricsPublic = os.Getenv("UPTOP_METRICS_PUBLIC") == "true"
cfg.EncryptionKey = os.Getenv("UPTOP_ENCRYPTION_KEY")
cfg.TLSCert = os.Getenv("UPTOP_TLS_CERT")
cfg.TLSKey = os.Getenv("UPTOP_TLS_KEY")
cfg.CORSOrigin = os.Getenv("UPTOP_CORS_ORIGIN")
cfg.TrustedProxies = parseTrustedProxies(os.Getenv("UPTOP_TRUSTED_PROXIES"))
cfg.SSHHostKey = envOrDefault("UPTOP_SSH_HOST_KEY", cfg.SSHHostKey)
cfg.AdminKey = os.Getenv("UPTOP_ADMIN_KEY")
cfg.KeysFile = os.Getenv("UPTOP_KEYS")
if v := os.Getenv("UPTOP_MAINT_RETENTION"); v != "" {
if d, err := time.ParseDuration(v); err == nil && d > 0 {
cfg.MaintRetention = d
}
}
return cfg
}
func (c appConfig) serverConfig(quietHTTPLog bool) server.ServerConfig {
return server.ServerConfig{
Port: c.HTTPPort,
EnableStatus: c.StatusEnabled,
Title: c.StatusTitle,
ClusterKey: c.ClusterSecret,
TLSCert: c.TLSCert,
TLSKey: c.TLSKey,
ClusterMode: c.ClusterMode,
MetricsPublic: c.MetricsPublic,
CORSOrigin: c.CORSOrigin,
TrustedProxies: c.TrustedProxies,
QuietHTTPLog: quietHTTPLog,
}
}
+4 -5
View File
@@ -9,22 +9,21 @@ import (
"time" "time"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models" "gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/store" "gitea.lerkolabs.com/lerkolabs/uptop/internal/store/storetest"
"github.com/charmbracelet/ssh" "github.com/charmbracelet/ssh"
gossh "golang.org/x/crypto/ssh" gossh "golang.org/x/crypto/ssh"
) )
// kcMockStore implements only what keyCache and userInvalidatingStore touch; // kcMockStore embeds BaseMock for default no-ops; only GetAllUsers is
// any other Store method panics via the embedded nil interface. // overridden because the tests mutate users/err between calls.
type kcMockStore struct { type kcMockStore struct {
store.Store storetest.BaseMock
users []models.User users []models.User
err error err error
} }
func (m *kcMockStore) GetAllUsers(_ context.Context) ([]models.User, error) { return m.users, m.err } func (m *kcMockStore) GetAllUsers(_ context.Context) ([]models.User, error) { return m.users, m.err }
func (m *kcMockStore) DeleteUser(_ context.Context, _ int) error { return nil }
func testKey(t *testing.T) (string, ssh.PublicKey) { func testKey(t *testing.T) (string, ssh.PublicKey) {
t.Helper() t.Helper()
+113 -148
View File
@@ -6,13 +6,13 @@ import (
"errors" "errors"
"flag" "flag"
"fmt" "fmt"
"log" "log/slog"
"net" "net"
"net/url" "net/url"
"os" "os"
"os/signal" "os/signal"
"path/filepath" "path/filepath"
"strconv" "runtime/debug"
"strings" "strings"
"sync" "sync"
"syscall" "syscall"
@@ -40,8 +40,34 @@ var (
date = "unknown" date = "unknown"
) )
// GoReleaser stamps the vars above via ldflags, but `go install module@tag`
// compiles without them and would report "dev". The module version and any
// vcs stamps are embedded in every binary, so fall back to those.
func init() {
if version != "dev" {
return
}
info, ok := debug.ReadBuildInfo()
if !ok {
return
}
if mv := info.Main.Version; mv != "" && mv != "(devel)" {
version = strings.TrimPrefix(mv, "v")
}
for _, s := range info.Settings {
switch s.Key {
case "vcs.revision":
commit = s.Value
case "vcs.time":
date = s.Value
}
}
}
func main() { func main() {
log.SetOutput(os.Stderr) slog.SetDefault(slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{
Level: slog.LevelInfo,
})))
if len(os.Args) >= 2 { if len(os.Args) >= 2 {
switch os.Args[1] { switch os.Args[1] {
@@ -63,11 +89,18 @@ func main() {
} }
func printVersion() { func printVersion() {
if version == "dev" { out := "uptop " + version
fmt.Println("uptop dev") var meta []string
} else { if commit != "none" {
fmt.Printf("uptop %s (%s, %s)\n", version, commit, date) meta = append(meta, commit)
} }
if date != "unknown" {
meta = append(meta, date)
}
if len(meta) > 0 {
out += " (" + strings.Join(meta, ", ") + ")"
}
fmt.Println(out)
} }
func envOrDefault(key, fallback string) string { func envOrDefault(key, fallback string) string {
@@ -111,7 +144,7 @@ func parseTrustedProxies(raw string) []*net.IPNet {
} }
_, ipnet, err := net.ParseCIDR(part) _, ipnet, err := net.ParseCIDR(part)
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "WARNING: ignoring invalid UPTOP_TRUSTED_PROXIES entry %q: %v\n", part, err) slog.Warn("ignoring invalid UPTOP_TRUSTED_PROXIES entry", "entry", part, "err", err) //nolint:gosec // structured slog, not format string
continue continue
} }
cidrs = append(cidrs, ipnet) cidrs = append(cidrs, ipnet)
@@ -128,21 +161,21 @@ func openStore(dbType, dsn string) store.Store {
ss, err = store.NewSQLiteStore(dsn) ss, err = store.NewSQLiteStore(dsn)
} }
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "database error: %v\n", err) slog.Error("database connection failed", "err", err)
os.Exit(1) os.Exit(1)
} }
if encKey := os.Getenv("UPTOP_ENCRYPTION_KEY"); encKey != "" { if encKey := os.Getenv("UPTOP_ENCRYPTION_KEY"); encKey != "" {
enc, err := store.NewEncryptor(encKey) enc, err := store.NewEncryptor(encKey)
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "encryption key error: %v\n", err) slog.Error("encryption key invalid", "err", err)
os.Exit(1) os.Exit(1)
} }
ss.SetEncryptor(enc) ss.SetEncryptor(enc)
} else { } else {
fmt.Println("WARNING: No UPTOP_ENCRYPTION_KEY set. Alert credentials stored unencrypted.") slog.Warn("no UPTOP_ENCRYPTION_KEY set, alert credentials stored unencrypted")
} }
if err := ss.Init(context.Background()); err != nil { if err := ss.Init(context.Background()); err != nil {
fmt.Fprintf(os.Stderr, "database init error: %v\n", err) slog.Error("database init failed", "err", err)
os.Exit(1) os.Exit(1)
} }
return ss return ss
@@ -167,7 +200,7 @@ func runApply(args []string) {
f, err := config.LoadFile(*filePath) f, err := config.LoadFile(*filePath)
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err) slog.Error("config load failed", "err", err)
os.Exit(1) os.Exit(1)
} }
@@ -176,7 +209,7 @@ func runApply(args []string) {
Prune: *prune, Prune: *prune,
}) })
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err) slog.Error("config apply failed", "err", err)
os.Exit(1) os.Exit(1)
} }
@@ -194,12 +227,12 @@ func runExport(args []string) {
f, err := config.Export(context.Background(), s) f, err := config.Export(context.Background(), s)
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err) slog.Error("export failed", "err", err)
os.Exit(1) os.Exit(1)
} }
if err := config.WriteFile(f, *outPath); err != nil { if err := config.WriteFile(f, *outPath); err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err) slog.Error("export write failed", "err", err)
os.Exit(1) os.Exit(1)
} }
} }
@@ -217,7 +250,7 @@ func runMigrateSecrets(args []string) {
} }
enc, err := store.NewEncryptor(encKey) enc, err := store.NewEncryptor(encKey)
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err) slog.Error("encryption key invalid", "err", err)
os.Exit(1) os.Exit(1)
} }
@@ -228,25 +261,25 @@ func runMigrateSecrets(args []string) {
ss, err = store.NewSQLiteStore(*dsn) ss, err = store.NewSQLiteStore(*dsn)
} }
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "database error: %v\n", err) slog.Error("database connection failed", "err", err)
os.Exit(1) os.Exit(1)
} }
if err := ss.Init(context.Background()); err != nil { if err := ss.Init(context.Background()); err != nil {
fmt.Fprintf(os.Stderr, "database init error: %v\n", err) slog.Error("database init failed", "err", err)
os.Exit(1)
}
alerts, err := ss.GetAllAlerts(context.Background())
if err != nil {
fmt.Fprintf(os.Stderr, "error loading alerts: %v\n", err)
os.Exit(1) os.Exit(1)
} }
ss.SetEncryptor(enc) ss.SetEncryptor(enc)
alerts, err := ss.GetAllAlerts(context.Background())
if err != nil {
slog.Error("failed to load alerts", "err", err)
os.Exit(1)
}
migrated := 0 migrated := 0
for _, a := range alerts { for _, a := range alerts {
if err := ss.UpdateAlert(context.Background(), a.ID, a.Name, a.Type, a.Settings); err != nil { if err := ss.UpdateAlert(context.Background(), a.ID, a.Name, a.Type, a.Settings); err != nil {
fmt.Fprintf(os.Stderr, "error migrating alert %q: %v\n", a.Name, err) slog.Error("alert migration failed", "alert", a.Name, "err", err)
os.Exit(1) os.Exit(1)
} }
migrated++ migrated++
@@ -255,64 +288,19 @@ func runMigrateSecrets(args []string) {
} }
func runServe(args []string) { func runServe(args []string) {
portVal := 23234 cfg := parseConfig()
dbType := "sqlite"
dbDSN := "uptop.db"
httpPort := 8080
enableStatus := false
statusTitle := "System Status"
clusterMode := "leader"
clusterPeer := ""
clusterKey := ""
if v := os.Getenv("UPTOP_PORT"); v != "" { if cfg.ClusterMode == "probe" {
if p, err := strconv.Atoi(v); err == nil { if cfg.NodeID == "" {
portVal = p
}
}
if v := os.Getenv("UPTOP_DB_TYPE"); v != "" {
dbType = v
}
if v := os.Getenv("UPTOP_DB_DSN"); v != "" {
dbDSN = v
}
if v := os.Getenv("UPTOP_HTTP_PORT"); v != "" {
if p, err := strconv.Atoi(v); err == nil {
httpPort = p
}
}
if v := os.Getenv("UPTOP_STATUS_ENABLED"); v == "true" {
enableStatus = true
}
if v := os.Getenv("UPTOP_STATUS_TITLE"); v != "" {
statusTitle = v
}
if v := os.Getenv("UPTOP_CLUSTER_MODE"); v != "" {
clusterMode = v
}
if v := os.Getenv("UPTOP_PEER_URL"); v != "" {
clusterPeer = v
}
if v := os.Getenv("UPTOP_CLUSTER_SECRET"); v != "" {
clusterKey = v
}
nodeID := os.Getenv("UPTOP_NODE_ID")
nodeName := os.Getenv("UPTOP_NODE_NAME")
nodeRegion := os.Getenv("UPTOP_NODE_REGION")
aggStrategy := os.Getenv("UPTOP_AGG_STRATEGY")
if clusterMode == "probe" {
if nodeID == "" {
fmt.Fprintln(os.Stderr, "UPTOP_NODE_ID is required for probe mode") fmt.Fprintln(os.Stderr, "UPTOP_NODE_ID is required for probe mode")
os.Exit(1) os.Exit(1)
} }
if clusterPeer == "" { if cfg.PeerURL == "" {
fmt.Fprintln(os.Stderr, "UPTOP_PEER_URL is required for probe mode") fmt.Fprintln(os.Stderr, "UPTOP_PEER_URL is required for probe mode")
os.Exit(1) os.Exit(1)
} }
fmt.Printf("Cluster: Running as PROBE (node=%s, region=%s)\n", nodeID, nodeRegion) fmt.Printf("Cluster: Running as PROBE (node=%s, region=%s)\n", cfg.NodeID, cfg.NodeRegion)
ctx, cancel := context.WithCancel(context.Background()) ctx, cancel := context.WithCancel(context.Background())
defer cancel() defer cancel()
@@ -323,29 +311,28 @@ func runServe(args []string) {
cancel() cancel()
}() }()
probeAllowPrivate := os.Getenv("UPTOP_ALLOW_PRIVATE_TARGETS") == "true" if cfg.AllowPrivateTargets {
if probeAllowPrivate { slog.Warn("private target blocking disabled, monitor URLs can reach internal networks")
fmt.Println("WARNING: Private target blocking disabled. Monitor URLs can reach internal networks.")
} }
if err := cluster.RunProbe(ctx, cluster.ProbeConfig{ if err := cluster.RunProbe(ctx, cluster.ProbeConfig{
NodeID: nodeID, NodeID: cfg.NodeID,
NodeName: nodeName, NodeName: cfg.NodeName,
Region: nodeRegion, Region: cfg.NodeRegion,
LeaderURL: clusterPeer, LeaderURL: cfg.PeerURL,
SharedKey: clusterKey, SharedKey: cfg.ClusterSecret,
Interval: 30, Interval: 30,
AllowPrivateTargets: probeAllowPrivate, AllowPrivateTargets: cfg.AllowPrivateTargets,
}); err != nil { }); err != nil {
fmt.Fprintf(os.Stderr, "Probe error: %v\n", err) slog.Error("probe failed", "err", err)
} }
return return
} }
fs := flag.NewFlagSet("serve", flag.ExitOnError) fs := flag.NewFlagSet("serve", flag.ExitOnError)
port := fs.Int("port", portVal, "SSH Port") port := fs.Int("port", cfg.Port, "SSH Port")
flagDBType := fs.String("db-type", dbType, "Database type") flagDBType := fs.String("db-type", cfg.DBType, "Database type")
flagDSN := fs.String("dsn", dbDSN, "Database DSN") flagDSN := fs.String("dsn", cfg.DBDSN, "Database DSN")
demo := fs.Bool("demo", false, "Seed demo data") demo := fs.Bool("demo", false, "Seed demo data")
importKuma := fs.String("import-kuma", "", "Import Uptime Kuma backup JSON file") importKuma := fs.String("import-kuma", "", "Import Uptime Kuma backup JSON file")
_ = fs.Parse(args) // ExitOnError: parse errors exit before returning _ = fs.Parse(args) // ExitOnError: parse errors exit before returning
@@ -354,32 +341,32 @@ func runServe(args []string) {
var dbErr error var dbErr error
if *flagDBType == "postgres" { if *flagDBType == "postgres" {
ss, dbErr = store.NewPostgresStore(*flagDSN) ss, dbErr = store.NewPostgresStore(*flagDSN)
fmt.Printf("Using PostgreSQL: %s\n", redactDSN(*flagDSN)) slog.Info("database connected", "type", "postgres", "dsn", redactDSN(*flagDSN))
} else { } else {
ss, dbErr = store.NewSQLiteStore(*flagDSN) ss, dbErr = store.NewSQLiteStore(*flagDSN)
fmt.Printf("Using SQLite: %s\n", *flagDSN) slog.Info("database connected", "type", "sqlite", "dsn", *flagDSN)
} }
if dbErr != nil { if dbErr != nil {
fmt.Fprintf(os.Stderr, "database connection error: %v\n", dbErr) slog.Error("database connection failed", "err", dbErr)
os.Exit(1) os.Exit(1)
} }
defer ss.Close() defer ss.Close()
if encKey := os.Getenv("UPTOP_ENCRYPTION_KEY"); encKey != "" { if cfg.EncryptionKey != "" {
enc, err := store.NewEncryptor(encKey) enc, err := store.NewEncryptor(cfg.EncryptionKey)
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "encryption key error: %v\n", err) slog.Error("encryption key invalid", "err", err)
os.Exit(1) os.Exit(1)
} }
ss.SetEncryptor(enc) ss.SetEncryptor(enc)
} else { } else {
fmt.Println("WARNING: No UPTOP_ENCRYPTION_KEY set. Alert credentials stored unencrypted.") slog.Warn("no UPTOP_ENCRYPTION_KEY set, alert credentials stored unencrypted")
} }
kc := newKeyCache(ss) kc := newKeyCache(ss)
var s store.Store = &userInvalidatingStore{Store: ss, kc: kc} var s store.Store = &userInvalidatingStore{Store: ss, kc: kc}
if err := s.Init(context.Background()); err != nil { if err := s.Init(context.Background()); err != nil {
fmt.Fprintf(os.Stderr, "database init error: %v\n", err) slog.Error("database init failed", "err", err)
os.Exit(1) os.Exit(1)
} }
if *demo { if *demo {
@@ -391,34 +378,29 @@ func runServe(args []string) {
if *importKuma != "" { if *importKuma != "" {
kb, err := importer.LoadKumaFile(*importKuma) kb, err := importer.LoadKumaFile(*importKuma)
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "kuma import error: %v\n", err) slog.Error("kuma import failed", "err", err)
os.Exit(1) os.Exit(1)
} }
backup := importer.ConvertKuma(kb) backup := importer.ConvertKuma(kb)
if err := s.ImportData(context.Background(), backup); err != nil { if err := s.ImportData(context.Background(), backup); err != nil {
fmt.Fprintf(os.Stderr, "import failed: %v\n", err) slog.Error("import failed", "err", err)
os.Exit(1) os.Exit(1)
} }
fmt.Printf("Imported %d monitors and %d alerts from Uptime Kuma v%s\n", len(backup.Sites), len(backup.Alerts), kb.Version) fmt.Printf("Imported %d monitors and %d alerts from Uptime Kuma v%s\n", len(backup.Sites), len(backup.Alerts), kb.Version)
} }
allowPrivate := os.Getenv("UPTOP_ALLOW_PRIVATE_TARGETS") == "true" if cfg.AllowPrivateTargets {
if allowPrivate { slog.Warn("private target blocking disabled, monitor URLs can reach internal networks")
fmt.Println("WARNING: Private target blocking disabled. Monitor URLs can reach internal networks.")
} }
eng := monitor.NewEngineWithOpts(s, allowPrivate) eng := monitor.NewEngineWithOpts(s, cfg.AllowPrivateTargets)
if os.Getenv("UPTOP_INSECURE_SKIP_VERIFY") == "true" { if cfg.InsecureSkipVerify {
eng.SetInsecureSkipVerify(true) eng.SetInsecureSkipVerify(true)
} }
if aggStrategy != "" { if cfg.AggStrategy != "" {
eng.SetAggStrategy(monitor.AggregationStrategy(aggStrategy)) eng.SetAggStrategy(monitor.AggregationStrategy(cfg.AggStrategy))
}
if v := os.Getenv("UPTOP_MAINT_RETENTION"); v != "" {
if d, err := time.ParseDuration(v); err == nil && d > 0 {
eng.SetMaintRetention(d)
}
} }
eng.SetMaintRetention(cfg.MaintRetention)
ctx, cancel := context.WithCancel(context.Background()) ctx, cancel := context.WithCancel(context.Background())
defer cancel() defer cancel()
@@ -428,31 +410,14 @@ func runServe(args []string) {
eng.InitAlertHealth() eng.InitAlertHealth()
eng.Start(ctx) eng.Start(ctx)
tlsCert := os.Getenv("UPTOP_TLS_CERT")
tlsKey := os.Getenv("UPTOP_TLS_KEY")
// When the local TUI owns the terminal, per-request HTTP logs to stderr
// would scribble over the alt screen.
localTUI := isatty.IsTerminal(os.Stdout.Fd()) || isatty.IsCygwinTerminal(os.Stdout.Fd()) localTUI := isatty.IsTerminal(os.Stdout.Fd()) || isatty.IsCygwinTerminal(os.Stdout.Fd())
httpSrv := server.Start(server.ServerConfig{ httpSrv := server.Start(cfg.serverConfig(localTUI), s, eng)
Port: httpPort,
EnableStatus: enableStatus,
Title: statusTitle,
ClusterKey: clusterKey,
TLSCert: tlsCert,
TLSKey: tlsKey,
ClusterMode: clusterMode,
MetricsPublic: os.Getenv("UPTOP_METRICS_PUBLIC") == "true",
CORSOrigin: os.Getenv("UPTOP_CORS_ORIGIN"),
TrustedProxies: parseTrustedProxies(os.Getenv("UPTOP_TRUSTED_PROXIES")),
QuietHTTPLog: localTUI,
}, s, eng)
cluster.Start(ctx, cluster.Config{ cluster.Start(ctx, cluster.Config{
Mode: clusterMode, Mode: cfg.ClusterMode,
PeerURL: clusterPeer, PeerURL: cfg.PeerURL,
SharedKey: clusterKey, SharedKey: cfg.ClusterSecret,
}, eng) }, eng)
sshSrv := startSSHServer(*port, s, eng, kc) sshSrv := startSSHServer(*port, s, eng, kc)
@@ -460,7 +425,7 @@ func runServe(args []string) {
if localTUI { if localTUI {
p := tea.NewProgram(tui.InitialModel(true, s, eng, version), tea.WithAltScreen(), tea.WithMouseCellMotion()) p := tea.NewProgram(tui.InitialModel(true, s, eng, version), tea.WithAltScreen(), tea.WithMouseCellMotion())
if _, err := p.Run(); err != nil { if _, err := p.Run(); err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err) slog.Error("TUI failed", "err", err)
} }
} else { } else {
fmt.Println("uptop running in HEADLESS mode") fmt.Println("uptop running in HEADLESS mode")
@@ -471,20 +436,18 @@ func runServe(args []string) {
} }
cancel() cancel()
// Drain pending DB writes before the deferred ss.Close() runs, so no
// write races a closed database.
eng.Stop() eng.Stop()
shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 30*time.Second) shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 30*time.Second)
defer shutdownCancel() defer shutdownCancel()
if httpSrv != nil { if httpSrv != nil {
if err := httpSrv.Shutdown(shutdownCtx); err != nil { if err := httpSrv.Shutdown(shutdownCtx); err != nil {
log.Printf("HTTP shutdown error: %v", err) slog.Error("HTTP shutdown failed", "err", err)
} }
} }
if sshSrv != nil { if sshSrv != nil {
if err := sshSrv.Shutdown(shutdownCtx); err != nil { if err := sshSrv.Shutdown(shutdownCtx); err != nil {
log.Printf("SSH shutdown error: %v", err) slog.Error("SSH shutdown failed", "err", err)
} }
} }
} }
@@ -503,12 +466,12 @@ func startSSHServer(port int, db store.Store, eng *monitor.Engine, kc *keyCache)
), ),
) )
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "SSH server error: %v\n", err) slog.Error("SSH server failed", "err", err)
return nil return nil
} }
go func() { go func() {
if err := s.ListenAndServe(); err != nil && !errors.Is(err, ssh.ErrServerClosed) { if err := s.ListenAndServe(); err != nil && !errors.Is(err, ssh.ErrServerClosed) {
log.Printf("SSH server error: %v", err) slog.Error("SSH server failed", "err", err)
} }
}() }()
return s return s
@@ -523,11 +486,11 @@ func seedDemoData(s store.Store) {
fmt.Println("Seeding demo data...") fmt.Println("Seeding demo data...")
if err := s.AddAlert(ctx, "Discord Ops", "discord", map[string]string{"url": "https://discord.com/api/webhooks/demo/token"}); err != nil { if err := s.AddAlert(ctx, "Discord Ops", "discord", map[string]string{"url": "https://discord.com/api/webhooks/demo/token"}); err != nil {
log.Printf("demo seed: add alert: %v", err) slog.Error("demo seed failed", "step", "add alert", "err", err)
return return
} }
if err := s.AddAlert(ctx, "Slack Infra", "slack", map[string]string{"url": "https://hooks.slack.com/services/DEMO/WEBHOOK"}); err != nil { if err := s.AddAlert(ctx, "Slack Infra", "slack", map[string]string{"url": "https://hooks.slack.com/services/DEMO/WEBHOOK"}); err != nil {
log.Printf("demo seed: add alert: %v", err) slog.Error("demo seed failed", "step", "add alert", "err", err)
return return
} }
if err := s.AddAlert(ctx, "Email Oncall", "email", map[string]string{ if err := s.AddAlert(ctx, "Email Oncall", "email", map[string]string{
@@ -535,7 +498,7 @@ func seedDemoData(s store.Store) {
"user": "oncall@example.com", "pass": "replace-me", "user": "oncall@example.com", "pass": "replace-me",
"from": "oncall@example.com", "to": "team@example.com", "from": "oncall@example.com", "to": "team@example.com",
}); err != nil { }); err != nil {
log.Printf("demo seed: add alert: %v", err) slog.Error("demo seed failed", "step", "add alert", "err", err)
return return
} }
@@ -545,7 +508,7 @@ func seedDemoData(s store.Store) {
alertID = alerts[0].ID alertID = alerts[0].ID
} }
demoSites := []models.Site{ demoSites := []models.SiteConfig{
{Name: "Google", URL: "https://www.google.com", Type: "http", Interval: 30, AlertID: alertID, CheckSSL: true, ExpiryThreshold: 14, MaxRetries: 2}, {Name: "Google", URL: "https://www.google.com", Type: "http", Interval: 30, AlertID: alertID, CheckSSL: true, ExpiryThreshold: 14, MaxRetries: 2},
{Name: "GitHub", URL: "https://github.com", Type: "http", Interval: 30, AlertID: alertID, CheckSSL: true, ExpiryThreshold: 7, MaxRetries: 3}, {Name: "GitHub", URL: "https://github.com", Type: "http", Interval: 30, AlertID: alertID, CheckSSL: true, ExpiryThreshold: 7, MaxRetries: 3},
{Name: "Cloudflare DNS", URL: "https://1.1.1.1", Type: "http", Interval: 60, AlertID: alertID, ExpiryThreshold: 7, MaxRetries: 1}, {Name: "Cloudflare DNS", URL: "https://1.1.1.1", Type: "http", Interval: 60, AlertID: alertID, ExpiryThreshold: 7, MaxRetries: 1},
@@ -559,7 +522,7 @@ func seedDemoData(s store.Store) {
} }
for _, site := range demoSites { for _, site := range demoSites {
if err := s.AddSite(ctx, site); err != nil { if err := s.AddSite(ctx, site); err != nil {
log.Printf("demo seed: add site %q: %v", site.Name, err) slog.Error("demo seed failed", "step", "add site", "site", site.Name, "err", err)
} }
} }
} }
@@ -582,7 +545,7 @@ func (c *keyCache) refresh() {
// Keep the previous key set: a transient DB error must not lock every // Keep the previous key set: a transient DB error must not lock every
// admin out. Revocation still fails closed because Invalidate clears // admin out. Revocation still fails closed because Invalidate clears
// the set immediately. // the set immediately.
log.Printf("SSH key cache refresh failed: %v", err) slog.Error("SSH key cache refresh failed", "err", err)
return return
} }
keys := make([]ssh.PublicKey, 0, len(users)) keys := make([]ssh.PublicKey, 0, len(users))
@@ -672,7 +635,9 @@ func seedKeysFromEnv(s store.Store) {
if path := os.Getenv("UPTOP_KEYS"); path != "" { if path := os.Getenv("UPTOP_KEYS"); path != "" {
f, err := os.Open(filepath.Clean(path)) f, err := os.Open(filepath.Clean(path))
if err == nil { if err != nil {
slog.Warn("failed to open UPTOP_KEYS file", "path", path, "err", err) //nolint:gosec // structured slog, not format string
} else {
scanner := bufio.NewScanner(f) scanner := bufio.NewScanner(f)
for scanner.Scan() { for scanner.Scan() {
line := strings.TrimSpace(scanner.Text()) line := strings.TrimSpace(scanner.Text())
@@ -691,7 +656,7 @@ func seedKeysFromEnv(s store.Store) {
existing, err := s.GetAllUsers(ctx) existing, err := s.GetAllUsers(ctx)
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "warning: could not check existing users: %v\n", err) slog.Warn("could not check existing users", "err", err)
return return
} }
@@ -708,7 +673,7 @@ func seedKeysFromEnv(s store.Store) {
username := usernameFromKey(key, i, len(existing)+added) username := usernameFromKey(key, i, len(existing)+added)
if err := s.AddUser(ctx, username, key, "admin"); err != nil { if err := s.AddUser(ctx, username, key, "admin"); err != nil {
fmt.Fprintf(os.Stderr, "warning: failed to seed user %q: %v\n", username, err) slog.Warn("failed to seed user", "user", username, "err", err) //nolint:gosec // structured slog, not format string
continue continue
} }
fmt.Printf("Seeded admin user %q from %s\n", username, seedSource(i, len(keys), os.Getenv("UPTOP_ADMIN_KEY") != "")) fmt.Printf("Seeded admin user %q from %s\n", username, seedSource(i, len(keys), os.Getenv("UPTOP_ADMIN_KEY") != ""))
+4 -4
View File
@@ -3,7 +3,7 @@ services:
# LEADER NODE # LEADER NODE
# ------------------------- # -------------------------
leader: leader:
build: . image: lerkolabs/uptop:latest
container_name: uptop-leader container_name: uptop-leader
ports: ports:
- "23234:23234" # SSH - "23234:23234" # SSH
@@ -18,7 +18,7 @@ services:
# Cluster Config # Cluster Config
- UPTOP_CLUSTER_MODE=leader - UPTOP_CLUSTER_MODE=leader
- UPTOP_CLUSTER_SECRET=mysecret - UPTOP_CLUSTER_SECRET=mysecret # EXAMPLE ONLY — rotate before use
depends_on: depends_on:
- leader-db - leader-db
stdin_open: true stdin_open: true
@@ -38,7 +38,7 @@ services:
# FOLLOWER NODE # FOLLOWER NODE
# ------------------------- # -------------------------
follower: follower:
build: . image: lerkolabs/uptop:latest
container_name: uptop-follower container_name: uptop-follower
ports: ports:
- "23233:23234" # SSH (Mapped to different host port) - "23233:23234" # SSH (Mapped to different host port)
@@ -53,7 +53,7 @@ services:
# Cluster Config # Cluster Config
- UPTOP_CLUSTER_MODE=follower - UPTOP_CLUSTER_MODE=follower
- UPTOP_CLUSTER_SECRET=mysecret - UPTOP_CLUSTER_SECRET=mysecret # EXAMPLE ONLY — rotate before use
# IMPORTANT: Uses the Service Name "leader" to connect internally # IMPORTANT: Uses the Service Name "leader" to connect internally
- UPTOP_PEER_URL=http://leader:8080 - UPTOP_PEER_URL=http://leader:8080
depends_on: depends_on:
+2 -2
View File
@@ -1,8 +1,8 @@
services: services:
# The Application # The Application
app: app:
build: build:
context: . context: ..
dockerfile: Dockerfile dockerfile: Dockerfile
container_name: uptop-dev container_name: uptop-dev
ports: ports:
+6 -6
View File
@@ -1,9 +1,9 @@
services: services:
leader: leader:
build: . image: lerkolabs/uptop:latest
environment: environment:
- UPTOP_CLUSTER_MODE=leader - UPTOP_CLUSTER_MODE=leader
- UPTOP_CLUSTER_SECRET=changeme - UPTOP_CLUSTER_SECRET=changeme # EXAMPLE ONLY — rotate before use
- UPTOP_AGG_STRATEGY=any-down - UPTOP_AGG_STRATEGY=any-down
- UPTOP_STATUS_ENABLED=true - UPTOP_STATUS_ENABLED=true
ports: ports:
@@ -11,25 +11,25 @@ services:
- "23234:23234" - "23234:23234"
probe-us-east: probe-us-east:
build: . image: lerkolabs/uptop:latest
environment: environment:
- UPTOP_CLUSTER_MODE=probe - UPTOP_CLUSTER_MODE=probe
- UPTOP_NODE_ID=us-east-1 - UPTOP_NODE_ID=us-east-1
- UPTOP_NODE_NAME=US East Probe - UPTOP_NODE_NAME=US East Probe
- UPTOP_NODE_REGION=us-east - UPTOP_NODE_REGION=us-east
- UPTOP_PEER_URL=http://leader:8080 - UPTOP_PEER_URL=http://leader:8080
- UPTOP_CLUSTER_SECRET=changeme - UPTOP_CLUSTER_SECRET=changeme # EXAMPLE ONLY — rotate before use
depends_on: depends_on:
- leader - leader
probe-eu-west: probe-eu-west:
build: . image: lerkolabs/uptop:latest
environment: environment:
- UPTOP_CLUSTER_MODE=probe - UPTOP_CLUSTER_MODE=probe
- UPTOP_NODE_ID=eu-west-1 - UPTOP_NODE_ID=eu-west-1
- UPTOP_NODE_NAME=EU West Probe - UPTOP_NODE_NAME=EU West Probe
- UPTOP_NODE_REGION=eu-west - UPTOP_NODE_REGION=eu-west
- UPTOP_PEER_URL=http://leader:8080 - UPTOP_PEER_URL=http://leader:8080
- UPTOP_CLUSTER_SECRET=changeme - UPTOP_CLUSTER_SECRET=changeme # EXAMPLE ONLY — rotate before use
depends_on: depends_on:
- leader - leader
+8 -3
View File
@@ -1,10 +1,15 @@
services: services:
app: app:
build: image: lerkolabs/uptop:latest
context: .
dockerfile: Dockerfile
container_name: uptop container_name: uptop
restart: unless-stopped restart: unless-stopped
read_only: true
cap_drop:
- ALL
security_opt:
- no-new-privileges:true
tmpfs:
- /tmp
ports: ports:
- "23234:23234" - "23234:23234"
- "8080:8080" - "8080:8080"
+6 -1
View File
@@ -16,6 +16,11 @@ A follower is a standby replica that takes over if the leader goes down.
- When the leader recovers, the follower detects it and goes back to standby - When the leader recovers, the follower detects it and goes back to standby
- Both nodes have their own database — they do not share state - Both nodes have their own database — they do not share state
**Limitations:**
- During a network partition where both nodes are healthy, both will run checks and fire alerts independently. There is no leader fencing — the follower has no way to confirm the leader is actually down vs. unreachable from its perspective. This window lasts until the partition heals, at which point the follower detects the leader and steps down.
- Expect duplicate alerts and doubled check history entries during a split-brain event. Alerts are idempotent for most providers (a second "site is down" notification is noisy but not harmful).
- Failover takeover time is ~15 seconds (3 missed polls × 5 second interval). This is not configurable.
**Required env vars:** **Required env vars:**
| Node | Variable | Value | | Node | Variable | Value |
@@ -76,5 +81,5 @@ Set via `UPTOP_AGG_STRATEGY` on the leader.
## Security ## Security
- Set `UPTOP_CLUSTER_SECRET` on all nodes. Without it, cluster API endpoints are unauthenticated. - Set `UPTOP_CLUSTER_SECRET` on all nodes. Without it, cluster API endpoints are unauthenticated.
- Secrets are sent in HTTP headers (`X-Upkeep-Secret`). Use TLS or a reverse proxy for production. - Secrets are sent in HTTP headers (`X-Uptop-Secret`). Use TLS or a reverse proxy for production.
- uptop warns on startup if the cluster secret is missing or if cluster mode is active without TLS. - uptop warns on startup if the cluster secret is missing or if cluster mode is active without TLS.
+31 -2
View File
@@ -122,7 +122,7 @@ Groups can't nest inside other groups. A group is healthy when all its children
## Alert types ## Alert types
All 9 providers work in the YAML. The `settings` map is different per type. All 10 providers work in the YAML. The `settings` map is different per type.
```yaml ```yaml
# Discord / Slack / Generic Webhook — just a URL # Discord / Slack / Generic Webhook — just a URL
@@ -149,6 +149,9 @@ All 9 providers work in the YAML. The `settings` map is different per type.
url: https://ntfy.sh url: https://ntfy.sh
topic: my-alerts topic: my-alerts
priority: "4" priority: "4"
# for protected topics:
# username: user
# password: pass
# Telegram # Telegram
- name: Telegram Ops - name: Telegram Ops
@@ -178,6 +181,14 @@ All 9 providers work in the YAML. The `settings` map is different per type.
url: https://gotify.example.com url: https://gotify.example.com
token: app-token token: app-token
priority: "8" priority: "8"
# Opsgenie
- name: Opsgenie
type: opsgenie
settings:
api_key: your-api-key
priority: P2 # P1P5, default P3
# eu: "true" # use the EU API endpoint
``` ```
## Commands ## Commands
@@ -224,7 +235,25 @@ Monitors and alerts are matched by **name**. Names must be unique across the ent
Apply is idempotent. Run it twice with the same file, second run changes nothing. Apply is idempotent. Run it twice with the same file, second run changes nothing.
If something fails mid-apply, just fix the issue and run it again. It picks up where it left off. Apply is **not atomic** — items are written one at a time, so an error mid-apply (bad value, lost DB connection, ctrl-C) leaves the items already written in place. That's safe to recover from: apply diffs against the database by name, so fix the issue and run it again — it converges the rest. Just don't run two applies against the same database at once.
## Backups and secrets
`uptop export` writes alert credentials (SMTP passwords, API tokens, webhook URLs) into the YAML in clear text — that's what makes the file restorable. Treat it like a secrets file.
The HTTP export endpoint redacts those same fields **by default**:
```bash
# secrets show as ***REDACTED*** — fine for sharing or review
curl -H "X-Uptop-Secret: your-secret" \
"http://localhost:8080/api/backup/export"
# full backup you can actually restore from
curl -H "X-Uptop-Secret: your-secret" \
"http://localhost:8080/api/backup/export?redact_secrets=false"
```
Restoring a redacted export imports the literal string `***REDACTED***` as your credentials. For real backups, pass `redact_secrets=false` or run `uptop export` on the host.
## Typical workflow ## Typical workflow
+63 -2
View File
@@ -3,9 +3,11 @@ package alert
import ( import (
"bytes" "bytes"
"context" "context"
"crypto/tls"
"encoding/json" "encoding/json"
"errors" "errors"
"fmt" "fmt"
"net"
"net/http" "net/http"
"net/smtp" "net/smtp"
"net/url" "net/url"
@@ -244,7 +246,6 @@ func (e *EmailProvider) Send(ctx context.Context, title, message string) error {
return ctx.Err() return ctx.Err()
default: default:
} }
auth := smtp.PlainAuth("", e.User, e.Pass, e.Host)
to := sanitizeHeader(e.To) to := sanitizeHeader(e.To)
from := sanitizeHeader(e.From) from := sanitizeHeader(e.From)
subject := sanitizeHeader(title) subject := sanitizeHeader(title)
@@ -256,7 +257,67 @@ func (e *EmailProvider) Send(ctx context.Context, title, message string) error {
"Content-Type: text/plain; charset=utf-8\r\n" + "Content-Type: text/plain; charset=utf-8\r\n" +
"\r\n" + "\r\n" +
body + "\r\n") body + "\r\n")
return smtp.SendMail(e.Host+":"+e.Port, auth, from, []string{to}, msg) return sendMailContext(ctx, e.Host, e.Port, e.User, e.Pass, from, []string{to}, msg)
}
// sendMailContext is a ctx-aware replacement for smtp.SendMail.
// smtp.SendMail ignores context entirely — a blackholed SMTP server hangs for
// the OS TCP timeout (minutes). This dials with the context deadline and sets
// connection deadlines so cancellation is respected throughout.
func sendMailContext(ctx context.Context, host, port, user, pass, from string, rcpt []string, msg []byte) error {
addr := host + ":" + port
dialer := net.Dialer{}
conn, err := dialer.DialContext(ctx, "tcp", addr)
if err != nil {
return fmt.Errorf("smtp dial: %w", err)
}
if deadline, ok := ctx.Deadline(); ok {
_ = conn.SetDeadline(deadline)
}
c, err := smtp.NewClient(conn, host)
if err != nil {
_ = conn.Close()
return fmt.Errorf("smtp client: %w", err)
}
defer c.Close()
if ok, _ := c.Extension("STARTTLS"); ok {
if err := c.StartTLS(&tls.Config{ServerName: host}); err != nil {
return fmt.Errorf("smtp starttls: %w", err)
}
}
if user != "" || pass != "" {
auth := smtp.PlainAuth("", user, pass, host)
if err := c.Auth(auth); err != nil {
return fmt.Errorf("smtp auth: %w", err)
}
}
if err := c.Mail(from); err != nil {
return fmt.Errorf("smtp mail: %w", err)
}
for _, r := range rcpt {
if err := c.Rcpt(r); err != nil {
return fmt.Errorf("smtp rcpt: %w", err)
}
}
w, err := c.Data()
if err != nil {
return fmt.Errorf("smtp data: %w", err)
}
if _, err := w.Write(msg); err != nil {
return fmt.Errorf("smtp write: %w", err)
}
if err := w.Close(); err != nil {
return fmt.Errorf("smtp data close: %w", err)
}
return c.Quit()
} }
type NtfyProvider struct { type NtfyProvider struct {
+117
View File
@@ -1,14 +1,18 @@
package alert package alert
import ( import (
"bufio"
"context" "context"
"encoding/json" "encoding/json"
"errors" "errors"
"fmt"
"net"
"net/http" "net/http"
"net/http/httptest" "net/http/httptest"
"net/url" "net/url"
"strings" "strings"
"testing" "testing"
"time"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models" "gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
) )
@@ -330,3 +334,116 @@ func TestSanitizeError(t *testing.T) {
t.Error("nil should stay nil") t.Error("nil should stay nil")
} }
} }
func TestEmailProvider_ContextTimeout(t *testing.T) {
// Listener that accepts but never speaks — simulates a blackholed SMTP server.
ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatal(err)
}
defer ln.Close()
go func() {
for {
conn, err := ln.Accept()
if err != nil {
return
}
// Hold connection open, never send banner.
go func(c net.Conn) {
time.Sleep(30 * time.Second)
c.Close()
}(conn)
}
}()
_, portStr, _ := net.SplitHostPort(ln.Addr().String())
provider := &EmailProvider{
Host: "127.0.0.1", Port: portStr,
From: "test@test.com", To: "dest@test.com",
}
ctx, cancel := context.WithTimeout(context.Background(), 200*time.Millisecond)
defer cancel()
start := time.Now()
err = provider.Send(ctx, "test", "body")
elapsed := time.Since(start)
if err == nil {
t.Fatal("expected error from stalled SMTP")
}
if elapsed > 2*time.Second {
t.Errorf("Send took %v — context deadline not respected", elapsed)
}
}
func TestSendMailContext_HappyPath(t *testing.T) {
// Minimal fake SMTP server that accepts one message.
ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatal(err)
}
defer ln.Close()
received := make(chan string, 1)
go func() {
conn, err := ln.Accept()
if err != nil {
return
}
defer conn.Close()
fmt.Fprintf(conn, "220 localhost ESMTP\r\n")
scanner := bufio.NewScanner(conn)
var dataMode bool
var body strings.Builder
for scanner.Scan() {
line := scanner.Text()
if dataMode {
if line == "." {
dataMode = false
fmt.Fprintf(conn, "250 OK\r\n")
continue
}
body.WriteString(line + "\n")
continue
}
switch {
case strings.HasPrefix(line, "EHLO"):
fmt.Fprintf(conn, "250-localhost\r\n250 OK\r\n")
case strings.HasPrefix(line, "MAIL FROM"):
fmt.Fprintf(conn, "250 OK\r\n")
case strings.HasPrefix(line, "RCPT TO"):
fmt.Fprintf(conn, "250 OK\r\n")
case line == "DATA":
fmt.Fprintf(conn, "354 Go ahead\r\n")
dataMode = true
case line == "QUIT":
fmt.Fprintf(conn, "221 Bye\r\n")
received <- body.String()
return
default:
fmt.Fprintf(conn, "250 OK\r\n")
}
}
}()
_, portStr, _ := net.SplitHostPort(ln.Addr().String())
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
err = sendMailContext(ctx, "127.0.0.1", portStr, "", "", "from@test.com", []string{"to@test.com"}, []byte("Subject: test\r\n\r\nhello"))
if err != nil {
t.Fatalf("sendMailContext: %v", err)
}
select {
case body := <-received:
if !strings.Contains(body, "hello") {
t.Errorf("expected body to contain 'hello', got: %s", body)
}
case <-time.After(5 * time.Second):
t.Fatal("timed out waiting for fake SMTP to receive message")
}
}
+1 -1
View File
@@ -52,7 +52,7 @@ func runFollowerLoop(ctx context.Context, cfg Config, eng *monitor.Engine) {
req, _ := http.NewRequest("GET", cfg.PeerURL+"/api/health", nil) req, _ := http.NewRequest("GET", cfg.PeerURL+"/api/health", nil)
if cfg.SharedKey != "" { if cfg.SharedKey != "" {
req.Header.Set("X-Upkeep-Secret", cfg.SharedKey) req.Header.Set("X-Uptop-Secret", cfg.SharedKey)
} }
resp, err := client.Do(req) resp, err := client.Do(req)
+7 -94
View File
@@ -12,100 +12,13 @@ import (
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models" "gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/monitor" "gitea.lerkolabs.com/lerkolabs/uptop/internal/monitor"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/store/storetest"
) )
// --- Mock Store (minimal, for monitor.NewEngine) ---
type mockStore struct { type mockStore struct {
sites []models.Site storetest.BaseMock
} }
func (m *mockStore) Init(_ context.Context) error { return nil }
func (m *mockStore) GetSites(_ context.Context) ([]models.Site, error) { return m.sites, nil }
func (m *mockStore) AddSite(_ context.Context, _ models.Site) error { return nil }
func (m *mockStore) UpdateSite(_ context.Context, _ models.Site) error { return nil }
func (m *mockStore) UpdateSitePaused(_ context.Context, _ int, _ bool) error { return nil }
func (m *mockStore) DeleteSite(_ context.Context, _ int) error { return nil }
func (m *mockStore) GetAllAlerts(_ context.Context) ([]models.AlertConfig, error) { return nil, nil }
func (m *mockStore) GetAlert(_ context.Context, _ int) (models.AlertConfig, error) {
return models.AlertConfig{}, nil
}
func (m *mockStore) AddAlert(_ context.Context, _ string, _ string, _ map[string]string) error {
return nil
}
func (m *mockStore) UpdateAlert(_ context.Context, _ int, _ string, _ string, _ map[string]string) error {
return nil
}
func (m *mockStore) DeleteAlert(_ context.Context, _ int) error { return nil }
func (m *mockStore) GetAllUsers(_ context.Context) ([]models.User, error) { return nil, nil }
func (m *mockStore) AddUser(_ context.Context, _ string, _ string, _ string) error { return nil }
func (m *mockStore) UpdateUser(_ context.Context, _ int, _ string, _ string, _ string) error {
return nil
}
func (m *mockStore) DeleteUser(_ context.Context, _ int) error { return nil }
func (m *mockStore) SaveCheck(_ context.Context, _ int, _ int64, _ bool) error { return nil }
func (m *mockStore) SaveCheckFromNode(_ context.Context, _ int, _ string, _ int64, _ bool) error {
return nil
}
func (m *mockStore) LoadAllHistory(_ context.Context, _ int) (map[int][]models.CheckRecord, error) {
return nil, nil
}
func (m *mockStore) ExportData(_ context.Context) (models.Backup, error) { return models.Backup{}, nil }
func (m *mockStore) ImportData(_ context.Context, _ models.Backup) error { return nil }
func (m *mockStore) GetSiteByName(_ context.Context, _ string) (models.Site, error) {
return models.Site{}, nil
}
func (m *mockStore) GetAlertByName(_ context.Context, _ string) (models.AlertConfig, error) {
return models.AlertConfig{}, nil
}
func (m *mockStore) AddSiteReturningID(_ context.Context, _ models.Site) (int, error) { return 0, nil }
func (m *mockStore) AddAlertReturningID(_ context.Context, _ string, _ string, _ map[string]string) (int, error) {
return 0, nil
}
func (m *mockStore) RegisterNode(_ context.Context, _ models.ProbeNode) error { return nil }
func (m *mockStore) GetNode(_ context.Context, _ string) (models.ProbeNode, error) {
return models.ProbeNode{}, nil
}
func (m *mockStore) GetAllNodes(_ context.Context) ([]models.ProbeNode, error) { return nil, nil }
func (m *mockStore) UpdateNodeLastSeen(_ context.Context, _ string) error { return nil }
func (m *mockStore) DeleteNode(_ context.Context, _ string) error { return nil }
func (m *mockStore) LoadAlertHealth(_ context.Context) (map[int]models.AlertHealthRecord, error) {
return nil, nil
}
func (m *mockStore) SaveAlertHealth(_ context.Context, _ models.AlertHealthRecord) error { return nil }
func (m *mockStore) SaveLog(_ context.Context, _ string) error { return nil }
func (m *mockStore) PruneLogs(_ context.Context) error { return nil }
func (m *mockStore) PruneCheckHistory(_ context.Context) error { return nil }
func (m *mockStore) PruneStateChanges(_ context.Context) error { return nil }
func (m *mockStore) LoadLogs(_ context.Context, _ int) ([]string, error) { return nil, nil }
func (m *mockStore) GetActiveMaintenanceWindows(_ context.Context) ([]models.MaintenanceWindow, error) {
return nil, nil
}
func (m *mockStore) GetAllMaintenanceWindows(_ context.Context, _ int) ([]models.MaintenanceWindow, error) {
return nil, nil
}
func (m *mockStore) AddMaintenanceWindow(_ context.Context, _ models.MaintenanceWindow) error {
return nil
}
func (m *mockStore) EndMaintenanceWindow(_ context.Context, _ int) error { return nil }
func (m *mockStore) DeleteMaintenanceWindow(_ context.Context, _ int) error { return nil }
func (m *mockStore) PruneExpiredMaintenanceWindows(_ context.Context, _ time.Duration) (int64, error) {
return 0, nil
}
func (m *mockStore) IsMonitorInMaintenance(_ context.Context, _ int) (bool, error) { return false, nil }
func (m *mockStore) GetPreference(_ context.Context, _ string) (string, error) { return "", nil }
func (m *mockStore) SetPreference(_ context.Context, _ string, _ string) error { return nil }
func (m *mockStore) SaveStateChange(_ context.Context, _ int, _ string, _ string, _ string) error {
return nil
}
func (m *mockStore) GetStateChanges(_ context.Context, _ int, _ int) ([]models.StateChange, error) {
return nil, nil
}
func (m *mockStore) GetStateChangesSince(_ context.Context, _ int, _ time.Time) ([]models.StateChange, error) {
return nil, nil
}
func (m *mockStore) Close() error { return nil }
// --- Cluster Start Tests --- // --- Cluster Start Tests ---
func TestStart_LeaderMode(t *testing.T) { func TestStart_LeaderMode(t *testing.T) {
@@ -200,7 +113,7 @@ func TestFollowerLoop_SendsSecret(t *testing.T) {
var receivedSecret string var receivedSecret string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
mu.Lock() mu.Lock()
receivedSecret = r.Header.Get("X-Upkeep-Secret") receivedSecret = r.Header.Get("X-Uptop-Secret")
mu.Unlock() mu.Unlock()
w.WriteHeader(200) w.WriteHeader(200)
w.Write([]byte("OK")) w.Write([]byte("OK"))
@@ -290,7 +203,7 @@ func TestProbeRegister_Failure(t *testing.T) {
func TestProbeFetchAssignments_Success(t *testing.T) { func TestProbeFetchAssignments_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
json.NewEncoder(w).Encode(map[string][]models.Site{ json.NewEncoder(w).Encode(map[string][]models.Site{
"sites": {{ID: 1, Name: "s1", Type: "http", URL: "http://example.com"}}, "sites": {{SiteConfig: models.SiteConfig{ID: 1, Name: "s1", Type: "http", URL: "http://example.com"}}},
}) })
})) }))
defer srv.Close() defer srv.Close()
@@ -327,8 +240,8 @@ func TestProbeExecuteChecks(t *testing.T) {
defer srv.Close() defer srv.Close()
sites := []models.Site{ sites := []models.Site{
{ID: 1, Type: "http", URL: srv.URL}, {SiteConfig: models.SiteConfig{ID: 1, Type: "http", URL: srv.URL}},
{ID: 2, Type: "http", URL: srv.URL}, {SiteConfig: models.SiteConfig{ID: 2, Type: "http", URL: srv.URL}},
} }
strict := &http.Client{} strict := &http.Client{}
@@ -364,7 +277,7 @@ func TestProbeExecuteChecks_Concurrency(t *testing.T) {
var sites []models.Site var sites []models.Site
for i := 0; i < 20; i++ { for i := 0; i < 20; i++ {
sites = append(sites, models.Site{ID: i + 1, Type: "http", URL: srv.URL}) sites = append(sites, models.Site{SiteConfig: models.SiteConfig{ID: i + 1, Type: "http", URL: srv.URL}})
} }
results := probeExecuteChecks(context.Background(), sites, &http.Client{}, &http.Client{}, true) results := probeExecuteChecks(context.Background(), sites, &http.Client{}, &http.Client{}, true)
+10 -10
View File
@@ -6,7 +6,7 @@ import (
"crypto/tls" "crypto/tls"
"encoding/json" "encoding/json"
"fmt" "fmt"
"log" "log/slog"
"net/http" "net/http"
"net/url" "net/url"
"sync" "sync"
@@ -47,7 +47,7 @@ func RunProbe(ctx context.Context, cfg ProbeConfig) error {
} }
if err := probeRegister(ctx, apiClient, cfg); err != nil { if err := probeRegister(ctx, apiClient, cfg); err != nil {
log.Printf("Probe: initial registration failed: %v (will retry)", err) slog.Error("probe initial registration failed", "err", err)
} }
for { for {
@@ -59,7 +59,7 @@ func RunProbe(ctx context.Context, cfg ProbeConfig) error {
sites, err := probeFetchAssignments(ctx, apiClient, cfg) sites, err := probeFetchAssignments(ctx, apiClient, cfg)
if err != nil { if err != nil {
log.Printf("Probe: failed to fetch assignments: %v", err) slog.Error("probe failed to fetch assignments", "err", err)
sleepCtx(ctx, 10*time.Second) sleepCtx(ctx, 10*time.Second)
continue continue
} }
@@ -73,7 +73,7 @@ func RunProbe(ctx context.Context, cfg ProbeConfig) error {
if len(results) > 0 { if len(results) > 0 {
if err := probeReportResults(ctx, apiClient, cfg, results); err != nil { if err := probeReportResults(ctx, apiClient, cfg, results); err != nil {
log.Printf("Probe: failed to report results: %v", err) slog.Error("probe failed to report results", "err", err)
} }
} }
@@ -90,7 +90,7 @@ func probeRegister(ctx context.Context, client *http.Client, cfg ProbeConfig) er
return err return err
} }
req.Header.Set("Content-Type", "application/json") req.Header.Set("Content-Type", "application/json")
req.Header.Set("X-Upkeep-Secret", cfg.SharedKey) req.Header.Set("X-Uptop-Secret", cfg.SharedKey)
resp, err := client.Do(req) resp, err := client.Do(req)
if err != nil { if err != nil {
return err return err
@@ -108,7 +108,7 @@ func probeFetchAssignments(ctx context.Context, client *http.Client, cfg ProbeCo
if err != nil { if err != nil {
return nil, err return nil, err
} }
req.Header.Set("X-Upkeep-Secret", cfg.SharedKey) req.Header.Set("X-Uptop-Secret", cfg.SharedKey)
resp, err := client.Do(req) resp, err := client.Do(req)
if err != nil { if err != nil {
return nil, err return nil, err
@@ -152,12 +152,12 @@ loop:
defer wg.Done() defer wg.Done()
defer func() { <-sem }() defer func() { <-sem }()
cr := monitor.RunCheck(ctx, s, strict, insecure, false, allowPrivate) cr := monitor.RunCheck(ctx, s.SiteConfig, strict, insecure, false, allowPrivate)
mu.Lock() mu.Lock()
results = append(results, probeResultItem{ results = append(results, probeResultItem{
SiteID: s.ID, SiteID: s.ID,
LatencyNs: cr.LatencyNs, LatencyNs: cr.LatencyNs,
IsUp: cr.Status == "UP", IsUp: cr.Status == string(models.StatusUp),
ErrorReason: cr.ErrorReason, ErrorReason: cr.ErrorReason,
}) })
mu.Unlock() mu.Unlock()
@@ -180,7 +180,7 @@ func probeReportResults(ctx context.Context, client *http.Client, cfg ProbeConfi
return err return err
} }
req.Header.Set("Content-Type", "application/json") req.Header.Set("Content-Type", "application/json")
req.Header.Set("X-Upkeep-Secret", cfg.SharedKey) req.Header.Set("X-Uptop-Secret", cfg.SharedKey)
resp, err := client.Do(req) resp, err := client.Do(req)
if err != nil { if err != nil {
return err return err
@@ -189,7 +189,7 @@ func probeReportResults(ctx context.Context, client *http.Client, cfg ProbeConfi
if resp.StatusCode != 200 { if resp.StatusCode != 200 {
return fmt.Errorf("results returned %d", resp.StatusCode) return fmt.Errorf("results returned %d", resp.StatusCode)
} }
fmt.Printf("Probe: reported %d check results\n", len(results)) slog.Info("probe reported check results", "count", len(results))
return nil return nil
} }
+13 -6
View File
@@ -42,7 +42,7 @@ func Apply(ctx context.Context, s store.Store, f *File, opts ApplyOpts) ([]Chang
existingAlertsByName[a.Name] = a existingAlertsByName[a.Name] = a
} }
existingSitesByName := make(map[string]models.Site, len(existingSites)) existingSitesByName := make(map[string]models.SiteConfig, len(existingSites))
for _, s := range existingSites { for _, s := range existingSites {
existingSitesByName[s.Name] = s existingSitesByName[s.Name] = s
} }
@@ -54,6 +54,7 @@ func Apply(ctx context.Context, s store.Store, f *File, opts ApplyOpts) ([]Chang
alertMap[ea.Name] = ea.ID alertMap[ea.Name] = ea.ID
} }
nextPlaceholderID := -1
desiredAlertNames := make(map[string]bool, len(f.Alerts)) desiredAlertNames := make(map[string]bool, len(f.Alerts))
for _, a := range f.Alerts { for _, a := range f.Alerts {
desiredAlertNames[a.Name] = true desiredAlertNames[a.Name] = true
@@ -66,6 +67,9 @@ func Apply(ctx context.Context, s store.Store, f *File, opts ApplyOpts) ([]Chang
return changes, fmt.Errorf("create alert %q: %w", a.Name, err) return changes, fmt.Errorf("create alert %q: %w", a.Name, err)
} }
alertMap[a.Name] = id alertMap[a.Name] = id
} else {
alertMap[a.Name] = nextPlaceholderID
nextPlaceholderID--
} }
} else { } else {
alertMap[a.Name] = existing.ID alertMap[a.Name] = existing.ID
@@ -109,6 +113,9 @@ func Apply(ctx context.Context, s store.Store, f *File, opts ApplyOpts) ([]Chang
return changes, fmt.Errorf("create group %q: %w", g.Name, err) return changes, fmt.Errorf("create group %q: %w", g.Name, err)
} }
groupMap[g.Name] = id groupMap[g.Name] = id
} else {
groupMap[g.Name] = nextPlaceholderID
nextPlaceholderID--
} }
} else { } else {
groupMap[g.Name] = existing.ID groupMap[g.Name] = existing.ID
@@ -181,7 +188,7 @@ func Apply(ctx context.Context, s store.Store, f *File, opts ApplyOpts) ([]Chang
return changes, nil return changes, nil
} }
func applyMonitor(ctx context.Context, s store.Store, m Monitor, alertMap map[string]int, existing map[string]models.Site, parentID int, dryRun bool) ([]Change, error) { func applyMonitor(ctx context.Context, s store.Store, m Monitor, alertMap map[string]int, existing map[string]models.SiteConfig, parentID int, dryRun bool) ([]Change, error) {
alertID, err := resolveAlertID(alertMap, m.Alert) alertID, err := resolveAlertID(alertMap, m.Alert)
if err != nil { if err != nil {
return nil, fmt.Errorf("monitor %q: %w", m.Name, err) return nil, fmt.Errorf("monitor %q: %w", m.Name, err)
@@ -222,8 +229,8 @@ func resolveAlertID(alertMap map[string]int, name string) (int, error) {
return id, nil return id, nil
} }
func monitorToSite(m Monitor, alertID, parentID int) models.Site { func monitorToSite(m Monitor, alertID, parentID int) models.SiteConfig {
s := models.Site{ s := models.SiteConfig{
Name: m.Name, Name: m.Name,
Type: m.Type, Type: m.Type,
URL: m.URL, URL: m.URL,
@@ -269,7 +276,7 @@ func collectMonitorNames(monitors []Monitor, names map[string]bool) {
} }
} }
func normalizeSite(s models.Site) models.Site { func normalizeSite(s models.SiteConfig) models.SiteConfig {
if s.Method == "" { if s.Method == "" {
s.Method = "GET" s.Method = "GET"
} }
@@ -293,7 +300,7 @@ func diffAlert(existing models.AlertConfig, desired Alert) string {
return strings.Join(diffs, ", ") return strings.Join(diffs, ", ")
} }
func diffSite(existing, desired models.Site) string { func diffSite(existing, desired models.SiteConfig) string {
var diffs []string var diffs []string
if existing.URL != desired.URL { if existing.URL != desired.URL {
diffs = append(diffs, fmt.Sprintf("url: %s -> %s", existing.URL, desired.URL)) diffs = append(diffs, fmt.Sprintf("url: %s -> %s", existing.URL, desired.URL))
+71 -3
View File
@@ -114,8 +114,8 @@ func TestApplyUpdate(t *testing.T) {
func TestApplyPrune(t *testing.T) { func TestApplyPrune(t *testing.T) {
s := newTestStore(t) s := newTestStore(t)
s.AddSite(context.Background(), models.Site{Name: "Keep", URL: "https://keep.com", Type: "http", Interval: 30, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"}) s.AddSite(context.Background(), models.SiteConfig{Name: "Keep", URL: "https://keep.com", Type: "http", Interval: 30, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"})
s.AddSite(context.Background(), models.Site{Name: "Remove", URL: "https://remove.com", Type: "http", Interval: 30, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"}) s.AddSite(context.Background(), models.SiteConfig{Name: "Remove", URL: "https://remove.com", Type: "http", Interval: 30, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"})
f := &File{ f := &File{
Monitors: []Monitor{ Monitors: []Monitor{
@@ -191,7 +191,7 @@ func TestApplyGroupHierarchy(t *testing.T) {
} }
sites, _ := s.GetSites(context.Background()) sites, _ := s.GetSites(context.Background())
var group models.Site var group models.SiteConfig
for _, s := range sites { for _, s := range sites {
if s.Type == "group" { if s.Type == "group" {
group = s group = s
@@ -266,6 +266,74 @@ func TestApplyDuplicateNames(t *testing.T) {
} }
} }
func TestApplyDryRunNewAlertAndMonitor(t *testing.T) {
s := newTestStore(t)
f := &File{
Alerts: []Alert{
{Name: "Discord", Type: "discord", Settings: map[string]string{"url": "https://example.com"}},
},
Monitors: []Monitor{
{Name: "Web", Type: "http", URL: "https://example.com", Interval: 30, Alert: "Discord"},
},
}
changes, err := Apply(context.Background(), s, f, ApplyOpts{DryRun: true})
if err != nil {
t.Fatalf("dry-run with new alert+monitor should not error: %v", err)
}
creates := 0
for _, c := range changes {
if c.Action == "create" {
creates++
}
}
if creates != 2 {
t.Fatalf("expected 2 creates (alert+monitor), got %d: %+v", creates, changes)
}
sites, _ := s.GetSites(context.Background())
alerts, _ := s.GetAllAlerts(context.Background())
if len(sites) != 0 {
t.Fatalf("dry-run should not persist sites, got %d", len(sites))
}
if len(alerts) != 0 {
t.Fatalf("dry-run should not persist alerts, got %d", len(alerts))
}
}
func TestApplyDryRunNewGroupWithChildren(t *testing.T) {
s := newTestStore(t)
f := &File{
Alerts: []Alert{
{Name: "Slack", Type: "slack", Settings: map[string]string{"url": "https://hooks.example.com"}},
},
Monitors: []Monitor{
{
Name: "Prod", Type: "group", Alert: "Slack",
Monitors: []Monitor{
{Name: "API", Type: "http", URL: "https://api.example.com", Interval: 15, Alert: "Slack"},
},
},
},
}
changes, err := Apply(context.Background(), s, f, ApplyOpts{DryRun: true})
if err != nil {
t.Fatalf("dry-run with new group+alert should not error: %v", err)
}
creates := 0
for _, c := range changes {
if c.Action == "create" {
creates++
}
}
if creates != 3 {
t.Fatalf("expected 3 creates (alert+group+child), got %d: %+v", creates, changes)
}
}
func TestApplyExistingAlertReference(t *testing.T) { func TestApplyExistingAlertReference(t *testing.T) {
s := newTestStore(t) s := newTestStore(t)
s.AddAlert(context.Background(), "Existing", "webhook", map[string]string{"url": "https://example.com"}) s.AddAlert(context.Background(), "Existing", "webhook", map[string]string{"url": "https://example.com"})
+4 -4
View File
@@ -34,9 +34,9 @@ func Export(ctx context.Context, s store.Store) (*File, error) {
}) })
} }
groups := make(map[int]models.Site) groups := make(map[int]models.SiteConfig)
children := make(map[int][]models.Site) children := make(map[int][]models.SiteConfig)
var topLevel []models.Site var topLevel []models.SiteConfig
for _, s := range dbSites { for _, s := range dbSites {
switch { switch {
@@ -76,7 +76,7 @@ func Export(ctx context.Context, s store.Store) (*File, error) {
return &File{Alerts: yamlAlerts, Monitors: yamlMonitors}, nil return &File{Alerts: yamlAlerts, Monitors: yamlMonitors}, nil
} }
func siteToMonitor(s models.Site, alertIDToName map[int]string) Monitor { func siteToMonitor(s models.SiteConfig, alertIDToName map[int]string) Monitor {
m := Monitor{ m := Monitor{
Name: s.Name, Name: s.Name,
Type: s.Type, Type: s.Type,
+7 -7
View File
@@ -22,7 +22,7 @@ func TestExportAlertNames(t *testing.T) {
s := newTestStore(t) s := newTestStore(t)
s.AddAlert(context.Background(), "Discord", "discord", map[string]string{"url": "https://example.com"}) s.AddAlert(context.Background(), "Discord", "discord", map[string]string{"url": "https://example.com"})
alerts, _ := s.GetAllAlerts(context.Background()) alerts, _ := s.GetAllAlerts(context.Background())
s.AddSite(context.Background(), models.Site{Name: "Web", URL: "https://example.com", Type: "http", Interval: 30, AlertID: alerts[0].ID, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"}) s.AddSite(context.Background(), models.SiteConfig{Name: "Web", URL: "https://example.com", Type: "http", Interval: 30, AlertID: alerts[0].ID, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"})
f, err := Export(context.Background(), s) f, err := Export(context.Background(), s)
if err != nil { if err != nil {
@@ -39,9 +39,9 @@ func TestExportAlertNames(t *testing.T) {
func TestExportGroupHierarchy(t *testing.T) { func TestExportGroupHierarchy(t *testing.T) {
s := newTestStore(t) s := newTestStore(t)
groupID, _ := s.AddSiteReturningID(context.Background(), models.Site{Name: "Prod", Type: "group", ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"}) groupID, _ := s.AddSiteReturningID(context.Background(), models.SiteConfig{Name: "Prod", Type: "group", ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"})
s.AddSite(context.Background(), models.Site{Name: "Prod Web", URL: "https://prod.example.com", Type: "http", Interval: 15, ParentID: groupID, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"}) s.AddSite(context.Background(), models.SiteConfig{Name: "Prod Web", URL: "https://prod.example.com", Type: "http", Interval: 15, ParentID: groupID, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"})
s.AddSite(context.Background(), models.Site{Name: "Top Level", URL: "https://example.com", Type: "http", Interval: 30, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"}) s.AddSite(context.Background(), models.SiteConfig{Name: "Top Level", URL: "https://example.com", Type: "http", Interval: 30, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"})
f, err := Export(context.Background(), s) f, err := Export(context.Background(), s)
if err != nil { if err != nil {
@@ -72,7 +72,7 @@ func TestExportGroupHierarchy(t *testing.T) {
func TestExportOmitsDefaults(t *testing.T) { func TestExportOmitsDefaults(t *testing.T) {
s := newTestStore(t) s := newTestStore(t)
s.AddSite(context.Background(), models.Site{ s.AddSite(context.Background(), models.SiteConfig{
Name: "Web", URL: "https://example.com", Type: "http", Interval: 30, Name: "Web", URL: "https://example.com", Type: "http", Interval: 30,
Method: "GET", AcceptedCodes: "200-299", ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299", ExpiryThreshold: 7,
}) })
@@ -98,8 +98,8 @@ func TestExportRoundTrip(t *testing.T) {
s1 := newTestStore(t) s1 := newTestStore(t)
s1.AddAlert(context.Background(), "Discord", "discord", map[string]string{"url": "https://example.com"}) s1.AddAlert(context.Background(), "Discord", "discord", map[string]string{"url": "https://example.com"})
alerts, _ := s1.GetAllAlerts(context.Background()) alerts, _ := s1.GetAllAlerts(context.Background())
s1.AddSite(context.Background(), models.Site{Name: "Web", URL: "https://example.com", Type: "http", Interval: 30, AlertID: alerts[0].ID, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"}) s1.AddSite(context.Background(), models.SiteConfig{Name: "Web", URL: "https://example.com", Type: "http", Interval: 30, AlertID: alerts[0].ID, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"})
s1.AddSite(context.Background(), models.Site{Name: "Ping", Type: "ping", Hostname: "10.0.0.1", Interval: 60, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"}) s1.AddSite(context.Background(), models.SiteConfig{Name: "Ping", Type: "ping", Hostname: "10.0.0.1", Interval: 60, ExpiryThreshold: 7, Method: "GET", AcceptedCodes: "200-299"})
exported, err := Export(context.Background(), s1) exported, err := Export(context.Background(), s1)
if err != nil { if err != nil {
+15 -4
View File
@@ -1,11 +1,14 @@
package importer package importer
import ( import (
"crypto/rand"
"encoding/hex"
"encoding/json" "encoding/json"
"fmt" "fmt"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
"os" "os"
"strings" "strings"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
) )
type KumaBackup struct { type KumaBackup struct {
@@ -80,7 +83,7 @@ func ConvertKuma(kb *KumaBackup) models.Backup {
} }
} }
var sites []models.Site var sites []models.SiteConfig
for _, m := range kb.MonitorList { for _, m := range kb.MonitorList {
site := convertKumaMonitor(m, kumaToUpkeepAlert) site := convertKumaMonitor(m, kumaToUpkeepAlert)
sites = append(sites, site) sites = append(sites, site)
@@ -132,8 +135,8 @@ func convertKumaNotifications(entries []KumaNotifEntry) map[int]models.AlertConf
return result return result
} }
func convertKumaMonitor(m KumaMonitor, alertMap map[int]int) models.Site { func convertKumaMonitor(m KumaMonitor, alertMap map[int]int) models.SiteConfig {
site := models.Site{ site := models.SiteConfig{
ID: m.ID, ID: m.ID,
Name: m.Name, Name: m.Name,
Description: m.Description, Description: m.Description,
@@ -155,10 +158,18 @@ func convertKumaMonitor(m KumaMonitor, alertMap map[int]int) models.Site {
site.DNSResolveType = m.DNSResolveType site.DNSResolveType = m.DNSResolveType
site.DNSServer = m.DNSResolveServer site.DNSServer = m.DNSResolveServer
site.Paused = !m.Active
switch m.Type { switch m.Type {
case "http": case "http":
site.URL = m.URL site.URL = m.URL
site.CheckSSL = m.ExpiryNotif site.CheckSSL = m.ExpiryNotif
case "push":
site.Type = "push"
b := make([]byte, 16)
if _, err := rand.Read(b); err == nil {
site.Token = hex.EncodeToString(b)
}
case "ping": case "ping":
if m.Hostname != "" { if m.Hostname != "" {
site.Hostname = m.Hostname site.Hostname = m.Hostname
+210
View File
@@ -0,0 +1,210 @@
package importer
import (
"os"
"path/filepath"
"strings"
"testing"
)
func writeTemp(t *testing.T, content string) string {
t.Helper()
path := filepath.Join(t.TempDir(), "backup.json")
if err := os.WriteFile(path, []byte(content), 0o600); err != nil {
t.Fatal(err)
}
return path
}
func TestLoadKumaFileMissingFile(t *testing.T) {
_, err := LoadKumaFile(filepath.Join(t.TempDir(), "nope.json"))
if err == nil {
t.Fatal("expected error for missing file")
}
}
func TestLoadKumaFileMalformedInput(t *testing.T) {
cases := []struct {
name string
body string
}{
{"empty file", ""},
{"truncated JSON", `{"version": "1.23", "monitorList": [`},
{"not JSON", "definitely not json"},
{"wrong root type", `[1, 2, 3]`},
{"monitorList wrong type", `{"monitorList": {"a": 1}}`},
{"monitor field wrong type", `{"monitorList": [{"id": "not-an-int"}]}`},
{"notificationList wrong type", `{"notificationList": "oops"}`},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
_, err := LoadKumaFile(writeTemp(t, tc.body))
if err == nil {
t.Fatalf("expected parse error for %s", tc.name)
}
if !strings.Contains(err.Error(), "parse JSON") {
t.Fatalf("expected wrapped parse error, got: %v", err)
}
})
}
}
func TestLoadKumaFileNullLists(t *testing.T) {
kb, err := LoadKumaFile(writeTemp(t, `{"version": "1.23", "monitorList": null, "notificationList": null}`))
if err != nil {
t.Fatal(err)
}
backup := ConvertKuma(kb)
if len(backup.Sites) != 0 || len(backup.Alerts) != 0 {
t.Fatalf("expected empty backup, got %d sites %d alerts", len(backup.Sites), len(backup.Alerts))
}
}
func TestConvertKumaSkipsMalformedNotificationConfig(t *testing.T) {
kb := &KumaBackup{
NotificationList: []KumaNotifEntry{
{ID: 1, Name: "broken", Config: "{not json"},
{ID: 2, Name: "good", Config: `{"type": "discord", "ntfyserverurl": "https://example.com/hook"}`},
},
MonitorList: []KumaMonitor{
{ID: 10, Name: "site", Type: "http", URL: "https://example.com", NotificationIDs: map[string]bool{"1": true}},
},
}
backup := ConvertKuma(kb)
if len(backup.Alerts) != 1 {
t.Fatalf("expected broken notification skipped, got %d alerts", len(backup.Alerts))
}
if backup.Alerts[0].Type != "discord" {
t.Fatalf("expected discord alert, got %q", backup.Alerts[0].Type)
}
if backup.Sites[0].AlertID != 0 {
t.Fatalf("site referencing skipped notification should keep AlertID 0, got %d", backup.Sites[0].AlertID)
}
}
func TestConvertKumaNtfyNotification(t *testing.T) {
kb := &KumaBackup{
NotificationList: []KumaNotifEntry{
{ID: 3, Name: "ntfy", Config: `{
"type": "ntfy",
"ntfyserverurl": "https://ntfy.example.com/",
"ntfytopic": "uptime",
"ntfyPriority": 4,
"ntfyAuthenticationMethod": "usernamePassword",
"ntfyusername": "u",
"ntfypassword": "p"
}`},
},
}
backup := ConvertKuma(kb)
if len(backup.Alerts) != 1 {
t.Fatalf("expected 1 alert, got %d", len(backup.Alerts))
}
a := backup.Alerts[0]
if a.Type != "ntfy" {
t.Fatalf("expected ntfy, got %q", a.Type)
}
if a.Settings["url"] != "https://ntfy.example.com" {
t.Fatalf("expected trailing slash trimmed, got %q", a.Settings["url"])
}
if a.Settings["topic"] != "uptime" || a.Settings["priority"] != "4" {
t.Fatalf("unexpected settings: %v", a.Settings)
}
if a.Settings["username"] != "u" || a.Settings["password"] != "p" {
t.Fatalf("expected credentials mapped, got %v", a.Settings)
}
}
func TestConvertKumaUnknownNotificationFallsBackToWebhook(t *testing.T) {
kb := &KumaBackup{
NotificationList: []KumaNotifEntry{
{ID: 4, Name: "matrix", Config: `{"type": "matrix", "ntfyserverurl": "https://example.com/hook"}`},
},
}
backup := ConvertKuma(kb)
if len(backup.Alerts) != 1 || backup.Alerts[0].Type != "webhook" {
t.Fatalf("expected webhook fallback, got %+v", backup.Alerts)
}
}
func TestConvertKumaHTTPMonitor(t *testing.T) {
kb := &KumaBackup{
NotificationList: []KumaNotifEntry{
{ID: 1, Name: "hook", Config: `{"type": "slack", "ntfyserverurl": "https://example.com/hook"}`},
},
MonitorList: []KumaMonitor{{
ID: 7,
Name: "web",
Type: "http",
URL: "https://example.com",
Interval: 60,
Timeout: 30,
MaxRetries: 2,
Method: "GET",
AcceptedCodes: []string{"200", "301"},
IgnoreTLS: true,
ExpiryNotif: true,
Active: false,
NotificationIDs: map[string]bool{"1": true},
}},
}
backup := ConvertKuma(kb)
if len(backup.Sites) != 1 {
t.Fatalf("expected 1 site, got %d", len(backup.Sites))
}
s := backup.Sites[0]
if s.URL != "https://example.com" || !s.CheckSSL || !s.IgnoreTLS {
t.Fatalf("http fields not mapped: %+v", s)
}
if !s.Paused {
t.Fatal("inactive monitor should import paused")
}
if s.AcceptedCodes != "200,301" {
t.Fatalf("expected joined accepted codes, got %q", s.AcceptedCodes)
}
if s.AlertID != 1 {
t.Fatalf("expected alert mapped, got %d", s.AlertID)
}
}
func TestConvertKumaPushMonitorGetsToken(t *testing.T) {
kb := &KumaBackup{
MonitorList: []KumaMonitor{{ID: 1, Name: "push", Type: "push", Active: true}},
}
backup := ConvertKuma(kb)
token := backup.Sites[0].Token
if len(token) != 32 {
t.Fatalf("expected 32-char hex token, got %q", token)
}
}
func TestConvertKumaNonNumericNotificationID(t *testing.T) {
kb := &KumaBackup{
MonitorList: []KumaMonitor{{
ID: 1,
Name: "site",
Type: "http",
NotificationIDs: map[string]bool{"abc": true},
}},
}
backup := ConvertKuma(kb)
if backup.Sites[0].AlertID != 0 {
t.Fatalf("non-numeric notification ID should not map, got %d", backup.Sites[0].AlertID)
}
}
func TestConvertKumaGroupAndChildren(t *testing.T) {
kb := &KumaBackup{
MonitorList: []KumaMonitor{
{ID: 1, Name: "grp", Type: "group", Active: true},
{ID: 2, Name: "ping", Type: "ping", Hostname: "10.0.0.1", Parent: 1, Active: true},
},
}
backup := ConvertKuma(kb)
if backup.Sites[0].Type != "group" {
t.Fatalf("expected group type, got %q", backup.Sites[0].Type)
}
if backup.Sites[1].ParentID != 1 || backup.Sites[1].Hostname != "10.0.0.1" {
t.Fatalf("child not mapped: %+v", backup.Sites[1])
}
}
+4 -3
View File
@@ -2,11 +2,12 @@ package metrics
import ( import (
"fmt" "fmt"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/monitor"
"net/http" "net/http"
"sort" "sort"
"strings" "strings"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/monitor"
) )
func Handler(eng *monitor.Engine) http.HandlerFunc { func Handler(eng *monitor.Engine) http.HandlerFunc {
@@ -19,7 +20,7 @@ func Handler(eng *monitor.Engine) http.HandlerFunc {
writeHelp(&b, "uptop_monitor_up", "gauge", "Whether the monitor is up (1) or down (0).") writeHelp(&b, "uptop_monitor_up", "gauge", "Whether the monitor is up (1) or down (0).")
for _, s := range sites { for _, s := range sites {
val := 0 val := 0
if s.Status == "UP" { if s.Status == models.StatusUp {
val = 1 val = 1
} }
writeGauge(&b, "uptop_monitor_up", labels(s), float64(val)) writeGauge(&b, "uptop_monitor_up", labels(s), float64(val))
+6 -86
View File
@@ -10,101 +10,21 @@ import (
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models" "gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/monitor" "gitea.lerkolabs.com/lerkolabs/uptop/internal/monitor"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/store/storetest"
) )
type mockStore struct { type mockStore struct {
sites []models.Site storetest.BaseMock
sites []models.SiteConfig
} }
func (m *mockStore) Init(_ context.Context) error { return nil } func (m *mockStore) GetSites(_ context.Context) ([]models.SiteConfig, error) {
func (m *mockStore) GetSites(_ context.Context) ([]models.Site, error) { return m.sites, nil } return m.sites, nil
func (m *mockStore) AddSite(_ context.Context, _ models.Site) error { return nil }
func (m *mockStore) UpdateSite(_ context.Context, _ models.Site) error { return nil }
func (m *mockStore) UpdateSitePaused(_ context.Context, _ int, _ bool) error { return nil }
func (m *mockStore) DeleteSite(_ context.Context, _ int) error { return nil }
func (m *mockStore) GetAllAlerts(_ context.Context) ([]models.AlertConfig, error) { return nil, nil }
func (m *mockStore) GetAlert(_ context.Context, _ int) (models.AlertConfig, error) {
return models.AlertConfig{}, nil
} }
func (m *mockStore) AddAlert(_ context.Context, _ string, _ string, _ map[string]string) error {
return nil
}
func (m *mockStore) UpdateAlert(_ context.Context, _ int, _ string, _ string, _ map[string]string) error {
return nil
}
func (m *mockStore) DeleteAlert(_ context.Context, _ int) error { return nil }
func (m *mockStore) GetAllUsers(_ context.Context) ([]models.User, error) { return nil, nil }
func (m *mockStore) AddUser(_ context.Context, _ string, _ string, _ string) error { return nil }
func (m *mockStore) UpdateUser(_ context.Context, _ int, _ string, _ string, _ string) error {
return nil
}
func (m *mockStore) DeleteUser(_ context.Context, _ int) error { return nil }
func (m *mockStore) SaveCheck(_ context.Context, _ int, _ int64, _ bool) error { return nil }
func (m *mockStore) LoadAllHistory(_ context.Context, _ int) (map[int][]models.CheckRecord, error) {
return nil, nil
}
func (m *mockStore) ExportData(_ context.Context) (models.Backup, error) { return models.Backup{}, nil }
func (m *mockStore) ImportData(_ context.Context, _ models.Backup) error { return nil }
func (m *mockStore) GetSiteByName(_ context.Context, _ string) (models.Site, error) {
return models.Site{}, nil
}
func (m *mockStore) GetAlertByName(_ context.Context, _ string) (models.AlertConfig, error) {
return models.AlertConfig{}, nil
}
func (m *mockStore) AddSiteReturningID(_ context.Context, _ models.Site) (int, error) { return 0, nil }
func (m *mockStore) AddAlertReturningID(_ context.Context, _ string, _ string, _ map[string]string) (int, error) {
return 0, nil
}
func (m *mockStore) SaveCheckFromNode(_ context.Context, _ int, _ string, _ int64, _ bool) error {
return nil
}
func (m *mockStore) RegisterNode(_ context.Context, _ models.ProbeNode) error { return nil }
func (m *mockStore) GetNode(_ context.Context, _ string) (models.ProbeNode, error) {
return models.ProbeNode{}, nil
}
func (m *mockStore) GetAllNodes(_ context.Context) ([]models.ProbeNode, error) { return nil, nil }
func (m *mockStore) UpdateNodeLastSeen(_ context.Context, _ string) error { return nil }
func (m *mockStore) DeleteNode(_ context.Context, _ string) error { return nil }
func (m *mockStore) LoadAlertHealth(_ context.Context) (map[int]models.AlertHealthRecord, error) {
return nil, nil
}
func (m *mockStore) SaveAlertHealth(_ context.Context, _ models.AlertHealthRecord) error { return nil }
func (m *mockStore) SaveLog(_ context.Context, _ string) error { return nil }
func (m *mockStore) PruneLogs(_ context.Context) error { return nil }
func (m *mockStore) PruneCheckHistory(_ context.Context) error { return nil }
func (m *mockStore) PruneStateChanges(_ context.Context) error { return nil }
func (m *mockStore) LoadLogs(_ context.Context, _ int) ([]string, error) { return nil, nil }
func (m *mockStore) GetActiveMaintenanceWindows(_ context.Context) ([]models.MaintenanceWindow, error) {
return nil, nil
}
func (m *mockStore) GetAllMaintenanceWindows(_ context.Context, _ int) ([]models.MaintenanceWindow, error) {
return nil, nil
}
func (m *mockStore) AddMaintenanceWindow(_ context.Context, _ models.MaintenanceWindow) error {
return nil
}
func (m *mockStore) EndMaintenanceWindow(_ context.Context, _ int) error { return nil }
func (m *mockStore) DeleteMaintenanceWindow(_ context.Context, _ int) error { return nil }
func (m *mockStore) PruneExpiredMaintenanceWindows(_ context.Context, _ time.Duration) (int64, error) {
return 0, nil
}
func (m *mockStore) IsMonitorInMaintenance(_ context.Context, _ int) (bool, error) { return false, nil }
func (m *mockStore) GetPreference(_ context.Context, _ string) (string, error) { return "", nil }
func (m *mockStore) SetPreference(_ context.Context, _ string, _ string) error { return nil }
func (m *mockStore) SaveStateChange(_ context.Context, _ int, _ string, _ string, _ string) error {
return nil
}
func (m *mockStore) GetStateChanges(_ context.Context, _ int, _ int) ([]models.StateChange, error) {
return nil, nil
}
func (m *mockStore) GetStateChangesSince(_ context.Context, _ int, _ time.Time) ([]models.StateChange, error) {
return nil, nil
}
func (m *mockStore) Close() error { return nil }
func TestMetricsHandler(t *testing.T) { func TestMetricsHandler(t *testing.T) {
ms := &mockStore{ ms := &mockStore{
sites: []models.Site{ sites: []models.SiteConfig{
{ID: 1, Name: "Example", URL: "https://example.com", Type: "http", Interval: 30}, {ID: 1, Name: "Example", URL: "https://example.com", Type: "http", Interval: 30},
{ID: 2, Name: "DNS Check", Type: "dns", Interval: 60}, {ID: 2, Name: "DNS Check", Type: "dns", Interval: 60},
}, },
+10 -3
View File
@@ -2,7 +2,7 @@ package models
import "time" import "time"
type Site struct { type SiteConfig struct {
ID int ID int
Name string Name string
URL string URL string
@@ -26,9 +26,11 @@ type Site struct {
IgnoreTLS bool IgnoreTLS bool
Paused bool Paused bool
Regions string Regions string
}
type SiteState struct {
FailureCount int FailureCount int
Status string Status Status
StatusCode int StatusCode int
Latency time.Duration Latency time.Duration
CertExpiry time.Time CertExpiry time.Time
@@ -40,6 +42,11 @@ type Site struct {
LastSuccessAt time.Time LastSuccessAt time.Time
} }
type Site struct {
SiteConfig
SiteState
}
type StateChange struct { type StateChange struct {
ID int ID int
SiteID int SiteID int
@@ -103,7 +110,7 @@ type MaintenanceWindow struct {
} }
type Backup struct { type Backup struct {
Sites []Site `json:"sites"` Sites []SiteConfig `json:"sites"`
Alerts []AlertConfig `json:"alerts"` Alerts []AlertConfig `json:"alerts"`
Users []User `json:"users"` Users []User `json:"users"`
MaintenanceWindows []MaintenanceWindow `json:"maintenance_windows,omitempty"` MaintenanceWindows []MaintenanceWindow `json:"maintenance_windows,omitempty"`
+18
View File
@@ -0,0 +1,18 @@
package models
type Status string
const (
StatusUp Status = "UP"
StatusDown Status = "DOWN"
StatusPending Status = "PENDING"
StatusLate Status = "LATE"
StatusStale Status = "STALE"
StatusSSLExp Status = "SSL EXP"
)
func (s Status) IsBroken() bool {
return s == StatusDown || s == StatusSSLExp
}
func (s Status) String() string { return string(s) }
+65 -35
View File
@@ -3,6 +3,7 @@ package monitor
import ( import (
"context" "context"
"fmt" "fmt"
"io"
"net" "net"
"net/http" "net/http"
"strconv" "strconv"
@@ -35,22 +36,27 @@ type CheckResult struct {
ErrorReason string ErrorReason string
} }
func RunCheck(ctx context.Context, site models.Site, strict, insecure *http.Client, globalInsecure bool, allowPrivate ...bool) CheckResult { func RunCheck(ctx context.Context, site models.SiteConfig, strict, insecure *http.Client, globalInsecure, allowPrivate bool) CheckResult {
private := len(allowPrivate) > 0 && allowPrivate[0] // Resolve + validate once for non-HTTP types to prevent DNS-rebind TOCTOU:
// a second resolve in the check function could return a different (private) IP.
if site.Type != "http" && site.Type != "dns" && !private { // HTTP is safe — SafeDialContext resolves and validates at dial time.
var pinnedIP net.IP
if site.Type != "http" && site.Type != "dns" && !allowPrivate {
host := site.Hostname host := site.Hostname
if host == "" { if host == "" {
host = site.URL host = site.URL
} }
if host != "" { if host != "" {
if ips, err := net.LookupIP(host); err == nil { ips, err := net.LookupIP(host)
for _, ip := range ips { if err != nil {
if isPrivateIP(ip) { return CheckResult{SiteID: site.ID, Status: string(models.StatusDown), ErrorReason: "resolve failed: " + err.Error()}
return CheckResult{SiteID: site.ID, Status: "DOWN", ErrorReason: "target resolves to private IP"} }
} for _, ip := range ips {
if isPrivateIP(ip) {
return CheckResult{SiteID: site.ID, Status: string(models.StatusDown), ErrorReason: "target resolves to private IP"}
} }
} }
pinnedIP = ips[0]
} }
} }
@@ -58,17 +64,17 @@ func RunCheck(ctx context.Context, site models.Site, strict, insecure *http.Clie
case "http": case "http":
return runHTTPCheck(ctx, site, strict, insecure, globalInsecure) return runHTTPCheck(ctx, site, strict, insecure, globalInsecure)
case "ping": case "ping":
return runPingCheck(ctx, site) return runPingCheck(ctx, site, pinnedIP)
case "port": case "port":
return runPortCheck(ctx, site) return runPortCheck(ctx, site, pinnedIP)
case "dns": case "dns":
return runDNSCheck(ctx, site) return runDNSCheck(ctx, site, allowPrivate)
default: default:
return CheckResult{SiteID: site.ID, Status: "DOWN", ErrorReason: "unsupported monitor type: " + site.Type} return CheckResult{SiteID: site.ID, Status: string(models.StatusDown), ErrorReason: "unsupported monitor type: " + site.Type}
} }
} }
func runHTTPCheck(ctx context.Context, site models.Site, strict, insecure *http.Client, globalInsecure bool) CheckResult { func runHTTPCheck(ctx context.Context, site models.SiteConfig, strict, insecure *http.Client, globalInsecure bool) CheckResult {
method := site.Method method := site.Method
if method == "" { if method == "" {
method = "GET" method = "GET"
@@ -80,7 +86,7 @@ func runHTTPCheck(ctx context.Context, site models.Site, strict, insecure *http.
req, err := http.NewRequestWithContext(ctx, method, site.URL, nil) req, err := http.NewRequestWithContext(ctx, method, site.URL, nil)
if err != nil { if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN", ErrorReason: "invalid request: " + err.Error()} return CheckResult{SiteID: site.ID, Status: string(models.StatusDown), ErrorReason: "invalid request: " + err.Error()}
} }
client := strict client := strict
@@ -94,20 +100,23 @@ func runHTTPCheck(ctx context.Context, site models.Site, strict, insecure *http.
result := CheckResult{ result := CheckResult{
SiteID: site.ID, SiteID: site.ID,
Status: "UP", Status: string(models.StatusUp),
LatencyNs: latency.Nanoseconds(), LatencyNs: latency.Nanoseconds(),
} }
if err != nil { if err != nil {
result.Status = "DOWN" result.Status = string(models.StatusDown)
result.ErrorReason = truncateError(err.Error(), maxErrorLength) result.ErrorReason = truncateError(err.Error(), maxErrorLength)
return result return result
} }
defer resp.Body.Close() defer func() {
_, _ = io.Copy(io.Discard, resp.Body)
_ = resp.Body.Close()
}()
result.StatusCode = resp.StatusCode result.StatusCode = resp.StatusCode
if !isCodeAccepted(resp.StatusCode, site.AcceptedCodes) { if !isCodeAccepted(resp.StatusCode, site.AcceptedCodes) {
result.Status = "DOWN" result.Status = string(models.StatusDown)
expected := site.AcceptedCodes expected := site.AcceptedCodes
if expected == "" { if expected == "" {
expected = defaultAcceptedCodes expected = defaultAcceptedCodes
@@ -120,7 +129,7 @@ func runHTTPCheck(ctx context.Context, site models.Site, strict, insecure *http.
cert := resp.TLS.PeerCertificates[0] cert := resp.TLS.PeerCertificates[0]
result.CertExpiry = cert.NotAfter result.CertExpiry = cert.NotAfter
if time.Now().After(cert.NotAfter) { if time.Now().After(cert.NotAfter) {
result.Status = "SSL EXP" result.Status = string(models.StatusSSLExp)
result.ErrorReason = "SSL certificate expired" result.ErrorReason = "SSL certificate expired"
} }
} }
@@ -128,7 +137,7 @@ func runHTTPCheck(ctx context.Context, site models.Site, strict, insecure *http.
return result return result
} }
func runPingCheck(_ context.Context, site models.Site) CheckResult { func runPingCheck(_ context.Context, site models.SiteConfig, pinnedIP net.IP) CheckResult {
host := site.Hostname host := site.Hostname
if host == "" { if host == "" {
host = site.URL host = site.URL
@@ -136,7 +145,10 @@ func runPingCheck(_ context.Context, site models.Site) CheckResult {
pinger, err := probing.NewPinger(host) pinger, err := probing.NewPinger(host)
if err != nil { if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN", ErrorReason: "ping setup: " + err.Error()} return CheckResult{SiteID: site.ID, Status: string(models.StatusDown), ErrorReason: "ping setup: " + err.Error()}
}
if pinnedIP != nil {
pinger.SetIPAddr(&net.IPAddr{IP: pinnedIP})
} }
pinger.Count = 1 pinger.Count = 1
pinger.Timeout = siteTimeout(site) pinger.Timeout = siteTimeout(site)
@@ -147,21 +159,24 @@ func runPingCheck(_ context.Context, site models.Site) CheckResult {
latency := time.Since(start) latency := time.Since(start)
if err != nil { if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds(), ErrorReason: "ping failed: " + err.Error()} return CheckResult{SiteID: site.ID, Status: string(models.StatusDown), LatencyNs: latency.Nanoseconds(), ErrorReason: "ping failed: " + err.Error()}
} }
if pinger.Statistics().PacketsRecv == 0 { if pinger.Statistics().PacketsRecv == 0 {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds(), ErrorReason: "no ICMP response"} return CheckResult{SiteID: site.ID, Status: string(models.StatusDown), LatencyNs: latency.Nanoseconds(), ErrorReason: "no ICMP response"}
} }
stats := pinger.Statistics() stats := pinger.Statistics()
return CheckResult{SiteID: site.ID, Status: "UP", LatencyNs: stats.AvgRtt.Nanoseconds()} return CheckResult{SiteID: site.ID, Status: string(models.StatusUp), LatencyNs: stats.AvgRtt.Nanoseconds()}
} }
func runPortCheck(_ context.Context, site models.Site) CheckResult { func runPortCheck(_ context.Context, site models.SiteConfig, pinnedIP net.IP) CheckResult {
host := site.Hostname host := site.Hostname
if host == "" { if host == "" {
host = site.URL host = site.URL
} }
if pinnedIP != nil {
host = pinnedIP.String()
}
addr := net.JoinHostPort(host, strconv.Itoa(site.Port)) addr := net.JoinHostPort(host, strconv.Itoa(site.Port))
timeout := siteTimeout(site) timeout := siteTimeout(site)
@@ -170,13 +185,13 @@ func runPortCheck(_ context.Context, site models.Site) CheckResult {
latency := time.Since(start) latency := time.Since(start)
if err != nil { if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds(), ErrorReason: truncateError(err.Error(), maxErrorLength)} return CheckResult{SiteID: site.ID, Status: string(models.StatusDown), LatencyNs: latency.Nanoseconds(), ErrorReason: truncateError(err.Error(), maxErrorLength)}
} }
_ = conn.Close() _ = conn.Close()
return CheckResult{SiteID: site.ID, Status: "UP", LatencyNs: latency.Nanoseconds()} return CheckResult{SiteID: site.ID, Status: string(models.StatusUp), LatencyNs: latency.Nanoseconds()}
} }
func runDNSCheck(_ context.Context, site models.Site) CheckResult { func runDNSCheck(_ context.Context, site models.SiteConfig, allowPrivate bool) CheckResult {
host := site.Hostname host := site.Hostname
if host == "" { if host == "" {
host = site.URL host = site.URL
@@ -186,9 +201,24 @@ func runDNSCheck(_ context.Context, site models.Site) CheckResult {
if server == "" { if server == "" {
server = defaultDNSServer server = defaultDNSServer
} }
if _, _, err := net.SplitHostPort(server); err != nil { serverHost, serverPort, err := net.SplitHostPort(server)
server = net.JoinHostPort(server, defaultDNSPort) if err != nil {
serverHost = server
serverPort = defaultDNSPort
} }
if !allowPrivate {
if serverPort != defaultDNSPort {
return CheckResult{SiteID: site.ID, Status: string(models.StatusDown), ErrorReason: "DNS server port must be 53"}
}
if ips, err := net.LookupIP(serverHost); err == nil {
for _, ip := range ips {
if isPrivateIP(ip) {
return CheckResult{SiteID: site.ID, Status: string(models.StatusDown), ErrorReason: "DNS server resolves to private address"}
}
}
}
}
server = net.JoinHostPort(serverHost, serverPort)
qtype := dns.TypeA qtype := dns.TypeA
switch site.DNSResolveType { switch site.DNSResolveType {
@@ -221,15 +251,15 @@ func runDNSCheck(_ context.Context, site models.Site) CheckResult {
latency := time.Since(start) latency := time.Since(start)
if err != nil { if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds(), ErrorReason: "DNS query failed: " + err.Error()} return CheckResult{SiteID: site.ID, Status: string(models.StatusDown), LatencyNs: latency.Nanoseconds(), ErrorReason: "DNS query failed: " + err.Error()}
} }
if r.Rcode != dns.RcodeSuccess { if r.Rcode != dns.RcodeSuccess {
return CheckResult{SiteID: site.ID, Status: "DOWN", StatusCode: r.Rcode, LatencyNs: latency.Nanoseconds(), ErrorReason: "DNS RCODE: " + dns.RcodeToString[r.Rcode]} return CheckResult{SiteID: site.ID, Status: string(models.StatusDown), StatusCode: r.Rcode, LatencyNs: latency.Nanoseconds(), ErrorReason: "DNS RCODE: " + dns.RcodeToString[r.Rcode]}
} }
return CheckResult{SiteID: site.ID, Status: "UP", LatencyNs: latency.Nanoseconds()} return CheckResult{SiteID: site.ID, Status: string(models.StatusUp), LatencyNs: latency.Nanoseconds()}
} }
func siteTimeout(site models.Site) time.Duration { func siteTimeout(site models.SiteConfig) time.Duration {
if site.Timeout > 0 { if site.Timeout > 0 {
return time.Duration(site.Timeout) * time.Second return time.Duration(site.Timeout) * time.Second
} }
+57 -20
View File
@@ -19,8 +19,8 @@ func TestRunCheck_HTTP_Success(t *testing.T) {
})) }))
defer srv.Close() defer srv.Close()
site := models.Site{ID: 1, Type: "http", URL: srv.URL} site := models.SiteConfig{ID: 1, Type: "http", URL: srv.URL}
result := RunCheck(context.Background(), site, http.DefaultClient, http.DefaultClient, false) result := RunCheck(context.Background(), site, http.DefaultClient, http.DefaultClient, false, false)
if result.Status != "UP" { if result.Status != "UP" {
t.Errorf("expected UP, got %s", result.Status) t.Errorf("expected UP, got %s", result.Status)
@@ -39,8 +39,8 @@ func TestRunCheck_HTTP_ServerError(t *testing.T) {
})) }))
defer srv.Close() defer srv.Close()
site := models.Site{ID: 1, Type: "http", URL: srv.URL} site := models.SiteConfig{ID: 1, Type: "http", URL: srv.URL}
result := RunCheck(context.Background(), site, http.DefaultClient, http.DefaultClient, false) result := RunCheck(context.Background(), site, http.DefaultClient, http.DefaultClient, false, false)
if result.Status != "DOWN" { if result.Status != "DOWN" {
t.Errorf("expected DOWN, got %s", result.Status) t.Errorf("expected DOWN, got %s", result.Status)
@@ -60,8 +60,8 @@ func TestRunCheck_HTTP_CustomAcceptedCodes(t *testing.T) {
return http.ErrUseLastResponse return http.ErrUseLastResponse
}} }}
site := models.Site{ID: 1, Type: "http", URL: srv.URL, AcceptedCodes: "200-399"} site := models.SiteConfig{ID: 1, Type: "http", URL: srv.URL, AcceptedCodes: "200-399"}
result := RunCheck(context.Background(), site, client, client, false) result := RunCheck(context.Background(), site, client, client, false, false)
if result.Status != "UP" { if result.Status != "UP" {
t.Errorf("expected UP with accepted 200-399, got %s", result.Status) t.Errorf("expected UP with accepted 200-399, got %s", result.Status)
@@ -76,8 +76,8 @@ func TestRunCheck_HTTP_MethodRespected(t *testing.T) {
})) }))
defer srv.Close() defer srv.Close()
site := models.Site{ID: 1, Type: "http", URL: srv.URL, Method: "HEAD"} site := models.SiteConfig{ID: 1, Type: "http", URL: srv.URL, Method: "HEAD"}
RunCheck(context.Background(), site, http.DefaultClient, http.DefaultClient, false) RunCheck(context.Background(), site, http.DefaultClient, http.DefaultClient, false, false)
if receivedMethod != "HEAD" { if receivedMethod != "HEAD" {
t.Errorf("expected HEAD, got %s", receivedMethod) t.Errorf("expected HEAD, got %s", receivedMethod)
@@ -91,8 +91,8 @@ func TestRunCheck_HTTP_Timeout(t *testing.T) {
})) }))
defer srv.Close() defer srv.Close()
site := models.Site{ID: 1, Type: "http", URL: srv.URL, Timeout: 1} site := models.SiteConfig{ID: 1, Type: "http", URL: srv.URL, Timeout: 1}
result := RunCheck(context.Background(), site, http.DefaultClient, http.DefaultClient, false) result := RunCheck(context.Background(), site, http.DefaultClient, http.DefaultClient, false, false)
if result.Status != "DOWN" { if result.Status != "DOWN" {
t.Errorf("expected DOWN on timeout, got %s", result.Status) t.Errorf("expected DOWN on timeout, got %s", result.Status)
@@ -109,8 +109,8 @@ func TestRunCheck_HTTP_SSLFields(t *testing.T) {
Transport: &http.Transport{TLSClientConfig: &tls.Config{InsecureSkipVerify: true}}, Transport: &http.Transport{TLSClientConfig: &tls.Config{InsecureSkipVerify: true}},
} }
site := models.Site{ID: 1, Type: "http", URL: srv.URL, CheckSSL: true, IgnoreTLS: true} site := models.SiteConfig{ID: 1, Type: "http", URL: srv.URL, CheckSSL: true, IgnoreTLS: true}
result := RunCheck(context.Background(), site, http.DefaultClient, insecureClient, false) result := RunCheck(context.Background(), site, http.DefaultClient, insecureClient, false, false)
if result.Status != "UP" { if result.Status != "UP" {
t.Errorf("expected UP, got %s", result.Status) t.Errorf("expected UP, got %s", result.Status)
@@ -133,7 +133,7 @@ func TestRunCheck_Port_Open(t *testing.T) {
_, portStr, _ := net.SplitHostPort(ln.Addr().String()) _, portStr, _ := net.SplitHostPort(ln.Addr().String())
port, _ := strconv.Atoi(portStr) port, _ := strconv.Atoi(portStr)
site := models.Site{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 2} site := models.SiteConfig{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 2}
result := RunCheck(context.Background(), site, nil, nil, false, true) result := RunCheck(context.Background(), site, nil, nil, false, true)
if result.Status != "UP" { if result.Status != "UP" {
@@ -153,7 +153,7 @@ func TestRunCheck_Port_Closed(t *testing.T) {
port, _ := strconv.Atoi(portStr) port, _ := strconv.Atoi(portStr)
ln.Close() ln.Close()
site := models.Site{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 1} site := models.SiteConfig{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 1}
result := RunCheck(context.Background(), site, nil, nil, false, true) result := RunCheck(context.Background(), site, nil, nil, false, true)
if result.Status != "DOWN" { if result.Status != "DOWN" {
@@ -161,6 +161,43 @@ func TestRunCheck_Port_Closed(t *testing.T) {
} }
} }
func TestRunPortCheck_UsesPinnedIP(t *testing.T) {
ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatal(err)
}
defer ln.Close()
_, portStr, _ := net.SplitHostPort(ln.Addr().String())
port, _ := strconv.Atoi(portStr)
// Pass a pinned IP — runPortCheck should dial it instead of resolving Hostname.
site := models.SiteConfig{ID: 1, Type: "port", Hostname: "will-not-resolve.invalid", Port: port, Timeout: 2}
result := runPortCheck(context.Background(), site, net.ParseIP("127.0.0.1"))
if result.Status != "UP" {
t.Errorf("expected UP when pinned IP used, got %s: %s", result.Status, result.ErrorReason)
}
}
func TestRunPortCheck_NilPinnedIP_UsesHostname(t *testing.T) {
ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatal(err)
}
defer ln.Close()
_, portStr, _ := net.SplitHostPort(ln.Addr().String())
port, _ := strconv.Atoi(portStr)
site := models.SiteConfig{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 2}
result := runPortCheck(context.Background(), site, nil)
if result.Status != "UP" {
t.Errorf("expected UP with nil pinnedIP fallback, got %s: %s", result.Status, result.ErrorReason)
}
}
func TestRunCheck_Port_BlocksPrivateByDefault(t *testing.T) { func TestRunCheck_Port_BlocksPrivateByDefault(t *testing.T) {
ln, err := net.Listen("tcp", "127.0.0.1:0") ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil { if err != nil {
@@ -171,8 +208,8 @@ func TestRunCheck_Port_BlocksPrivateByDefault(t *testing.T) {
_, portStr, _ := net.SplitHostPort(ln.Addr().String()) _, portStr, _ := net.SplitHostPort(ln.Addr().String())
port, _ := strconv.Atoi(portStr) port, _ := strconv.Atoi(portStr)
site := models.Site{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 2} site := models.SiteConfig{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 2}
result := RunCheck(context.Background(), site, nil, nil, false) result := RunCheck(context.Background(), site, nil, nil, false, false)
if result.Status != "DOWN" { if result.Status != "DOWN" {
t.Errorf("expected DOWN when private targets blocked, got %s", result.Status) t.Errorf("expected DOWN when private targets blocked, got %s", result.Status)
@@ -180,8 +217,8 @@ func TestRunCheck_Port_BlocksPrivateByDefault(t *testing.T) {
} }
func TestRunCheck_UnknownType(t *testing.T) { func TestRunCheck_UnknownType(t *testing.T) {
site := models.Site{ID: 1, Type: "invalid"} site := models.SiteConfig{ID: 1, Type: "invalid"}
result := RunCheck(context.Background(), site, nil, nil, false) result := RunCheck(context.Background(), site, nil, nil, false, false)
if result.Status != "DOWN" { if result.Status != "DOWN" {
t.Errorf("expected DOWN for unknown type, got %s", result.Status) t.Errorf("expected DOWN for unknown type, got %s", result.Status)
@@ -214,10 +251,10 @@ func TestIsCodeAccepted(t *testing.T) {
} }
func TestSiteTimeout(t *testing.T) { func TestSiteTimeout(t *testing.T) {
if got := siteTimeout(models.Site{Timeout: 0}); got != 5*time.Second { if got := siteTimeout(models.SiteConfig{Timeout: 0}); got != 5*time.Second {
t.Errorf("expected 5s default, got %v", got) t.Errorf("expected 5s default, got %v", got)
} }
if got := siteTimeout(models.Site{Timeout: 10}); got != 10*time.Second { if got := siteTimeout(models.SiteConfig{Timeout: 10}); got != 10*time.Second {
t.Errorf("expected 10s, got %v", got) t.Errorf("expected 10s, got %v", got)
} }
} }
+96 -78
View File
@@ -115,10 +115,14 @@ func newEngine(s store.Store, allowPrivateTargets bool) *Engine {
} }
} }
// SetInsecureSkipVerify must be called before Start: the field is read by
// checker goroutines without synchronization.
func (e *Engine) SetInsecureSkipVerify(skip bool) { func (e *Engine) SetInsecureSkipVerify(skip bool) {
e.insecureSkipVerify = skip e.insecureSkipVerify = skip
} }
// SetMaintRetention must be called before Start: the field is read by the
// maintenance prune goroutine without synchronization.
func (e *Engine) SetMaintRetention(d time.Duration) { func (e *Engine) SetMaintRetention(d time.Duration) {
e.maintRetention = d e.maintRetention = d
} }
@@ -334,7 +338,7 @@ func (e *Engine) RecordHeartbeat(token string) bool {
} }
var ( var (
prevStatus string prevStatus models.Status
name string name string
alertID int alertID int
downSince time.Time downSince time.Time
@@ -346,12 +350,12 @@ func (e *Engine) RecordHeartbeat(token string) bool {
downSince = s.StatusChangedAt // captured before mutation = when it went down downSince = s.StatusChangedAt // captured before mutation = when it went down
s.LastCheck = time.Now() s.LastCheck = time.Now()
s.Status = "UP" s.Status = models.StatusUp
s.FailureCount = 0 s.FailureCount = 0
s.Latency = 0 s.Latency = 0
s.LastError = "" s.LastError = ""
s.LastSuccessAt = time.Now() s.LastSuccessAt = time.Now()
if prevStatus != "UP" { if prevStatus != models.StatusUp {
s.StatusChangedAt = time.Now() s.StatusChangedAt = time.Now()
} }
}) })
@@ -360,13 +364,13 @@ func (e *Engine) RecordHeartbeat(token string) bool {
} }
switch prevStatus { switch prevStatus {
case "PENDING": case models.StatusPending:
e.AddLog(fmt.Sprintf("Push Monitor '%s' received first heartbeat", name)) e.AddLog(fmt.Sprintf("Push Monitor '%s' received first heartbeat", name))
case "LATE": case models.StatusLate:
e.AddLog(fmt.Sprintf("Push Monitor '%s' heartbeat arrived (was late)", name)) e.AddLog(fmt.Sprintf("Push Monitor '%s' heartbeat arrived (was late)", name))
case "STALE": case models.StatusStale:
e.AddLog(fmt.Sprintf("Push Monitor '%s' heartbeat arrived (was stale)", name)) e.AddLog(fmt.Sprintf("Push Monitor '%s' heartbeat arrived (was stale)", name))
case "DOWN": case models.StatusDown:
downDur := "" downDur := ""
if !downSince.IsZero() { if !downSince.IsZero() {
downDur = fmt.Sprintf(" (was down %s)", fmtDurationShort(time.Since(downSince))) downDur = fmt.Sprintf(" (was down %s)", fmtDurationShort(time.Since(downSince)))
@@ -375,8 +379,10 @@ func (e *Engine) RecordHeartbeat(token string) bool {
go e.triggerAlert(alertID, "✅ RECOVERY", fmt.Sprintf("Push Monitor '%s' is receiving heartbeats.%s", name, downDur)) go e.triggerAlert(alertID, "✅ RECOVERY", fmt.Sprintf("Push Monitor '%s' is receiving heartbeats.%s", name, downDur))
} }
if prevStatus != "UP" && prevStatus != "PENDING" { e.recordCheck(targetID, 0, true)
e.enqueueWrite(writeStateChange{siteID: targetID, fromStatus: prevStatus, toStatus: "UP"})
if prevStatus != models.StatusUp && prevStatus != models.StatusPending {
e.enqueueWrite(writeStateChange{siteID: targetID, fromStatus: string(prevStatus), toStatus: string(models.StatusUp)})
} }
return true return true
@@ -418,7 +424,7 @@ func (e *Engine) Start(ctx context.Context) {
e.refreshMaintenanceCache(ctx) e.refreshMaintenanceCache(ctx)
sites, err := e.db.GetSites(ctx) configs, err := e.db.GetSites(ctx)
if err != nil { if err != nil {
e.AddLog(fmt.Sprintf("Failed to load sites: %v", err)) e.AddLog(fmt.Sprintf("Failed to load sites: %v", err))
select { select {
@@ -428,34 +434,51 @@ func (e *Engine) Start(ctx context.Context) {
} }
continue continue
} }
for _, s := range sites { dbIDs := make(map[int]bool, len(configs))
for _, cfg := range configs {
dbIDs[cfg.ID] = true
e.mu.RLock() e.mu.RLock()
_, exists := e.liveState[s.ID] existing, exists := e.liveState[cfg.ID]
e.mu.RUnlock() e.mu.RUnlock()
if !exists { if !exists {
e.mu.Lock() e.mu.Lock()
s.Status = "PENDING" site := models.Site{SiteConfig: cfg, SiteState: models.SiteState{Status: models.StatusPending}}
if h, ok := e.GetHistory(s.ID); ok && len(h.Statuses) > 0 { if h, ok := e.GetHistory(cfg.ID); ok && len(h.Statuses) > 0 {
if h.Statuses[len(h.Statuses)-1] { if h.Statuses[len(h.Statuses)-1] {
s.Status = "UP" site.Status = models.StatusUp
} else { } else {
s.Status = "DOWN" site.Status = models.StatusDown
} }
if len(h.Latencies) > 0 { if len(h.Latencies) > 0 {
s.Latency = h.Latencies[len(h.Latencies)-1] site.Latency = h.Latencies[len(h.Latencies)-1]
} }
} }
e.liveState[s.ID] = s e.liveState[cfg.ID] = site
e.addToTokenIndex(s) e.addToTokenIndex(site)
e.mu.Unlock() e.mu.Unlock()
e.checkerWG.Add(1) e.checkerWG.Add(1)
go func(id int) { go func(id int) {
defer e.checkerWG.Done() defer e.checkerWG.Done()
e.monitorRoutine(ctx, id) e.monitorRoutine(ctx, id)
}(s.ID) }(cfg.ID)
} else if existing.SiteConfig != cfg {
e.UpdateSiteConfig(cfg)
} }
} }
e.mu.RLock()
var vanished []int
for id := range e.liveState {
if !dbIDs[id] {
vanished = append(vanished, id)
}
}
e.mu.RUnlock()
for _, id := range vanished {
e.RemoveSite(id)
e.AddLog(fmt.Sprintf("Monitor removed (no longer in DB): ID %d", id))
}
select { select {
case <-time.After(pollInterval): case <-time.After(pollInterval):
case <-ctx.Done(): case <-ctx.Done():
@@ -498,27 +521,17 @@ func (e *Engine) pruneMaintenanceWindows(ctx context.Context) {
} }
} }
func (e *Engine) UpdateSiteConfig(site models.Site) { func (e *Engine) UpdateSiteConfig(cfg models.SiteConfig) {
e.mu.Lock() e.mu.Lock()
if existing, ok := e.liveState[site.ID]; ok { if existing, ok := e.liveState[cfg.ID]; ok {
e.removeFromTokenIndex(site.ID) e.removeFromTokenIndex(cfg.ID)
site.Status = existing.Status existing.SiteConfig = cfg
site.StatusCode = existing.StatusCode e.liveState[cfg.ID] = existing
site.Latency = existing.Latency e.addToTokenIndex(existing)
site.CertExpiry = existing.CertExpiry
site.HasSSL = existing.HasSSL
site.LastCheck = existing.LastCheck
site.SentSSLWarning = existing.SentSSLWarning
site.FailureCount = existing.FailureCount
site.LastError = existing.LastError
site.StatusChangedAt = existing.StatusChangedAt
site.LastSuccessAt = existing.LastSuccessAt
e.liveState[site.ID] = site
e.addToTokenIndex(site)
} }
e.mu.Unlock() e.mu.Unlock()
e.signalRecheck(site.ID) e.signalRecheck(cfg.ID)
} }
func (e *Engine) getRecheckChan(id int) chan struct{} { func (e *Engine) getRecheckChan(id int) chan struct{} {
@@ -675,7 +688,7 @@ func (e *Engine) checkByID(ctx context.Context, id int) {
case "group": case "group":
e.checkGroup(ctx, site) e.checkGroup(ctx, site)
default: default:
result := RunCheck(ctx, site, e.strictClient, e.insecureClient, e.insecureSkipVerify, e.allowPrivateTargets) result := RunCheck(ctx, site.SiteConfig, e.strictClient, e.insecureClient, e.insecureSkipVerify, e.allowPrivateTargets)
updatedSite := site updatedSite := site
updatedSite.HasSSL = result.HasSSL updatedSite.HasSSL = result.HasSSL
updatedSite.CertExpiry = result.CertExpiry updatedSite.CertExpiry = result.CertExpiry
@@ -686,7 +699,7 @@ func (e *Engine) checkByID(ctx context.Context, id int) {
} }
func (e *Engine) checkPush(_ context.Context, site models.Site) { func (e *Engine) checkPush(_ context.Context, site models.Site) {
if site.Status == "PENDING" { if site.Status == models.StatusPending {
return return
} }
@@ -702,16 +715,16 @@ func (e *Engine) checkPush(_ context.Context, site models.Site) {
now := time.Now() now := time.Now()
if now.After(graceEnd) { if now.After(graceEnd) {
if site.Status != "DOWN" { if site.Status != models.StatusDown {
e.handleStatusChange(site, "DOWN", 0, 0, "heartbeat missed") e.handleStatusChange(site, string(models.StatusDown), 0, 0, "heartbeat missed")
} }
} else if now.After(staleMark) { } else if now.After(staleMark) {
if site.Status != "STALE" { if site.Status != models.StatusStale {
e.handleStatusChange(site, "STALE", 0, 0, "heartbeat stale") e.handleStatusChange(site, string(models.StatusStale), 0, 0, "heartbeat stale")
} }
} else if now.After(overdue) { } else if now.After(overdue) {
if site.Status != "LATE" { if site.Status != models.StatusLate {
e.handleStatusChange(site, "LATE", 0, 0, "heartbeat overdue") e.handleStatusChange(site, string(models.StatusLate), 0, 0, "heartbeat overdue")
} }
} }
} }
@@ -727,9 +740,10 @@ func (e *Engine) handleStatusChange(snap models.Site, rawStatus string, code int
} }
inMaint := e.isInMaintenance(snap.ID) inMaint := e.isInMaintenance(snap.ID)
status := models.Status(rawStatus)
var ( var (
prev, next string prev, next models.Status
name, typ string name, typ string
alertID int alertID int
failCount, maxRetries int failCount, maxRetries int
@@ -745,7 +759,7 @@ func (e *Engine) handleStatusChange(snap models.Site, rawStatus string, code int
_, exists := e.applyState(snap.ID, func(s *models.Site) { _, exists := e.applyState(snap.ID, func(s *models.Site) {
// A non-UP result computed from a stale snapshot must not override a // A non-UP result computed from a stale snapshot must not override a
// heartbeat (or newer check) that landed while we were evaluating. // heartbeat (or newer check) that landed while we were evaluating.
if rawStatus != "UP" && s.LastCheck.After(snap.LastCheck) { if status != models.StatusUp && s.LastCheck.After(snap.LastCheck) {
skipped = true skipped = true
return return
} }
@@ -764,24 +778,24 @@ func (e *Engine) handleStatusChange(snap models.Site, rawStatus string, code int
s.HasSSL = snap.HasSSL s.HasSSL = snap.HasSSL
s.CertExpiry = snap.CertExpiry s.CertExpiry = snap.CertExpiry
s.LastError = errorReason s.LastError = errorReason
if rawStatus == "UP" { if status == models.StatusUp {
s.LastSuccessAt = time.Now() s.LastSuccessAt = time.Now()
s.LastError = "" s.LastError = ""
} }
// Status + failure-count transition, based on the CURRENT live status. // Status + failure-count transition, based on the CURRENT live status.
if rawStatus == "UP" { if status == models.StatusUp {
s.FailureCount = 0 s.FailureCount = 0
s.Status = "UP" s.Status = models.StatusUp
} else { } else {
if s.FailureCount <= s.MaxRetries { if s.FailureCount <= s.MaxRetries {
s.FailureCount++ s.FailureCount++
} }
if s.FailureCount > s.MaxRetries { if s.FailureCount > s.MaxRetries {
if s.Status != rawStatus { if s.Status != status {
confirmedDown = true confirmedDown = true
} }
s.Status = rawStatus s.Status = status
s.FailureCount = s.MaxRetries + 1 s.FailureCount = s.MaxRetries + 1
} else { } else {
failedCheck = true failedCheck = true
@@ -789,16 +803,16 @@ func (e *Engine) handleStatusChange(snap models.Site, rawStatus string, code int
} }
failCount = s.FailureCount failCount = s.FailureCount
if s.Status != prev && prev != "PENDING" { if s.Status != prev && prev != models.StatusPending {
s.StatusChangedAt = time.Now() s.StatusChangedAt = time.Now()
} else if s.StatusChangedAt.IsZero() && s.Status != "PENDING" { } else if s.StatusChangedAt.IsZero() && s.Status != models.StatusPending {
s.StatusChangedAt = time.Now() s.StatusChangedAt = time.Now()
} }
// SSL expiry warning (fresh HasSSL/CertExpiry + config threshold). // SSL expiry warning (fresh HasSSL/CertExpiry + config threshold).
if typ == "http" && s.CheckSSL && s.HasSSL { if typ == "http" && s.CheckSSL && s.HasSSL {
days := int(time.Until(s.CertExpiry).Hours() / 24) days := int(time.Until(s.CertExpiry).Hours() / 24)
if days <= s.ExpiryThreshold && !s.SentSSLWarning && rawStatus != "SSL EXP" { if days <= s.ExpiryThreshold && !s.SentSSLWarning && status != models.StatusSSLExp {
sslWarnFire = true sslWarnFire = true
sslDays = days sslDays = days
s.SentSSLWarning = true s.SentSSLWarning = true
@@ -815,7 +829,7 @@ func (e *Engine) handleStatusChange(snap models.Site, rawStatus string, code int
return return
} }
e.recordCheck(snap.ID, latency, rawStatus == "UP") e.recordCheck(snap.ID, latency, status == models.StatusUp)
if confirmedDown { if confirmedDown {
if errorReason != "" { if errorReason != "" {
@@ -827,8 +841,8 @@ func (e *Engine) handleStatusChange(snap models.Site, rawStatus string, code int
e.AddLog(fmt.Sprintf("Monitor '%s' failed check %d/%d", name, failCount, maxRetries)) e.AddLog(fmt.Sprintf("Monitor '%s' failed check %d/%d", name, failCount, maxRetries))
} }
if changed && prev != "PENDING" { if changed && prev != models.StatusPending {
e.enqueueWrite(writeStateChange{siteID: snap.ID, fromStatus: prev, toStatus: next, reason: errorReason}) e.enqueueWrite(writeStateChange{siteID: snap.ID, fromStatus: string(prev), toStatus: string(next), reason: errorReason})
} }
if sslWarnFire { if sslWarnFire {
@@ -839,13 +853,11 @@ func (e *Engine) handleStatusChange(snap models.Site, rawStatus string, code int
} }
} }
isBroken := func(s string) bool { return s == "DOWN" || s == "SSL EXP" } if prev == models.StatusUp && next == models.StatusLate {
if prev == "UP" && next == "LATE" {
e.AddLog(fmt.Sprintf("Monitor '%s' heartbeat overdue", name)) e.AddLog(fmt.Sprintf("Monitor '%s' heartbeat overdue", name))
} }
if !isBroken(prev) && isBroken(next) && next != "PENDING" { if !prev.IsBroken() && next.IsBroken() && next != models.StatusPending {
if inMaint { if inMaint {
e.AddLog(fmt.Sprintf("Monitor '%s' is DOWN (alerts suppressed — maintenance)", name)) e.AddLog(fmt.Sprintf("Monitor '%s' is DOWN (alerts suppressed — maintenance)", name))
} else { } else {
@@ -859,7 +871,7 @@ func (e *Engine) handleStatusChange(snap models.Site, rawStatus string, code int
e.triggerAlert(alertID, "🚨 ALERT", msg) e.triggerAlert(alertID, "🚨 ALERT", msg)
} }
} }
if isBroken(prev) && next == "UP" { if prev.IsBroken() && next == models.StatusUp {
downDur := "" downDur := ""
if !downSince.IsZero() { if !downSince.IsZero() {
downDur = fmt.Sprintf(" (was down %s)", fmtDurationShort(time.Since(downSince))) downDur = fmt.Sprintf(" (was down %s)", fmtDurationShort(time.Since(downSince)))
@@ -869,12 +881,15 @@ func (e *Engine) handleStatusChange(snap models.Site, rawStatus string, code int
e.triggerAlert(alertID, "✅ RECOVERY", fmt.Sprintf("Monitor '%s' is UP%s", name, downDur)) e.triggerAlert(alertID, "✅ RECOVERY", fmt.Sprintf("Monitor '%s' is UP%s", name, downDur))
} }
} }
if prev == "LATE" && next == "UP" && !isBroken(prev) { if prev == models.StatusLate && next == models.StatusUp && !prev.IsBroken() {
e.AddLog(fmt.Sprintf("Monitor '%s' heartbeat arrived (was late)", name)) e.AddLog(fmt.Sprintf("Monitor '%s' heartbeat arrived (was late)", name))
} }
} }
func (e *Engine) triggerAlert(alertID int, title, message string) { func (e *Engine) triggerAlert(alertID int, title, message string) {
if alertID <= 0 {
return
}
cfg, err := e.db.GetAlert(context.Background(), alertID) cfg, err := e.db.GetAlert(context.Background(), alertID)
if err != nil { if err != nil {
e.AddLog(fmt.Sprintf("Failed to load alert config %d: %v", alertID, err)) e.AddLog(fmt.Sprintf("Failed to load alert config %d: %v", alertID, err))
@@ -991,12 +1006,12 @@ func (e *Engine) GetDisplayStatus(site models.Site) string {
if e.isInMaintenance(site.ID) { if e.isInMaintenance(site.ID) {
return "MAINT" return "MAINT"
} }
return site.Status return string(site.Status)
} }
func (e *Engine) checkGroup(_ context.Context, site models.Site) { func (e *Engine) checkGroup(_ context.Context, site models.Site) {
e.mu.RLock() e.mu.RLock()
status := "UP" status := models.StatusUp
hasChildren := false hasChildren := false
for _, child := range e.liveState { for _, child := range e.liveState {
if child.ParentID != site.ID || child.Type == "group" { if child.ParentID != site.ID || child.Type == "group" {
@@ -1006,31 +1021,34 @@ func (e *Engine) checkGroup(_ context.Context, site models.Site) {
if child.Paused || e.isInMaintenance(child.ID) { if child.Paused || e.isInMaintenance(child.ID) {
continue continue
} }
if child.Status == "DOWN" || child.Status == "SSL EXP" { if child.Status == models.StatusDown || child.Status == models.StatusSSLExp {
status = "DOWN" status = models.StatusDown
} else if child.Status == "STALE" && status != "DOWN" { } else if child.Status == models.StatusStale && status != models.StatusDown {
status = "STALE" status = models.StatusStale
} else if child.Status == "LATE" && status != "DOWN" && status != "STALE" { } else if child.Status == models.StatusLate && status != models.StatusDown && status != models.StatusStale {
status = "LATE" status = models.StatusLate
} else if child.Status == "PENDING" && status != "DOWN" && status != "STALE" && status != "LATE" { } else if child.Status == models.StatusPending && status != models.StatusDown && status != models.StatusStale && status != models.StatusLate {
status = "PENDING" status = models.StatusPending
} }
} }
e.mu.RUnlock() e.mu.RUnlock()
if !hasChildren { if !hasChildren {
status = "PENDING" status = models.StatusPending
} }
e.applyState(site.ID, func(s *models.Site) { e.applyState(site.ID, func(s *models.Site) {
s.Status = status s.Status = status
}) })
e.recordCheck(site.ID, 0, !status.IsBroken())
} }
func (e *Engine) EnqueueProbeCheck(siteID int, nodeID string, latencyNs int64, isUp bool) { func (e *Engine) EnqueueProbeCheck(siteID int, nodeID string, latencyNs int64, isUp bool) {
e.enqueueWrite(writeProbeCheck{siteID: siteID, nodeID: nodeID, latencyNs: latencyNs, isUp: isUp}) e.enqueueWrite(writeProbeCheck{siteID: siteID, nodeID: nodeID, latencyNs: latencyNs, isUp: isUp})
} }
// SetAggStrategy must be called before Start: the field is read by the probe
// aggregation path without synchronization.
func (e *Engine) SetAggStrategy(strategy AggregationStrategy) { func (e *Engine) SetAggStrategy(strategy AggregationStrategy) {
e.aggStrategy = strategy e.aggStrategy = strategy
} }
@@ -1072,15 +1090,15 @@ func (e *Engine) IngestProbeResult(nodeID string, siteID int, latencyNs int64, i
aggUp, avgLatency := AggregateStatus(results, e.aggStrategy) aggUp, avgLatency := AggregateStatus(results, e.aggStrategy)
rawStatus := "UP" probeStatus := models.StatusUp
if !aggUp { if !aggUp {
rawStatus = "DOWN" probeStatus = models.StatusDown
} }
updatedSite := site updatedSite := site
updatedSite.Latency = time.Duration(avgLatency) updatedSite.Latency = time.Duration(avgLatency)
updatedSite.LastCheck = time.Now() updatedSite.LastCheck = time.Now()
e.handleStatusChange(updatedSite, rawStatus, 0, time.Duration(avgLatency), errorReason) e.handleStatusChange(updatedSite, string(probeStatus), 0, time.Duration(avgLatency), errorReason)
} }
func (e *Engine) GetProbeResults(siteID int) map[string]NodeResult { func (e *Engine) GetProbeResults(siteID int) map[string]NodeResult {
+216 -142
View File
@@ -8,6 +8,7 @@ import (
"time" "time"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models" "gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/store/storetest"
) )
// --- Mock Store --- // --- Mock Store ---
@@ -19,8 +20,9 @@ type savedCheck struct {
} }
type mockStore struct { type mockStore struct {
storetest.BaseMock
mu sync.Mutex mu sync.Mutex
sites []models.Site sites []models.SiteConfig
alerts map[int]models.AlertConfig alerts map[int]models.AlertConfig
maintenance map[int]bool maintenance map[int]bool
logs []string logs []string
@@ -38,42 +40,8 @@ func newMockStore() *mockStore {
} }
} }
func (m *mockStore) Init(context.Context) error { return nil } func (m *mockStore) GetSites(context.Context) ([]models.SiteConfig, error) { return m.sites, nil }
func (m *mockStore) GetSites(context.Context) ([]models.Site, error) { return m.sites, nil }
func (m *mockStore) AddSite(context.Context, models.Site) error { return nil }
func (m *mockStore) UpdateSite(context.Context, models.Site) error { return nil }
func (m *mockStore) UpdateSitePaused(context.Context, int, bool) error { return nil }
func (m *mockStore) DeleteSite(context.Context, int) error { return nil }
func (m *mockStore) AddAlert(context.Context, string, string, map[string]string) error { return nil }
func (m *mockStore) UpdateAlert(context.Context, int, string, string, map[string]string) error {
return nil
}
func (m *mockStore) DeleteAlert(context.Context, int) error { return nil }
func (m *mockStore) GetAllUsers(context.Context) ([]models.User, error) { return nil, nil }
func (m *mockStore) AddUser(context.Context, string, string, string) error { return nil }
func (m *mockStore) UpdateUser(context.Context, int, string, string, string) error { return nil }
func (m *mockStore) DeleteUser(context.Context, int) error { return nil }
func (m *mockStore) ExportData(context.Context) (models.Backup, error) { return models.Backup{}, nil }
func (m *mockStore) ImportData(context.Context, models.Backup) error { return nil }
func (m *mockStore) GetSiteByName(context.Context, string) (models.Site, error) {
return models.Site{}, nil
}
func (m *mockStore) AddSiteReturningID(context.Context, models.Site) (int, error) { return 0, nil }
func (m *mockStore) AddAlertReturningID(context.Context, string, string, map[string]string) (int, error) {
return 0, nil
}
func (m *mockStore) SaveCheckFromNode(context.Context, int, string, int64, bool) error { return nil }
func (m *mockStore) RegisterNode(context.Context, models.ProbeNode) error { return nil }
func (m *mockStore) GetNode(context.Context, string) (models.ProbeNode, error) {
return models.ProbeNode{}, nil
}
func (m *mockStore) GetAllNodes(context.Context) ([]models.ProbeNode, error) { return nil, nil }
func (m *mockStore) UpdateNodeLastSeen(context.Context, string) error { return nil }
func (m *mockStore) DeleteNode(context.Context, string) error { return nil }
func (m *mockStore) LoadAlertHealth(context.Context) (map[int]models.AlertHealthRecord, error) {
return nil, nil
}
func (m *mockStore) SaveAlertHealth(context.Context, models.AlertHealthRecord) error { return nil }
func (m *mockStore) GetActiveMaintenanceWindows(context.Context) ([]models.MaintenanceWindow, error) { func (m *mockStore) GetActiveMaintenanceWindows(context.Context) ([]models.MaintenanceWindow, error) {
m.mu.Lock() m.mu.Lock()
defer m.mu.Unlock() defer m.mu.Unlock()
@@ -83,25 +51,6 @@ func (m *mockStore) GetActiveMaintenanceWindows(context.Context) ([]models.Maint
} }
return windows, nil return windows, nil
} }
func (m *mockStore) GetAllMaintenanceWindows(context.Context, int) ([]models.MaintenanceWindow, error) {
return nil, nil
}
func (m *mockStore) AddMaintenanceWindow(context.Context, models.MaintenanceWindow) error { return nil }
func (m *mockStore) EndMaintenanceWindow(context.Context, int) error { return nil }
func (m *mockStore) DeleteMaintenanceWindow(context.Context, int) error { return nil }
func (m *mockStore) PruneExpiredMaintenanceWindows(context.Context, time.Duration) (int64, error) {
return 0, nil
}
func (m *mockStore) GetPreference(context.Context, string) (string, error) { return "", nil }
func (m *mockStore) SetPreference(context.Context, string, string) error { return nil }
func (m *mockStore) SaveStateChange(context.Context, int, string, string, string) error { return nil }
func (m *mockStore) GetStateChanges(context.Context, int, int) ([]models.StateChange, error) {
return nil, nil
}
func (m *mockStore) GetStateChangesSince(context.Context, int, time.Time) ([]models.StateChange, error) {
return nil, nil
}
func (m *mockStore) Close() error { return nil }
func (m *mockStore) GetAllAlerts(context.Context) ([]models.AlertConfig, error) { func (m *mockStore) GetAllAlerts(context.Context) ([]models.AlertConfig, error) {
m.mu.Lock() m.mu.Lock()
@@ -154,18 +103,14 @@ func (m *mockStore) SaveLog(_ context.Context, msg string) error {
return nil return nil
} }
func (m *mockStore) LoadLogs(_ context.Context, limit int) ([]string, error) { func (m *mockStore) LoadLogs(_ context.Context, _ int) ([]string, error) {
return m.logs, nil return m.logs, nil
} }
func (m *mockStore) LoadAllHistory(_ context.Context, limit int) (map[int][]models.CheckRecord, error) { func (m *mockStore) LoadAllHistory(_ context.Context, _ int) (map[int][]models.CheckRecord, error) {
return m.history, nil return m.history, nil
} }
func (m *mockStore) PruneLogs(context.Context) error { return nil }
func (m *mockStore) PruneCheckHistory(context.Context) error { return nil }
func (m *mockStore) PruneStateChanges(context.Context) error { return nil }
// --- Helpers --- // --- Helpers ---
func newTestEngine(ms *mockStore) *Engine { func newTestEngine(ms *mockStore) *Engine {
@@ -203,7 +148,10 @@ func (m *mockStore) getAlertCallsSnapshot() []int {
func TestHandleStatusChange_PendingToUp(t *testing.T) { func TestHandleStatusChange_PendingToUp(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "PENDING", MaxRetries: 3, AlertID: 1} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", MaxRetries: 3, AlertID: 1},
SiteState: models.SiteState{Status: "PENDING"},
}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "UP", 200, 10*time.Millisecond, "") e.handleStatusChange(site, "UP", 200, 10*time.Millisecond, "")
@@ -224,7 +172,10 @@ func TestHandleStatusChange_PendingToUp(t *testing.T) {
func TestHandleStatusChange_UpIncrementFailure(t *testing.T) { func TestHandleStatusChange_UpIncrementFailure(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 3, FailureCount: 0} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", MaxRetries: 3},
SiteState: models.SiteState{Status: "UP", FailureCount: 0},
}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "DOWN", 500, 0, "test error") e.handleStatusChange(site, "DOWN", 500, 0, "test error")
@@ -242,7 +193,10 @@ func TestHandleStatusChange_UpToDown_ExceedsRetries(t *testing.T) {
ms := newMockStore() ms := newMockStore()
ms.alerts[1] = models.AlertConfig{ID: 1, Name: "discord", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}} ms.alerts[1] = models.AlertConfig{ID: 1, Name: "discord", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}}
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 2, FailureCount: 2, AlertID: 1} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", MaxRetries: 2, AlertID: 1},
SiteState: models.SiteState{Status: "UP", FailureCount: 2},
}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "DOWN", 500, 0, "test error") e.handleStatusChange(site, "DOWN", 500, 0, "test error")
@@ -265,7 +219,10 @@ func TestHandleStatusChange_UpToDown_ZeroRetries(t *testing.T) {
ms := newMockStore() ms := newMockStore()
ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}} ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}}
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 0, FailureCount: 0, AlertID: 1} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", MaxRetries: 0, AlertID: 1},
SiteState: models.SiteState{Status: "UP", FailureCount: 0},
}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "DOWN", 0, 0, "test error") e.handleStatusChange(site, "DOWN", 0, 0, "test error")
@@ -284,7 +241,10 @@ func TestHandleStatusChange_DownToUp_Recovery(t *testing.T) {
ms := newMockStore() ms := newMockStore()
ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}} ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}}
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "DOWN", FailureCount: 4, AlertID: 1} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", AlertID: 1},
SiteState: models.SiteState{Status: "DOWN", FailureCount: 4},
}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "UP", 200, 5*time.Millisecond, "") e.handleStatusChange(site, "UP", 200, 5*time.Millisecond, "")
@@ -305,7 +265,10 @@ func TestHandleStatusChange_DownToUp_Recovery(t *testing.T) {
func TestHandleStatusChange_DownStaysDown(t *testing.T) { func TestHandleStatusChange_DownStaysDown(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "DOWN", MaxRetries: 2, FailureCount: 3} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", MaxRetries: 2},
SiteState: models.SiteState{Status: "DOWN", FailureCount: 3},
}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "DOWN", 0, 0, "test error") e.handleStatusChange(site, "DOWN", 0, 0, "test error")
@@ -324,7 +287,10 @@ func TestHandleStatusChange_SSLExpired(t *testing.T) {
ms := newMockStore() ms := newMockStore()
ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}} ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}}
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 0, AlertID: 1} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", MaxRetries: 0, AlertID: 1},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "SSL EXP", 0, 0, "SSL certificate expired") e.handleStatusChange(site, "SSL EXP", 0, 0, "SSL certificate expired")
@@ -344,7 +310,10 @@ func TestHandleStatusChange_AlertSuppressedMaintenance(t *testing.T) {
ms.maintenance[1] = true ms.maintenance[1] = true
ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}} ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}}
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 0, AlertID: 1} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", MaxRetries: 0, AlertID: 1},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
e.refreshMaintenanceCache(context.Background()) e.refreshMaintenanceCache(context.Background())
@@ -376,7 +345,10 @@ func TestHandleStatusChange_RecoverySuppressedMaintenance(t *testing.T) {
ms.maintenance[1] = true ms.maintenance[1] = true
ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}} ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}}
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "DOWN", AlertID: 1} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", AlertID: 1},
SiteState: models.SiteState{Status: "DOWN"},
}
injectSite(e, site) injectSite(e, site)
e.refreshMaintenanceCache(context.Background()) e.refreshMaintenanceCache(context.Background())
@@ -397,10 +369,8 @@ func TestHandleStatusChange_SSLWarning(t *testing.T) {
ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}} ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}}
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ site := models.Site{
ID: 1, Name: "test", Status: "UP", Type: "http", SiteConfig: models.SiteConfig{ID: 1, Name: "test", Type: "http", CheckSSL: true, ExpiryThreshold: 30, AlertID: 1},
CheckSSL: true, HasSSL: true, ExpiryThreshold: 30, SiteState: models.SiteState{Status: "UP", HasSSL: true, SentSSLWarning: false, CertExpiry: time.Now().Add(15 * 24 * time.Hour)},
SentSSLWarning: false, AlertID: 1,
CertExpiry: time.Now().Add(15 * 24 * time.Hour),
} }
injectSite(e, site) injectSite(e, site)
@@ -420,10 +390,8 @@ func TestHandleStatusChange_SSLWarningNotRepeated(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ site := models.Site{
ID: 1, Name: "test", Status: "UP", Type: "http", SiteConfig: models.SiteConfig{ID: 1, Name: "test", Type: "http", CheckSSL: true, ExpiryThreshold: 30, AlertID: 1},
CheckSSL: true, HasSSL: true, ExpiryThreshold: 30, SiteState: models.SiteState{Status: "UP", HasSSL: true, SentSSLWarning: true, CertExpiry: time.Now().Add(15 * 24 * time.Hour)},
SentSSLWarning: true, AlertID: 1,
CertExpiry: time.Now().Add(15 * 24 * time.Hour),
} }
injectSite(e, site) injectSite(e, site)
@@ -439,10 +407,8 @@ func TestHandleStatusChange_SSLWarningReset(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ site := models.Site{
ID: 1, Name: "test", Status: "UP", Type: "http", SiteConfig: models.SiteConfig{ID: 1, Name: "test", Type: "http", CheckSSL: true, ExpiryThreshold: 30},
CheckSSL: true, HasSSL: true, ExpiryThreshold: 30, SiteState: models.SiteState{Status: "UP", HasSSL: true, SentSSLWarning: true, CertExpiry: time.Now().Add(60 * 24 * time.Hour)},
SentSSLWarning: true,
CertExpiry: time.Now().Add(60 * 24 * time.Hour),
} }
injectSite(e, site) injectSite(e, site)
@@ -460,10 +426,8 @@ func TestHandleStatusChange_SSLWarningSuppressedMaint(t *testing.T) {
ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}} ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}}
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ site := models.Site{
ID: 1, Name: "test", Status: "UP", Type: "http", SiteConfig: models.SiteConfig{ID: 1, Name: "test", Type: "http", CheckSSL: true, ExpiryThreshold: 30, AlertID: 1},
CheckSSL: true, HasSSL: true, ExpiryThreshold: 30, SiteState: models.SiteState{Status: "UP", HasSSL: true, SentSSLWarning: false, CertExpiry: time.Now().Add(15 * 24 * time.Hour)},
SentSSLWarning: false, AlertID: 1,
CertExpiry: time.Now().Add(15 * 24 * time.Hour),
} }
injectSite(e, site) injectSite(e, site)
e.refreshMaintenanceCache(context.Background()) e.refreshMaintenanceCache(context.Background())
@@ -483,7 +447,10 @@ func TestHandleStatusChange_SSLWarningSuppressedMaint(t *testing.T) {
func TestHandleStatusChange_InactiveEngine(t *testing.T) { func TestHandleStatusChange_InactiveEngine(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 0} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", MaxRetries: 0},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
e.SetActive(false) e.SetActive(false)
@@ -500,7 +467,10 @@ func TestHandleStatusChange_InactiveEngine(t *testing.T) {
func TestRecordHeartbeat_ValidToken(t *testing.T) { func TestRecordHeartbeat_ValidToken(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "push-test", Type: "push", Token: "abc123", Status: "UP"} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "push-test", Type: "push", Token: "abc123"},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
if !e.RecordHeartbeat("abc123") { if !e.RecordHeartbeat("abc123") {
@@ -520,7 +490,10 @@ func TestRecordHeartbeat_RecoveryFromDown(t *testing.T) {
ms := newMockStore() ms := newMockStore()
ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}} ms.alerts[1] = models.AlertConfig{ID: 1, Name: "test", Type: "webhook", Settings: map[string]string{"url": "http://example.com"}}
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "push-test", Type: "push", Token: "abc123", Status: "DOWN", AlertID: 1, FailureCount: 3} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "push-test", Type: "push", Token: "abc123", AlertID: 1},
SiteState: models.SiteState{Status: "DOWN", FailureCount: 3},
}
injectSite(e, site) injectSite(e, site)
if !e.RecordHeartbeat("abc123") { if !e.RecordHeartbeat("abc123") {
@@ -552,7 +525,10 @@ func TestRecordHeartbeat_UnknownToken(t *testing.T) {
func TestRecordHeartbeat_InactiveEngine(t *testing.T) { func TestRecordHeartbeat_InactiveEngine(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Type: "push", Token: "abc123", Status: "UP"} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Type: "push", Token: "abc123"},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
e.SetActive(false) e.SetActive(false)
@@ -567,9 +543,8 @@ func TestCheckPush_DeadlineMissed(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ site := models.Site{
ID: 1, Name: "push", Type: "push", Status: "UP", SiteConfig: models.SiteConfig{ID: 1, Name: "push", Type: "push", Interval: 10, MaxRetries: 0},
Interval: 10, MaxRetries: 0, SiteState: models.SiteState{Status: "UP", LastCheck: time.Now().Add(-120 * time.Second)},
LastCheck: time.Now().Add(-120 * time.Second),
} }
injectSite(e, site) injectSite(e, site)
@@ -585,9 +560,8 @@ func TestCheckPush_OverdueBecomesLate(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ site := models.Site{
ID: 1, Name: "push", Type: "push", Status: "UP", SiteConfig: models.SiteConfig{ID: 1, Name: "push", Type: "push", Interval: 300},
Interval: 300, SiteState: models.SiteState{Status: "UP", LastCheck: time.Now().Add(-310 * time.Second)},
LastCheck: time.Now().Add(-310 * time.Second),
} }
injectSite(e, site) injectSite(e, site)
@@ -605,9 +579,8 @@ func TestCheckPush_OverdueBecomesStale(t *testing.T) {
// interval=300, grace=150 (300/2), staleMark=overdue+75 // interval=300, grace=150 (300/2), staleMark=overdue+75
// at 380s: past staleMark(375) but before graceEnd(450) // at 380s: past staleMark(375) but before graceEnd(450)
site := models.Site{ site := models.Site{
ID: 1, Name: "push", Type: "push", Status: "UP", SiteConfig: models.SiteConfig{ID: 1, Name: "push", Type: "push", Interval: 300},
Interval: 300, SiteState: models.SiteState{Status: "UP", LastCheck: time.Now().Add(-380 * time.Second)},
LastCheck: time.Now().Add(-380 * time.Second),
} }
injectSite(e, site) injectSite(e, site)
@@ -623,8 +596,8 @@ func TestCheckPush_WithinDeadline(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ site := models.Site{
ID: 1, Name: "push", Type: "push", Status: "UP", SiteConfig: models.SiteConfig{ID: 1, Name: "push", Type: "push", Interval: 60},
Interval: 60, LastCheck: time.Now(), SiteState: models.SiteState{Status: "UP", LastCheck: time.Now()},
} }
injectSite(e, site) injectSite(e, site)
@@ -640,8 +613,8 @@ func TestCheckPush_PendingStaysPending(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ site := models.Site{
ID: 1, Name: "push", Type: "push", Status: "PENDING", SiteConfig: models.SiteConfig{ID: 1, Name: "push", Type: "push", Interval: 60},
Interval: 60, SiteState: models.SiteState{Status: "PENDING"},
} }
injectSite(e, site) injectSite(e, site)
@@ -658,9 +631,18 @@ func TestCheckPush_PendingStaysPending(t *testing.T) {
func TestCheckGroup_AllChildrenUp(t *testing.T) { func TestCheckGroup_AllChildrenUp(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
group := models.Site{ID: 1, Name: "group", Type: "group", Status: "PENDING"} group := models.Site{
child1 := models.Site{ID: 2, Name: "child1", Type: "http", ParentID: 1, Status: "UP"} SiteConfig: models.SiteConfig{ID: 1, Name: "group", Type: "group"},
child2 := models.Site{ID: 3, Name: "child2", Type: "http", ParentID: 1, Status: "UP"} SiteState: models.SiteState{Status: "PENDING"},
}
child1 := models.Site{
SiteConfig: models.SiteConfig{ID: 2, Name: "child1", Type: "http", ParentID: 1},
SiteState: models.SiteState{Status: "UP"},
}
child2 := models.Site{
SiteConfig: models.SiteConfig{ID: 3, Name: "child2", Type: "http", ParentID: 1},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, group) injectSite(e, group)
injectSite(e, child1) injectSite(e, child1)
injectSite(e, child2) injectSite(e, child2)
@@ -676,9 +658,18 @@ func TestCheckGroup_AllChildrenUp(t *testing.T) {
func TestCheckGroup_OneChildDown(t *testing.T) { func TestCheckGroup_OneChildDown(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
group := models.Site{ID: 1, Name: "group", Type: "group", Status: "UP"} group := models.Site{
child1 := models.Site{ID: 2, Name: "child1", Type: "http", ParentID: 1, Status: "UP"} SiteConfig: models.SiteConfig{ID: 1, Name: "group", Type: "group"},
child2 := models.Site{ID: 3, Name: "child2", Type: "http", ParentID: 1, Status: "DOWN"} SiteState: models.SiteState{Status: "UP"},
}
child1 := models.Site{
SiteConfig: models.SiteConfig{ID: 2, Name: "child1", Type: "http", ParentID: 1},
SiteState: models.SiteState{Status: "UP"},
}
child2 := models.Site{
SiteConfig: models.SiteConfig{ID: 3, Name: "child2", Type: "http", ParentID: 1},
SiteState: models.SiteState{Status: "DOWN"},
}
injectSite(e, group) injectSite(e, group)
injectSite(e, child1) injectSite(e, child1)
injectSite(e, child2) injectSite(e, child2)
@@ -694,9 +685,17 @@ func TestCheckGroup_OneChildDown(t *testing.T) {
func TestCheckGroup_PausedChildIgnored(t *testing.T) { func TestCheckGroup_PausedChildIgnored(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
group := models.Site{ID: 1, Name: "group", Type: "group"} group := models.Site{
child1 := models.Site{ID: 2, Name: "child1", Type: "http", ParentID: 1, Status: "UP"} SiteConfig: models.SiteConfig{ID: 1, Name: "group", Type: "group"},
child2 := models.Site{ID: 3, Name: "child2", Type: "http", ParentID: 1, Status: "DOWN", Paused: true} }
child1 := models.Site{
SiteConfig: models.SiteConfig{ID: 2, Name: "child1", Type: "http", ParentID: 1},
SiteState: models.SiteState{Status: "UP"},
}
child2 := models.Site{
SiteConfig: models.SiteConfig{ID: 3, Name: "child2", Type: "http", ParentID: 1, Paused: true},
SiteState: models.SiteState{Status: "DOWN"},
}
injectSite(e, group) injectSite(e, group)
injectSite(e, child1) injectSite(e, child1)
injectSite(e, child2) injectSite(e, child2)
@@ -713,9 +712,17 @@ func TestCheckGroup_MaintenanceChildIgnored(t *testing.T) {
ms := newMockStore() ms := newMockStore()
ms.maintenance[3] = true ms.maintenance[3] = true
e := newTestEngine(ms) e := newTestEngine(ms)
group := models.Site{ID: 1, Name: "group", Type: "group"} group := models.Site{
child1 := models.Site{ID: 2, Name: "child1", Type: "http", ParentID: 1, Status: "UP"} SiteConfig: models.SiteConfig{ID: 1, Name: "group", Type: "group"},
child2 := models.Site{ID: 3, Name: "child2", Type: "http", ParentID: 1, Status: "DOWN"} }
child1 := models.Site{
SiteConfig: models.SiteConfig{ID: 2, Name: "child1", Type: "http", ParentID: 1},
SiteState: models.SiteState{Status: "UP"},
}
child2 := models.Site{
SiteConfig: models.SiteConfig{ID: 3, Name: "child2", Type: "http", ParentID: 1},
SiteState: models.SiteState{Status: "DOWN"},
}
injectSite(e, group) injectSite(e, group)
injectSite(e, child1) injectSite(e, child1)
injectSite(e, child2) injectSite(e, child2)
@@ -732,7 +739,10 @@ func TestCheckGroup_MaintenanceChildIgnored(t *testing.T) {
func TestCheckGroup_NoChildren(t *testing.T) { func TestCheckGroup_NoChildren(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
group := models.Site{ID: 1, Name: "group", Type: "group", Status: "UP"} group := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "group", Type: "group"},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, group) injectSite(e, group)
e.checkGroup(context.Background(), group) e.checkGroup(context.Background(), group)
@@ -827,10 +837,13 @@ func TestInitHistory_LoadsFromDB(t *testing.T) {
func TestUpdateSiteConfig_PreservesRuntime(t *testing.T) { func TestUpdateSiteConfig_PreservesRuntime(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", URL: "http://old.com", Status: "DOWN", FailureCount: 3, Latency: 100 * time.Millisecond} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", URL: "http://old.com"},
SiteState: models.SiteState{Status: "DOWN", FailureCount: 3, Latency: 100 * time.Millisecond},
}
injectSite(e, site) injectSite(e, site)
updated := models.Site{ID: 1, Name: "test", URL: "http://new.com", Interval: 60} updated := models.SiteConfig{ID: 1, Name: "test", URL: "http://new.com", Interval: 60}
e.UpdateSiteConfig(updated) e.UpdateSiteConfig(updated)
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
@@ -851,7 +864,10 @@ func TestUpdateSiteConfig_PreservesRuntime(t *testing.T) {
func TestRemoveSite_CleansUp(t *testing.T) { func TestRemoveSite_CleansUp(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Type: "push", Token: "tok1", Status: "UP"} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", Type: "push", Token: "tok1"},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
e.recordCheck(1, 5*time.Millisecond, true) e.recordCheck(1, 5*time.Millisecond, true)
@@ -871,7 +887,10 @@ func TestRemoveSite_CleansUp(t *testing.T) {
func TestToggleSitePause(t *testing.T) { func TestToggleSitePause(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "UP"} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test"},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
paused := e.ToggleSitePause(1) paused := e.ToggleSitePause(1)
@@ -900,8 +919,14 @@ func TestToggleSitePause_NonexistentSite(t *testing.T) {
func TestGetAllSites_ReturnsCopy(t *testing.T) { func TestGetAllSites_ReturnsCopy(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
injectSite(e, models.Site{ID: 1, Name: "s1", Status: "UP"}) injectSite(e, models.Site{
injectSite(e, models.Site{ID: 2, Name: "s2", Status: "DOWN"}) SiteConfig: models.SiteConfig{ID: 1, Name: "s1"},
SiteState: models.SiteState{Status: "UP"},
})
injectSite(e, models.Site{
SiteConfig: models.SiteConfig{ID: 2, Name: "s2"},
SiteState: models.SiteState{Status: "DOWN"},
})
sites := e.GetAllSites() sites := e.GetAllSites()
if len(sites) != 2 { if len(sites) != 2 {
@@ -920,10 +945,13 @@ func TestGetAllSites_ReturnsCopy(t *testing.T) {
func TestGetLiveState_ReturnsCopy(t *testing.T) { func TestGetLiveState_ReturnsCopy(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
injectSite(e, models.Site{ID: 1, Name: "s1", Status: "UP"}) injectSite(e, models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "s1"},
SiteState: models.SiteState{Status: "UP"},
})
state := e.GetLiveState() state := e.GetLiveState()
state[1] = models.Site{Name: "mutated"} state[1] = models.Site{SiteConfig: models.SiteConfig{Name: "mutated"}}
fresh := e.GetLiveState() fresh := e.GetLiveState()
if fresh[1].Name == "mutated" { if fresh[1].Name == "mutated" {
@@ -1039,7 +1067,8 @@ func TestConcurrent_RecordHeartbeat(t *testing.T) {
e := newTestEngine(ms) e := newTestEngine(ms)
for i := 0; i < 10; i++ { for i := 0; i < 10; i++ {
injectSite(e, models.Site{ injectSite(e, models.Site{
ID: i + 1, Type: "push", Token: fmt.Sprintf("tok-%d", i+1), Status: "UP", SiteConfig: models.SiteConfig{ID: i + 1, Type: "push", Token: fmt.Sprintf("tok-%d", i+1)},
SiteState: models.SiteState{Status: "UP"},
}) })
} }
@@ -1057,7 +1086,10 @@ func TestConcurrent_RecordHeartbeat(t *testing.T) {
func TestConcurrent_HandleStatusChangeAndGetState(t *testing.T) { func TestConcurrent_HandleStatusChangeAndGetState(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 100} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", MaxRetries: 100},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
var wg sync.WaitGroup var wg sync.WaitGroup
@@ -1110,7 +1142,10 @@ func TestConcurrent_RecordCheckAndGetHistory(t *testing.T) {
func TestHandleStatusChange_PauseDuringCheckSurvives(t *testing.T) { func TestHandleStatusChange_PauseDuringCheckSurvives(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 0} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", MaxRetries: 0},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
// `site` is the stale snapshot the check ran against (Paused=false). // `site` is the stale snapshot the check ran against (Paused=false).
@@ -1134,11 +1169,14 @@ func TestHandleStatusChange_PauseDuringCheckSurvives(t *testing.T) {
func TestHandleStatusChange_ConfigEditDuringCheckSurvives(t *testing.T) { func TestHandleStatusChange_ConfigEditDuringCheckSurvives(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", URL: "http://old.com", Type: "http", Status: "UP", MaxRetries: 0, Interval: 30} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", URL: "http://old.com", Type: "http", MaxRetries: 0, Interval: 30},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
// Config changes mid-check. // Config changes mid-check.
e.UpdateSiteConfig(models.Site{ID: 1, Name: "test", URL: "http://new.com", Type: "http", Interval: 60}) e.UpdateSiteConfig(models.SiteConfig{ID: 1, Name: "test", URL: "http://new.com", Type: "http", Interval: 60})
// Stale check (ran against http://old.com) folds its result in. // Stale check (ran against http://old.com) folds its result in.
e.handleStatusChange(site, "UP", 200, 5*time.Millisecond, "") e.handleStatusChange(site, "UP", 200, 5*time.Millisecond, "")
@@ -1160,7 +1198,10 @@ func TestHandleStatusChange_HeartbeatNotOverwrittenByStaleDown(t *testing.T) {
e := newTestEngine(ms) e := newTestEngine(ms)
// Snapshot the engine would have taken before evaluating staleness: // Snapshot the engine would have taken before evaluating staleness:
// LastCheck is old, so checkPush decided "DOWN". // LastCheck is old, so checkPush decided "DOWN".
snap := models.Site{ID: 1, Name: "push", Type: "push", Token: "tok", Status: "UP", Interval: 10, LastCheck: time.Now().Add(-120 * time.Second)} snap := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "push", Type: "push", Token: "tok", Interval: 10},
SiteState: models.SiteState{Status: "UP", LastCheck: time.Now().Add(-120 * time.Second)},
}
injectSite(e, snap) injectSite(e, snap)
// A heartbeat lands first, advancing LastCheck and confirming UP. // A heartbeat lands first, advancing LastCheck and confirming UP.
@@ -1181,7 +1222,10 @@ func TestHandleStatusChange_HeartbeatNotOverwrittenByStaleDown(t *testing.T) {
func TestHandleStatusChange_RemovedSiteDropped(t *testing.T) { func TestHandleStatusChange_RemovedSiteDropped(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 0} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", MaxRetries: 0},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
e.RemoveSite(1) e.RemoveSite(1)
@@ -1244,9 +1288,18 @@ func TestEngineStop_Idempotent(t *testing.T) {
func TestCheckGroup_AllPausedNoAutoFreeze(t *testing.T) { func TestCheckGroup_AllPausedNoAutoFreeze(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
group := models.Site{ID: 1, Name: "group", Type: "group", Status: "UP"} group := models.Site{
child1 := models.Site{ID: 2, Name: "child1", Type: "http", ParentID: 1, Status: "UP", Paused: true} SiteConfig: models.SiteConfig{ID: 1, Name: "group", Type: "group"},
child2 := models.Site{ID: 3, Name: "child2", Type: "http", ParentID: 1, Status: "UP", Paused: true} SiteState: models.SiteState{Status: "UP"},
}
child1 := models.Site{
SiteConfig: models.SiteConfig{ID: 2, Name: "child1", Type: "http", ParentID: 1, Paused: true},
SiteState: models.SiteState{Status: "UP"},
}
child2 := models.Site{
SiteConfig: models.SiteConfig{ID: 3, Name: "child2", Type: "http", ParentID: 1, Paused: true},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, group) injectSite(e, group)
injectSite(e, child1) injectSite(e, child1)
injectSite(e, child2) injectSite(e, child2)
@@ -1263,7 +1316,10 @@ func TestCheckGroup_AllPausedNoAutoFreeze(t *testing.T) {
func TestHandleStatusChange_PendingRetriesBeforeDown(t *testing.T) { func TestHandleStatusChange_PendingRetriesBeforeDown(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "new-monitor", Status: "PENDING", MaxRetries: 2} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "new-monitor", MaxRetries: 2},
SiteState: models.SiteState{Status: "PENDING"},
}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "DOWN", 0, 0, "timeout") e.handleStatusChange(site, "DOWN", 0, 0, "timeout")
@@ -1292,7 +1348,10 @@ func TestHandleStatusChange_PendingRetriesBeforeDown(t *testing.T) {
func TestHandleStatusChange_LateRetriesBeforeDown(t *testing.T) { func TestHandleStatusChange_LateRetriesBeforeDown(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "push-mon", Status: "LATE", MaxRetries: 1} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "push-mon", MaxRetries: 1},
SiteState: models.SiteState{Status: "LATE"},
}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "DOWN", 0, 0, "missed heartbeat") e.handleStatusChange(site, "DOWN", 0, 0, "missed heartbeat")
@@ -1312,7 +1371,10 @@ func TestHandleStatusChange_LateRetriesBeforeDown(t *testing.T) {
func TestIngestProbeResult_ExpiresStaleProbes(t *testing.T) { func TestIngestProbeResult_ExpiresStaleProbes(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Type: "http", Status: "UP", Interval: 30} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", Type: "http", Interval: 30},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
e.probeResultsMu.Lock() e.probeResultsMu.Lock()
@@ -1344,7 +1406,10 @@ func TestIngestProbeResult_ExpiresStaleProbes(t *testing.T) {
func TestRemoveSite_CleansProbeResults(t *testing.T) { func TestRemoveSite_CleansProbeResults(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Type: "http", Status: "UP"} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", Type: "http"},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
e.probeResultsMu.Lock() e.probeResultsMu.Lock()
@@ -1367,8 +1432,14 @@ func TestIsInMaintenance_UsesCache(t *testing.T) {
ms := newMockStore() ms := newMockStore()
ms.maintenance[10] = true // direct maintenance on group ms.maintenance[10] = true // direct maintenance on group
e := newTestEngine(ms) e := newTestEngine(ms)
group := models.Site{ID: 10, Name: "group", Type: "group", Status: "UP"} group := models.Site{
child := models.Site{ID: 20, Name: "child", Type: "http", ParentID: 10, Status: "UP"} SiteConfig: models.SiteConfig{ID: 10, Name: "group", Type: "group"},
SiteState: models.SiteState{Status: "UP"},
}
child := models.Site{
SiteConfig: models.SiteConfig{ID: 20, Name: "child", Type: "http", ParentID: 10},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, group) injectSite(e, group)
injectSite(e, child) injectSite(e, child)
e.refreshMaintenanceCache(context.Background()) e.refreshMaintenanceCache(context.Background())
@@ -1389,7 +1460,10 @@ func TestIsInMaintenance_GlobalMaintenance(t *testing.T) {
ms := newMockStore() ms := newMockStore()
ms.maintenance[0] = true ms.maintenance[0] = true
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ID: 1, Name: "test", Type: "http", Status: "UP"} site := models.Site{
SiteConfig: models.SiteConfig{ID: 1, Name: "test", Type: "http"},
SiteState: models.SiteState{Status: "UP"},
}
injectSite(e, site) injectSite(e, site)
e.refreshMaintenanceCache(context.Background()) e.refreshMaintenanceCache(context.Background())
+5
View File
@@ -11,9 +11,11 @@ var privateRanges []*net.IPNet
func init() { func init() {
cidrs := []string{ cidrs := []string{
"0.0.0.0/8",
"127.0.0.0/8", "127.0.0.0/8",
"::1/128", "::1/128",
"10.0.0.0/8", "10.0.0.0/8",
"100.64.0.0/10",
"172.16.0.0/12", "172.16.0.0/12",
"192.168.0.0/16", "192.168.0.0/16",
"169.254.0.0/16", "169.254.0.0/16",
@@ -27,6 +29,9 @@ func init() {
} }
func isPrivateIP(ip net.IP) bool { func isPrivateIP(ip net.IP) bool {
if ip.IsUnspecified() || ip.IsMulticast() || ip.IsLoopback() {
return true
}
for _, network := range privateRanges { for _, network := range privateRanges {
if network.Contains(ip) { if network.Contains(ip) {
return true return true
+11 -18
View File
@@ -16,14 +16,14 @@ type SLAReport struct {
MTBF time.Duration MTBF time.Duration
} }
func ComputeSLA(changes []models.StateChange, currentStatus string, window time.Duration) SLAReport { func ComputeSLA(changes []models.StateChange, currentStatus models.Status, window time.Duration) SLAReport {
now := time.Now() now := time.Now()
windowStart := now.Add(-window) windowStart := now.Add(-window)
report := SLAReport{Window: window} report := SLAReport{Window: window}
if len(changes) == 0 { if len(changes) == 0 {
if isDown(currentStatus) { if models.Status(currentStatus).IsBroken() {
report.UptimePct = 0 report.UptimePct = 0
report.Downtime = window report.Downtime = window
} else { } else {
@@ -40,7 +40,7 @@ func ComputeSLA(changes []models.StateChange, currentStatus string, window time.
} }
// Determine status at window start: last transition before or at windowStart. // Determine status at window start: last transition before or at windowStart.
statusAtStart := "UP" statusAtStart := string(models.StatusUp)
for i := len(sorted) - 1; i >= 0; i-- { for i := len(sorted) - 1; i >= 0; i-- {
if !sorted[i].ChangedAt.After(windowStart) { if !sorted[i].ChangedAt.After(windowStart) {
statusAtStart = sorted[i].ToStatus statusAtStart = sorted[i].ToStatus
@@ -51,7 +51,7 @@ func ComputeSLA(changes []models.StateChange, currentStatus string, window time.
var upTime, downTime time.Duration var upTime, downTime time.Duration
var outages []time.Duration var outages []time.Duration
cursor := windowStart cursor := windowStart
wasDown := isDown(statusAtStart) wasDown := models.Status(statusAtStart).IsBroken()
if wasDown { if wasDown {
report.OutageCount = 1 report.OutageCount = 1
@@ -77,7 +77,7 @@ func ComputeSLA(changes []models.StateChange, currentStatus string, window time.
upTime += seg upTime += seg
} }
newDown := isDown(sc.ToStatus) newDown := models.Status(sc.ToStatus).IsBroken()
if !wasDown && newDown { if !wasDown && newDown {
report.OutageCount++ report.OutageCount++
outageStart = sc.ChangedAt outageStart = sc.ChangedAt
@@ -127,18 +127,15 @@ func ComputeSLA(changes []models.StateChange, currentStatus string, window time.
return report return report
} }
func ComputeDailyBreakdown(changes []models.StateChange, currentStatus string, days int, now time.Time) []DayReport { func ComputeDailyBreakdown(changes []models.StateChange, currentStatus models.Status, days int, now time.Time) []DayReport {
reports := make([]DayReport, days) reports := make([]DayReport, days)
for i := 0; i < days; i++ { for i := 0; i < days; i++ {
dayEnd := time.Date(now.Year(), now.Month(), now.Day(), 0, 0, 0, 0, now.Location()).Add(-time.Duration(i) * 24 * time.Hour) dayStart := time.Date(now.Year(), now.Month(), now.Day()-i, 0, 0, 0, 0, now.Location())
dayEnd := time.Date(now.Year(), now.Month(), now.Day()-i+1, 0, 0, 0, 0, now.Location())
if i == 0 { if i == 0 {
dayEnd = now dayEnd = now
} }
dayStart := time.Date(now.Year(), now.Month(), now.Day(), 0, 0, 0, 0, now.Location()).Add(-time.Duration(i) * 24 * time.Hour)
if i > 0 {
dayEnd = dayStart.Add(24 * time.Hour)
}
windowChanges := filterChangesForWindow(changes, dayStart, dayEnd) windowChanges := filterChangesForWindow(changes, dayStart, dayEnd)
@@ -159,10 +156,6 @@ type DayReport struct {
UptimePct float64 UptimePct float64
} }
func isDown(status string) bool {
return status == "DOWN" || status == "SSL EXP"
}
func filterChangesForWindow(changes []models.StateChange, start, end time.Time) []models.StateChange { func filterChangesForWindow(changes []models.StateChange, start, end time.Time) []models.StateChange {
var filtered []models.StateChange var filtered []models.StateChange
for _, sc := range changes { for _, sc := range changes {
@@ -180,7 +173,7 @@ func inferStatusAt(changes []models.StateChange, at time.Time) string {
return sc.ToStatus return sc.ToStatus
} }
} }
return "UP" return string(models.StatusUp)
} }
func computeSLAForWindow(changes []models.StateChange, statusAtStart string, start, end time.Time) float64 { func computeSLAForWindow(changes []models.StateChange, statusAtStart string, start, end time.Time) float64 {
@@ -193,7 +186,7 @@ func computeSLAForWindow(changes []models.StateChange, statusAtStart string, sta
var upTime, downTime time.Duration var upTime, downTime time.Duration
cursor := start cursor := start
wasDown := isDown(statusAtStart) wasDown := models.Status(statusAtStart).IsBroken()
for _, sc := range sorted { for _, sc := range sorted {
if sc.ChangedAt.Before(start) || !sc.ChangedAt.Before(end) { if sc.ChangedAt.Before(start) || !sc.ChangedAt.Before(end) {
@@ -205,7 +198,7 @@ func computeSLAForWindow(changes []models.StateChange, statusAtStart string, sta
} else { } else {
upTime += seg upTime += seg
} }
wasDown = isDown(sc.ToStatus) wasDown = models.Status(sc.ToStatus).IsBroken()
cursor = sc.ChangedAt cursor = sc.ChangedAt
} }
+13 -13
View File
@@ -137,24 +137,24 @@ func TestComputeDailyBreakdown(t *testing.T) {
} }
} }
func TestIsDown(t *testing.T) { func TestIsBroken(t *testing.T) {
if !isDown("DOWN") { if !models.StatusDown.IsBroken() {
t.Error("DOWN should be down") t.Error("DOWN should be broken")
} }
if !isDown("SSL EXP") { if !models.StatusSSLExp.IsBroken() {
t.Error("SSL EXP should be down") t.Error("SSL EXP should be broken")
} }
if isDown("UP") { if models.StatusUp.IsBroken() {
t.Error("UP should not be down") t.Error("UP should not be broken")
} }
if isDown("LATE") { if models.StatusLate.IsBroken() {
t.Error("LATE should not be down") t.Error("LATE should not be broken")
} }
if isDown("STALE") { if models.StatusStale.IsBroken() {
t.Error("STALE should not be down") t.Error("STALE should not be broken")
} }
if isDown("PENDING") { if models.StatusPending.IsBroken() {
t.Error("PENDING should not be down") t.Error("PENDING should not be broken")
} }
} }
+19 -7
View File
@@ -25,6 +25,7 @@ type RateLimiter struct {
rate float64 rate float64
burst float64 burst float64
trusted []*net.IPNet trusted []*net.IPNet
stop chan struct{}
} }
func NewRateLimiter(requestsPerMinute int, trusted []*net.IPNet) *RateLimiter { func NewRateLimiter(requestsPerMinute int, trusted []*net.IPNet) *RateLimiter {
@@ -33,11 +34,16 @@ func NewRateLimiter(requestsPerMinute int, trusted []*net.IPNet) *RateLimiter {
rate: float64(requestsPerMinute) / 60.0, rate: float64(requestsPerMinute) / 60.0,
burst: float64(requestsPerMinute), burst: float64(requestsPerMinute),
trusted: trusted, trusted: trusted,
stop: make(chan struct{}),
} }
go rl.cleanup() go rl.cleanup()
return rl return rl
} }
func (rl *RateLimiter) Stop() {
close(rl.stop)
}
func (rl *RateLimiter) Allow(ip string) bool { func (rl *RateLimiter) Allow(ip string) bool {
rl.mu.Lock() rl.mu.Lock()
defer rl.mu.Unlock() defer rl.mu.Unlock()
@@ -84,16 +90,22 @@ func (rl *RateLimiter) evictOldest() {
} }
func (rl *RateLimiter) cleanup() { func (rl *RateLimiter) cleanup() {
ticker := time.NewTicker(5 * time.Minute)
defer ticker.Stop()
for { for {
time.Sleep(5 * time.Minute) select {
rl.mu.Lock() case <-ticker.C:
cutoff := time.Now().Add(-10 * time.Minute) rl.mu.Lock()
for ip, v := range rl.visitors { cutoff := time.Now().Add(-10 * time.Minute)
if v.lastSeen.Before(cutoff) { for ip, v := range rl.visitors {
delete(rl.visitors, ip) if v.lastSeen.Before(cutoff) {
delete(rl.visitors, ip)
}
} }
rl.mu.Unlock()
case <-rl.stop:
return
} }
rl.mu.Unlock()
} }
} }
+463 -423
View File
@@ -5,7 +5,7 @@ import (
"encoding/json" "encoding/json"
"fmt" "fmt"
"html/template" "html/template"
"log" "log/slog"
"net" "net"
"net/http" "net/http"
"sort" "sort"
@@ -21,6 +21,395 @@ import (
const maxRequestBody = 1 << 20 const maxRequestBody = 1 << 20
type ServerConfig struct {
Port int
EnableStatus bool
Title string
ClusterKey string
TLSCert string
TLSKey string
ClusterMode string
MetricsPublic bool
CORSOrigin string
TrustedProxies []*net.IPNet
QuietHTTPLog bool
}
type Server struct {
cfg ServerConfig
store store.Store
eng *monitor.Engine
pushRL *RateLimiter
probeRL *RateLimiter
backupRL *RateLimiter
statusRL *RateLimiter
}
func NewServer(cfg ServerConfig, s store.Store, eng *monitor.Engine) *Server {
return &Server{
cfg: cfg,
store: s,
eng: eng,
pushRL: NewRateLimiter(60, cfg.TrustedProxies),
probeRL: NewRateLimiter(30, cfg.TrustedProxies),
backupRL: NewRateLimiter(10, cfg.TrustedProxies),
statusRL: NewRateLimiter(120, cfg.TrustedProxies),
}
}
func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
srv := NewServer(cfg, s, eng)
return srv.Start()
}
func (s *Server) Start() *http.Server {
if s.cfg.ClusterKey == "" {
slog.Warn("no UPTOP_CLUSTER_SECRET set, cluster API endpoints will reject all requests")
}
if s.cfg.ClusterMode != "" && s.cfg.ClusterMode != "leader" && s.cfg.TLSCert == "" {
slog.Warn("cluster mode active without TLS, secrets transmitted in cleartext")
}
handler := s.routes()
addr := fmt.Sprintf(":%d", s.cfg.Port)
httpSrv := &http.Server{
Addr: addr,
Handler: handler,
ReadHeaderTimeout: 10 * time.Second,
ReadTimeout: 30 * time.Second,
WriteTimeout: 60 * time.Second,
IdleTimeout: 120 * time.Second,
}
go func() {
if s.cfg.TLSCert != "" && s.cfg.TLSKey != "" {
slog.Info("HTTPS server listening", "addr", addr)
if err := httpSrv.ListenAndServeTLS(s.cfg.TLSCert, s.cfg.TLSKey); err != nil && err != http.ErrServerClosed {
slog.Error("HTTPS server failed", "err", err)
}
} else {
slog.Info("HTTP server listening", "addr", addr)
if err := httpSrv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
slog.Error("HTTP server failed", "err", err)
}
}
}()
return httpSrv
}
func (s *Server) routes() http.Handler {
mux := http.NewServeMux()
mux.HandleFunc("/api/push", RateLimit(s.pushRL, s.handlePush))
mux.HandleFunc("/api/health", s.handleHealth)
mux.HandleFunc("/api/backup/export", RateLimit(s.backupRL, s.handleExport))
mux.HandleFunc("/api/backup/import", RateLimit(s.backupRL, s.handleImport))
mux.HandleFunc("/api/import/kuma", RateLimit(s.backupRL, s.handleKumaImport))
mux.HandleFunc("/api/probe/register", RateLimit(s.probeRL, s.handleProbeRegister))
mux.HandleFunc("/api/probe/assignments", RateLimit(s.probeRL, s.handleProbeAssignments))
mux.HandleFunc("/api/probe/results", RateLimit(s.probeRL, s.handleProbeResults))
mux.HandleFunc("/metrics", s.handleMetrics)
if s.cfg.EnableStatus {
mux.HandleFunc("/status", RateLimit(s.statusRL, s.handleStatus))
mux.HandleFunc("/status/json", RateLimit(s.statusRL, s.handleStatusJSON))
}
handler := securityHeadersMiddleware(mux)
if !s.cfg.QuietHTTPLog {
handler = loggingMiddleware(s.cfg.TrustedProxies, handler)
}
if s.cfg.TLSCert != "" {
handler = hstsMiddleware(handler)
}
return handler
}
func (s *Server) requireAuth(r *http.Request) bool {
return s.cfg.ClusterKey != "" && checkSecret(r.Header.Get("X-Uptop-Secret"), s.cfg.ClusterKey)
}
func (s *Server) handlePush(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet && r.Method != http.MethodPost {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
token := extractBearerToken(r)
if token == "" {
if qt := r.URL.Query().Get("token"); qt != "" {
token = qt
slog.Warn("push token in query string is deprecated, use Authorization: Bearer header")
}
}
if token == "" {
http.Error(w, "Missing token", http.StatusBadRequest)
return
}
if s.eng.RecordHeartbeat(token) {
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte("OK"))
} else {
http.Error(w, "Invalid Token", http.StatusNotFound)
}
}
func (s *Server) handleHealth(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if s.cfg.ClusterKey != "" && !checkSecret(r.Header.Get("X-Uptop-Secret"), s.cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte("OK"))
}
func (s *Server) handleExport(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if !s.requireAuth(r) {
http.Error(w, "Unauthorized: UPTOP_CLUSTER_SECRET required", http.StatusUnauthorized)
return
}
data, err := s.store.ExportData(r.Context())
if err != nil {
slog.Error("export failed", "err", err)
http.Error(w, "Export failed", http.StatusInternalServerError)
return
}
if r.URL.Query().Get("redact_secrets") != "false" {
for i := range data.Alerts {
data.Alerts[i].Settings = models.RedactAlertSettings(data.Alerts[i].Type, data.Alerts[i].Settings)
}
}
_ = json.NewEncoder(w).Encode(data) //nolint:errcheck
}
func (s *Server) handleImport(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
if !s.requireAuth(r) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var data models.Backup
if err := json.NewDecoder(r.Body).Decode(&data); err != nil {
http.Error(w, "Invalid JSON", http.StatusBadRequest)
return
}
// API import never modifies users — cluster-secret holder shouldn't be
// able to replace admin accounts. CLI restore still does full import.
data.Users = nil
if err := s.store.ImportData(r.Context(), data); err != nil {
slog.Error("import failed", "err", err)
http.Error(w, "Import failed", http.StatusInternalServerError)
return
}
_, _ = w.Write([]byte("Import Successful (users excluded — manage via CLI or UPTOP_KEYS)"))
}
func (s *Server) handleKumaImport(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
if !s.requireAuth(r) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var kb importer.KumaBackup
if err := json.NewDecoder(r.Body).Decode(&kb); err != nil {
slog.Error("invalid Kuma JSON", "err", err)
http.Error(w, "Invalid Kuma JSON", http.StatusBadRequest)
return
}
backup := importer.ConvertKuma(&kb)
if err := s.store.ImportData(r.Context(), backup); err != nil {
slog.Error("Kuma import failed", "err", err)
http.Error(w, "Import failed", http.StatusInternalServerError)
return
}
fmt.Fprintf(w, "Imported %d monitors, %d alerts from Kuma v%s", len(backup.Sites), len(backup.Alerts), kb.Version)
}
func (s *Server) handleProbeRegister(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
if !s.requireAuth(r) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var req struct {
ID string `json:"id"`
Name string `json:"name"`
Region string `json:"region"`
Version string `json:"version"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, "Invalid JSON", http.StatusBadRequest)
return
}
if req.ID == "" {
http.Error(w, "id is required", http.StatusBadRequest)
return
}
if err := s.store.RegisterNode(r.Context(), models.ProbeNode{
ID: req.ID, Name: req.Name, Region: req.Region, Version: req.Version,
}); err != nil {
slog.Error("probe registration failed", "err", err)
http.Error(w, "Registration failed", http.StatusInternalServerError)
return
}
_ = json.NewEncoder(w).Encode(map[string]bool{"ok": true}) //nolint:errcheck
}
func (s *Server) handleProbeAssignments(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if !s.requireAuth(r) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
nodeID := r.URL.Query().Get("node_id")
var nodeRegion string
if nodeID != "" {
if node, err := s.store.GetNode(r.Context(), nodeID); err == nil {
nodeRegion = node.Region
}
}
sites := s.eng.GetAllSites()
var assigned []models.Site
for _, site := range sites {
if site.Paused || site.Type == "push" || site.Type == "group" {
continue
}
if site.Regions != "" && nodeRegion != "" {
matched := false
for _, reg := range strings.Split(site.Regions, ",") {
if strings.TrimSpace(reg) == nodeRegion {
matched = true
break
}
}
if !matched {
continue
}
}
assigned = append(assigned, site)
}
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(map[string][]models.Site{"sites": assigned}) //nolint:errcheck
}
func (s *Server) handleProbeResults(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
if !s.requireAuth(r) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var req struct {
NodeID string `json:"node_id"`
Results []struct {
SiteID int `json:"site_id"`
LatencyNs int64 `json:"latency_ns"`
IsUp bool `json:"is_up"`
ErrorReason string `json:"error_reason"`
} `json:"results"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, "Invalid JSON", http.StatusBadRequest)
return
}
if req.NodeID == "" {
http.Error(w, "node_id is required", http.StatusBadRequest)
return
}
for _, result := range req.Results {
s.eng.EnqueueProbeCheck(result.SiteID, req.NodeID, result.LatencyNs, result.IsUp)
s.eng.IngestProbeResult(req.NodeID, result.SiteID, result.LatencyNs, result.IsUp, result.ErrorReason)
}
if err := s.store.UpdateNodeLastSeen(r.Context(), req.NodeID); err != nil {
slog.Error("node last-seen update failed", "err", err)
}
_ = json.NewEncoder(w).Encode(map[string]bool{"ok": true}) //nolint:errcheck
}
func (s *Server) handleMetrics(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if !s.cfg.MetricsPublic {
if !s.requireAuth(r) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
}
metrics.Handler(s.eng)(w, r)
}
func (s *Server) handleStatus(w http.ResponseWriter, _ *http.Request) {
renderStatusPage(w, s.cfg.Title, s.eng)
}
func (s *Server) handleStatusJSON(w http.ResponseWriter, r *http.Request) {
state := s.eng.GetLiveState()
activeWindows, _ := s.store.GetActiveMaintenanceWindows(r.Context())
maintSet := make(map[int]bool)
allInMaint := false
for _, mw := range activeWindows {
if mw.Type != "maintenance" {
continue
}
if mw.MonitorID == 0 {
allInMaint = true
} else {
maintSet[mw.MonitorID] = true
}
}
public := make(map[int]statusSite, len(state))
for id, site := range state {
displayStatus := string(site.Status)
if allInMaint || maintSet[site.ID] || (site.ParentID > 0 && maintSet[site.ParentID]) {
displayStatus = "MAINT"
}
public[id] = statusSite{
Name: site.Name,
Type: site.Type,
URL: site.URL,
Status: displayStatus,
Paused: site.Paused,
LastCheck: site.LastCheck,
Latency: site.Latency,
}
}
if s.cfg.CORSOrigin != "" {
w.Header().Set("Access-Control-Allow-Origin", s.cfg.CORSOrigin)
}
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(public) //nolint:errcheck
}
// --- Helpers ---
func checkSecret(got, want string) bool { func checkSecret(got, want string) bool {
return subtle.ConstantTimeCompare([]byte(got), []byte(want)) == 1 return subtle.ConstantTimeCompare([]byte(got), []byte(want)) == 1
} }
@@ -33,8 +422,79 @@ func extractBearerToken(r *http.Request) string {
return "" return ""
} }
// Alert-settings redaction policy lives in models.RedactAlertSettings so the // statusSite is the public DTO for /status/json.
// TUI detail panel and this export path share one allowlist. type statusSite struct {
Name string
Type string
URL string
Status string
Paused bool
LastCheck time.Time
Latency time.Duration
}
// --- Middleware ---
type statusWriter struct {
http.ResponseWriter
code int
}
func (w *statusWriter) WriteHeader(code int) {
w.code = code
w.ResponseWriter.WriteHeader(code)
}
func loggingMiddleware(trusted []*net.IPNet, next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
sw := &statusWriter{ResponseWriter: w, code: 200}
next.ServeHTTP(sw, r)
path := strings.ReplaceAll(strings.ReplaceAll(r.URL.Path, "\n", ""), "\r", "")
slog.Info("http request", "method", r.Method, "path", path, "status", sw.code, "duration", time.Since(start).Round(time.Millisecond), "ip", clientIP(r, trusted)) //nolint:gosec // structured slog, not format string
})
}
func securityHeadersMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("X-Content-Type-Options", "nosniff")
w.Header().Set("X-Frame-Options", "DENY")
w.Header().Set("Referrer-Policy", "no-referrer")
w.Header().Set("Content-Security-Policy", "default-src 'self'; script-src 'unsafe-inline'; style-src 'unsafe-inline'")
next.ServeHTTP(w, r)
})
}
func hstsMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Strict-Transport-Security", "max-age=63072000; includeSubDomains")
next.ServeHTTP(w, r)
})
}
func renderStatusPage(w http.ResponseWriter, title string, eng *monitor.Engine) {
sites := eng.GetAllSites()
sort.Slice(sites, func(i, j int) bool {
if sites[i].Status != sites[j].Status {
if sites[i].Status == models.StatusDown {
return true
}
if sites[j].Status == models.StatusDown {
return false
}
}
return sites[i].Name < sites[j].Name
})
data := struct {
Title string
Sites []models.Site
}{Title: title, Sites: sites}
if err := statusTpl.Execute(w, data); err != nil {
slog.Error("status page render failed", "err", err)
}
}
var statusTpl = template.Must(template.New("status").Parse(` var statusTpl = template.Must(template.New("status").Parse(`
<!DOCTYPE html> <!DOCTYPE html>
@@ -167,423 +627,3 @@ var statusTpl = template.Must(template.New("status").Parse(`
</script> </script>
</body> </body>
</html>`)) </html>`))
type ServerConfig struct {
Port int
EnableStatus bool
Title string
ClusterKey string
TLSCert string
TLSKey string
ClusterMode string
MetricsPublic bool
CORSOrigin string
TrustedProxies []*net.IPNet
// QuietHTTPLog disables per-request stderr logging. Set when the local
// TUI owns the terminal — request logs would scribble over the alt screen.
QuietHTTPLog bool
}
// statusSite is the public DTO for /status/json. models.Site must never be
// serialized raw here: it carries internal fields (LastError, Hostname, Port,
// DNSServer, AlertID, Token, ...) and every field added to it would become
// public by default. Field names match what the status page JS reads.
type statusSite struct {
Name string
Type string
URL string
Status string
Paused bool
LastCheck time.Time
Latency time.Duration
}
func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
if cfg.ClusterKey == "" {
fmt.Println("WARNING: No UPTOP_CLUSTER_SECRET set. Cluster API endpoints are unauthenticated.")
}
pushRL := NewRateLimiter(60, cfg.TrustedProxies)
probeRL := NewRateLimiter(30, cfg.TrustedProxies)
backupRL := NewRateLimiter(10, cfg.TrustedProxies)
statusRL := NewRateLimiter(120, cfg.TrustedProxies)
mux := http.NewServeMux()
// 1. Push Heartbeat
mux.HandleFunc("/api/push", RateLimit(pushRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet && r.Method != http.MethodPost {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
token := extractBearerToken(r)
if token == "" {
if qt := r.URL.Query().Get("token"); qt != "" {
token = qt
log.Printf("DEPRECATED: push token in query string — use Authorization: Bearer header instead")
}
}
if token == "" {
http.Error(w, "Missing token", http.StatusBadRequest)
return
}
if eng.RecordHeartbeat(token) {
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte("OK"))
} else {
http.Error(w, "Invalid Token", http.StatusNotFound)
}
}))
// 2. Health Check (For Cluster Follower)
mux.HandleFunc("/api/health", func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey != "" && !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte("OK"))
})
// 3. Config Export
mux.HandleFunc("/api/backup/export", RateLimit(backupRL, func(w http.ResponseWriter, r *http.Request) {
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized: UPTOP_CLUSTER_SECRET required", http.StatusUnauthorized)
return
}
data, err := s.ExportData(r.Context())
if err != nil {
log.Printf("Export failed: %v", err)
http.Error(w, "Export failed", http.StatusInternalServerError)
return
}
if r.URL.Query().Get("redact_secrets") != "false" {
for i := range data.Alerts {
data.Alerts[i].Settings = models.RedactAlertSettings(data.Alerts[i].Type, data.Alerts[i].Settings)
}
}
_ = json.NewEncoder(w).Encode(data) //nolint:errcheck
}))
// 4. Config Import
mux.HandleFunc("/api/backup/import", RateLimit(backupRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var data models.Backup
if err := json.NewDecoder(r.Body).Decode(&data); err != nil {
http.Error(w, "Invalid JSON", http.StatusBadRequest)
return
}
if err := s.ImportData(r.Context(), data); err != nil {
log.Printf("Import failed: %v", err)
http.Error(w, "Import failed", http.StatusInternalServerError)
return
}
_, _ = w.Write([]byte("Import Successful"))
}))
// 5. Kuma Import
mux.HandleFunc("/api/import/kuma", RateLimit(backupRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var kb importer.KumaBackup
if err := json.NewDecoder(r.Body).Decode(&kb); err != nil {
log.Printf("Invalid Kuma JSON: %v", err)
http.Error(w, "Invalid Kuma JSON", http.StatusBadRequest)
return
}
backup := importer.ConvertKuma(&kb)
if err := s.ImportData(r.Context(), backup); err != nil {
log.Printf("Kuma import failed: %v", err)
http.Error(w, "Import failed", http.StatusInternalServerError)
return
}
fmt.Fprintf(w, "Imported %d monitors, %d alerts from Kuma v%s", len(backup.Sites), len(backup.Alerts), kb.Version)
}))
// 6. Probe Registration
mux.HandleFunc("/api/probe/register", RateLimit(probeRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var req struct {
ID string `json:"id"`
Name string `json:"name"`
Region string `json:"region"`
Version string `json:"version"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, "Invalid JSON", http.StatusBadRequest)
return
}
if req.ID == "" {
http.Error(w, "id is required", http.StatusBadRequest)
return
}
if err := s.RegisterNode(r.Context(), models.ProbeNode{
ID: req.ID, Name: req.Name, Region: req.Region, Version: req.Version,
}); err != nil {
log.Printf("Probe register failed: %v", err)
http.Error(w, "Registration failed", http.StatusInternalServerError)
return
}
_ = json.NewEncoder(w).Encode(map[string]bool{"ok": true}) //nolint:errcheck
}))
// 7. Probe Assignment Fetch
mux.HandleFunc("/api/probe/assignments", RateLimit(probeRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
nodeID := r.URL.Query().Get("node_id")
var nodeRegion string
if nodeID != "" {
if node, err := s.GetNode(r.Context(), nodeID); err == nil {
nodeRegion = node.Region
}
}
sites := eng.GetAllSites()
var assigned []models.Site
for _, site := range sites {
if site.Paused || site.Type == "push" || site.Type == "group" {
continue
}
if site.Regions != "" && nodeRegion != "" {
matched := false
for _, r := range strings.Split(site.Regions, ",") {
if strings.TrimSpace(r) == nodeRegion {
matched = true
break
}
}
if !matched {
continue
}
}
assigned = append(assigned, site)
}
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(map[string][]models.Site{"sites": assigned}) //nolint:errcheck
}))
// 8. Probe Result Submission
mux.HandleFunc("/api/probe/results", RateLimit(probeRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var req struct {
NodeID string `json:"node_id"`
Results []struct {
SiteID int `json:"site_id"`
LatencyNs int64 `json:"latency_ns"`
IsUp bool `json:"is_up"`
ErrorReason string `json:"error_reason"`
} `json:"results"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, "Invalid JSON", http.StatusBadRequest)
return
}
if req.NodeID == "" {
http.Error(w, "node_id is required", http.StatusBadRequest)
return
}
for _, result := range req.Results {
eng.EnqueueProbeCheck(result.SiteID, req.NodeID, result.LatencyNs, result.IsUp)
eng.IngestProbeResult(req.NodeID, result.SiteID, result.LatencyNs, result.IsUp, result.ErrorReason)
}
if err := s.UpdateNodeLastSeen(r.Context(), req.NodeID); err != nil {
log.Printf("Failed to update node last seen: %v", err)
}
_ = json.NewEncoder(w).Encode(map[string]bool{"ok": true}) //nolint:errcheck
}))
// 9. Prometheus Metrics
mux.HandleFunc("/metrics", func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if !cfg.MetricsPublic && cfg.ClusterKey != "" {
if !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
}
metrics.Handler(eng)(w, r)
})
// 10. Status Page
if cfg.EnableStatus {
mux.HandleFunc("/status", RateLimit(statusRL, func(w http.ResponseWriter, r *http.Request) { renderStatusPage(w, cfg.Title, eng) }))
mux.HandleFunc("/status/json", RateLimit(statusRL, func(w http.ResponseWriter, r *http.Request) {
state := eng.GetLiveState()
activeWindows, _ := s.GetActiveMaintenanceWindows(r.Context())
maintSet := make(map[int]bool)
allInMaint := false
for _, mw := range activeWindows {
if mw.Type != "maintenance" {
continue
}
if mw.MonitorID == 0 {
allInMaint = true
} else {
maintSet[mw.MonitorID] = true
}
}
public := make(map[int]statusSite, len(state))
for id, site := range state {
status := site.Status
if allInMaint || maintSet[site.ID] || (site.ParentID > 0 && maintSet[site.ParentID]) {
status = "MAINT"
}
public[id] = statusSite{
Name: site.Name,
Type: site.Type,
URL: site.URL,
Status: status,
Paused: site.Paused,
LastCheck: site.LastCheck,
Latency: site.Latency,
}
}
if cfg.CORSOrigin != "" {
w.Header().Set("Access-Control-Allow-Origin", cfg.CORSOrigin)
}
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(public) //nolint:errcheck
}))
}
if cfg.ClusterMode != "" && cfg.ClusterMode != "leader" && cfg.TLSCert == "" {
fmt.Println("WARNING: Cluster mode active without TLS. Secrets transmitted in cleartext.")
}
handler := securityHeadersMiddleware(mux)
if !cfg.QuietHTTPLog {
handler = loggingMiddleware(cfg.TrustedProxies, handler)
}
if cfg.TLSCert != "" {
handler = hstsMiddleware(handler)
}
addr := fmt.Sprintf(":%d", cfg.Port)
srv := &http.Server{
Addr: addr,
Handler: handler,
ReadHeaderTimeout: 10 * time.Second,
ReadTimeout: 30 * time.Second,
WriteTimeout: 60 * time.Second,
IdleTimeout: 120 * time.Second,
}
go func() {
if cfg.TLSCert != "" && cfg.TLSKey != "" {
fmt.Printf("HTTPS Server listening on %s\n", addr)
if err := srv.ListenAndServeTLS(cfg.TLSCert, cfg.TLSKey); err != nil && err != http.ErrServerClosed {
log.Printf("HTTPS server error: %v", err)
}
} else {
fmt.Printf("HTTP Server listening on %s\n", addr)
if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
log.Printf("HTTP server error: %v", err)
}
}
}()
return srv
}
type statusWriter struct {
http.ResponseWriter
code int
}
func (w *statusWriter) WriteHeader(code int) {
w.code = code
w.ResponseWriter.WriteHeader(code)
}
func loggingMiddleware(trusted []*net.IPNet, next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
sw := &statusWriter{ResponseWriter: w, code: 200}
next.ServeHTTP(sw, r)
path := strings.ReplaceAll(strings.ReplaceAll(r.URL.Path, "\n", ""), "\r", "")
log.Printf("%s %s %d %s %s", r.Method, path, sw.code, time.Since(start).Round(time.Millisecond), clientIP(r, trusted)) //nolint:gosec // path sanitized above
})
}
func securityHeadersMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("X-Content-Type-Options", "nosniff")
w.Header().Set("X-Frame-Options", "DENY")
w.Header().Set("Referrer-Policy", "no-referrer")
w.Header().Set("Content-Security-Policy", "default-src 'self'; script-src 'unsafe-inline'; style-src 'unsafe-inline'")
next.ServeHTTP(w, r)
})
}
func hstsMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Strict-Transport-Security", "max-age=63072000; includeSubDomains")
next.ServeHTTP(w, r)
})
}
func renderStatusPage(w http.ResponseWriter, title string, eng *monitor.Engine) {
sites := eng.GetAllSites()
sort.Slice(sites, func(i, j int) bool {
if sites[i].Status != sites[j].Status {
if sites[i].Status == "DOWN" {
return true
}
if sites[j].Status == "DOWN" {
return false
}
}
return sites[i].Name < sites[j].Name
})
data := struct {
Title string
Sites []models.Site
}{Title: title, Sites: sites}
if err := statusTpl.Execute(w, data); err != nil {
log.Printf("Failed to render status page: %v", err)
}
}
+9 -81
View File
@@ -13,13 +13,15 @@ import (
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models" "gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/monitor" "gitea.lerkolabs.com/lerkolabs/uptop/internal/monitor"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/store/storetest"
) )
// --- Mock Store --- // --- Mock Store ---
type mockStore struct { type mockStore struct {
storetest.BaseMock
mu sync.Mutex mu sync.Mutex
sites []models.Site sites []models.SiteConfig
alerts []models.AlertConfig alerts []models.AlertConfig
nodes map[string]models.ProbeNode nodes map[string]models.ProbeNode
importedData *models.Backup importedData *models.Backup
@@ -33,84 +35,10 @@ func newMockStore() *mockStore {
} }
} }
func (m *mockStore) Init(_ context.Context) error { return nil } func (m *mockStore) GetSites(_ context.Context) ([]models.SiteConfig, error) { return m.sites, nil }
func (m *mockStore) GetSites(_ context.Context) ([]models.Site, error) { return m.sites, nil }
func (m *mockStore) AddSite(_ context.Context, _ models.Site) error { return nil }
func (m *mockStore) UpdateSite(_ context.Context, _ models.Site) error { return nil }
func (m *mockStore) UpdateSitePaused(_ context.Context, _ int, _ bool) error { return nil }
func (m *mockStore) DeleteSite(_ context.Context, _ int) error { return nil }
func (m *mockStore) GetAllAlerts(_ context.Context) ([]models.AlertConfig, error) { func (m *mockStore) GetAllAlerts(_ context.Context) ([]models.AlertConfig, error) {
return m.alerts, nil return m.alerts, nil
} }
func (m *mockStore) GetAlert(_ context.Context, _ int) (models.AlertConfig, error) {
return models.AlertConfig{}, nil
}
func (m *mockStore) AddAlert(_ context.Context, _ string, _ string, _ map[string]string) error {
return nil
}
func (m *mockStore) UpdateAlert(_ context.Context, _ int, _ string, _ string, _ map[string]string) error {
return nil
}
func (m *mockStore) DeleteAlert(_ context.Context, _ int) error { return nil }
func (m *mockStore) GetAllUsers(_ context.Context) ([]models.User, error) { return nil, nil }
func (m *mockStore) AddUser(_ context.Context, _ string, _ string, _ string) error { return nil }
func (m *mockStore) UpdateUser(_ context.Context, _ int, _ string, _ string, _ string) error {
return nil
}
func (m *mockStore) DeleteUser(_ context.Context, _ int) error { return nil }
func (m *mockStore) SaveCheck(_ context.Context, _ int, _ int64, _ bool) error { return nil }
func (m *mockStore) SaveCheckFromNode(_ context.Context, siteID int, nodeID string, latencyNs int64, isUp bool) error {
return nil
}
func (m *mockStore) LoadAllHistory(_ context.Context, _ int) (map[int][]models.CheckRecord, error) {
return nil, nil
}
func (m *mockStore) GetSiteByName(_ context.Context, _ string) (models.Site, error) {
return models.Site{}, nil
}
func (m *mockStore) GetAlertByName(_ context.Context, _ string) (models.AlertConfig, error) {
return models.AlertConfig{}, nil
}
func (m *mockStore) AddSiteReturningID(_ context.Context, _ models.Site) (int, error) { return 0, nil }
func (m *mockStore) AddAlertReturningID(_ context.Context, _ string, _ string, _ map[string]string) (int, error) {
return 0, nil
}
func (m *mockStore) GetAllNodes(_ context.Context) ([]models.ProbeNode, error) { return nil, nil }
func (m *mockStore) UpdateNodeLastSeen(_ context.Context, _ string) error { return nil }
func (m *mockStore) DeleteNode(_ context.Context, _ string) error { return nil }
func (m *mockStore) LoadAlertHealth(_ context.Context) (map[int]models.AlertHealthRecord, error) {
return nil, nil
}
func (m *mockStore) SaveAlertHealth(_ context.Context, _ models.AlertHealthRecord) error { return nil }
func (m *mockStore) SaveLog(_ context.Context, _ string) error { return nil }
func (m *mockStore) PruneLogs(_ context.Context) error { return nil }
func (m *mockStore) PruneCheckHistory(_ context.Context) error { return nil }
func (m *mockStore) PruneStateChanges(_ context.Context) error { return nil }
func (m *mockStore) LoadLogs(_ context.Context, _ int) ([]string, error) { return nil, nil }
func (m *mockStore) GetAllMaintenanceWindows(_ context.Context, _ int) ([]models.MaintenanceWindow, error) {
return nil, nil
}
func (m *mockStore) AddMaintenanceWindow(_ context.Context, _ models.MaintenanceWindow) error {
return nil
}
func (m *mockStore) EndMaintenanceWindow(_ context.Context, _ int) error { return nil }
func (m *mockStore) DeleteMaintenanceWindow(_ context.Context, _ int) error { return nil }
func (m *mockStore) PruneExpiredMaintenanceWindows(_ context.Context, _ time.Duration) (int64, error) {
return 0, nil
}
func (m *mockStore) IsMonitorInMaintenance(_ context.Context, _ int) (bool, error) { return false, nil }
func (m *mockStore) GetPreference(_ context.Context, _ string) (string, error) { return "", nil }
func (m *mockStore) SetPreference(_ context.Context, _ string, _ string) error { return nil }
func (m *mockStore) SaveStateChange(_ context.Context, _ int, _ string, _ string, _ string) error {
return nil
}
func (m *mockStore) GetStateChanges(_ context.Context, _ int, _ int) ([]models.StateChange, error) {
return nil, nil
}
func (m *mockStore) GetStateChangesSince(_ context.Context, _ int, _ time.Time) ([]models.StateChange, error) {
return nil, nil
}
func (m *mockStore) Close() error { return nil }
func (m *mockStore) ExportData(_ context.Context) (models.Backup, error) { func (m *mockStore) ExportData(_ context.Context) (models.Backup, error) {
return models.Backup{ return models.Backup{
@@ -213,7 +141,7 @@ func authReq(method, url, secret string, body []byte) (*http.Response, error) {
return nil, err return nil, err
} }
if secret != "" { if secret != "" {
req.Header.Set("X-Upkeep-Secret", secret) req.Header.Set("X-Uptop-Secret", secret)
} }
return http.DefaultClient.Do(req) return http.DefaultClient.Do(req)
} }
@@ -324,7 +252,7 @@ func TestExport_Unauthorized_WrongKey(t *testing.T) {
func TestExport_Success(t *testing.T) { func TestExport_Success(t *testing.T) {
ts := newTestServer(t, "secret", false) ts := newTestServer(t, "secret", false)
ts.store.sites = []models.Site{{ID: 1, Name: "example", URL: "http://example.com"}} ts.store.sites = []models.SiteConfig{{ID: 1, Name: "example", URL: "http://example.com"}}
resp, err := authReq("GET", ts.baseURL+"/api/backup/export", "secret", nil) resp, err := authReq("GET", ts.baseURL+"/api/backup/export", "secret", nil)
if err != nil { if err != nil {
@@ -371,7 +299,7 @@ func TestImport_Unauthorized(t *testing.T) {
func TestImport_Success(t *testing.T) { func TestImport_Success(t *testing.T) {
ts := newTestServer(t, "secret", false) ts := newTestServer(t, "secret", false)
backup := models.Backup{ backup := models.Backup{
Sites: []models.Site{{Name: "imported", URL: "http://example.com"}}, Sites: []models.SiteConfig{{Name: "imported", URL: "http://example.com"}},
} }
body, _ := json.Marshal(backup) body, _ := json.Marshal(backup)
resp, err := authReq("POST", ts.baseURL+"/api/backup/import", "secret", body) resp, err := authReq("POST", ts.baseURL+"/api/backup/import", "secret", body)
@@ -509,9 +437,9 @@ func TestStatusJSON_PublicDTOOnly(t *testing.T) {
// take. The old version of this test injected via UpdateSiteConfig, which // take. The old version of this test injected via UpdateSiteConfig, which
// no-ops for unknown IDs, so it asserted over zero sites and passed // no-ops for unknown IDs, so it asserted over zero sites and passed
// against a server that leaked tokens. // against a server that leaked tokens.
ts.store.sites = []models.Site{{ ts.store.sites = []models.SiteConfig{{
ID: 1, Name: "test", Type: "push", Token: "secret-token", ID: 1, Name: "test", Type: "push", Token: "secret-token",
Hostname: "internal-host", LastError: "internal failure detail", AlertID: 3, Hostname: "internal-host", AlertID: 3,
}} }}
ctx, cancel := context.WithCancel(context.Background()) ctx, cancel := context.WithCancel(context.Background())
ts.engine.Start(ctx) ts.engine.Start(ctx)
+8 -1
View File
@@ -5,13 +5,20 @@ import (
"strconv" "strconv"
) )
type Migration struct {
Version int
SQL string
}
type Dialect interface { type Dialect interface {
DriverName() string DriverName() string
CreateTablesSQL() []string CreateTablesSQL() []string
MigrationsSQL() []string Migrations() []Migration
BaselineVersion() int
BoolFalse() string BoolFalse() string
ResetSequenceOnEmpty(db *sql.DB, table string) ResetSequenceOnEmpty(db *sql.DB, table string)
ImportWipe(tx *sql.Tx) ImportWipe(tx *sql.Tx)
ImportWipeUsers(tx *sql.Tx)
ImportResetSequences(tx *sql.Tx) ImportResetSequences(tx *sql.Tx)
UpsertNodeSQL() string UpsertNodeSQL() string
UpsertAlertHealthSQL() string UpsertAlertHealthSQL() string
+47 -41
View File
@@ -2,7 +2,7 @@ package store
import ( import (
"database/sql" "database/sql"
"log" "log/slog"
_ "github.com/lib/pq" _ "github.com/lib/pq"
) )
@@ -13,8 +13,9 @@ func NewPostgresStore(connStr string) (*SQLStore, error) {
return NewSQLStore("postgres", connStr, &PostgresDialect{}) return NewSQLStore("postgres", connStr, &PostgresDialect{})
} }
func (d *PostgresDialect) DriverName() string { return "postgres" } func (d *PostgresDialect) DriverName() string { return "postgres" }
func (d *PostgresDialect) BoolFalse() string { return "FALSE" } func (d *PostgresDialect) BoolFalse() string { return "FALSE" }
func (d *PostgresDialect) BaselineVersion() int { return 21 }
func (d *PostgresDialect) CreateTablesSQL() []string { func (d *PostgresDialect) CreateTablesSQL() []string {
return []string{ return []string{
@@ -32,7 +33,8 @@ func (d *PostgresDialect) CreateTablesSQL() []string {
method TEXT DEFAULT 'GET', description TEXT DEFAULT '', method TEXT DEFAULT 'GET', description TEXT DEFAULT '',
parent_id INTEGER DEFAULT 0, accepted_codes TEXT DEFAULT '200-299', parent_id INTEGER DEFAULT 0, accepted_codes TEXT DEFAULT '200-299',
dns_resolve_type TEXT DEFAULT '', dns_server TEXT DEFAULT '', dns_resolve_type TEXT DEFAULT '', dns_server TEXT DEFAULT '',
ignore_tls BOOLEAN DEFAULT FALSE, paused BOOLEAN DEFAULT FALSE ignore_tls BOOLEAN DEFAULT FALSE, paused BOOLEAN DEFAULT FALSE,
regions TEXT DEFAULT ''
)`, )`,
`CREATE TABLE IF NOT EXISTS users ( `CREATE TABLE IF NOT EXISTS users (
id SERIAL PRIMARY KEY, id SERIAL PRIMARY KEY,
@@ -42,7 +44,8 @@ func (d *PostgresDialect) CreateTablesSQL() []string {
`CREATE TABLE IF NOT EXISTS check_history ( `CREATE TABLE IF NOT EXISTS check_history (
id SERIAL PRIMARY KEY, id SERIAL PRIMARY KEY,
site_id INTEGER NOT NULL, latency_ns BIGINT, site_id INTEGER NOT NULL, latency_ns BIGINT,
is_up BOOLEAN, checked_at TIMESTAMPTZ DEFAULT NOW() is_up BOOLEAN, checked_at TIMESTAMPTZ DEFAULT NOW(),
node_id TEXT DEFAULT ''
)`, )`,
`CREATE INDEX IF NOT EXISTS idx_check_history_site ON check_history(site_id, checked_at DESC)`, `CREATE INDEX IF NOT EXISTS idx_check_history_site ON check_history(site_id, checked_at DESC)`,
`CREATE TABLE IF NOT EXISTS nodes ( `CREATE TABLE IF NOT EXISTS nodes (
@@ -92,29 +95,29 @@ func (d *PostgresDialect) CreateTablesSQL() []string {
} }
} }
func (d *PostgresDialect) MigrationsSQL() []string { func (d *PostgresDialect) Migrations() []Migration {
return []string{ return []Migration{
"ALTER TABLE sites ADD COLUMN IF NOT EXISTS hostname TEXT DEFAULT ''", {1, "ALTER TABLE sites ADD COLUMN IF NOT EXISTS hostname TEXT DEFAULT ''"},
"ALTER TABLE sites ADD COLUMN IF NOT EXISTS port INTEGER DEFAULT 0", {2, "ALTER TABLE sites ADD COLUMN IF NOT EXISTS port INTEGER DEFAULT 0"},
"ALTER TABLE sites ADD COLUMN IF NOT EXISTS timeout INTEGER DEFAULT 0", {3, "ALTER TABLE sites ADD COLUMN IF NOT EXISTS timeout INTEGER DEFAULT 0"},
"ALTER TABLE sites ADD COLUMN IF NOT EXISTS method TEXT DEFAULT 'GET'", {4, "ALTER TABLE sites ADD COLUMN IF NOT EXISTS method TEXT DEFAULT 'GET'"},
"ALTER TABLE sites ADD COLUMN IF NOT EXISTS description TEXT DEFAULT ''", {5, "ALTER TABLE sites ADD COLUMN IF NOT EXISTS description TEXT DEFAULT ''"},
"ALTER TABLE sites ADD COLUMN IF NOT EXISTS parent_id INTEGER DEFAULT 0", {6, "ALTER TABLE sites ADD COLUMN IF NOT EXISTS parent_id INTEGER DEFAULT 0"},
"ALTER TABLE sites ADD COLUMN IF NOT EXISTS accepted_codes TEXT DEFAULT '200-299'", {7, "ALTER TABLE sites ADD COLUMN IF NOT EXISTS accepted_codes TEXT DEFAULT '200-299'"},
"ALTER TABLE sites ADD COLUMN IF NOT EXISTS dns_resolve_type TEXT DEFAULT ''", {8, "ALTER TABLE sites ADD COLUMN IF NOT EXISTS dns_resolve_type TEXT DEFAULT ''"},
"ALTER TABLE sites ADD COLUMN IF NOT EXISTS dns_server TEXT DEFAULT ''", {9, "ALTER TABLE sites ADD COLUMN IF NOT EXISTS dns_server TEXT DEFAULT ''"},
"ALTER TABLE sites ADD COLUMN IF NOT EXISTS ignore_tls BOOLEAN DEFAULT FALSE", {10, "ALTER TABLE sites ADD COLUMN IF NOT EXISTS ignore_tls BOOLEAN DEFAULT FALSE"},
"ALTER TABLE sites ADD COLUMN IF NOT EXISTS paused BOOLEAN DEFAULT FALSE", {11, "ALTER TABLE sites ADD COLUMN IF NOT EXISTS paused BOOLEAN DEFAULT FALSE"},
"ALTER TABLE check_history ADD COLUMN IF NOT EXISTS node_id TEXT DEFAULT ''", {12, "ALTER TABLE check_history ADD COLUMN IF NOT EXISTS node_id TEXT DEFAULT ''"},
"ALTER TABLE sites ADD COLUMN IF NOT EXISTS regions TEXT DEFAULT ''", {13, "ALTER TABLE sites ADD COLUMN IF NOT EXISTS regions TEXT DEFAULT ''"},
"ALTER TABLE check_history ALTER COLUMN checked_at TYPE TIMESTAMPTZ USING checked_at AT TIME ZONE 'UTC'", {14, "ALTER TABLE check_history ALTER COLUMN checked_at TYPE TIMESTAMPTZ USING checked_at AT TIME ZONE 'UTC'"},
"ALTER TABLE nodes ALTER COLUMN last_seen TYPE TIMESTAMPTZ USING last_seen AT TIME ZONE 'UTC'", {15, "ALTER TABLE nodes ALTER COLUMN last_seen TYPE TIMESTAMPTZ USING last_seen AT TIME ZONE 'UTC'"},
"ALTER TABLE logs ALTER COLUMN created_at TYPE TIMESTAMPTZ USING created_at AT TIME ZONE 'UTC'", {16, "ALTER TABLE logs ALTER COLUMN created_at TYPE TIMESTAMPTZ USING created_at AT TIME ZONE 'UTC'"},
"ALTER TABLE maintenance_windows ALTER COLUMN start_time TYPE TIMESTAMPTZ USING start_time AT TIME ZONE 'UTC'", {17, "ALTER TABLE maintenance_windows ALTER COLUMN start_time TYPE TIMESTAMPTZ USING start_time AT TIME ZONE 'UTC'"},
"ALTER TABLE maintenance_windows ALTER COLUMN end_time TYPE TIMESTAMPTZ USING end_time AT TIME ZONE 'UTC'", {18, "ALTER TABLE maintenance_windows ALTER COLUMN end_time TYPE TIMESTAMPTZ USING end_time AT TIME ZONE 'UTC'"},
"ALTER TABLE maintenance_windows ALTER COLUMN created_at TYPE TIMESTAMPTZ USING created_at AT TIME ZONE 'UTC'", {19, "ALTER TABLE maintenance_windows ALTER COLUMN created_at TYPE TIMESTAMPTZ USING created_at AT TIME ZONE 'UTC'"},
"ALTER TABLE state_changes ALTER COLUMN changed_at TYPE TIMESTAMPTZ USING changed_at AT TIME ZONE 'UTC'", {20, "ALTER TABLE state_changes ALTER COLUMN changed_at TYPE TIMESTAMPTZ USING changed_at AT TIME ZONE 'UTC'"},
"ALTER TABLE alert_health ALTER COLUMN last_send_at TYPE TIMESTAMPTZ USING last_send_at AT TIME ZONE 'UTC'", {21, "ALTER TABLE alert_health ALTER COLUMN last_send_at TYPE TIMESTAMPTZ USING last_send_at AT TIME ZONE 'UTC'"},
} }
} }
@@ -130,39 +133,42 @@ func (d *PostgresDialect) ResetSequenceOnEmpty(db *sql.DB, table string) {}
func (d *PostgresDialect) ImportWipe(tx *sql.Tx) { func (d *PostgresDialect) ImportWipe(tx *sql.Tx) {
if _, err := tx.Exec("TRUNCATE TABLE sites RESTART IDENTITY CASCADE"); err != nil { if _, err := tx.Exec("TRUNCATE TABLE sites RESTART IDENTITY CASCADE"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "sites", "err", err)
} }
if _, err := tx.Exec("TRUNCATE TABLE alerts RESTART IDENTITY CASCADE"); err != nil { if _, err := tx.Exec("TRUNCATE TABLE alerts RESTART IDENTITY CASCADE"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "alerts", "err", err)
}
if _, err := tx.Exec("TRUNCATE TABLE users RESTART IDENTITY CASCADE"); err != nil {
log.Printf("import wipe error: %v", err)
} }
if _, err := tx.Exec("TRUNCATE TABLE maintenance_windows RESTART IDENTITY CASCADE"); err != nil { if _, err := tx.Exec("TRUNCATE TABLE maintenance_windows RESTART IDENTITY CASCADE"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "maintenance_windows", "err", err)
} }
if _, err := tx.Exec("TRUNCATE TABLE check_history RESTART IDENTITY CASCADE"); err != nil { if _, err := tx.Exec("TRUNCATE TABLE check_history RESTART IDENTITY CASCADE"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "check_history", "err", err)
} }
if _, err := tx.Exec("TRUNCATE TABLE state_changes RESTART IDENTITY CASCADE"); err != nil { if _, err := tx.Exec("TRUNCATE TABLE state_changes RESTART IDENTITY CASCADE"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "state_changes", "err", err)
} }
if _, err := tx.Exec("TRUNCATE TABLE alert_health RESTART IDENTITY CASCADE"); err != nil { if _, err := tx.Exec("TRUNCATE TABLE alert_health RESTART IDENTITY CASCADE"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "alert_health", "err", err)
}
}
func (d *PostgresDialect) ImportWipeUsers(tx *sql.Tx) {
if _, err := tx.Exec("TRUNCATE TABLE users RESTART IDENTITY CASCADE"); err != nil {
slog.Debug("import wipe failed", "table", "users", "err", err)
} }
} }
func (d *PostgresDialect) ImportResetSequences(tx *sql.Tx) { func (d *PostgresDialect) ImportResetSequences(tx *sql.Tx) {
if _, err := tx.Exec("SELECT setval('sites_id_seq', (SELECT COALESCE(MAX(id), 1) FROM sites))"); err != nil { if _, err := tx.Exec("SELECT setval('sites_id_seq', (SELECT COALESCE(MAX(id), 1) FROM sites))"); err != nil {
log.Printf("sequence reset error: %v", err) slog.Debug("sequence reset failed", "table", "sites", "err", err)
} }
if _, err := tx.Exec("SELECT setval('alerts_id_seq', (SELECT COALESCE(MAX(id), 1) FROM alerts))"); err != nil { if _, err := tx.Exec("SELECT setval('alerts_id_seq', (SELECT COALESCE(MAX(id), 1) FROM alerts))"); err != nil {
log.Printf("sequence reset error: %v", err) slog.Debug("sequence reset failed", "table", "alerts", "err", err)
} }
if _, err := tx.Exec("SELECT setval('users_id_seq', (SELECT COALESCE(MAX(id), 1) FROM users))"); err != nil { if _, err := tx.Exec("SELECT setval('users_id_seq', (SELECT COALESCE(MAX(id), 1) FROM users))"); err != nil {
log.Printf("sequence reset error: %v", err) slog.Debug("sequence reset failed", "table", "users", "err", err)
} }
if _, err := tx.Exec("SELECT setval('maintenance_windows_id_seq', (SELECT COALESCE(MAX(id), 1) FROM maintenance_windows))"); err != nil { if _, err := tx.Exec("SELECT setval('maintenance_windows_id_seq', (SELECT COALESCE(MAX(id), 1) FROM maintenance_windows))"); err != nil {
log.Printf("sequence reset error: %v", err) slog.Debug("sequence reset failed", "table", "maintenance_windows", "err", err)
} }
} }
+50 -36
View File
@@ -3,7 +3,8 @@ package store
import ( import (
"database/sql" "database/sql"
"fmt" "fmt"
"log" "log/slog"
"os"
_ "modernc.org/sqlite" _ "modernc.org/sqlite"
) )
@@ -25,11 +26,19 @@ func NewSQLiteStore(path string) (*SQLStore, error) {
if err != nil { if err != nil {
return nil, err return nil, err
} }
if path != ":memory:" {
for _, suffix := range []string{"", "-wal", "-shm"} {
if err := os.Chmod(path+suffix, 0600); err != nil && !os.IsNotExist(err) {
slog.Warn("failed to chmod database file", "path", path+suffix, "err", err)
}
}
}
return s, nil return s, nil
} }
func (d *SQLiteDialect) DriverName() string { return "sqlite" } func (d *SQLiteDialect) DriverName() string { return "sqlite" }
func (d *SQLiteDialect) BoolFalse() string { return "0" } func (d *SQLiteDialect) BoolFalse() string { return "0" }
func (d *SQLiteDialect) BaselineVersion() int { return 13 }
func (d *SQLiteDialect) CreateTablesSQL() []string { func (d *SQLiteDialect) CreateTablesSQL() []string {
return []string{ return []string{
@@ -47,7 +56,8 @@ func (d *SQLiteDialect) CreateTablesSQL() []string {
method TEXT DEFAULT 'GET', description TEXT DEFAULT '', method TEXT DEFAULT 'GET', description TEXT DEFAULT '',
parent_id INTEGER DEFAULT 0, accepted_codes TEXT DEFAULT '200-299', parent_id INTEGER DEFAULT 0, accepted_codes TEXT DEFAULT '200-299',
dns_resolve_type TEXT DEFAULT '', dns_server TEXT DEFAULT '', dns_resolve_type TEXT DEFAULT '', dns_server TEXT DEFAULT '',
ignore_tls BOOLEAN DEFAULT 0, paused BOOLEAN DEFAULT 0 ignore_tls BOOLEAN DEFAULT 0, paused BOOLEAN DEFAULT 0,
regions TEXT DEFAULT ''
)`, )`,
`CREATE TABLE IF NOT EXISTS users ( `CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY AUTOINCREMENT, id INTEGER PRIMARY KEY AUTOINCREMENT,
@@ -57,7 +67,8 @@ func (d *SQLiteDialect) CreateTablesSQL() []string {
`CREATE TABLE IF NOT EXISTS check_history ( `CREATE TABLE IF NOT EXISTS check_history (
id INTEGER PRIMARY KEY AUTOINCREMENT, id INTEGER PRIMARY KEY AUTOINCREMENT,
site_id INTEGER NOT NULL, latency_ns INTEGER, site_id INTEGER NOT NULL, latency_ns INTEGER,
is_up BOOLEAN, checked_at DATETIME DEFAULT CURRENT_TIMESTAMP is_up BOOLEAN, checked_at DATETIME DEFAULT CURRENT_TIMESTAMP,
node_id TEXT DEFAULT ''
)`, )`,
`CREATE INDEX IF NOT EXISTS idx_check_history_site ON check_history(site_id, checked_at DESC)`, `CREATE INDEX IF NOT EXISTS idx_check_history_site ON check_history(site_id, checked_at DESC)`,
`CREATE TABLE IF NOT EXISTS nodes ( `CREATE TABLE IF NOT EXISTS nodes (
@@ -107,21 +118,21 @@ func (d *SQLiteDialect) CreateTablesSQL() []string {
} }
} }
func (d *SQLiteDialect) MigrationsSQL() []string { func (d *SQLiteDialect) Migrations() []Migration {
return []string{ return []Migration{
"ALTER TABLE sites ADD COLUMN hostname TEXT DEFAULT ''", {1, "ALTER TABLE sites ADD COLUMN hostname TEXT DEFAULT ''"},
"ALTER TABLE sites ADD COLUMN port INTEGER DEFAULT 0", {2, "ALTER TABLE sites ADD COLUMN port INTEGER DEFAULT 0"},
"ALTER TABLE sites ADD COLUMN timeout INTEGER DEFAULT 0", {3, "ALTER TABLE sites ADD COLUMN timeout INTEGER DEFAULT 0"},
"ALTER TABLE sites ADD COLUMN method TEXT DEFAULT 'GET'", {4, "ALTER TABLE sites ADD COLUMN method TEXT DEFAULT 'GET'"},
"ALTER TABLE sites ADD COLUMN description TEXT DEFAULT ''", {5, "ALTER TABLE sites ADD COLUMN description TEXT DEFAULT ''"},
"ALTER TABLE sites ADD COLUMN parent_id INTEGER DEFAULT 0", {6, "ALTER TABLE sites ADD COLUMN parent_id INTEGER DEFAULT 0"},
"ALTER TABLE sites ADD COLUMN accepted_codes TEXT DEFAULT '200-299'", {7, "ALTER TABLE sites ADD COLUMN accepted_codes TEXT DEFAULT '200-299'"},
"ALTER TABLE sites ADD COLUMN dns_resolve_type TEXT DEFAULT ''", {8, "ALTER TABLE sites ADD COLUMN dns_resolve_type TEXT DEFAULT ''"},
"ALTER TABLE sites ADD COLUMN dns_server TEXT DEFAULT ''", {9, "ALTER TABLE sites ADD COLUMN dns_server TEXT DEFAULT ''"},
"ALTER TABLE sites ADD COLUMN ignore_tls BOOLEAN DEFAULT 0", {10, "ALTER TABLE sites ADD COLUMN ignore_tls BOOLEAN DEFAULT 0"},
"ALTER TABLE sites ADD COLUMN paused BOOLEAN DEFAULT 0", {11, "ALTER TABLE sites ADD COLUMN paused BOOLEAN DEFAULT 0"},
"ALTER TABLE check_history ADD COLUMN node_id TEXT DEFAULT ''", {12, "ALTER TABLE check_history ADD COLUMN node_id TEXT DEFAULT ''"},
"ALTER TABLE sites ADD COLUMN regions TEXT DEFAULT ''", {13, "ALTER TABLE sites ADD COLUMN regions TEXT DEFAULT ''"},
} }
} }
@@ -138,44 +149,47 @@ func (d *SQLiteDialect) ResetSequenceOnEmpty(db *sql.DB, table string) {
_ = db.QueryRow("SELECT COUNT(*) FROM " + table).Scan(&count) //nolint:errcheck _ = db.QueryRow("SELECT COUNT(*) FROM " + table).Scan(&count) //nolint:errcheck
if count == 0 { if count == 0 {
if _, err := db.Exec("DELETE FROM sqlite_sequence WHERE name=?", table); err != nil { if _, err := db.Exec("DELETE FROM sqlite_sequence WHERE name=?", table); err != nil {
log.Printf("sequence cleanup error: %v", err) slog.Debug("sequence cleanup failed", "table", table, "err", err)
} }
} }
} }
func (d *SQLiteDialect) ImportWipe(tx *sql.Tx) { func (d *SQLiteDialect) ImportWipe(tx *sql.Tx) {
if _, err := tx.Exec("DELETE FROM sites"); err != nil { if _, err := tx.Exec("DELETE FROM sites"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "sites", "err", err)
} }
if _, err := tx.Exec("DELETE FROM sqlite_sequence WHERE name='sites'"); err != nil { if _, err := tx.Exec("DELETE FROM sqlite_sequence WHERE name='sites'"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "sqlite_sequence(sites)", "err", err)
} }
if _, err := tx.Exec("DELETE FROM alerts"); err != nil { if _, err := tx.Exec("DELETE FROM alerts"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "alerts", "err", err)
} }
if _, err := tx.Exec("DELETE FROM sqlite_sequence WHERE name='alerts'"); err != nil { if _, err := tx.Exec("DELETE FROM sqlite_sequence WHERE name='alerts'"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "sqlite_sequence(alerts)", "err", err)
}
if _, err := tx.Exec("DELETE FROM users"); err != nil {
log.Printf("import wipe error: %v", err)
}
if _, err := tx.Exec("DELETE FROM sqlite_sequence WHERE name='users'"); err != nil {
log.Printf("import wipe error: %v", err)
} }
if _, err := tx.Exec("DELETE FROM maintenance_windows"); err != nil { if _, err := tx.Exec("DELETE FROM maintenance_windows"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "maintenance_windows", "err", err)
} }
if _, err := tx.Exec("DELETE FROM sqlite_sequence WHERE name='maintenance_windows'"); err != nil { if _, err := tx.Exec("DELETE FROM sqlite_sequence WHERE name='maintenance_windows'"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "sqlite_sequence(maintenance_windows)", "err", err)
} }
if _, err := tx.Exec("DELETE FROM check_history"); err != nil { if _, err := tx.Exec("DELETE FROM check_history"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "check_history", "err", err)
} }
if _, err := tx.Exec("DELETE FROM state_changes"); err != nil { if _, err := tx.Exec("DELETE FROM state_changes"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "state_changes", "err", err)
} }
if _, err := tx.Exec("DELETE FROM alert_health"); err != nil { if _, err := tx.Exec("DELETE FROM alert_health"); err != nil {
log.Printf("import wipe error: %v", err) slog.Debug("import wipe failed", "table", "alert_health", "err", err)
}
}
func (d *SQLiteDialect) ImportWipeUsers(tx *sql.Tx) {
if _, err := tx.Exec("DELETE FROM users"); err != nil {
slog.Debug("import wipe failed", "table", "users", "err", err)
}
if _, err := tx.Exec("DELETE FROM sqlite_sequence WHERE name='users'"); err != nil {
slog.Debug("import wipe failed", "table", "sqlite_sequence(users)", "err", err)
} }
} }
+51 -23
View File
@@ -7,7 +7,6 @@ import (
"encoding/hex" "encoding/hex"
"encoding/json" "encoding/json"
"fmt" "fmt"
"strings"
"time" "time"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models" "gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
@@ -18,7 +17,6 @@ const (
maxLogRows = 200 maxLogRows = 200
maxStateChangesPerSite = 5000 maxStateChangesPerSite = 5000
maxMaintenanceExport = 1000 maxMaintenanceExport = 1000
maxRequestBody = 1 << 20
) )
type SQLStore struct { type SQLStore struct {
@@ -80,19 +78,40 @@ func (s *SQLStore) Init(ctx context.Context) error {
return err return err
} }
} }
for _, m := range s.dialect.MigrationsSQL() {
if _, err := s.db.ExecContext(ctx, m); err != nil { if _, err := s.db.ExecContext(ctx, `CREATE TABLE IF NOT EXISTS schema_version (
errMsg := err.Error() version INTEGER PRIMARY KEY,
if strings.Contains(errMsg, "already exists") || strings.Contains(errMsg, "duplicate column") { applied_at DATETIME DEFAULT CURRENT_TIMESTAMP
continue )`); err != nil {
} return fmt.Errorf("create schema_version: %w", err)
return fmt.Errorf("migration failed: %w", err) }
var current int
_ = s.db.QueryRowContext(ctx, "SELECT COALESCE(MAX(version), 0) FROM schema_version").Scan(&current) //nolint:errcheck
if current == 0 {
baseline := s.dialect.BaselineVersion()
if _, err := s.db.ExecContext(ctx, s.q("INSERT INTO schema_version (version) VALUES (?)"), baseline); err != nil {
return fmt.Errorf("seed baseline version: %w", err)
}
current = baseline
}
for _, m := range s.dialect.Migrations() {
if m.Version <= current {
continue
}
if _, err := s.db.ExecContext(ctx, m.SQL); err != nil {
return fmt.Errorf("migration %d failed: %w", m.Version, err)
}
if _, err := s.db.ExecContext(ctx, s.q("INSERT INTO schema_version (version) VALUES (?)"), m.Version); err != nil {
return fmt.Errorf("record migration %d: %w", m.Version, err)
} }
} }
return nil return nil
} }
func (s *SQLStore) GetSites(ctx context.Context) ([]models.Site, error) { func (s *SQLStore) GetSites(ctx context.Context) ([]models.SiteConfig, error) {
bf := s.dialect.BoolFalse() bf := s.dialect.BoolFalse()
query := fmt.Sprintf( //nolint:gosec // bf is a dialect boolean literal, not user input query := fmt.Sprintf( //nolint:gosec // bf is a dialect boolean literal, not user input
"SELECT id, COALESCE(name, url), url, COALESCE(type, 'http'), COALESCE(token, ''), interval, alert_id, check_ssl, threshold, max_retries, COALESCE(hostname, ''), COALESCE(port, 0), COALESCE(timeout, 0), COALESCE(method, 'GET'), COALESCE(description, ''), COALESCE(parent_id, 0), COALESCE(accepted_codes, '200-299'), COALESCE(dns_resolve_type, ''), COALESCE(dns_server, ''), COALESCE(ignore_tls, %s), COALESCE(paused, %s), COALESCE(regions, '') FROM sites", "SELECT id, COALESCE(name, url), url, COALESCE(type, 'http'), COALESCE(token, ''), interval, alert_id, check_ssl, threshold, max_retries, COALESCE(hostname, ''), COALESCE(port, 0), COALESCE(timeout, 0), COALESCE(method, 'GET'), COALESCE(description, ''), COALESCE(parent_id, 0), COALESCE(accepted_codes, '200-299'), COALESCE(dns_resolve_type, ''), COALESCE(dns_server, ''), COALESCE(ignore_tls, %s), COALESCE(paused, %s), COALESCE(regions, '') FROM sites",
@@ -103,9 +122,9 @@ func (s *SQLStore) GetSites(ctx context.Context) ([]models.Site, error) {
return nil, err return nil, err
} }
defer rows.Close() defer rows.Close()
var sites []models.Site var sites []models.SiteConfig
for rows.Next() { for rows.Next() {
var st models.Site var st models.SiteConfig
if err := rows.Scan(&st.ID, &st.Name, &st.URL, &st.Type, &st.Token, &st.Interval, &st.AlertID, if err := rows.Scan(&st.ID, &st.Name, &st.URL, &st.Type, &st.Token, &st.Interval, &st.AlertID,
&st.CheckSSL, &st.ExpiryThreshold, &st.MaxRetries, &st.Hostname, &st.Port, &st.Timeout, &st.CheckSSL, &st.ExpiryThreshold, &st.MaxRetries, &st.Hostname, &st.Port, &st.Timeout,
&st.Method, &st.Description, &st.ParentID, &st.AcceptedCodes, &st.DNSResolveType, &st.Method, &st.Description, &st.ParentID, &st.AcceptedCodes, &st.DNSResolveType,
@@ -117,7 +136,7 @@ func (s *SQLStore) GetSites(ctx context.Context) ([]models.Site, error) {
return sites, rows.Err() return sites, rows.Err()
} }
func (s *SQLStore) AddSite(ctx context.Context, site models.Site) error { func (s *SQLStore) AddSite(ctx context.Context, site models.SiteConfig) error {
token := "" token := ""
if site.Type == "push" { if site.Type == "push" {
var err error var err error
@@ -132,9 +151,11 @@ func (s *SQLStore) AddSite(ctx context.Context, site models.Site) error {
return err return err
} }
func (s *SQLStore) UpdateSite(ctx context.Context, site models.Site) error { func (s *SQLStore) UpdateSite(ctx context.Context, site models.SiteConfig) error {
var existingToken string var existingToken string
_ = s.db.QueryRowContext(ctx, s.q("SELECT token FROM sites WHERE id=?"), site.ID).Scan(&existingToken) //nolint:errcheck if err := s.db.QueryRowContext(ctx, s.q("SELECT token FROM sites WHERE id=?"), site.ID).Scan(&existingToken); err != nil && err != sql.ErrNoRows {
return fmt.Errorf("read existing token: %w", err)
}
if site.Type == "push" && existingToken == "" { if site.Type == "push" && existingToken == "" {
var err error var err error
existingToken, err = generateToken() existingToken, err = generateToken()
@@ -178,13 +199,13 @@ func (s *SQLStore) DeleteSite(ctx context.Context, id int) error {
return nil return nil
} }
func (s *SQLStore) GetSiteByName(ctx context.Context, name string) (models.Site, error) { func (s *SQLStore) GetSiteByName(ctx context.Context, name string) (models.SiteConfig, error) {
bf := s.dialect.BoolFalse() bf := s.dialect.BoolFalse()
query := fmt.Sprintf( //nolint:gosec // bf is a dialect boolean literal, not user input query := fmt.Sprintf( //nolint:gosec // bf is a dialect boolean literal, not user input
"SELECT id, COALESCE(name, url), url, COALESCE(type, 'http'), COALESCE(token, ''), interval, alert_id, check_ssl, threshold, max_retries, COALESCE(hostname, ''), COALESCE(port, 0), COALESCE(timeout, 0), COALESCE(method, 'GET'), COALESCE(description, ''), COALESCE(parent_id, 0), COALESCE(accepted_codes, '200-299'), COALESCE(dns_resolve_type, ''), COALESCE(dns_server, ''), COALESCE(ignore_tls, %s), COALESCE(paused, %s), COALESCE(regions, '') FROM sites WHERE name = %s", "SELECT id, COALESCE(name, url), url, COALESCE(type, 'http'), COALESCE(token, ''), interval, alert_id, check_ssl, threshold, max_retries, COALESCE(hostname, ''), COALESCE(port, 0), COALESCE(timeout, 0), COALESCE(method, 'GET'), COALESCE(description, ''), COALESCE(parent_id, 0), COALESCE(accepted_codes, '200-299'), COALESCE(dns_resolve_type, ''), COALESCE(dns_server, ''), COALESCE(ignore_tls, %s), COALESCE(paused, %s), COALESCE(regions, '') FROM sites WHERE name = %s",
bf, bf, s.q("?"), bf, bf, s.q("?"),
) )
var st models.Site var st models.SiteConfig
err := s.db.QueryRowContext(ctx, query, name).Scan(&st.ID, &st.Name, &st.URL, &st.Type, &st.Token, &st.Interval, &st.AlertID, err := s.db.QueryRowContext(ctx, query, name).Scan(&st.ID, &st.Name, &st.URL, &st.Type, &st.Token, &st.Interval, &st.AlertID,
&st.CheckSSL, &st.ExpiryThreshold, &st.MaxRetries, &st.Hostname, &st.Port, &st.Timeout, &st.CheckSSL, &st.ExpiryThreshold, &st.MaxRetries, &st.Hostname, &st.Port, &st.Timeout,
&st.Method, &st.Description, &st.ParentID, &st.AcceptedCodes, &st.DNSResolveType, &st.Method, &st.Description, &st.ParentID, &st.AcceptedCodes, &st.DNSResolveType,
@@ -226,7 +247,7 @@ func (s *SQLStore) GetAlertByName(ctx context.Context, name string) (models.Aler
return a, nil return a, nil
} }
func (s *SQLStore) AddSiteReturningID(ctx context.Context, site models.Site) (int, error) { func (s *SQLStore) AddSiteReturningID(ctx context.Context, site models.SiteConfig) (int, error) {
token := "" token := ""
if site.Type == "push" { if site.Type == "push" {
var err error var err error
@@ -325,8 +346,10 @@ func (s *SQLStore) UpdateAlert(ctx context.Context, id int, name, aType string,
} }
func (s *SQLStore) DeleteAlert(ctx context.Context, id int) error { func (s *SQLStore) DeleteAlert(ctx context.Context, id int) error {
_, err := s.db.ExecContext(ctx, s.q("DELETE FROM alerts WHERE id=?"), id) if _, err := s.db.ExecContext(ctx, s.q("UPDATE sites SET alert_id = 0 WHERE alert_id = ?"), id); err != nil {
if err != nil { return err
}
if _, err := s.db.ExecContext(ctx, s.q("DELETE FROM alerts WHERE id=?"), id); err != nil {
return err return err
} }
s.dialect.ResetSequenceOnEmpty(s.db, "alerts") s.dialect.ResetSequenceOnEmpty(s.db, "alerts")
@@ -719,9 +742,14 @@ func (s *SQLStore) ImportData(ctx context.Context, data models.Backup) error {
s.dialect.ImportWipe(tx) s.dialect.ImportWipe(tx)
for _, u := range data.Users { // Only wipe+replace users when callers explicitly provide them (CLI
if _, err := tx.ExecContext(ctx, s.q("INSERT INTO users (username, public_key, role) VALUES (?, ?, ?)"), u.Username, u.PublicKey, u.Role); err != nil { // full restore). API/Kuma imports pass nil — existing users preserved.
return err if data.Users != nil {
s.dialect.ImportWipeUsers(tx)
for _, u := range data.Users {
if _, err := tx.ExecContext(ctx, s.q("INSERT INTO users (username, public_key, role) VALUES (?, ?, ?)"), u.Username, u.PublicKey, u.Role); err != nil {
return err
}
} }
} }
for _, a := range data.Alerts { for _, a := range data.Alerts {
+31 -6
View File
@@ -33,7 +33,7 @@ func TestSiteCRUD(t *testing.T) {
t.Fatalf("expected 0 sites, got %d", len(sites)) t.Fatalf("expected 0 sites, got %d", len(sites))
} }
if err := s.AddSite(context.Background(), models.Site{Name: "Test", URL: "https://example.com", Type: "http", Interval: 30}); err != nil { if err := s.AddSite(context.Background(), models.SiteConfig{Name: "Test", URL: "https://example.com", Type: "http", Interval: 30}); err != nil {
t.Fatalf("AddSite: %v", err) t.Fatalf("AddSite: %v", err)
} }
@@ -174,7 +174,7 @@ func TestUserCRUD(t *testing.T) {
func TestPushTokenGeneration(t *testing.T) { func TestPushTokenGeneration(t *testing.T) {
s := newTestStore(t) s := newTestStore(t)
if err := s.AddSite(context.Background(), models.Site{Name: "Push Monitor", Type: "push", Interval: 60}); err != nil { if err := s.AddSite(context.Background(), models.SiteConfig{Name: "Push Monitor", Type: "push", Interval: 60}); err != nil {
t.Fatalf("AddSite: %v", err) t.Fatalf("AddSite: %v", err)
} }
@@ -199,7 +199,7 @@ func TestImportExport(t *testing.T) {
if err := s.AddAlert(context.Background(), "Test Alert", "webhook", map[string]string{"url": "https://example.com"}); err != nil { if err := s.AddAlert(context.Background(), "Test Alert", "webhook", map[string]string{"url": "https://example.com"}); err != nil {
t.Fatalf("AddAlert: %v", err) t.Fatalf("AddAlert: %v", err)
} }
if err := s.AddSite(context.Background(), models.Site{Name: "Site1", URL: "https://example.com", Type: "http", Interval: 30}); err != nil { if err := s.AddSite(context.Background(), models.SiteConfig{Name: "Site1", URL: "https://example.com", Type: "http", Interval: 30}); err != nil {
t.Fatalf("AddSite: %v", err) t.Fatalf("AddSite: %v", err)
} }
if err := s.AddUser(context.Background(), "user1", "ssh-ed25519 KEY", "user"); err != nil { if err := s.AddUser(context.Background(), "user1", "ssh-ed25519 KEY", "user"); err != nil {
@@ -239,7 +239,7 @@ func TestImportExport(t *testing.T) {
func TestImportData_WipesHistory(t *testing.T) { func TestImportData_WipesHistory(t *testing.T) {
s := newTestStore(t) s := newTestStore(t)
if err := s.AddSite(context.Background(), models.Site{Name: "OldSite", URL: "https://old.com", Type: "http", Interval: 30}); err != nil { if err := s.AddSite(context.Background(), models.SiteConfig{Name: "OldSite", URL: "https://old.com", Type: "http", Interval: 30}); err != nil {
t.Fatalf("AddSite: %v", err) t.Fatalf("AddSite: %v", err)
} }
if err := s.SaveCheck(context.Background(), 1, 5000, true); err != nil { if err := s.SaveCheck(context.Background(), 1, 5000, true); err != nil {
@@ -253,7 +253,7 @@ func TestImportData_WipesHistory(t *testing.T) {
} }
backup := models.Backup{ backup := models.Backup{
Sites: []models.Site{{ID: 1, Name: "NewSite", URL: "https://new.com", Type: "http", Interval: 60}}, Sites: []models.SiteConfig{{ID: 1, Name: "NewSite", URL: "https://new.com", Type: "http", Interval: 60}},
} }
if err := s.ImportData(context.Background(), backup); err != nil { if err := s.ImportData(context.Background(), backup); err != nil {
t.Fatalf("ImportData: %v", err) t.Fatalf("ImportData: %v", err)
@@ -276,6 +276,31 @@ func TestImportData_WipesHistory(t *testing.T) {
} }
} }
func TestImportData_NilUsersPreservesExisting(t *testing.T) {
s := newTestStore(t)
if err := s.AddUser(context.Background(), "admin", "ssh-ed25519 ADMINKEY", "admin"); err != nil {
t.Fatalf("AddUser: %v", err)
}
backup := models.Backup{
Sites: []models.SiteConfig{{ID: 1, Name: "New", URL: "https://new.com", Type: "http", Interval: 30}},
Alerts: []models.AlertConfig{{ID: 1, Name: "a", Type: "webhook", Settings: map[string]string{"url": "https://h.com"}}},
Users: nil,
}
if err := s.ImportData(context.Background(), backup); err != nil {
t.Fatalf("ImportData: %v", err)
}
users, err := s.GetAllUsers(context.Background())
if err != nil {
t.Fatalf("GetAllUsers: %v", err)
}
if len(users) != 1 || users[0].Username != "admin" {
t.Errorf("expected existing admin user preserved, got %d users", len(users))
}
}
func TestCheckHistory(t *testing.T) { func TestCheckHistory(t *testing.T) {
s := newTestStore(t) s := newTestStore(t)
@@ -314,7 +339,7 @@ func TestCheckHistory(t *testing.T) {
func TestDeleteSiteCascade(t *testing.T) { func TestDeleteSiteCascade(t *testing.T) {
s := newTestStore(t) s := newTestStore(t)
site := models.Site{Name: "Cascade Test", URL: "https://example.com", Interval: 30} site := models.SiteConfig{Name: "Cascade Test", URL: "https://example.com", Interval: 30}
if err := s.AddSite(context.Background(), site); err != nil { if err := s.AddSite(context.Background(), site); err != nil {
t.Fatalf("AddSite: %v", err) t.Fatalf("AddSite: %v", err)
} }
+5 -5
View File
@@ -11,9 +11,9 @@ type Store interface {
Init(ctx context.Context) error Init(ctx context.Context) error
// Sites // Sites
GetSites(ctx context.Context) ([]models.Site, error) GetSites(ctx context.Context) ([]models.SiteConfig, error)
AddSite(ctx context.Context, site models.Site) error AddSite(ctx context.Context, site models.SiteConfig) error
UpdateSite(ctx context.Context, site models.Site) error UpdateSite(ctx context.Context, site models.SiteConfig) error
UpdateSitePaused(ctx context.Context, id int, paused bool) error UpdateSitePaused(ctx context.Context, id int, paused bool) error
DeleteSite(ctx context.Context, id int) error DeleteSite(ctx context.Context, id int) error
@@ -25,9 +25,9 @@ type Store interface {
DeleteAlert(ctx context.Context, id int) error DeleteAlert(ctx context.Context, id int) error
// Declarative config support // Declarative config support
GetSiteByName(ctx context.Context, name string) (models.Site, error) GetSiteByName(ctx context.Context, name string) (models.SiteConfig, error)
GetAlertByName(ctx context.Context, name string) (models.AlertConfig, error) GetAlertByName(ctx context.Context, name string) (models.AlertConfig, error)
AddSiteReturningID(ctx context.Context, site models.Site) (int, error) AddSiteReturningID(ctx context.Context, site models.SiteConfig) (int, error)
AddAlertReturningID(ctx context.Context, name, aType string, settings map[string]string) (int, error) AddAlertReturningID(ctx context.Context, name, aType string, settings map[string]string) (int, error)
// Users // Users
+276
View File
@@ -0,0 +1,276 @@
package storetest
import (
"context"
"time"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
)
// BaseMock implements store.Store with no-op defaults. Embed it in test-specific
// mocks and override only the methods you need via the exported Func fields or
// by shadowing the method on the embedding struct.
type BaseMock struct {
GetSitesFunc func(ctx context.Context) ([]models.SiteConfig, error)
AddSiteFunc func(ctx context.Context, site models.SiteConfig) error
UpdateSiteFunc func(ctx context.Context, site models.SiteConfig) error
GetAllAlertsFunc func(ctx context.Context) ([]models.AlertConfig, error)
GetAlertFunc func(ctx context.Context, id int) (models.AlertConfig, error)
GetAllUsersFunc func(ctx context.Context) ([]models.User, error)
GetAllNodesFunc func(ctx context.Context) ([]models.ProbeNode, error)
GetActiveMaintenanceWindowsFunc func(ctx context.Context) ([]models.MaintenanceWindow, error)
GetAllMaintenanceWindowsFunc func(ctx context.Context, limit int) ([]models.MaintenanceWindow, error)
IsMonitorInMaintenanceFunc func(ctx context.Context, id int) (bool, error)
LoadAlertHealthFunc func(ctx context.Context) (map[int]models.AlertHealthRecord, error)
LoadAllHistoryFunc func(ctx context.Context, limit int) (map[int][]models.CheckRecord, error)
SaveCheckFunc func(ctx context.Context, siteID int, latencyNs int64, isUp bool) error
SaveCheckFromNodeFunc func(ctx context.Context, siteID int, nodeID string, latencyNs int64, isUp bool) error
SaveLogFunc func(ctx context.Context, message string) error
SaveStateChangeFunc func(ctx context.Context, siteID int, from, to, reason string) error
SaveAlertHealthFunc func(ctx context.Context, h models.AlertHealthRecord) error
GetStateChangesFunc func(ctx context.Context, siteID, limit int) ([]models.StateChange, error)
GetStateChangesSinceFunc func(ctx context.Context, siteID int, since time.Time) ([]models.StateChange, error)
ExportDataFunc func(ctx context.Context) (models.Backup, error)
ImportDataFunc func(ctx context.Context, data models.Backup) error
RegisterNodeFunc func(ctx context.Context, node models.ProbeNode) error
GetNodeFunc func(ctx context.Context, id string) (models.ProbeNode, error)
GetPreferenceFunc func(ctx context.Context, key string) (string, error)
SetPreferenceFunc func(ctx context.Context, key, value string) error
}
func (m *BaseMock) Init(_ context.Context) error { return nil }
func (m *BaseMock) Close() error { return nil }
func (m *BaseMock) GetSites(ctx context.Context) ([]models.SiteConfig, error) {
if m.GetSitesFunc != nil {
return m.GetSitesFunc(ctx)
}
return nil, nil
}
func (m *BaseMock) AddSite(ctx context.Context, site models.SiteConfig) error {
if m.AddSiteFunc != nil {
return m.AddSiteFunc(ctx, site)
}
return nil
}
func (m *BaseMock) UpdateSite(ctx context.Context, site models.SiteConfig) error {
if m.UpdateSiteFunc != nil {
return m.UpdateSiteFunc(ctx, site)
}
return nil
}
func (m *BaseMock) UpdateSitePaused(_ context.Context, _ int, _ bool) error { return nil }
func (m *BaseMock) DeleteSite(_ context.Context, _ int) error { return nil }
func (m *BaseMock) GetAllAlerts(ctx context.Context) ([]models.AlertConfig, error) {
if m.GetAllAlertsFunc != nil {
return m.GetAllAlertsFunc(ctx)
}
return nil, nil
}
func (m *BaseMock) GetAlert(ctx context.Context, id int) (models.AlertConfig, error) {
if m.GetAlertFunc != nil {
return m.GetAlertFunc(ctx, id)
}
return models.AlertConfig{}, nil
}
func (m *BaseMock) AddAlert(_ context.Context, _ string, _ string, _ map[string]string) error {
return nil
}
func (m *BaseMock) UpdateAlert(_ context.Context, _ int, _ string, _ string, _ map[string]string) error {
return nil
}
func (m *BaseMock) DeleteAlert(_ context.Context, _ int) error { return nil }
func (m *BaseMock) GetSiteByName(_ context.Context, _ string) (models.SiteConfig, error) {
return models.SiteConfig{}, nil
}
func (m *BaseMock) GetAlertByName(_ context.Context, _ string) (models.AlertConfig, error) {
return models.AlertConfig{}, nil
}
func (m *BaseMock) AddSiteReturningID(_ context.Context, _ models.SiteConfig) (int, error) {
return 0, nil
}
func (m *BaseMock) AddAlertReturningID(_ context.Context, _ string, _ string, _ map[string]string) (int, error) {
return 0, nil
}
func (m *BaseMock) GetAllUsers(ctx context.Context) ([]models.User, error) {
if m.GetAllUsersFunc != nil {
return m.GetAllUsersFunc(ctx)
}
return nil, nil
}
func (m *BaseMock) AddUser(_ context.Context, _ string, _ string, _ string) error { return nil }
func (m *BaseMock) UpdateUser(_ context.Context, _ int, _ string, _ string, _ string) error {
return nil
}
func (m *BaseMock) DeleteUser(_ context.Context, _ int) error { return nil }
func (m *BaseMock) SaveCheck(ctx context.Context, siteID int, latencyNs int64, isUp bool) error {
if m.SaveCheckFunc != nil {
return m.SaveCheckFunc(ctx, siteID, latencyNs, isUp)
}
return nil
}
func (m *BaseMock) SaveCheckFromNode(ctx context.Context, siteID int, nodeID string, latencyNs int64, isUp bool) error {
if m.SaveCheckFromNodeFunc != nil {
return m.SaveCheckFromNodeFunc(ctx, siteID, nodeID, latencyNs, isUp)
}
return nil
}
func (m *BaseMock) LoadAllHistory(ctx context.Context, limit int) (map[int][]models.CheckRecord, error) {
if m.LoadAllHistoryFunc != nil {
return m.LoadAllHistoryFunc(ctx, limit)
}
return nil, nil
}
func (m *BaseMock) PruneCheckHistory(_ context.Context) error { return nil }
func (m *BaseMock) SaveStateChange(ctx context.Context, siteID int, from, to, reason string) error {
if m.SaveStateChangeFunc != nil {
return m.SaveStateChangeFunc(ctx, siteID, from, to, reason)
}
return nil
}
func (m *BaseMock) GetStateChanges(ctx context.Context, siteID, limit int) ([]models.StateChange, error) {
if m.GetStateChangesFunc != nil {
return m.GetStateChangesFunc(ctx, siteID, limit)
}
return nil, nil
}
func (m *BaseMock) GetStateChangesSince(ctx context.Context, siteID int, since time.Time) ([]models.StateChange, error) {
if m.GetStateChangesSinceFunc != nil {
return m.GetStateChangesSinceFunc(ctx, siteID, since)
}
return nil, nil
}
func (m *BaseMock) PruneStateChanges(_ context.Context) error { return nil }
func (m *BaseMock) RegisterNode(ctx context.Context, node models.ProbeNode) error {
if m.RegisterNodeFunc != nil {
return m.RegisterNodeFunc(ctx, node)
}
return nil
}
func (m *BaseMock) GetNode(ctx context.Context, id string) (models.ProbeNode, error) {
if m.GetNodeFunc != nil {
return m.GetNodeFunc(ctx, id)
}
return models.ProbeNode{}, nil
}
func (m *BaseMock) GetAllNodes(ctx context.Context) ([]models.ProbeNode, error) {
if m.GetAllNodesFunc != nil {
return m.GetAllNodesFunc(ctx)
}
return nil, nil
}
func (m *BaseMock) UpdateNodeLastSeen(_ context.Context, _ string) error { return nil }
func (m *BaseMock) DeleteNode(_ context.Context, _ string) error { return nil }
func (m *BaseMock) LoadAlertHealth(ctx context.Context) (map[int]models.AlertHealthRecord, error) {
if m.LoadAlertHealthFunc != nil {
return m.LoadAlertHealthFunc(ctx)
}
return nil, nil
}
func (m *BaseMock) SaveAlertHealth(ctx context.Context, h models.AlertHealthRecord) error {
if m.SaveAlertHealthFunc != nil {
return m.SaveAlertHealthFunc(ctx, h)
}
return nil
}
func (m *BaseMock) SaveLog(ctx context.Context, message string) error {
if m.SaveLogFunc != nil {
return m.SaveLogFunc(ctx, message)
}
return nil
}
func (m *BaseMock) LoadLogs(_ context.Context, _ int) ([]string, error) { return nil, nil }
func (m *BaseMock) PruneLogs(_ context.Context) error { return nil }
func (m *BaseMock) GetActiveMaintenanceWindows(ctx context.Context) ([]models.MaintenanceWindow, error) {
if m.GetActiveMaintenanceWindowsFunc != nil {
return m.GetActiveMaintenanceWindowsFunc(ctx)
}
return nil, nil
}
func (m *BaseMock) GetAllMaintenanceWindows(ctx context.Context, limit int) ([]models.MaintenanceWindow, error) {
if m.GetAllMaintenanceWindowsFunc != nil {
return m.GetAllMaintenanceWindowsFunc(ctx, limit)
}
return nil, nil
}
func (m *BaseMock) AddMaintenanceWindow(_ context.Context, _ models.MaintenanceWindow) error {
return nil
}
func (m *BaseMock) EndMaintenanceWindow(_ context.Context, _ int) error { return nil }
func (m *BaseMock) DeleteMaintenanceWindow(_ context.Context, _ int) error { return nil }
func (m *BaseMock) PruneExpiredMaintenanceWindows(_ context.Context, _ time.Duration) (int64, error) {
return 0, nil
}
func (m *BaseMock) IsMonitorInMaintenance(ctx context.Context, id int) (bool, error) {
if m.IsMonitorInMaintenanceFunc != nil {
return m.IsMonitorInMaintenanceFunc(ctx, id)
}
return false, nil
}
func (m *BaseMock) GetPreference(ctx context.Context, key string) (string, error) {
if m.GetPreferenceFunc != nil {
return m.GetPreferenceFunc(ctx, key)
}
return "", nil
}
func (m *BaseMock) SetPreference(ctx context.Context, key, value string) error {
if m.SetPreferenceFunc != nil {
return m.SetPreferenceFunc(ctx, key, value)
}
return nil
}
func (m *BaseMock) ExportData(ctx context.Context) (models.Backup, error) {
if m.ExportDataFunc != nil {
return m.ExportDataFunc(ctx)
}
return models.Backup{}, nil
}
func (m *BaseMock) ImportData(ctx context.Context, data models.Backup) error {
if m.ImportDataFunc != nil {
return m.ImportDataFunc(ctx, data)
}
return nil
}
-103
View File
@@ -1,103 +0,0 @@
package tui
// braillePlane is a subpixel canvas where each terminal cell maps to a 2×4
// dot grid, rendered via Unicode braille (U+2800..U+28FF).
type braillePlane struct {
wCells, hCells int
wDots, hDots int
dots []bool
}
func newBraillePlane(wCells, hCells int) *braillePlane {
wd, hd := wCells*2, hCells*4
return &braillePlane{
wCells: wCells, hCells: hCells,
wDots: wd, hDots: hd,
dots: make([]bool, wd*hd),
}
}
func (p *braillePlane) set(dx, dy int) {
if dx < 0 || dy < 0 || dx >= p.wDots || dy >= p.hDots {
return
}
p.dots[dy*p.wDots+dx] = true
}
// line draws a Bresenham line between two dot coordinates.
func (p *braillePlane) line(x0, y0, x1, y1 int) {
dx := intAbs(x1 - x0)
sx := 1
if x0 >= x1 {
sx = -1
}
dy := -intAbs(y1 - y0)
sy := 1
if y0 >= y1 {
sy = -1
}
err := dx + dy
for {
p.set(x0, y0)
if x0 == x1 && y0 == y1 {
return
}
e2 := 2 * err
if e2 >= dy {
err += dy
x0 += sx
}
if e2 <= dx {
err += dx
y0 += sy
}
}
}
// fillBelow fills all dots below the topmost lit dot in each column,
// producing an area-chart effect.
func (p *braillePlane) fillBelow() {
for x := 0; x < p.wDots; x++ {
topY := -1
for y := 0; y < p.hDots; y++ {
if p.dots[y*p.wDots+x] {
topY = y
break
}
}
if topY >= 0 {
for y := topY + 1; y < p.hDots; y++ {
p.dots[y*p.wDots+x] = true
}
}
}
}
// cellMask builds the U+2800-relative bitmask for one terminal cell.
func (p *braillePlane) cellMask(cx, cy int) byte {
type bit struct {
dx, dy int
m byte
}
bits := [...]bit{
{0, 0, 0x01}, {0, 1, 0x02}, {0, 2, 0x04},
{1, 0, 0x08}, {1, 1, 0x10}, {1, 2, 0x20},
{0, 3, 0x40}, {1, 3, 0x80},
}
var mask byte
for _, b := range bits {
dx := cx*2 + b.dx
dy := cy*4 + b.dy
if dx >= 0 && dx < p.wDots && dy >= 0 && dy < p.hDots && p.dots[dy*p.wDots+dx] {
mask |= b.m
}
}
return mask
}
func intAbs(n int) int {
if n < 0 {
return -n
}
return n
}
-64
View File
@@ -1,64 +0,0 @@
package tui
import "testing"
func TestBraillePlane_Set(t *testing.T) {
p := newBraillePlane(2, 1)
if p.wDots != 4 || p.hDots != 4 {
t.Fatalf("expected 4x4 dots, got %dx%d", p.wDots, p.hDots)
}
p.set(0, 0)
if !p.dots[0] {
t.Error("dot at (0,0) should be set")
}
p.set(-1, 0) // out of bounds, should not panic
p.set(0, 99) // out of bounds, should not panic
}
func TestBraillePlane_CellMask(t *testing.T) {
p := newBraillePlane(1, 1)
// Set bottom-left dot
p.set(0, 3)
mask := p.cellMask(0, 0)
if mask != 0x40 {
t.Errorf("bottom-left dot should be 0x40, got 0x%02x", mask)
}
// Set all dots
for y := 0; y < 4; y++ {
for x := 0; x < 2; x++ {
p.set(x, y)
}
}
mask = p.cellMask(0, 0)
if mask != 0xFF {
t.Errorf("all dots should be 0xFF, got 0x%02x", mask)
}
}
func TestBraillePlane_Line(t *testing.T) {
p := newBraillePlane(3, 1)
p.line(0, 2, 5, 2) // horizontal line
for x := 0; x <= 5; x++ {
if !p.dots[2*p.wDots+x] {
t.Errorf("dot at (%d, 2) should be set", x)
}
}
}
func TestBraillePlane_FillBelow(t *testing.T) {
p := newBraillePlane(1, 1)
p.set(0, 1) // set dot at row 1
p.fillBelow()
if !p.dots[1*p.wDots+0] {
t.Error("original dot should still be set")
}
if !p.dots[2*p.wDots+0] {
t.Error("row 2 should be filled")
}
if !p.dots[3*p.wDots+0] {
t.Error("row 3 should be filled")
}
if p.dots[0*p.wDots+0] {
t.Error("row 0 above the dot should not be filled")
}
}
+16 -1
View File
@@ -104,10 +104,25 @@ func (m *Model) refreshLive() {
ordered = filterSites(ordered, m.filterText) ordered = filterSites(ordered, m.filterText)
} }
m.sites = ordered m.sites = ordered
m.logViewport.SetContent(strings.Join(m.engine.GetLogs(), "\n")) m.refreshLogContent()
if m.currentTab == 0 && m.selectedID != 0 {
for i, s := range m.sites {
if s.ID == m.selectedID {
m.cursor = i
break
}
}
}
m.clampCursor() m.clampCursor()
} }
func (m *Model) syncSelectedID() {
if m.currentTab == 0 && m.cursor < len(m.sites) {
m.selectedID = m.sites[m.cursor].ID
}
}
// clampCursor keeps the cursor and scroll offset within the current tab's list. // clampCursor keeps the cursor and scroll offset within the current tab's list.
func (m *Model) clampCursor() { func (m *Model) clampCursor() {
listLen := m.currentListLen() listLen := m.currentListLen()
+13 -13
View File
@@ -8,9 +8,9 @@ import (
func TestSortSitesForDisplay_GroupsFirst(t *testing.T) { func TestSortSitesForDisplay_GroupsFirst(t *testing.T) {
sites := []models.Site{ sites := []models.Site{
{ID: 3, Name: "ungrouped", Type: "http", Status: "UP"}, {SiteConfig: models.SiteConfig{ID: 3, Name: "ungrouped", Type: "http"}, SiteState: models.SiteState{Status: "UP"}},
{ID: 1, Name: "group-a", Type: "group", Status: "UP"}, {SiteConfig: models.SiteConfig{ID: 1, Name: "group-a", Type: "group"}, SiteState: models.SiteState{Status: "UP"}},
{ID: 2, Name: "child", Type: "http", Status: "UP", ParentID: 1}, {SiteConfig: models.SiteConfig{ID: 2, Name: "child", Type: "http", ParentID: 1}, SiteState: models.SiteState{Status: "UP"}},
} }
result := sortSitesForDisplay(sites, nil) result := sortSitesForDisplay(sites, nil)
if len(result) != 3 { if len(result) != 3 {
@@ -29,9 +29,9 @@ func TestSortSitesForDisplay_GroupsFirst(t *testing.T) {
func TestSortSitesForDisplay_CollapsedHidesChildren(t *testing.T) { func TestSortSitesForDisplay_CollapsedHidesChildren(t *testing.T) {
sites := []models.Site{ sites := []models.Site{
{ID: 1, Name: "group-a", Type: "group", Status: "UP"}, {SiteConfig: models.SiteConfig{ID: 1, Name: "group-a", Type: "group"}, SiteState: models.SiteState{Status: "UP"}},
{ID: 2, Name: "child-1", Type: "http", Status: "UP", ParentID: 1}, {SiteConfig: models.SiteConfig{ID: 2, Name: "child-1", Type: "http", ParentID: 1}, SiteState: models.SiteState{Status: "UP"}},
{ID: 3, Name: "child-2", Type: "http", Status: "UP", ParentID: 1}, {SiteConfig: models.SiteConfig{ID: 3, Name: "child-2", Type: "http", ParentID: 1}, SiteState: models.SiteState{Status: "UP"}},
} }
collapsed := map[int]bool{1: true} collapsed := map[int]bool{1: true}
result := sortSitesForDisplay(sites, collapsed) result := sortSitesForDisplay(sites, collapsed)
@@ -45,9 +45,9 @@ func TestSortSitesForDisplay_CollapsedHidesChildren(t *testing.T) {
func TestSortSitesForDisplay_StatusOrdering(t *testing.T) { func TestSortSitesForDisplay_StatusOrdering(t *testing.T) {
sites := []models.Site{ sites := []models.Site{
{ID: 1, Name: "up-site", Type: "http", Status: "UP"}, {SiteConfig: models.SiteConfig{ID: 1, Name: "up-site", Type: "http"}, SiteState: models.SiteState{Status: "UP"}},
{ID: 2, Name: "down-site", Type: "http", Status: "DOWN"}, {SiteConfig: models.SiteConfig{ID: 2, Name: "down-site", Type: "http"}, SiteState: models.SiteState{Status: "DOWN"}},
{ID: 3, Name: "late-site", Type: "http", Status: "LATE"}, {SiteConfig: models.SiteConfig{ID: 3, Name: "late-site", Type: "http"}, SiteState: models.SiteState{Status: "LATE"}},
} }
result := sortSitesForDisplay(sites, nil) result := sortSitesForDisplay(sites, nil)
if result[0].Status != "DOWN" { if result[0].Status != "DOWN" {
@@ -63,9 +63,9 @@ func TestSortSitesForDisplay_StatusOrdering(t *testing.T) {
func TestFilterSites(t *testing.T) { func TestFilterSites(t *testing.T) {
sites := []models.Site{ sites := []models.Site{
{Name: "Production API"}, {SiteConfig: models.SiteConfig{Name: "Production API"}},
{Name: "Staging API"}, {SiteConfig: models.SiteConfig{Name: "Staging API"}},
{Name: "Database"}, {SiteConfig: models.SiteConfig{Name: "Database"}},
} }
tests := []struct { tests := []struct {
@@ -87,7 +87,7 @@ func TestFilterSites(t *testing.T) {
} }
func TestFilterSites_EmptyNeedle(t *testing.T) { func TestFilterSites_EmptyNeedle(t *testing.T) {
sites := []models.Site{{Name: "a"}, {Name: "b"}} sites := []models.Site{{SiteConfig: models.SiteConfig{Name: "a"}}, {SiteConfig: models.SiteConfig{Name: "b"}}}
got := filterSites(sites, "") got := filterSites(sites, "")
if len(got) != 2 { if len(got) != 2 {
t.Errorf("empty needle should return all, got %d", len(got)) t.Errorf("empty needle should return all, got %d", len(got))
+12 -9
View File
@@ -34,6 +34,9 @@ func (m Model) emptyState(message, hint string) string {
} }
func limitStr(text string, max int) string { func limitStr(text string, max int) string {
if max < 3 {
return text
}
runes := []rune(text) runes := []rune(text)
if len(runes) > max { if len(runes) > max {
return string(runes[:max-3]) + "..." return string(runes[:max-3]) + "..."
@@ -143,16 +146,16 @@ func (m Model) fmtRetries(site models.Site) string {
dispCount = site.MaxRetries dispCount = site.MaxRetries
} }
s := fmt.Sprintf("%d/%d", dispCount, site.MaxRetries) s := fmt.Sprintf("%d/%d", dispCount, site.MaxRetries)
if site.Status == "DOWN" { if site.Status == models.StatusDown {
return m.st.dangerStyle.Render(s) return m.st.dangerStyle.Render(s)
} }
if site.Status == "UP" && site.FailureCount > 0 { if site.Status == models.StatusUp && site.FailureCount > 0 {
return m.st.warnStyle.Render(s) return m.st.warnStyle.Render(s)
} }
return s return s
} }
func (m Model) fmtStatus(status string, paused bool, inMaint bool) string { func (m Model) fmtStatus(status models.Status, paused bool, inMaint bool) string {
if paused { if paused {
return m.st.warnStyle.Render("◇ PAUSED") return m.st.warnStyle.Render("◇ PAUSED")
} }
@@ -160,18 +163,18 @@ func (m Model) fmtStatus(status string, paused bool, inMaint bool) string {
return m.st.maintStyle.Render("◼ MAINT") return m.st.maintStyle.Render("◼ MAINT")
} }
switch status { switch status {
case "DOWN": case models.StatusDown:
return m.st.dangerStyle.Render("▼ DOWN") return m.st.dangerStyle.Render("▼ DOWN")
case "SSL EXP": case models.StatusSSLExp:
return m.st.dangerStyle.Render("▼ SSL EXP") return m.st.dangerStyle.Render("▼ SSL EXP")
case "LATE": case models.StatusLate:
return m.st.warnStyle.Render("◆ LATE") return m.st.warnStyle.Render("◆ LATE")
case "STALE": case models.StatusStale:
return m.st.staleStyle.Render("◆ STALE") return m.st.staleStyle.Render("◆ STALE")
case "PENDING": case models.StatusPending:
return m.st.subtleStyle.Render("○ PENDING") return m.st.subtleStyle.Render("○ PENDING")
default: default:
return m.st.specialStyle.Render("▲ " + status) return m.st.specialStyle.Render("▲ " + string(status))
} }
} }
+16 -16
View File
@@ -38,13 +38,13 @@ func TestSiteOrder(t *testing.T) {
site models.Site site models.Site
want int want int
}{ }{
{"down", models.Site{Status: "DOWN"}, 0}, {"down", models.Site{SiteState: models.SiteState{Status: "DOWN"}}, 0},
{"ssl exp", models.Site{Status: "SSL EXP"}, 0}, {"ssl exp", models.Site{SiteState: models.SiteState{Status: "SSL EXP"}}, 0},
{"late", models.Site{Status: "LATE"}, 1}, {"late", models.Site{SiteState: models.SiteState{Status: "LATE"}}, 1},
{"up", models.Site{Status: "UP"}, 2}, {"up", models.Site{SiteState: models.SiteState{Status: "UP"}}, 2},
{"pending", models.Site{Status: "PENDING"}, 3}, {"pending", models.Site{SiteState: models.SiteState{Status: "PENDING"}}, 3},
{"paused up", models.Site{Status: "UP", Paused: true}, 3}, {"paused up", models.Site{SiteConfig: models.SiteConfig{Paused: true}, SiteState: models.SiteState{Status: "UP"}}, 3},
{"paused down", models.Site{Status: "DOWN", Paused: true}, 3}, {"paused down", models.Site{SiteConfig: models.SiteConfig{Paused: true}, SiteState: models.SiteState{Status: "DOWN"}}, 3},
} }
for _, tt := range tests { for _, tt := range tests {
got := siteOrder(tt.site) got := siteOrder(tt.site)
@@ -56,19 +56,19 @@ func TestSiteOrder(t *testing.T) {
func TestFmtStatus(t *testing.T) { func TestFmtStatus(t *testing.T) {
tests := []struct { tests := []struct {
status string status models.Status
paused bool paused bool
inMaint bool inMaint bool
wantSub string wantSub string
}{ }{
{"DOWN", false, false, "▼ DOWN"}, {models.StatusDown, false, false, "▼ DOWN"},
{"UP", false, false, "▲ UP"}, {models.StatusUp, false, false, "▲ UP"},
{"SSL EXP", false, false, "▼ SSL EXP"}, {models.StatusSSLExp, false, false, "▼ SSL EXP"},
{"LATE", false, false, "◆ LATE"}, {models.StatusLate, false, false, "◆ LATE"},
{"STALE", false, false, "◆ STALE"}, {models.StatusStale, false, false, "◆ STALE"},
{"PENDING", false, false, "○ PENDING"}, {models.StatusPending, false, false, "○ PENDING"},
{"DOWN", true, false, "◇ PAUSED"}, {models.StatusDown, true, false, "◇ PAUSED"},
{"DOWN", false, true, "◼ MAINT"}, {models.StatusDown, false, true, "◼ MAINT"},
} }
for _, tt := range tests { for _, tt := range tests {
got := styledModel.fmtStatus(tt.status, tt.paused, tt.inMaint) got := styledModel.fmtStatus(tt.status, tt.paused, tt.inMaint)
+3 -9
View File
@@ -75,10 +75,7 @@ func fmtAlertType(t string) string {
} }
} }
func (m Model) fmtAlertConfig(alert struct { func (m Model) fmtAlertConfig(alert models.AlertConfig) string {
Type string
Settings map[string]string
}) string {
switch alert.Type { switch alert.Type {
case "email": case "email":
host := alert.Settings["host"] host := alert.Settings["host"]
@@ -201,10 +198,7 @@ func (m Model) viewAlertsTab() string {
m.fmtAlertHealth(h), m.fmtAlertHealth(h),
m.zones.Mark(fmt.Sprintf("alert-%d", i), limitStr(a.Name, nameW-2)), m.zones.Mark(fmt.Sprintf("alert-%d", i), limitStr(a.Name, nameW-2)),
fmtAlertType(a.Type), fmtAlertType(a.Type),
limitStr(m.fmtAlertConfig(struct { limitStr(m.fmtAlertConfig(a), cfgW-2),
Type string
Settings map[string]string
}{a.Type, a.Settings}), cfgW-2),
m.fmtAlertLastSent(h), m.fmtAlertLastSent(h),
}) })
} }
@@ -271,7 +265,7 @@ func (m Model) viewAlertDetailPanel() string {
} }
b.WriteString(m.divider() + "\n") b.WriteString(m.divider() + "\n")
b.WriteString(m.st.subtleStyle.Render(" [i/Esc] Back [e] Edit [t] Test [q] Quit")) b.WriteString(m.st.subtleStyle.Render(" [q/Esc] Back [e] Edit [t] Test"))
return lipgloss.NewStyle().Padding(1, 2).Render(b.String()) return lipgloss.NewStyle().Padding(1, 2).Render(b.String())
} }
+2 -8
View File
@@ -44,10 +44,7 @@ func TestAlertDetailPanel_MasksSecretsStableOrder(t *testing.T) {
func TestFmtAlertConfig_MasksSecrets(t *testing.T) { func TestFmtAlertConfig_MasksSecrets(t *testing.T) {
m := newTestModel(&tuiMockStore{}) m := newTestModel(&tuiMockStore{})
webhook := m.fmtAlertConfig(struct { webhook := m.fmtAlertConfig(models.AlertConfig{Type: "discord", Settings: map[string]string{"url": "https://discord.com/api/webhooks/123456/SeCrEtToKeN"}})
Type string
Settings map[string]string
}{"discord", map[string]string{"url": "https://discord.com/api/webhooks/123456/SeCrEtToKeN"}})
if strings.Contains(webhook, "SeCrEtToKeN") || strings.Contains(webhook, "123456") { if strings.Contains(webhook, "SeCrEtToKeN") || strings.Contains(webhook, "123456") {
t.Errorf("webhook URL path (the credential) rendered in table: %q", webhook) t.Errorf("webhook URL path (the credential) rendered in table: %q", webhook)
} }
@@ -55,10 +52,7 @@ func TestFmtAlertConfig_MasksSecrets(t *testing.T) {
t.Errorf("webhook host missing from table config: %q", webhook) t.Errorf("webhook host missing from table config: %q", webhook)
} }
pd := m.fmtAlertConfig(struct { pd := m.fmtAlertConfig(models.AlertConfig{Type: "pagerduty", Settings: map[string]string{"routing_key": "R0123456789ABCDEFGHIJ"}})
Type string
Settings map[string]string
}{"pagerduty", map[string]string{"routing_key": "R0123456789ABCDEFGHIJ"}})
if strings.Contains(pd, "R0123456789ABCDEFGHIJ") { if strings.Contains(pd, "R0123456789ABCDEFGHIJ") {
t.Errorf("pagerduty routing key rendered raw in table: %q", pd) t.Errorf("pagerduty routing key rendered raw in table: %q", pd)
} }
+18 -12
View File
@@ -82,18 +82,15 @@ func (m Model) renderLogLine(line string) string {
return fmt.Sprintf(" %s %s", tag, msg) return fmt.Sprintf(" %s %s", tag, msg)
} }
func (m Model) viewLogsTab() string { // refreshLogContent rebuilds the log viewport from the full engine log list,
content := m.logViewport.View() // filtering before windowing so the entry count and "(n hidden)" reflect all
if strings.TrimSpace(content) == "" || content == "Waiting for logs..." { // logs, not just the visible viewport slice.
return m.emptyState("No log entries yet.", "Logs appear as monitors run checks") func (m *Model) refreshLogContent() {
}
lines := strings.Split(content, "\n")
var rendered []string var rendered []string
total := 0 total := 0
shown := 0 shown := 0
for _, line := range lines { for _, line := range m.engine.GetLogs() {
if strings.TrimSpace(line) == "" { if strings.TrimSpace(line) == "" {
continue continue
} }
@@ -106,18 +103,27 @@ func (m Model) viewLogsTab() string {
rendered = append(rendered, m.renderLogLine(line)) rendered = append(rendered, m.renderLogLine(line))
} }
m.logTotal = total
m.logShown = shown
m.logViewport.SetContent(strings.Join(rendered, "\n"))
}
func (m Model) viewLogsTab() string {
if m.logTotal == 0 {
return m.emptyState("No log entries yet.", "Logs appear as monitors run checks")
}
filterLabel := "All" filterLabel := "All"
if m.logFilterImportant { if m.logFilterImportant {
filterLabel = "Important" filterLabel = "Important"
} }
header := m.st.subtleStyle.Render(fmt.Sprintf( header := m.st.subtleStyle.Render(fmt.Sprintf(
" %d entries Filter: %s", shown, filterLabel)) " %d entries Filter: %s", m.logShown, filterLabel))
if m.logFilterImportant && shown < total { if m.logFilterImportant && m.logShown < m.logTotal {
header += m.st.subtleStyle.Render(fmt.Sprintf(" (%d hidden)", total-shown)) header += m.st.subtleStyle.Render(fmt.Sprintf(" (%d hidden)", m.logTotal-m.logShown))
} }
m.logViewport.SetContent(strings.Join(rendered, "\n"))
return "\n" + header + "\n\n" + m.logViewport.View() return "\n" + header + "\n\n" + m.logViewport.View()
} }
+168 -146
View File
@@ -240,7 +240,7 @@ func (m Model) viewSitesTab() string {
name = limitStr(name, nameW-2) name = limitStr(name, nameW-2)
} }
if (site.Status == "DOWN" || site.Status == "SSL EXP" || site.Status == "LATE" || site.Status == "STALE") && site.LastError != "" { if (site.Status == models.StatusDown || site.Status == models.StatusSSLExp || site.Status == models.StatusLate || site.Status == models.StatusStale) && site.LastError != "" {
nameLen := len([]rune(name)) nameLen := len([]rune(name))
errSpace := nameW - nameLen - 3 errSpace := nameW - nameLen - 3
if errSpace > 10 { if errSpace > 10 {
@@ -326,101 +326,104 @@ func (m *Model) initSiteHuhForm() tea.Cmd {
} }
} }
// m.alerts is the tab-data cache (≤5s stale) — no store IO in Update. return m.rebuildSiteForm()
alertOpts := []huh.Option[string]{huh.NewOption("None", "0")} }
func (m *Model) rebuildSiteForm() tea.Cmd {
groups := m.buildSiteFormGroups()
m.huhForm = huh.NewForm(groups...).WithTheme(m.theme.HuhTheme())
if m.termWidth > 0 {
m.huhForm.WithWidth(m.termWidth)
}
formHeight := m.termHeight - 7
if formHeight < 5 {
formHeight = 5
}
m.huhForm.WithHeight(formHeight)
m.lastSiteType = m.siteFormData.SiteType
return m.huhForm.Init()
}
func (m *Model) siteFormOptions() (alertOpts, groupOpts []huh.Option[string]) {
alertOpts = []huh.Option[string]{huh.NewOption("None", "0")}
for _, a := range m.alerts { for _, a := range m.alerts {
alertOpts = append(alertOpts, huh.NewOption( alertOpts = append(alertOpts, huh.NewOption(
fmt.Sprintf("%s (%s)", a.Name, a.Type), fmt.Sprintf("%s (%s)", a.Name, a.Type),
strconv.Itoa(a.ID), strconv.Itoa(a.ID),
)) ))
} }
groupOpts = []huh.Option[string]{huh.NewOption("None", "0")}
groupOpts := []huh.Option[string]{huh.NewOption("None", "0")}
for _, s := range m.sites { for _, s := range m.sites {
if s.Type == "group" && s.ID != m.editID { if s.Type == "group" && s.ID != m.editID {
groupOpts = append(groupOpts, huh.NewOption(s.Name, strconv.Itoa(s.ID))) groupOpts = append(groupOpts, huh.NewOption(s.Name, strconv.Itoa(s.ID)))
} }
} }
return
}
m.huhForm = huh.NewForm( func (m *Model) buildSiteFormGroups() []*huh.Group {
huh.NewGroup( d := m.siteFormData
huh.NewInput().Title("Monitor Name"). alertOpts, groupOpts := m.siteFormOptions()
Placeholder("My Service").
Value(&m.siteFormData.Name). // Page 1 — Monitor Setup: core fields + type-specific target
Validate(func(s string) error { setup := []huh.Field{
if s == "" { huh.NewInput().Title("Monitor Name").
return fmt.Errorf("name is required") Placeholder("My Service").
} Value(&d.Name).
return nil Validate(func(s string) error {
}), if s == "" {
huh.NewSelect[string]().Title("Monitor Type"). return fmt.Errorf("name is required")
Options( }
huh.NewOption("HTTP/HTTPS", "http"), return nil
huh.NewOption("Push / Heartbeat", "push"), }),
huh.NewOption("Ping (ICMP)", "ping"), huh.NewSelect[string]().Title("Monitor Type").
huh.NewOption("TCP Port", "port"), Options(
huh.NewOption("DNS", "dns"), huh.NewOption("HTTP/HTTPS", "http"),
huh.NewOption("Group", "group"), huh.NewOption("Push / Heartbeat", "push"),
).Value(&m.siteFormData.SiteType), huh.NewOption("Ping (ICMP)", "ping"),
huh.NewSelect[string]().Title("Alert Channel"). huh.NewOption("TCP Port", "port"),
Options(alertOpts...). huh.NewOption("DNS", "dns"),
Value(&m.siteFormData.AlertID), huh.NewOption("Group", "group"),
).Title("Monitor Settings"), ).Value(&d.SiteType),
huh.NewGroup( huh.NewSelect[string]().Title("Alert Channel").
huh.NewInput().Title("URL"). Options(alertOpts...).
Placeholder("https://example.com"). Value(&d.AlertID),
Description("Required for HTTP monitors"). }
Value(&m.siteFormData.URL).
Validate(func(s string) error { switch d.SiteType {
if m.siteFormData.SiteType != "http" { case "http":
return nil setup = append(setup, huh.NewInput().Title("URL").
} Placeholder("https://example.com").
if s == "" { Value(&d.URL).
return fmt.Errorf("URL is required for HTTP monitors") Validate(func(s string) error {
} if s == "" {
u, err := url.Parse(s) return fmt.Errorf("URL is required")
if err != nil { }
return fmt.Errorf("invalid URL") u, err := url.Parse(s)
} if err != nil {
if u.Scheme != "http" && u.Scheme != "https" { return fmt.Errorf("invalid URL")
return fmt.Errorf("URL must start with http:// or https://") }
} if u.Scheme != "http" && u.Scheme != "https" {
if u.Host == "" { return fmt.Errorf("URL must start with http:// or https://")
return fmt.Errorf("URL must include a host") }
} if u.Host == "" {
return nil return fmt.Errorf("URL must include a host")
}), }
huh.NewInput().Title("Check Interval (seconds)"). return nil
Placeholder("60"). }))
Value(&m.siteFormData.Interval). case "ping", "dns":
Validate(func(s string) error { setup = append(setup, huh.NewInput().Title("Hostname / IP").
if m.siteFormData.SiteType == "group" { Placeholder("10.0.0.1").
return nil Value(&d.Hostname))
} case "port":
v, err := strconv.Atoi(s) setup = append(setup,
if err != nil {
return fmt.Errorf("must be a number")
}
if v < 5 {
return fmt.Errorf("minimum interval is 5 seconds")
}
return nil
}),
huh.NewSelect[string]().Title("Parent Group").
Options(groupOpts...).
Value(&m.siteFormData.GroupID),
huh.NewInput().Title("Hostname / IP"). huh.NewInput().Title("Hostname / IP").
Placeholder("10.0.0.1"). Placeholder("10.0.0.1").
Description("Target for ping/port/DNS monitors"). Value(&d.Hostname),
Value(&m.siteFormData.Hostname),
huh.NewInput().Title("Port"). huh.NewInput().Title("Port").
Placeholder("0"). Placeholder("443").
Description("Target port for TCP port monitors"). Value(&d.Port).
Value(&m.siteFormData.Port).
Validate(func(s string) error { Validate(func(s string) error {
if m.siteFormData.SiteType != "port" {
return nil
}
v, err := strconv.Atoi(s) v, err := strconv.Atoi(s)
if err != nil { if err != nil {
return fmt.Errorf("must be a number") return fmt.Errorf("must be a number")
@@ -429,34 +432,20 @@ func (m *Model) initSiteHuhForm() tea.Cmd {
return fmt.Errorf("port must be 1-65535") return fmt.Errorf("port must be 1-65535")
} }
return nil return nil
}), }))
huh.NewInput().Title("Timeout (seconds)"). }
Placeholder("5").
Value(&m.siteFormData.Timeout). groups := []*huh.Group{huh.NewGroup(setup...).Title("Monitor Setup")}
Validate(func(s string) error {
if m.siteFormData.SiteType == "group" { if d.SiteType == "group" {
return nil return groups
} }
v, err := strconv.Atoi(s)
if err != nil { // Page 2 — Configuration: type-specific options + shared defaults
return fmt.Errorf("must be a number") var config []huh.Field
}
if v < 1 || v > 300 { if d.SiteType == "http" {
return fmt.Errorf("timeout must be 1-300 seconds") config = append(config,
}
return nil
}),
huh.NewInput().Title("Description").
Placeholder("Optional description").
Value(&m.siteFormData.Description),
huh.NewInput().Title("Probe Regions").
Placeholder("us-east, eu-west (empty = all)").
Description("Comma-separated regions for distributed probing").
Value(&m.siteFormData.Regions),
).Title("Connection").WithHideFunc(func() bool {
return m.siteFormData.SiteType == "group"
}),
huh.NewGroup(
huh.NewSelect[string]().Title("HTTP Method"). huh.NewSelect[string]().Title("HTTP Method").
Options( Options(
huh.NewOption("GET", "GET"), huh.NewOption("GET", "GET"),
@@ -466,22 +455,75 @@ func (m *Model) initSiteHuhForm() tea.Cmd {
huh.NewOption("DELETE", "DELETE"), huh.NewOption("DELETE", "DELETE"),
huh.NewOption("HEAD", "HEAD"), huh.NewOption("HEAD", "HEAD"),
huh.NewOption("OPTIONS", "OPTIONS"), huh.NewOption("OPTIONS", "OPTIONS"),
).Value(&m.siteFormData.Method), ).Value(&d.Method),
huh.NewInput().Title("Accepted Status Codes"). huh.NewInput().Title("Accepted Status Codes").
Placeholder("200-299"). Placeholder("200-299").
Description("Ranges (200-299) and singles (301) separated by commas"). Description("Ranges (200-299) and singles (301) separated by commas").
Value(&m.siteFormData.AcceptedCodes), Value(&d.AcceptedCodes),
).Title("HTTP Settings").WithHideFunc(func() bool { )
return m.siteFormData.SiteType != "http" }
}),
huh.NewGroup( config = append(config,
huh.NewInput().Title("Check Interval (seconds)").
Placeholder("60").
Value(&d.Interval).
Validate(func(s string) error {
v, err := strconv.Atoi(s)
if err != nil {
return fmt.Errorf("must be a number")
}
if v < 5 {
return fmt.Errorf("minimum interval is 5 seconds")
}
return nil
}),
huh.NewInput().Title("Timeout (seconds)").
Placeholder("5").
Value(&d.Timeout).
Validate(func(s string) error {
v, err := strconv.Atoi(s)
if err != nil {
return fmt.Errorf("must be a number")
}
if v < 1 || v > 300 {
return fmt.Errorf("timeout must be 1-300 seconds")
}
return nil
}),
huh.NewInput().Title("Max Retries Before Alert").
Placeholder("0").
Value(&d.Retries).
Validate(func(s string) error {
v, err := strconv.Atoi(s)
if err != nil {
return fmt.Errorf("must be a number")
}
if v < 0 {
return fmt.Errorf("retries cannot be negative")
}
return nil
}),
huh.NewSelect[string]().Title("Parent Group").
Options(groupOpts...).
Value(&d.GroupID),
huh.NewInput().Title("Description").
Placeholder("Optional description").
Value(&d.Description),
huh.NewInput().Title("Probe Regions").
Placeholder("us-east, eu-west (empty = all)").
Description("Comma-separated regions for distributed probing").
Value(&d.Regions),
)
if d.SiteType == "http" {
config = append(config,
huh.NewConfirm().Title("Monitor SSL Certificate?"). huh.NewConfirm().Title("Monitor SSL Certificate?").
Value(&m.siteFormData.CheckSSL), Value(&d.CheckSSL),
huh.NewInput().Title("SSL Warning Threshold (days)"). huh.NewInput().Title("SSL Warning Threshold (days)").
Placeholder("7"). Placeholder("7").
Value(&m.siteFormData.Threshold). Value(&d.Threshold).
Validate(func(s string) error { Validate(func(s string) error {
if !m.siteFormData.CheckSSL { if !d.CheckSSL {
return nil return nil
} }
v, err := strconv.Atoi(s) v, err := strconv.Atoi(s)
@@ -493,30 +535,13 @@ func (m *Model) initSiteHuhForm() tea.Cmd {
} }
return nil return nil
}), }),
huh.NewInput().Title("Max Retries Before Alert").
Placeholder("0").
Value(&m.siteFormData.Retries).
Validate(func(s string) error {
if m.siteFormData.SiteType == "group" {
return nil
}
v, err := strconv.Atoi(s)
if err != nil {
return fmt.Errorf("must be a number")
}
if v < 0 {
return fmt.Errorf("retries cannot be negative")
}
return nil
}),
huh.NewConfirm().Title("Ignore TLS Errors?"). huh.NewConfirm().Title("Ignore TLS Errors?").
Value(&m.siteFormData.IgnoreTLS), Value(&d.IgnoreTLS),
).Title("Advanced").WithHideFunc(func() bool { )
return m.siteFormData.SiteType == "group" }
}),
).WithTheme(m.theme.HuhTheme())
return m.huhForm.Init() groups = append(groups, huh.NewGroup(config...).Title("Configuration"))
return groups
} }
func (m *Model) submitSiteForm() tea.Cmd { func (m *Model) submitSiteForm() tea.Cmd {
@@ -535,7 +560,7 @@ func (m *Model) submitSiteForm() tea.Cmd {
threshold = 7 threshold = 7
} }
site := models.Site{ cfg := models.SiteConfig{
ID: m.editID, ID: m.editID,
Name: d.Name, Name: d.Name,
URL: d.URL, URL: d.URL,
@@ -559,11 +584,8 @@ func (m *Model) submitSiteForm() tea.Cmd {
st := m.store st := m.store
m.state = stateDashboard m.state = stateDashboard
if m.editID > 0 { if m.editID > 0 {
// The engine's in-memory config updates immediately; the DB write m.engine.UpdateSiteConfig(cfg)
// follows in the Cmd. New sites enter the engine via its poll loop return writeCmd("Update site", func() error { return st.UpdateSite(context.Background(), cfg) })
// once the insert lands.
m.engine.UpdateSiteConfig(site)
return writeCmd("Update site", func() error { return st.UpdateSite(context.Background(), site) })
} }
return writeCmd("Add site", func() error { return st.AddSite(context.Background(), site) }) return writeCmd("Add site", func() error { return st.AddSite(context.Background(), cfg) })
} }
+6
View File
@@ -80,6 +80,8 @@ const (
chromeFooter = 2 // footer: "\n" prefix + text line chromeFooter = 2 // footer: "\n" prefix + text line
chromeTable = 3 // renderTable "\n" prefix + top border + header + bottom border (lipgloss collapses two into three rendered lines) chromeTable = 3 // renderTable "\n" prefix + top border + header + bottom border (lipgloss collapses two into three rendered lines)
chromeBase = chromePadV + chromeHeader + chromeGaps + chromeFooter + chromeTable chromeBase = chromePadV + chromeHeader + chromeGaps + chromeFooter + chromeTable
detailSparkWidth = 40
) )
type sessionState int type sessionState int
@@ -103,6 +105,7 @@ type Model struct {
state sessionState state sessionState
currentTab int currentTab int
cursor int cursor int
selectedID int
tableOffset int tableOffset int
maxTableRows int maxTableRows int
termWidth int termWidth int
@@ -112,12 +115,15 @@ type Model struct {
huhForm *huh.Form huhForm *huh.Form
siteFormData *siteFormData siteFormData *siteFormData
lastSiteType string
alertFormData *alertFormData alertFormData *alertFormData
userFormData *userFormData userFormData *userFormData
maintFormData *maintFormData maintFormData *maintFormData
logViewport viewport.Model logViewport viewport.Model
logFilterImportant bool logFilterImportant bool
logTotal int
logShown int
historyViewport viewport.Model historyViewport viewport.Model
historyChanges []models.StateChange historyChanges []models.StateChange
+41 -31
View File
@@ -110,10 +110,6 @@ func (m *Model) handleConfirmDelete(msg tea.Msg) (tea.Model, tea.Cmd) {
} }
func (m *Model) handleFormMsg(msg tea.Msg) (tea.Model, tea.Cmd) { func (m *Model) handleFormMsg(msg tea.Msg) (tea.Model, tea.Cmd) {
if wsm, ok := msg.(tea.WindowSizeMsg); ok {
m.termWidth = wsm.Width
m.termHeight = wsm.Height
}
if keyMsg, ok := msg.(tea.KeyMsg); ok { if keyMsg, ok := msg.(tea.KeyMsg); ok {
if keyMsg.String() == "ctrl+c" { if keyMsg.String() == "ctrl+c" {
return m, tea.Quit return m, tea.Quit
@@ -132,6 +128,13 @@ func (m *Model) handleFormMsg(msg tea.Msg) (tea.Model, tea.Cmd) {
if f, ok := form.(*huh.Form); ok { if f, ok := form.(*huh.Form); ok {
m.huhForm = f m.huhForm = f
} }
if m.state == stateFormSite && m.siteFormData != nil &&
m.siteFormData.SiteType != m.lastSiteType {
rebuildCmd := m.rebuildSiteForm()
// Advance to Type select — user just changed it.
skipName := m.huhForm.NextField()
return m, tea.Batch(rebuildCmd, skipName)
}
if m.huhForm.State == huh.StateCompleted { if m.huhForm.State == huh.StateCompleted {
// The store write runs in the returned Cmd; its writeDoneMsg // The store write runs in the returned Cmd; its writeDoneMsg
// triggers the tab-data reload once the row actually exists. // triggers the tab-data reload once the row actually exists.
@@ -145,24 +148,35 @@ func (m *Model) handleFormMsg(msg tea.Msg) (tea.Model, tea.Cmd) {
return m, nil return m, nil
} }
func (m *Model) handleResize(msg tea.WindowSizeMsg) (tea.Model, tea.Cmd) { func (m *Model) recalcLayout() {
m.termWidth = msg.Width
m.termHeight = msg.Height
chrome := chromeBase chrome := chromeBase
if m.filterMode || m.filterText != "" { if m.filterMode || m.filterText != "" {
chrome++ chrome++
} }
m.maxTableRows = msg.Height - chrome m.maxTableRows = m.termHeight - chrome
if m.maxTableRows < 1 { if m.maxTableRows < 1 {
m.maxTableRows = 1 m.maxTableRows = 1
} }
}
func (m *Model) handleResize(msg tea.WindowSizeMsg) (tea.Model, tea.Cmd) {
m.termWidth = msg.Width
m.termHeight = msg.Height
m.recalcLayout()
m.logViewport.Width = msg.Width - chromePadH m.logViewport.Width = msg.Width - chromePadH
m.logViewport.Height = msg.Height - (chromePadV + chromeHeader + chromeFooter + 2) m.logViewport.Height = msg.Height - (chromePadV + chromeHeader + chromeFooter + 2)
m.historyViewport.Width = msg.Width - chromePadH m.historyViewport.Width = msg.Width - chromePadH
m.historyViewport.Height = msg.Height - 10 m.historyViewport.Height = msg.Height - 10
m.slaViewport.Width = msg.Width - chromePadH m.slaViewport.Width = msg.Width - chromePadH
m.slaViewport.Height = msg.Height - 16 m.slaViewport.Height = msg.Height - 16
return m, tea.ClearScreen if m.huhForm != nil {
formHeight := msg.Height - 7
if formHeight < 5 {
formHeight = 5
}
m.huhForm.WithHeight(formHeight)
}
return m, nil
} }
func (m *Model) handleTick(t time.Time) (tea.Model, tea.Cmd) { func (m *Model) handleTick(t time.Time) (tea.Model, tea.Cmd) {
@@ -286,6 +300,7 @@ func (m *Model) handleMouse(msg tea.MouseMsg) (tea.Model, tea.Cmd) {
} }
} }
} }
m.syncSelectedID()
return m, nil return m, nil
} }
@@ -323,9 +338,11 @@ func (m *Model) handleFilterKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) {
m.filterText = "" m.filterText = ""
m.cursor = 0 m.cursor = 0
m.tableOffset = 0 m.tableOffset = 0
m.recalcLayout()
m.refreshLive() m.refreshLive()
case "enter": case "enter":
m.filterMode = false m.filterMode = false
m.recalcLayout()
case "backspace": case "backspace":
if len(m.filterText) > 0 { if len(m.filterText) > 0 {
m.filterText = m.filterText[:len(m.filterText)-1] m.filterText = m.filterText[:len(m.filterText)-1]
@@ -336,8 +353,8 @@ func (m *Model) handleFilterKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) {
case "ctrl+c": case "ctrl+c":
return m, tea.Quit return m, tea.Quit
default: default:
if len(msg.String()) == 1 { if len(msg.Runes) == 1 {
m.filterText += msg.String() m.filterText += string(msg.Runes)
m.cursor = 0 m.cursor = 0
m.tableOffset = 0 m.tableOffset = 0
m.refreshLive() m.refreshLive()
@@ -379,7 +396,7 @@ func (m *Model) handleDetailKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) {
return m, m.openSLAView(m.sites[m.cursor]) return m, m.openSLAView(m.sites[m.cursor])
} }
case "q": case "q":
return m, tea.Quit m.state = stateDashboard
} }
return m, nil return m, nil
} }
@@ -391,16 +408,14 @@ func (m *Model) handleSparklineClick(msg tea.MouseMsg) (tea.Model, tea.Cmd) {
site := m.sites[m.cursor] site := m.sites[m.cursor]
hist, _ := m.engine.GetHistory(site.ID) hist, _ := m.engine.GetHistory(site.ID)
const sparkWidth = 40
if zi := m.zones.Get("spark-latency"); zi != nil && !zi.IsZero() && zi.InBounds(msg) { if zi := m.zones.Get("spark-latency"); zi != nil && !zi.IsZero() && zi.InBounds(msg) {
x, _ := zi.Pos(msg) x, _ := zi.Pos(msg)
m.sparkTooltipIdx = resolveSparklineIndex(x, sparkWidth, len(hist.Latencies)) m.sparkTooltipIdx = resolveSparklineIndex(x, detailSparkWidth, len(hist.Latencies))
return m, nil return m, nil
} }
if zi := m.zones.Get("spark-heartbeat"); zi != nil && !zi.IsZero() && zi.InBounds(msg) { if zi := m.zones.Get("spark-heartbeat"); zi != nil && !zi.IsZero() && zi.InBounds(msg) {
x, _ := zi.Pos(msg) x, _ := zi.Pos(msg)
m.sparkTooltipIdx = resolveSparklineIndex(x, sparkWidth, len(hist.Statuses)) m.sparkTooltipIdx = resolveSparklineIndex(x, detailSparkWidth, len(hist.Statuses))
return m, nil return m, nil
} }
@@ -455,7 +470,7 @@ func (m *Model) handleSLAData(msg slaDataMsg) (tea.Model, tea.Cmd) {
} }
period := slaPeriods[msg.periodIdx] period := slaPeriods[msg.periodIdx]
var currentStatus string var currentStatus models.Status
for _, s := range m.sites { for _, s := range m.sites {
if s.ID == msg.siteID { if s.ID == msg.siteID {
currentStatus = s.Status currentStatus = s.Status
@@ -499,10 +514,8 @@ func (m *Model) handleHistoryKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) {
func (m *Model) handleAlertDetailKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) { func (m *Model) handleAlertDetailKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) {
switch msg.String() { switch msg.String() {
case "i", "esc": case "q", "i", "esc":
m.state = stateDashboard m.state = stateDashboard
case "q":
return m, tea.Quit
} }
return m, nil return m, nil
} }
@@ -515,11 +528,13 @@ func (m *Model) handleDashboardKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) {
case "/": case "/":
if m.currentTab == 0 { if m.currentTab == 0 {
m.filterMode = true m.filterMode = true
m.recalcLayout()
return m, nil return m, nil
} }
case "f": case "f":
if m.state == stateLogs { if m.state == stateLogs {
m.logFilterImportant = !m.logFilterImportant m.logFilterImportant = !m.logFilterImportant
m.refreshLogContent()
return m, nil return m, nil
} }
case "tab": case "tab":
@@ -537,6 +552,7 @@ func (m *Model) handleDashboardKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) {
if m.cursor < m.tableOffset { if m.cursor < m.tableOffset {
m.tableOffset = m.cursor m.tableOffset = m.cursor
} }
m.syncSelectedID()
} }
case "down", "j": case "down", "j":
if m.state == stateLogs { if m.state == stateLogs {
@@ -548,6 +564,7 @@ func (m *Model) handleDashboardKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) {
if m.cursor >= m.tableOffset+m.maxTableRows { if m.cursor >= m.tableOffset+m.maxTableRows {
m.tableOffset++ m.tableOffset++
} }
m.syncSelectedID()
} }
} }
case "n": case "n":
@@ -610,7 +627,7 @@ func (m *Model) handleDashboardKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) {
return m, writeCmd("Save theme", func() error { return m, writeCmd("Save theme", func() error {
return st.SetPreference(context.Background(), "theme", name) return st.SetPreference(context.Background(), "theme", name)
}) })
case "d", "backspace": case "d":
return m.handleDeleteItem() return m.handleDeleteItem()
} }
return m, nil return m, nil
@@ -717,6 +734,7 @@ func (m *Model) handleClick(msg tea.MouseMsg) (tea.Model, tea.Cmd) {
for i := m.tableOffset; i < end; i++ { for i := m.tableOffset; i < end; i++ {
if m.zones.Get(fmt.Sprintf("%s-%d", prefix, i)).InBounds(msg) { if m.zones.Get(fmt.Sprintf("%s-%d", prefix, i)).InBounds(msg) {
m.cursor = i m.cursor = i
m.syncSelectedID()
return m, nil return m, nil
} }
} }
@@ -745,16 +763,8 @@ func (m *Model) switchTab(idx int) {
} }
} }
func (m *Model) adjustCursor(newLen int) { func (m *Model) adjustCursor(_ int) {
if m.cursor >= newLen && m.cursor > 0 { m.clampCursor()
m.cursor--
}
if m.cursor < m.tableOffset {
m.tableOffset = m.cursor
if m.tableOffset < 0 {
m.tableOffset = 0
}
}
} }
func (m *Model) submitForm() tea.Cmd { func (m *Model) submitForm() tea.Cmd {
+9 -91
View File
@@ -8,6 +8,7 @@ import (
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models" "gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/monitor" "gitea.lerkolabs.com/lerkolabs/uptop/internal/monitor"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/store/storetest"
tea "github.com/charmbracelet/bubbletea" tea "github.com/charmbracelet/bubbletea"
zone "github.com/lrstanley/bubblezone" zone "github.com/lrstanley/bubblezone"
) )
@@ -15,13 +16,14 @@ import (
// --- minimal Store mock for TUI data-flow tests --- // --- minimal Store mock for TUI data-flow tests ---
type tuiMockStore struct { type tuiMockStore struct {
storetest.BaseMock
alerts []models.AlertConfig alerts []models.AlertConfig
users []models.User users []models.User
nodes []models.ProbeNode nodes []models.ProbeNode
maint []models.MaintenanceWindow maint []models.MaintenanceWindow
stateChanges []models.StateChange stateChanges []models.StateChange
stateChangeCalls int // counts GetStateChanges hits (to prove View does no IO) stateChangeCalls int
deleteSiteCalls int // counts DeleteSite hits (to prove writes run in Cmds) deleteSiteCalls int
} }
func (m *tuiMockStore) GetAllAlerts(_ context.Context) ([]models.AlertConfig, error) { func (m *tuiMockStore) GetAllAlerts(_ context.Context) ([]models.AlertConfig, error) {
@@ -38,94 +40,10 @@ func (m *tuiMockStore) GetStateChanges(_ context.Context, _ int, _ int) ([]model
func (m *tuiMockStore) GetAllMaintenanceWindows(_ context.Context, _ int) ([]models.MaintenanceWindow, error) { func (m *tuiMockStore) GetAllMaintenanceWindows(_ context.Context, _ int) ([]models.MaintenanceWindow, error) {
return m.maint, nil return m.maint, nil
} }
func (m *tuiMockStore) Init(_ context.Context) error { return nil }
func (m *tuiMockStore) GetSites(_ context.Context) ([]models.Site, error) { return nil, nil }
func (m *tuiMockStore) AddSite(_ context.Context, _ models.Site) error { return nil }
func (m *tuiMockStore) UpdateSite(_ context.Context, _ models.Site) error { return nil }
func (m *tuiMockStore) UpdateSitePaused(_ context.Context, _ int, _ bool) error { return nil }
func (m *tuiMockStore) DeleteSite(_ context.Context, _ int) error { func (m *tuiMockStore) DeleteSite(_ context.Context, _ int) error {
m.deleteSiteCalls++ m.deleteSiteCalls++
return nil return nil
} }
func (m *tuiMockStore) GetAlert(_ context.Context, _ int) (models.AlertConfig, error) {
return models.AlertConfig{}, nil
}
func (m *tuiMockStore) AddAlert(_ context.Context, _ string, _ string, _ map[string]string) error {
return nil
}
func (m *tuiMockStore) UpdateAlert(_ context.Context, _ int, _ string, _ string, _ map[string]string) error {
return nil
}
func (m *tuiMockStore) DeleteAlert(_ context.Context, _ int) error { return nil }
func (m *tuiMockStore) GetSiteByName(_ context.Context, _ string) (models.Site, error) {
return models.Site{}, nil
}
func (m *tuiMockStore) GetAlertByName(_ context.Context, _ string) (models.AlertConfig, error) {
return models.AlertConfig{}, nil
}
func (m *tuiMockStore) AddSiteReturningID(_ context.Context, _ models.Site) (int, error) {
return 0, nil
}
func (m *tuiMockStore) AddAlertReturningID(_ context.Context, _ string, _ string, _ map[string]string) (int, error) {
return 0, nil
}
func (m *tuiMockStore) AddUser(_ context.Context, _ string, _ string, _ string) error { return nil }
func (m *tuiMockStore) UpdateUser(_ context.Context, _ int, _ string, _ string, _ string) error {
return nil
}
func (m *tuiMockStore) DeleteUser(_ context.Context, _ int) error { return nil }
func (m *tuiMockStore) SaveCheck(_ context.Context, _ int, _ int64, _ bool) error { return nil }
func (m *tuiMockStore) SaveCheckFromNode(_ context.Context, _ int, _ string, _ int64, _ bool) error {
return nil
}
func (m *tuiMockStore) LoadAllHistory(_ context.Context, _ int) (map[int][]models.CheckRecord, error) {
return nil, nil
}
func (m *tuiMockStore) PruneCheckHistory(_ context.Context) error { return nil }
func (m *tuiMockStore) SaveStateChange(_ context.Context, _ int, _ string, _ string, _ string) error {
return nil
}
func (m *tuiMockStore) GetStateChangesSince(_ context.Context, _ int, _ time.Time) ([]models.StateChange, error) {
return nil, nil
}
func (m *tuiMockStore) PruneStateChanges(_ context.Context) error { return nil }
func (m *tuiMockStore) RegisterNode(_ context.Context, _ models.ProbeNode) error { return nil }
func (m *tuiMockStore) GetNode(_ context.Context, _ string) (models.ProbeNode, error) {
return models.ProbeNode{}, nil
}
func (m *tuiMockStore) UpdateNodeLastSeen(_ context.Context, _ string) error { return nil }
func (m *tuiMockStore) DeleteNode(_ context.Context, _ string) error { return nil }
func (m *tuiMockStore) LoadAlertHealth(_ context.Context) (map[int]models.AlertHealthRecord, error) {
return nil, nil
}
func (m *tuiMockStore) SaveAlertHealth(_ context.Context, _ models.AlertHealthRecord) error {
return nil
}
func (m *tuiMockStore) SaveLog(_ context.Context, _ string) error { return nil }
func (m *tuiMockStore) LoadLogs(_ context.Context, _ int) ([]string, error) { return nil, nil }
func (m *tuiMockStore) PruneLogs(_ context.Context) error { return nil }
func (m *tuiMockStore) GetActiveMaintenanceWindows(_ context.Context) ([]models.MaintenanceWindow, error) {
return nil, nil
}
func (m *tuiMockStore) AddMaintenanceWindow(_ context.Context, _ models.MaintenanceWindow) error {
return nil
}
func (m *tuiMockStore) EndMaintenanceWindow(_ context.Context, _ int) error { return nil }
func (m *tuiMockStore) DeleteMaintenanceWindow(_ context.Context, _ int) error { return nil }
func (m *tuiMockStore) PruneExpiredMaintenanceWindows(_ context.Context, _ time.Duration) (int64, error) {
return 0, nil
}
func (m *tuiMockStore) IsMonitorInMaintenance(_ context.Context, _ int) (bool, error) {
return false, nil
}
func (m *tuiMockStore) GetPreference(_ context.Context, _ string) (string, error) { return "", nil }
func (m *tuiMockStore) SetPreference(_ context.Context, _ string, _ string) error { return nil }
func (m *tuiMockStore) ExportData(_ context.Context) (models.Backup, error) {
return models.Backup{}, nil
}
func (m *tuiMockStore) ImportData(_ context.Context, _ models.Backup) error { return nil }
func (m *tuiMockStore) Close() error { return nil }
func newTestModel(ms *tuiMockStore) Model { func newTestModel(ms *tuiMockStore) Model {
return Model{ return Model{
@@ -198,7 +116,7 @@ func (*stubErr) Error() string { return "boom" }
func TestDetailLoad_CachesAndViewDoesNoIO(t *testing.T) { func TestDetailLoad_CachesAndViewDoesNoIO(t *testing.T) {
ms := &tuiMockStore{stateChanges: []models.StateChange{{FromStatus: "UP", ToStatus: "DOWN"}}} ms := &tuiMockStore{stateChanges: []models.StateChange{{FromStatus: "UP", ToStatus: "DOWN"}}}
m := newTestModel(ms) m := newTestModel(ms)
m.sites = []models.Site{{ID: 1, Name: "site", Status: "DOWN"}} m.sites = []models.Site{{SiteConfig: models.SiteConfig{ID: 1, Name: "site"}, SiteState: models.SiteState{Status: "DOWN"}}}
m.cursor = 0 m.cursor = 0
m.state = stateDetail m.state = stateDetail
m.termWidth = 120 m.termWidth = 120
@@ -283,7 +201,7 @@ func TestHandleTabData_DropsStaleSeq(t *testing.T) {
func TestHistoryKey_LoadsOffUIGoroutine(t *testing.T) { func TestHistoryKey_LoadsOffUIGoroutine(t *testing.T) {
ms := &tuiMockStore{stateChanges: []models.StateChange{{FromStatus: "UP", ToStatus: "DOWN"}}} ms := &tuiMockStore{stateChanges: []models.StateChange{{FromStatus: "UP", ToStatus: "DOWN"}}}
m := newTestModel(ms) m := newTestModel(ms)
m.sites = []models.Site{{ID: 7, Name: "site"}} m.sites = []models.Site{{SiteConfig: models.SiteConfig{ID: 7, Name: "site"}}}
m.state = stateDetail m.state = stateDetail
m.termWidth, m.termHeight = 120, 40 m.termWidth, m.termHeight = 120, 40
@@ -322,7 +240,7 @@ func TestHistoryKey_LoadsOffUIGoroutine(t *testing.T) {
func TestSLAData_DropsStaleReply(t *testing.T) { func TestSLAData_DropsStaleReply(t *testing.T) {
m := newTestModel(&tuiMockStore{}) m := newTestModel(&tuiMockStore{})
m.termWidth, m.termHeight = 120, 40 m.termWidth, m.termHeight = 120, 40
m.sites = []models.Site{{ID: 3, Status: "UP"}} m.sites = []models.Site{{SiteConfig: models.SiteConfig{ID: 3}, SiteState: models.SiteState{Status: "UP"}}}
if cmd := (&m).openSLAView(m.sites[0]); cmd == nil { if cmd := (&m).openSLAView(m.sites[0]); cmd == nil {
t.Fatal("openSLAView should return a load Cmd") t.Fatal("openSLAView should return a load Cmd")
@@ -346,7 +264,7 @@ func TestSLAData_DropsStaleReply(t *testing.T) {
func TestConfirmDelete_WritesOffUIGoroutine(t *testing.T) { func TestConfirmDelete_WritesOffUIGoroutine(t *testing.T) {
ms := &tuiMockStore{} ms := &tuiMockStore{}
m := newTestModel(ms) m := newTestModel(ms)
m.sites = []models.Site{{ID: 4, Name: "s"}} m.sites = []models.Site{{SiteConfig: models.SiteConfig{ID: 4, Name: "s"}}}
m.state = stateConfirmDelete m.state = stateConfirmDelete
m.deleteTab = 0 m.deleteTab = 0
m.deleteID = 4 m.deleteID = 4
@@ -394,7 +312,7 @@ func TestWriteDoneMsg_LogsErrorAndReloads(t *testing.T) {
func TestDetailRefreshCmd_OnlyWhileDetailOpen(t *testing.T) { func TestDetailRefreshCmd_OnlyWhileDetailOpen(t *testing.T) {
ms := &tuiMockStore{stateChanges: []models.StateChange{{FromStatus: "UP", ToStatus: "DOWN"}}} ms := &tuiMockStore{stateChanges: []models.StateChange{{FromStatus: "UP", ToStatus: "DOWN"}}}
m := newTestModel(ms) m := newTestModel(ms)
m.sites = []models.Site{{ID: 5, Name: "site"}} m.sites = []models.Site{{SiteConfig: models.SiteConfig{ID: 5, Name: "site"}}}
m.state = stateDashboard m.state = stateDashboard
if (&m).detailRefreshCmd() != nil { if (&m).detailRefreshCmd() != nil {
+5 -9
View File
@@ -6,6 +6,7 @@ import (
"strings" "strings"
"time" "time"
"gitea.lerkolabs.com/lerkolabs/uptop/internal/models"
"github.com/charmbracelet/lipgloss" "github.com/charmbracelet/lipgloss"
) )
@@ -16,7 +17,7 @@ func sinApprox(x float64) float64 {
func (m Model) pulseIndicator() string { func (m Model) pulseIndicator() string {
hasDown := false hasDown := false
for _, s := range m.sites { for _, s := range m.sites {
if !s.Paused && !m.isMonitorInMaintenance(s.ID) && (s.Status == "DOWN" || s.Status == "SSL EXP") { if !s.Paused && !m.isMonitorInMaintenance(s.ID) && (s.Status == models.StatusDown || s.Status == models.StatusSSLExp) {
hasDown = true hasDown = true
break break
} }
@@ -84,11 +85,6 @@ func (m Model) View() string {
case stateFormMaint: case stateFormMaint:
title = "New Maintenance Window" title = "New Maintenance Window"
} }
formHeight := m.termHeight - 7
if formHeight < 5 {
formHeight = 5
}
m.huhForm.WithHeight(formHeight)
header := m.st.titleStyle.Render(title) header := m.st.titleStyle.Render(title)
footer := m.st.subtleStyle.Render("\n[Esc] Cancel") footer := m.st.subtleStyle.Render("\n[Esc] Cancel")
return lipgloss.NewStyle().Padding(1, 2).Render(header + "\n\n" + m.huhForm.View() + "\n" + footer) return lipgloss.NewStyle().Padding(1, 2).Render(header + "\n\n" + m.huhForm.View() + "\n" + footer)
@@ -127,9 +123,9 @@ func (m Model) computeStats() dashboardStats {
continue continue
} }
switch site.Status { switch site.Status {
case "DOWN", "SSL EXP": case models.StatusDown, models.StatusSSLExp:
s.downCount++ s.downCount++
case "LATE": case models.StatusLate:
s.lateCount++ s.lateCount++
} }
} }
@@ -269,7 +265,7 @@ func (m Model) renderFooter(stats dashboardStats) string {
var keys string var keys string
switch m.currentTab { switch m.currentTab {
case 0: case 0:
keys = "[/]Filter [n]New [e]Edit [i]Info [d]Del [p]Pause [T]Theme [Tab]Switch [q]Quit" keys = "[/]Filter [n]New [e]Edit [i]Info [d]Del [p]Pause [Space]Collapse [T]Theme [Tab]Switch [q]Quit"
case 1: case 1:
keys = "[n]New [e]Edit [i]Info [d]Del [t]Test [T]Theme [Tab]Switch [q]Quit" keys = "[n]New [e]Edit [i]Info [d]Del [t]Test [T]Theme [Tab]Switch [q]Quit"
case 2: case 2:
+16 -10
View File
@@ -2,6 +2,7 @@ package tui
import ( import (
"fmt" "fmt"
"sort"
"strconv" "strconv"
"strings" "strings"
"time" "time"
@@ -45,7 +46,7 @@ func (m Model) viewDetailPanel() string {
row("Status", m.fmtStatus(site.Status, site.Paused, m.isMonitorInMaintenance(site.ID))) row("Status", m.fmtStatus(site.Status, site.Paused, m.isMonitorInMaintenance(site.ID)))
if (site.Status == "DOWN" || site.Status == "SSL EXP" || site.Status == "LATE" || site.Status == "STALE") && site.LastError != "" { if (site.Status == models.StatusDown || site.Status == models.StatusSSLExp || site.Status == models.StatusLate || site.Status == models.StatusStale) && site.LastError != "" {
errWidth := m.termWidth - chromePadH - 19 errWidth := m.termWidth - chromePadH - 19
if errWidth < 30 { if errWidth < 30 {
errWidth = 30 errWidth = 30
@@ -58,7 +59,7 @@ func (m Model) viewDetailPanel() string {
row("HTTP Code", strconv.Itoa(site.StatusCode)) row("HTTP Code", strconv.Itoa(site.StatusCode))
} }
if (site.Status == "DOWN" || site.Status == "SSL EXP") && site.LastError != "" { if (site.Status == models.StatusDown || site.Status == models.StatusSSLExp) && site.LastError != "" {
chain := connectionChain(site.LastError, site.Type, site.StatusCode, strings.HasPrefix(site.URL, "https")) chain := connectionChain(site.LastError, site.Type, site.StatusCode, strings.HasPrefix(site.URL, "https"))
if len(chain) > 0 { if len(chain) > 0 {
b.WriteString("\n") b.WriteString("\n")
@@ -163,8 +164,14 @@ func (m Model) viewDetailPanel() string {
probeResults := m.engine.GetProbeResults(site.ID) probeResults := m.engine.GetProbeResults(site.ID)
if len(probeResults) > 0 { if len(probeResults) > 0 {
nodeIDs := make([]string, 0, len(probeResults))
for id := range probeResults {
nodeIDs = append(nodeIDs, id)
}
sort.Strings(nodeIDs)
b.WriteString("\n" + m.st.subtleStyle.Render(" PROBE RESULTS") + "\n") b.WriteString("\n" + m.st.subtleStyle.Render(" PROBE RESULTS") + "\n")
for nodeID, result := range probeResults { for _, nodeID := range nodeIDs {
result := probeResults[nodeID]
status := m.st.specialStyle.Render("UP") status := m.st.specialStyle.Render("UP")
if !result.IsUp { if !result.IsUp {
status = m.st.dangerStyle.Render("DN") status = m.st.dangerStyle.Render("DN")
@@ -189,7 +196,7 @@ func (m Model) viewDetailPanel() string {
for i, sc := range stateChanges { for i, sc := range stateChanges {
ago := fmtDuration(time.Since(sc.ChangedAt)) ago := fmtDuration(time.Since(sc.ChangedAt))
arrow := m.st.subtleStyle.Render(sc.FromStatus) + " → " arrow := m.st.subtleStyle.Render(sc.FromStatus) + " → "
if sc.ToStatus == "UP" { if sc.ToStatus == string(models.StatusUp) {
arrow += m.st.specialStyle.Render(sc.ToStatus) arrow += m.st.specialStyle.Render(sc.ToStatus)
} else { } else {
arrow += m.st.dangerStyle.Render(sc.ToStatus) arrow += m.st.dangerStyle.Render(sc.ToStatus)
@@ -198,7 +205,7 @@ func (m Model) viewDetailPanel() string {
if dur := computeOutageDuration(stateChanges, i); dur > 0 { if dur := computeOutageDuration(stateChanges, i); dur > 0 {
line += " " + m.st.warnStyle.Render("outage "+fmtDuration(dur)) line += " " + m.st.warnStyle.Render("outage "+fmtDuration(dur))
} }
if sc.ErrorReason != "" && sc.ToStatus != "UP" { if sc.ErrorReason != "" && sc.ToStatus != string(models.StatusUp) {
line += " " + m.st.dangerStyle.Render(sc.ErrorReason) line += " " + m.st.dangerStyle.Render(sc.ErrorReason)
} }
b.WriteString(line + "\n") b.WriteString(line + "\n")
@@ -207,9 +214,8 @@ func (m Model) viewDetailPanel() string {
} }
b.WriteString(m.divider() + "\n") b.WriteString(m.divider() + "\n")
const sparkWidth = 40
if site.Type == "push" { if site.Type == "push" {
b.WriteString(" " + m.zones.Mark("spark-heartbeat", m.heartbeatSparkline(hist.Statuses, sparkWidth, ""))) b.WriteString(" " + m.zones.Mark("spark-heartbeat", m.heartbeatSparkline(hist.Statuses, detailSparkWidth, "")))
if len(hist.Statuses) > 0 { if len(hist.Statuses) > 0 {
up := 0 up := 0
for _, s := range hist.Statuses { for _, s := range hist.Statuses {
@@ -222,7 +228,7 @@ func (m Model) viewDetailPanel() string {
up, len(hist.Statuses)) up, len(hist.Statuses))
} }
} else { } else {
b.WriteString(" " + m.zones.Mark("spark-latency", m.latencySparkline(hist.Latencies, hist.Statuses, sparkWidth, ""))) b.WriteString(" " + m.zones.Mark("spark-latency", m.latencySparkline(hist.Latencies, hist.Statuses, detailSparkWidth, "")))
var minL, maxL, total time.Duration var minL, maxL, total time.Duration
count := 0 count := 0
for i, l := range hist.Latencies { for i, l := range hist.Latencies {
@@ -249,12 +255,12 @@ func (m Model) viewDetailPanel() string {
} }
if m.sparkTooltipIdx >= 0 { if m.sparkTooltipIdx >= 0 {
b.WriteString("\n" + m.renderSparkTooltip(site, hist, sparkWidth)) b.WriteString("\n" + m.renderSparkTooltip(site, hist, detailSparkWidth))
} }
b.WriteString("\n") b.WriteString("\n")
b.WriteString(m.divider() + "\n") b.WriteString(m.divider() + "\n")
b.WriteString(m.st.subtleStyle.Render(" [i/Esc] Back [e] Edit [h] History [s] SLA [click] Inspect [q] Quit")) b.WriteString(m.st.subtleStyle.Render(" [q/Esc] Back [e] Edit [h] History [s] SLA [click] Inspect"))
return lipgloss.NewStyle().Padding(1, 2).Render(b.String()) return lipgloss.NewStyle().Padding(1, 2).Render(b.String())
} }
+6 -6
View File
@@ -17,14 +17,14 @@ type historyStats struct {
func computeOutageDuration(changes []models.StateChange, idx int) time.Duration { func computeOutageDuration(changes []models.StateChange, idx int) time.Duration {
sc := changes[idx] sc := changes[idx]
if sc.ToStatus != "UP" { if sc.ToStatus != string(models.StatusUp) {
return 0 return 0
} }
if idx+1 >= len(changes) { if idx+1 >= len(changes) {
return 0 return 0
} }
prev := changes[idx+1] prev := changes[idx+1]
if prev.ToStatus == "UP" { if prev.ToStatus == string(models.StatusUp) {
return 0 return 0
} }
dur := sc.ChangedAt.Sub(prev.ChangedAt) dur := sc.ChangedAt.Sub(prev.ChangedAt)
@@ -122,11 +122,11 @@ func (m Model) buildHistoryContent() string {
arrow := m.st.subtleStyle.Render(sc.FromStatus) + " → " arrow := m.st.subtleStyle.Render(sc.FromStatus) + " → "
switch sc.ToStatus { switch sc.ToStatus {
case "UP": case string(models.StatusUp):
arrow += m.st.specialStyle.Render(sc.ToStatus) arrow += m.st.specialStyle.Render(sc.ToStatus)
case "LATE": case string(models.StatusLate):
arrow += m.st.warnStyle.Render(sc.ToStatus) arrow += m.st.warnStyle.Render(sc.ToStatus)
case "STALE": case string(models.StatusStale):
arrow += m.st.staleStyle.Render(sc.ToStatus) arrow += m.st.staleStyle.Render(sc.ToStatus)
default: default:
arrow += m.st.dangerStyle.Render(sc.ToStatus) arrow += m.st.dangerStyle.Render(sc.ToStatus)
@@ -138,7 +138,7 @@ func (m Model) buildHistoryContent() string {
} }
reason := "" reason := ""
if sc.ErrorReason != "" && sc.ToStatus != "UP" { if sc.ErrorReason != "" && sc.ToStatus != string(models.StatusUp) {
reason = m.st.dangerStyle.Render(limitStr(sc.ErrorReason, reasonWidth)) reason = m.st.dangerStyle.Render(limitStr(sc.ErrorReason, reasonWidth))
} }