70 Commits

Author SHA1 Message Date
lerko 10f249a2ae chore: add TUI screenshots via VHS
CI / test (pull_request) Successful in 2m50s
CI / lint (pull_request) Failing after 1m12s
CI / vulncheck (pull_request) Successful in 56s
Screenshots capture 4 views: monitors dashboard (hero), detail panel,
alerts tab, and logs tab. Includes VHS tape, demo seed config, and
setup script for reproducible captures.

Also fixes latencySparkline to color DOWN checks red instead of green
— previously failed checks with 0ms latency rendered as green bars.
2026-05-28 18:30:39 -04:00
lerko cfbf01274d chore(tui): visual polish — detail sections, column headers, alert detail (#37)
CI / test (push) Successful in 2m40s
CI / lint (push) Successful in 1m2s
CI / vulncheck (push) Successful in 51s
Release / docker (push) Has been cancelled
Release / release (push) Has been cancelled
## Summary

Bundled remaining UX polish items from the screenshot review.

### Changes

**Detail panel sections (#5)**
- Fields grouped into ENDPOINT, TIMING, HTTP, CONFIG sections with subtle headers
- Matches existing PROBE RESULTS and STATE CHANGES section pattern
- Cleaner visual hierarchy without box-drawing clutter

**Omit unconfigured fields (#6)**
- Timeout hidden when 0 (unconfigured)
- Method hidden when default GET
- AcceptedCodes shows "200-299" explicitly when empty

**Column header (#7)**
- `LATENCY` → `LAT` (design short, never truncate — htop/btop pattern)

**Alert detail view (#8)**
- `i` key on Alerts tab opens full detail panel
- Shows: type, health status, last sent time, send/fail counts, last error
- Full config key:value pairs (untruncated)
- Keybinding: `[i/Esc] Back  [e] Edit  [t] Test  [q] Quit`

### Files (3)
- `internal/tui/tab_sites.go` — section headers, field omission, LAT header
- `internal/tui/tab_alerts.go` — viewAlertDetailPanel()
- `internal/tui/tui.go` — stateAlertDetail, key handler, render routing

Reviewed-on: lerko/uptop#37
2026-05-28 20:40:29 +00:00
lerko 26e297cbae Merge pull request 'feat: alert channel health indicator + test alerts' (#36) from feat/alert-health into main
CI / test (push) Successful in 2m48s
CI / lint (push) Successful in 1m17s
CI / vulncheck (push) Successful in 1m6s
Reviewed-on: lerko/uptop#36
2026-05-28 01:33:00 +00:00
lerko 0aa2f9cd8a feat: alert channel health indicator + test alerts
CI / test (pull_request) Successful in 2m46s
CI / lint (pull_request) Successful in 1m1s
CI / vulncheck (pull_request) Successful in 51s
Track alert delivery health at runtime:
- AlertHealth struct: LastSendAt, LastSendOK, LastError, SendCount, FailCount
- triggerAlert records success/failure after each Send()
- Health data exposed via GetAlertHealth() for TUI

Alerts tab enriched:
- Health dot column: green (OK), red (failed), gray (never sent)
- LAST SENT column: relative time ("2m ago", "never")
- [t] key sends test notification through selected channel

Inspired by Grafana's contact point health columns.
2026-05-27 21:23:06 -04:00
lerko f17f06a1c6 Merge pull request 'feat: logs tab overhaul — severity tags, filtering, recovery durations' (#35) from feat/logs-overhaul into main
CI / test (push) Successful in 2m47s
CI / lint (push) Successful in 1m16s
CI / vulncheck (push) Successful in 56s
Reviewed-on: lerko/uptop#35
2026-05-28 00:35:24 +00:00
lerko b14d5e19db feat: logs tab overhaul — severity tags, filtering, recovery durations
CI / test (pull_request) Successful in 2m36s
CI / lint (pull_request) Successful in 1m1s
CI / vulncheck (pull_request) Successful in 51s
Logs tab visual overhaul:
- Severity-classified entries: DOWN (red), UP (green), WARN (amber),
  SYS (cyan), info (gray) — rendered as inline tags, not whole-line color
- Column-aligned format: [timestamp] [severity tag] [message]
- Filter toggle (f key): All vs Important only (hides retry noise)
- Header shows entry count, filter state, hidden count

Engine log improvements:
- Recovery messages include downtime duration ("was down 14m")
- LATE transition logged ("heartbeat overdue")
- Push monitor recovery includes downtime duration
2026-05-27 20:14:43 -04:00
lerko a2b38ddc60 Merge pull request 'feat: proper push monitor lifecycle — PENDING, LATE, DOWN' (#34) from feat/push-monitor-states into main
CI / test (push) Successful in 2m48s
CI / lint (push) Successful in 1m17s
CI / vulncheck (push) Successful in 56s
Reviewed-on: lerko/uptop#34
2026-05-28 00:01:56 +00:00
lerko 5dc31108f8 feat: proper push monitor lifecycle — PENDING, LATE, DOWN states
CI / test (pull_request) Successful in 2m41s
CI / lint (pull_request) Successful in 1m7s
CI / vulncheck (pull_request) Successful in 46s
Push monitors no longer lie about status:

- PENDING stays until first heartbeat (no auto-promote to UP)
- LATE state (amber) when overdue but within grace period
- DOWN only after grace period expires
- Grace period = interval/2, minimum 60s

RecordHeartbeat now handles all transitions:
- PENDING → UP (first heartbeat, logged)
- LATE → UP (late arrival, logged)
- DOWN → UP (recovery, alert + state change persisted)

TUI updates:
- LATE rendered in amber/warning color
- Status bar shows LATE count separately
- Tab badge shows ⚠ for late monitors
- Sort order: DOWN > LATE > UP > PENDING > PAUSED
- Detail panel shows error for LATE monitors

Inspired by Healthchecks.io state machine (new/up/grace/down).
2026-05-27 19:56:50 -04:00
lerko 63773b13d0 Merge pull request 'feat: show error reason when monitors go DOWN' (#33) from feat/error-reason into main
CI / test (push) Successful in 2m51s
CI / lint (push) Successful in 1m1s
CI / vulncheck (push) Successful in 56s
Reviewed-on: lerko/uptop#33
2026-05-27 23:38:26 +00:00
lerko bc3a44beac feat: show error reason when monitors go DOWN
CI / test (pull_request) Successful in 2m42s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Successful in 51s
Propagate check failure reasons through the entire stack:
- Checker captures specific errors (DNS, timeout, HTTP status, SSL, etc.)
- Engine tracks LastError, StatusChangedAt, LastSuccessAt per monitor
- State transitions persisted to new state_changes table
- Detail panel shows error reason, HTTP code, state duration, last
  success time, and last 5 state change events
- Monitor table shows inline error preview for DOWN services
- Alert messages include error reason
- Probe nodes forward error reasons to leader

15 files changed across models, checker, engine, store, TUI, and probes.
2026-05-27 19:32:30 -04:00
lerko d8a2cab90f feat: seed SSH users from env var and authorized_keys file (#31)
CI / test (push) Successful in 2m36s
CI / lint (push) Successful in 1m12s
CI / vulncheck (push) Successful in 56s
Release / release (push) Has been cancelled
Release / docker (push) Has been cancelled
## Summary

Docker onboarding was broken — no way to add first SSH user without `docker attach` to TUI.

Now reads SSH public keys from two sources on startup:
- `UPTOP_ADMIN_KEY` env var — single key for quick single-user setup
- `UPTOP_KEYS` file path — authorized_keys format for team setup

Dockerfile already sets `UPTOP_KEYS=/data/authorized_keys` and compose mounts `./data:/data`, so the flow is:

```
echo "ssh-ed25519 AAAA... me@host" > ./data/authorized_keys
docker compose up -d
ssh -p 23234 localhost
```

### Behavior
- Skips keys already in DB (idempotent across restarts)
- All seeded users get admin role
- Username parsed from key comment (e.g. `tyler@macbook` → `tyler`)
- Comments and blank lines in keys file are ignored

### Tested
- UPTOP_ADMIN_KEY seeds single admin user
- UPTOP_KEYS file seeds multiple users with correct usernames
- Second startup skips existing keys (no duplicates)
- Build and all tests pass

Reviewed-on: lerko/uptop#31
2026-05-27 21:15:00 +00:00
lerko ea721601ab Merge pull request 'ci: overhaul pipeline — caching, GoReleaser, govulncheck' (#30) from ci/pipeline-overhaul into main
CI / test (push) Successful in 2m50s
CI / lint (push) Successful in 1m11s
CI / vulncheck (push) Successful in 56s
Reviewed-on: lerko/uptop#30
2026-05-27 00:37:32 +00:00
lerko b1935aa682 fix(deps): bump golang.org/x/crypto v0.47.0 → v0.52.0
CI / test (pull_request) Successful in 2m46s
CI / lint (pull_request) Successful in 1m12s
CI / vulncheck (pull_request) Successful in 56s
Fixes 7 vulns (GO-2026-5014 through GO-2026-5023) found by govulncheck.
Also bumps x/net, x/sys, x/text, x/sync, x/mod, x/tools to latest.
2026-05-26 20:20:23 -04:00
lerko 2cd3dcddb4 chore: bump Go 1.24.4 → 1.26.3, Alpine 3.21 → 3.23
CI / test (pull_request) Successful in 2m57s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Failing after 1m1s
Go 1.24 EOL since Feb 2026. Fixes 33 stdlib vulns found by
govulncheck (database/sql, os/exec, net/http). Gets Green Tea GC.
2026-05-26 20:12:43 -04:00
lerko 7d4ef1f594 fix(ci): remove explicit container, use sh shell
CI / test (pull_request) Successful in 2m44s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Failing after 1m7s
Act runner is Alpine-based — container: directive breaks node-based
actions (checkout, setup-go). Runner already has apk natively.
Added shell: sh to all jobs since runner lacks bash.
2026-05-26 18:44:08 -04:00
lerko f0ff87c0d0 fix(ci): rename GITEA_TOKEN to RELEASE_TOKEN
CI / test (pull_request) Failing after 31s
CI / lint (pull_request) Successful in 1m9s
CI / vulncheck (pull_request) Failing after 15s
Gitea reserves the GITEA_ prefix for repo action secrets.
2026-05-26 18:36:11 -04:00
lerko 5aab391b74 ci: overhaul pipeline — caching, GoReleaser, govulncheck
- Add module + build cache to CI (was only caching go-build, not go/pkg/mod)
- Declare explicit Alpine container instead of relying on runner image
- Drop redundant go vet (already in golangci-lint)
- Add govulncheck job for dependency CVE scanning
- Add GoReleaser config for Gitea-native binary releases + checksums
- Replace .github/workflows/docker.yml with .gitea/workflows/release.yml
- Docker multiarch (amd64+arm64) via buildx in release workflow
- Dockerfile: add --mount=type=cache for mod/build, add -trimpath
2026-05-26 18:24:19 -04:00
lerko 8ad213c96c Merge pull request 'fix(security): phase 4 code quality and low-severity fixes' (#29) from security/phase-4-quality into main
CI / test (push) Successful in 4m35s
CI / lint (push) Successful in 1m7s
Reviewed-on: lerko/uptop#29
2026-05-26 21:31:40 +00:00
lerko 986f9f1d55 fix(security): phase 4 code quality and low-severity fixes
CI / test (pull_request) Successful in 4m24s
CI / lint (pull_request) Successful in 1m1s
- Fix limitStr to handle multi-byte UTF-8 characters correctly
- Sanitize log messages: strip ANSI escape sequences and newlines
- URL-encode probe node_id instead of string concatenation
- Fix follower resp.Body leak on non-200 responses
- Make SSH host key path configurable via UPTOP_SSH_HOST_KEY env var
- Add HTTP method checks on GET-only endpoints (405 for wrong methods)
- Extract magic numbers into named constants across monitor/store/server
- Standardize error output to stderr for all startup errors
2026-05-26 17:25:47 -04:00
lerko c50ec82dcb Merge pull request 'fix(security): phase 3 medium reliability and hardening' (#28) from security/phase-3-reliability into main
CI / test (push) Successful in 4m25s
CI / lint (push) Successful in 1m6s
Reviewed-on: lerko/uptop#28
2026-05-26 21:07:30 +00:00
lerko bd561d9a5e fix(security): phase 3 medium reliability and hardening
CI / test (pull_request) Successful in 4m23s
CI / lint (pull_request) Successful in 1m11s
- Fail hard on critical migration errors (ignore only "already exists")
- Cache SSH user keys with 30s TTL (avoid DB query per auth attempt)
- Configure DB connection pooling (25 open, 5 idle, 5m lifetime)
- Enable SQLite WAL mode for concurrent read/write
- Optimize check history pruning (only prune above 1100 rows)
- Add security headers: X-Content-Type-Options, X-Frame-Options, CSP, Referrer-Policy
- Add CORS policy on /status/json via UPTOP_CORS_ORIGIN env var
- Add HTTP request logging middleware (method, path, status, duration, IP)
- Fix config file permissions from 0644 to 0600
- Pin Docker images: golang:1.24-alpine3.21, alpine:3.21
- Fix Docker CI tag pattern for CalVer (was semver)
- Pass build args (VERSION, COMMIT, BUILD_DATE) to Docker build
2026-05-26 16:57:03 -04:00
lerko 7a8f2ad15b Merge pull request 'fix(security): phase 2 high-severity hardening' (#27) from security/phase-2-hardening into main
CI / test (push) Successful in 4m33s
CI / lint (push) Successful in 1m6s
Reviewed-on: lerko/uptop#27
2026-05-26 15:31:18 +00:00
lerko d30d1460bd fix(security): phase 2 high-severity hardening
CI / test (pull_request) Successful in 4m31s
CI / lint (pull_request) Successful in 56s
- Push heartbeat accepts Authorization: Bearer header (query string deprecated)
- Gotify alerts use X-Gotify-Key header instead of token in URL
- Per-IP rate limiting on all API endpoints (token-bucket)
- /metrics gated behind cluster secret (UPTOP_METRICS_PUBLIC=true to opt out)
- Config export redacts passwords/tokens by default (redact_secrets=false to override)
- Fix rewritePlaceholders for 100+ SQL parameters
- Fix AddSiteReturningID/AddAlertReturningID race with LastInsertId/RETURNING
- HTTP server timeouts: read 30s, write 60s, idle 120s
2026-05-25 21:15:33 -04:00
lerko b43dfae98f Merge pull request 'fix(security): phase 1 critical fixes for public release' (#26) from security/phase-1-critical into main
CI / test (push) Successful in 4m19s
CI / lint (push) Successful in 1m6s
Reviewed-on: lerko/uptop#26
2026-05-26 00:43:52 +00:00
lerko 60b30935b3 fix(security): phase 1 critical fixes for public release
CI / test (pull_request) Successful in 4m40s
CI / lint (pull_request) Successful in 1m2s
- Redact PostgreSQL DSN password from stdout/logs
- Harden .dockerignore to exclude .ssh/, .claude/, *.db, *.local files
- SSRF protection: block private/loopback/link-local IPs by default
  (UPTOP_ALLOW_PRIVATE_TARGETS=true to override for homelab use)
- Fix email header injection via CRLF in monitor names
- AES-256-GCM encryption for alert credentials at rest
  (UPTOP_ENCRYPTION_KEY env var, migrate-secrets subcommand)
- TLS support for HTTP server (UPTOP_TLS_CERT/UPTOP_TLS_KEY)
  with HSTS header when TLS enabled
2026-05-25 11:26:47 -04:00
lerko b70edaace5 Merge pull request 'chore: rename project from go-upkeep to uptop' (#25) from chore/rename-uptop into main
CI / test (push) Successful in 4m26s
CI / lint (push) Successful in 1m1s
Reviewed-on: lerko/uptime#25
2026-05-25 01:02:30 +00:00
lerko 9d12e3ecf1 chore: complete rename from go-upkeep to uptop
CI / test (pull_request) Successful in 4m26s
CI / lint (pull_request) Successful in 1m11s
- Module path: gitea.lerkolabs.com/lerko/uptop
- Binary: cmd/uptop/
- All imports updated to full module path
- Env vars: UPKEEP_* → UPTOP_*
- Prometheus metrics: upkeep_* → uptop_*
- Default DB: uptop.db
- Docker image: lerko/uptop
- All docs, compose files, CI updated

Only remaining "go-upkeep" reference is the fork attribution in README.
2026-05-24 20:20:35 -04:00
lerko 36a4b69837 Merge pull request 'feat(tui): theme system with 5 curated dark palettes' (#24) from feat/themes into main
CI / test (push) Successful in 4m51s
CI / lint (push) Successful in 1m12s
Reviewed-on: lerko/uptime#24
2026-05-24 23:30:25 +00:00
lerko fee84c9363 fix(tui): tighten zebra row contrast for Tokyo Night and Gruvbox
CI / test (pull_request) Successful in 4m48s
CI / lint (pull_request) Successful in 1m11s
Previous ZebraBg was too far from Bg, washing out text on those
themes. Reduced to a 2-step shift for subtle row alternation.
2026-05-24 19:19:51 -04:00
lerko 87edd4aa40 feat(tui): swap light theme for Tokyo Night and Gruvbox
Light theme doesn't work well on dark terminals. Replace with
two proven dark palettes. Now 5 themes: Flexoki Dark, Tokyo Night,
Catppuccin Mocha, Nord, Gruvbox.
2026-05-24 19:10:29 -04:00
lerko 602f1b2c52 feat(tui): add theme system with 4 curated palettes
Flexoki Dark (default), Flexoki Light, Catppuccin Mocha, Nord.
Press T to cycle themes; selection persists in preferences.

All hardcoded colors replaced with theme-driven values.
Dedicated ZebraBg per theme for subtle row striping.
2026-05-24 19:05:40 -04:00
lerko 6e659cf6ee Merge pull request 'fix(tui): scope form validators to relevant monitor types' (#23) from fix/ssl-threshold-validation into main
CI / test (push) Successful in 4m48s
CI / lint (push) Successful in 56s
Reviewed-on: lerko/uptime#23
2026-05-24 22:03:33 +00:00
lerko 0a56f01929 fix(tui): guard max retries validator for group type
CI / test (pull_request) Successful in 4m40s
CI / lint (pull_request) Successful in 1m1s
Consistent with interval/timeout validators that already skip for
group monitors. Prevents potential validation block if field is
cleared while editing.
2026-05-24 17:45:19 -04:00
lerko b5b9cc81a5 fix(tui): skip irrelevant field validation by monitor type
URL, SSL threshold, and port validators blocked form progression
when editing monitors that don't use those fields (e.g. ping monitors
failing URL validation, non-SSL sites failing threshold check).

Scope each validator to fire only for its relevant monitor type.
2026-05-24 17:38:40 -04:00
lerko f64b46f055 Merge pull request 'ci: cache Go build artifacts between runs' (#22) from chore/ci-cache into main
CI / test (push) Successful in 4m40s
CI / lint (push) Successful in 1m12s
Reviewed-on: lerko/uptime#22
2026-05-24 20:01:05 +00:00
lerko d038361320 ci: cache Go build artifacts between runs
CI / test (pull_request) Successful in 6m18s
CI / lint (pull_request) Successful in 56s
2026-05-24 15:52:21 -04:00
lerko d03dc0c1ea Merge pull request 'docs: community polish for public readiness' (#21) from chore/community-polish into main
CI / test (push) Successful in 4m44s
CI / lint (push) Successful in 1m7s
Reviewed-on: lerko/uptime#21
2026-05-24 19:40:00 +00:00
lerko 1fa2b1d98c docs: add install instructions and Kuma migration guide to README
CI / test (pull_request) Successful in 4m36s
CI / lint (pull_request) Successful in 1m2s
2026-05-24 15:33:13 -04:00
lerko 09e1bec9a3 docs: add SECURITY.md with disclosure policy 2026-05-24 14:15:25 -04:00
lerko deb7d017af docs: add CONTRIBUTING.md 2026-05-24 14:15:11 -04:00
lerko 1e0ae22447 docs: add CHANGELOG.md with release history 2026-05-24 14:14:57 -04:00
lerko 611f26846c chore: update LICENSE with dual copyright for independent fork 2026-05-24 14:14:35 -04:00
lerko 8f9210b451 feat: add --version flag with build metadata injection
Supports `goupkeep version`, `--version`, and `-v`. Prints version,
commit hash, and build date when injected via ldflags. Shows "dev"
for local builds. Dockerfile updated with ARGs for version injection.
2026-05-24 14:14:13 -04:00
lerko cc8d76fdbc Merge pull request 'chore: add linter config and CI pipeline' (#20) from chore/linter-ci-pipeline into main
CI / test (push) Successful in 4m52s
CI / lint (push) Successful in 1m11s
Reviewed-on: lerko/uptime#20
2026-05-24 17:56:05 +00:00
lerko 26268bb6ef fix(ci): install gcc for race detector support
CI / test (pull_request) Successful in 4m41s
CI / lint (pull_request) Successful in 1m17s
2026-05-24 12:49:21 -04:00
lerko 5915e0ebe3 fix(ci): enable CGO for race detector, use lint-action v7
CI / test (pull_request) Failing after 1m27s
CI / lint (pull_request) Successful in 1m37s
2026-05-24 12:45:28 -04:00
lerko 6d7ecc46eb fix(ci): use sh instead of bash for runner compatibility
CI / test (pull_request) Failing after 46s
CI / lint (pull_request) Failing after 20s
2026-05-24 12:42:49 -04:00
lerko fb3f96f608 ci: add Gitea Actions pipeline for test and lint
CI / test (pull_request) Failing after 9s
CI / lint (pull_request) Failing after 56s
Runs on push to main and on pull requests. Two parallel jobs:
- test: go vet + go test -race
- lint: golangci-lint via official action
2026-05-23 22:02:26 -04:00
lerko 359cff7292 chore: add golangci-lint config and fix all lint issues
Add .golangci.yml enabling errcheck, staticcheck, govet, gosec,
ineffassign, and unused linters. Fix 66 issues across 16 files:
- Check all unchecked errors (errcheck)
- Use HTTP status constants instead of numeric literals (staticcheck)
- Replace deprecated LineUp/LineDown with ScrollUp/ScrollDown (staticcheck)
- Convert sprintf+write patterns to fmt.Fprintf (staticcheck)
- Add ReadHeaderTimeout to http.Server (gosec)
- Remove unused types and functions (unused)
- Add nolint comments for intentional patterns (InsecureSkipVerify,
  math/rand for jitter, dialect-only SQL formatting)
2026-05-23 22:02:06 -04:00
lerko da61ce0f88 Merge pull request 'fix: critical bugs and security hardening' (#19) from fix/critical-bugs-security-hardening into main 2026-05-24 01:45:11 +00:00
lerko 7398f520f0 test(cluster): add tests for follower failover and probe operations
15 tests covering leader/follower mode selection, follower failover
after 3 consecutive health check failures, recovery when leader returns,
secret header propagation, context cancellation, probe registration,
assignment fetching, concurrent check execution (verifies 10-semaphore
cap), and result reporting.
2026-05-23 21:23:26 -04:00
lerko c6d120d7a4 test(server): add HTTP handler tests for all API endpoints
24 tests covering push heartbeat, health check, backup export/import,
probe registration/assignments/results, and status page endpoints.
Tests verify auth enforcement (constant-time secret), method validation,
input validation, token stripping on status JSON, and maintenance
window overrides.
2026-05-23 21:10:32 -04:00
lerko 94296e8286 test(monitor): add comprehensive test suite for engine and checkers
55 tests covering state machine transitions, heartbeat handling, push
deadline checks, group aggregation, history recording, probe aggregation,
log management, state management, and concurrency safety.

Checker tests cover HTTP (via httptest), port (via net.Listen),
isCodeAccepted ranges, and siteTimeout defaults. Ping and DNS
checkers skipped (need ICMP privileges and DNS server).

Coverage: 64.2% overall, 100% on handleStatusChange, triggerAlert,
checkPush, recordCheck, and AggregateStatus.
2026-05-23 21:06:28 -04:00
lerko 4b5495fb49 fix(monitor): add jitter to check intervals and stagger startup
Monitors with the same interval no longer fire simultaneously.
Each tick adds up to 10% random jitter. Initial checks stagger
over 0-3s to avoid thundering herd on startup.
2026-05-23 20:05:30 -04:00
lerko 4891843c94 fix: graceful shutdown for HTTP, SSH servers and database
HTTP and SSH servers now shut down cleanly on SIGINT/SIGTERM with a
30s timeout. Database connection closed via defer. Replaced log.Fatalf
in SSH goroutine with log.Printf + ErrServerClosed check to prevent
unclean process exits.
2026-05-23 13:23:27 -04:00
lerko 93c5b638cf fix(server): constant-time secret comparison, request size limits
Replace string equality checks on cluster secret with
crypto/subtle.ConstantTimeCompare to prevent timing attacks.
Add http.MaxBytesReader (1MB) to all POST endpoints that decode
JSON bodies. Change Start() to return *http.Server for graceful
shutdown support. Replace log.Fatalf with log.Printf in HTTP
server goroutine.
2026-05-23 13:20:28 -04:00
lerko 8e6d97710b fix(alert): add context to Provider.Send, log alert failures
Provider.Send now accepts context.Context for timeout/cancellation.
HTTPProvider and NtfyProvider use NewRequestWithContext so HTTP alerts
respect the 30s deadline. triggerAlert logs send failures and config
load errors instead of silently swallowing them.
2026-05-23 13:18:04 -04:00
lerko ae141c62ba fix(store): replace panic with error return, handle unmarshal errors
generateToken() now returns (string, error) instead of panicking on
crypto/rand failure. All json.Unmarshal calls for alert settings now
check and propagate errors instead of silently ignoring them.

Adds Close() to Store interface for graceful shutdown support.
Skips malformed notification entries during Kuma import.
2026-05-23 13:15:39 -04:00
lerko ba53845193 Merge pull request 'fix(tui): visual polish and layout improvements' (#18) from fix/tui-visual-polish into main
Reviewed-on: lerko/uptime#18
2026-05-23 16:12:57 +00:00
lerko fb11e9ba85 fix(tui): stable monitor count and universal group icons
Site count in tab label and footer now reflects total monitors
(excluding groups) regardless of collapse state. Down count also
computed from all sites so collapsed groups with down children
still surface in the badge. Replaced Nerd Font folder glyphs
with standard Unicode triangles for cross-font compatibility.
2026-05-23 11:01:34 -04:00
lerko e84b64f8ed feat(tui): zebra striping, detail breadcrumb, sparkline stats, collapse persistence
Add alternating row backgrounds for easier table scanning. Detail panel
now shows breadcrumb path (Sites > Group > Name) and min/avg/max latency
stats below the sparkline. Group collapse state persists across restarts
via new preferences table in both SQLite and Postgres.
2026-05-22 20:53:23 -04:00
lerko 88e4f0ed69 fix(tui): group selection highlight, layout constants, group history graphs
Group rows now show selection background when navigated to. Layout
chrome extracted to named constants to prevent viewport drift. Groups
display aggregate history as dot sparkline (●) distinct from site
bar sparklines, with uptime computed from active children only.
Paused and maintenance children excluded from group aggregates.
2026-05-22 20:26:49 -04:00
lerko 8e948bf187 Merge pull request 'feat: incident management and maintenance windows' (#17) from feat/incident-management into main
Reviewed-on: lerko/uptime#17
2026-05-22 23:34:16 +00:00
lerko dc672d6cba fix(tui): exclude maintenance'd monitors from down count and pulse
Sites badge, status line, and pulse indicator now skip monitors under
maintenance when counting DOWN — consistent with group behavior.
2026-05-22 19:25:27 -04:00
lerko a89584dac1 fix(engine): skip children in maintenance when computing group status
Group status now treats maintenance'd children like paused ones —
they're excluded from the UP/DOWN calculation. Prevents group from
showing DOWN when its only failing child is under maintenance.
2026-05-22 19:19:08 -04:00
lerko d437f54797 fix(tui): constrain form height to terminal and forward resize events
Forms overflowed past terminal because huh didn't know about the
surrounding chrome (header, footer, padding). Now sets WithHeight()
on every render and forwards WindowSizeMsg during form state.
2026-05-22 19:06:27 -04:00
lerko b146f34d19 feat: add incident management and maintenance windows
Maintenance windows suppress alerts during planned downtime while checks
continue running. Incidents provide informational tracking. Supports
targeting all monitors, single monitor, or group (applies to children).

New Maint tab in TUI with create/end/delete. Status page, JSON API, and
Prometheus metrics all reflect maintenance state.
2026-05-22 18:45:02 -04:00
lerko 5de834465f Merge pull request 'fix(tui): correct viewport sizing and dynamic chrome calculation' (#16) from fix/tui-viewport-sizing into main
Reviewed-on: lerko/uptime#16
2026-05-22 22:22:10 +00:00
lerko ea401136a9 fix(tui): correct viewport sizing and dynamic chrome calculation
Replace hardcoded row offset with counted chrome lines, account for
filter bar, and fix log viewport dimensions.
2026-05-22 18:19:08 -04:00
lerko 5a9b19b3e8 chore: add production docker-compose.yml
Single-container SQLite deploy for `docker compose up -d`.
2026-05-22 15:00:09 -04:00
70 changed files with 6836 additions and 1086 deletions
+12
View File
@@ -1,3 +1,15 @@
.git
tmp/
vendor/
# Security: keep sensitive/local files out of Docker build context
.ssh/
.claude/
.github/
.gitea/
CLAUDE.md
*.local.json
*.local.md
*.local
*.db
*.db-journal
+73
View File
@@ -0,0 +1,73 @@
name: CI
on:
push:
branches: [main]
pull_request:
env:
GO_VERSION: "1.26"
jobs:
test:
runs-on: ubuntu-latest
defaults:
run:
shell: sh
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: "1.26"
- uses: actions/cache@v4
with:
path: |
~/go/pkg/mod
~/.cache/go-build
key: go-${{ hashFiles('go.sum') }}
restore-keys: go-
- name: Install build tools
run: apk add --no-cache gcc musl-dev
- name: Download modules
run: go mod download
- name: Test
run: CGO_ENABLED=1 go test -race -timeout 120s ./...
lint:
runs-on: ubuntu-latest
defaults:
run:
shell: sh
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: "1.26"
- uses: golangci/golangci-lint-action@v7
with:
version: v2.11.2
vulncheck:
runs-on: ubuntu-latest
defaults:
run:
shell: sh
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: "1.26"
- name: Install govulncheck
run: go install golang.org/x/vuln/cmd/govulncheck@latest
- name: Run govulncheck
run: govulncheck ./...
+68
View File
@@ -0,0 +1,68 @@
name: Release
on:
push:
tags:
- "[0-9]*"
jobs:
release:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/setup-go@v5
with:
go-version: "1.26"
- uses: actions/cache@v4
with:
path: |
~/go/pkg/mod
~/.cache/go-build
key: release-go-${{ hashFiles('go.sum') }}
restore-keys: release-go-
- name: Run GoReleaser
uses: goreleaser/goreleaser-action@v7
with:
distribution: goreleaser
version: "~> v2"
args: release --clean
env:
GORELEASER_FORCE_TOKEN: gitea
GITEA_TOKEN: ${{ secrets.RELEASE_TOKEN }}
docker:
runs-on: ubuntu-latest
needs: [release]
steps:
- uses: actions/checkout@v4
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
platforms: linux/amd64,linux/arm64
tags: |
lerkolabs/uptop:${{ github.ref_name }}
lerkolabs/uptop:latest
build-args: |
VERSION=${{ github.ref_name }}
COMMIT=${{ github.sha }}
BUILD_DATE=${{ github.event.head_commit.timestamp }}
-45
View File
@@ -1,45 +0,0 @@
name: Publish Release
on:
push:
tags:
- 'v*'
jobs:
push_to_registry:
name: Build and Push Docker Image
runs-on: ubuntu-latest
steps:
- name: Check out the repo
uses: actions/checkout@v4
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Extract metadata (tags, labels)
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ secrets.DOCKERHUB_USERNAME }}/go-upkeep
tags: |
# This turns git tag "v1.0.0" into docker tag "1.0.0"
type=semver,pattern={{version}}
# This updates the "latest" tag to this version
type=raw,value=latest
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
+2 -5
View File
@@ -26,8 +26,8 @@ go.work
# End of https://www.toptal.com/developers/gitignore/api/go
/goupkeep
upkeep.db
/uptop
uptop.db
.ssh
@@ -35,8 +35,5 @@ authorized_keys
tmp
# Old repo
/go-upkeep/
*.local.json
*.local.md
+29
View File
@@ -0,0 +1,29 @@
version: "2"
linters:
default: none
enable:
- errcheck
- staticcheck
- govet
- gosec
- ineffassign
- unused
settings:
errcheck:
check-type-assertions: false
check-blank: false
exclusions:
presets:
- std-error-handling
- common-false-positives
rules:
- path: _test\.go
linters:
- errcheck
- gosec
run:
timeout: 5m
+42
View File
@@ -0,0 +1,42 @@
version: 2
gitea_urls:
api: https://gitea.lerkolabs.com/api/v1
download: https://gitea.lerkolabs.com
release:
gitea:
owner: lerko
name: uptop
builds:
- main: ./cmd/uptop/main.go
binary: uptop
env:
- CGO_ENABLED=1
goos:
- linux
goarch:
- amd64
ldflags:
- -s -w
- -X main.version={{ .Version }}
- -X main.commit={{ .Commit }}
- -X main.date={{ .Date }}
flags:
- -trimpath
archives:
- formats: [tar.gz]
name_template: "{{ .ProjectName }}_{{ .Os }}_{{ .Arch }}"
checksum:
name_template: checksums.txt
changelog:
sort: asc
filters:
exclude:
- "^docs:"
- "^chore:"
- "^style:"
+46
View File
@@ -0,0 +1,46 @@
# Changelog
## [2026.05.2] — 2026-05-23
### Added
- Comprehensive test suite (94 tests across monitor, server, cluster)
- golangci-lint config with CI enforcement
- Gitea Actions CI pipeline (test + lint)
- Graceful shutdown for HTTP and SSH servers
- Context-aware alert delivery with timeout
- Request size limits on all POST endpoints
- Constant-time secret comparison
- Check interval jitter to prevent thundering herd
- `--version` flag with build metadata injection
### Fixed
- Silent JSON unmarshal failures in alert settings
- Panic on crypto/rand failure replaced with error return
- Alert delivery errors now logged instead of swallowed
- log.Fatalf in goroutines replaced with log.Printf
- Deprecated LineUp/LineDown API calls
### Security
- Cluster secret compared with crypto/subtle (timing-safe)
- http.MaxBytesReader on all JSON endpoints
- ReadHeaderTimeout added to HTTP server
## [2026.05.1] — 2026-05-14
### Added
- Distributed probing with leader + probe nodes
- Config-as-code (YAML apply/export with dry-run, prune)
- TUI visual polish (zebra striping, sparklines, breadcrumbs)
- Incident management and maintenance windows
- 9 alert providers (Discord, Slack, Email, Ntfy, Telegram, PagerDuty, Pushover, Gotify, Webhook)
## [2026.04.1] — Initial independent fork
### Added
- SSH-accessible TUI (Bubble Tea + Wish)
- 6 check types (HTTP, Push, Ping, Port, DNS, Group)
- SQLite and PostgreSQL support
- HA clustering with automatic failover
- Prometheus metrics endpoint
- Public status page
- Uptime Kuma import
+23
View File
@@ -0,0 +1,23 @@
# Contributing
## Development
```sh
go run cmd/uptop/main.go -demo # starts with sample data
ssh -p 23234 localhost # connect to TUI
```
## Tests
```sh
go test ./... # unit tests
go test -race ./... # race detector
golangci-lint run ./... # linting
```
## Pull Requests
- Branch from `main`, PR back to `main`
- Conventional Commits for messages (`feat:`, `fix:`, `chore:`)
- Tests must pass, linter must be clean
- One logical change per PR
+16 -10
View File
@@ -1,28 +1,34 @@
# --- Stage 1: Builder ---
FROM golang:alpine AS builder
FROM golang:1.26-alpine3.23 AS builder
RUN apk add --no-cache gcc musl-dev
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
RUN --mount=type=cache,target=/go/pkg/mod \
go mod download
COPY . .
ENV CGO_ENABLED=1
RUN go build -ldflags="-s -w" -o go-upkeep ./cmd/goupkeep/main.go
ARG VERSION=dev
ARG COMMIT=none
ARG BUILD_DATE=unknown
RUN --mount=type=cache,target=/go/pkg/mod \
--mount=type=cache,target=/root/.cache/go-build \
go build -trimpath -ldflags="-s -w -X main.version=${VERSION} -X main.commit=${COMMIT} -X main.date=${BUILD_DATE}" -o uptop ./cmd/uptop/main.go
# --- Stage 2: Runner ---
FROM alpine:latest
FROM alpine:3.23
WORKDIR /app
RUN apk add --no-cache ca-certificates openssh-client
RUN mkdir /data
COPY --from=builder /app/go-upkeep .
COPY --from=builder /app/uptop .
# Set Default Configuration via ENV
# Docker users can override these in docker-compose.yml
ENV LIPGLOSS_RENDERER_HAS_DARK_BACKGROUND=true
ENV UPKEEP_DB_TYPE=sqlite
ENV UPKEEP_DB_DSN=/data/upkeep.db
ENV UPKEEP_KEYS=/data/authorized_keys
ENV UPKEEP_PORT=23234
ENV UPTOP_DB_TYPE=sqlite
ENV UPTOP_DB_DSN=/data/uptop.db
ENV UPTOP_KEYS=/data/authorized_keys
ENV UPTOP_PORT=23234
EXPOSE 23234
CMD ["./go-upkeep"]
CMD ["./uptop"]
+1
View File
@@ -1,6 +1,7 @@
MIT License
Copyright (c) 2026 Roman Dvořák
Copyright (c) 2026 Tyler Koenig
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
+53 -23
View File
@@ -1,8 +1,8 @@
# Go-Upkeep
# uptop
Self-hosted uptime monitor with a TUI you can access over SSH. No browser, no install on the client — just `ssh -p 23234 your-server`.
Originally forked from [RDGames/go-upkeep](https://github.com/RDGames/go-upkeep). This is an independent fork with significant additions.
Built on the foundation of [RDGames/go-upkeep](https://github.com/RDGames/go-upkeep).
## What it does
@@ -18,30 +18,49 @@ Originally forked from [RDGames/go-upkeep](https://github.com/RDGames/go-upkeep)
## Quick start
```bash
go run cmd/goupkeep/main.go
go run cmd/uptop/main.go
ssh -p 23234 localhost
```
Seed some demo data to see it in action:
```bash
go run cmd/goupkeep/main.go -demo
go run cmd/uptop/main.go -demo
```
## Install
### From source
```bash
go install gitea.lerkolabs.com/lerko/uptop/cmd/uptop@latest
```
### Docker
```bash
docker pull lerko/uptop:latest
docker run -p 23234:23234 -p 8080:8080 -v ./data:/data lerko/uptop
```
### Binary
Download from [Releases](https://gitea.lerkolabs.com/lerko/uptop/releases).
## Config as code
Export your current monitors:
```bash
goupkeep export -o monitors.yaml
uptop export -o monitors.yaml
```
Apply a config file:
```bash
goupkeep apply -f monitors.yaml
goupkeep apply -f monitors.yaml --dry-run # see what would change
goupkeep apply -f monitors.yaml --prune # delete anything not in the YAML
uptop apply -f monitors.yaml
uptop apply -f monitors.yaml --dry-run # see what would change
uptop apply -f monitors.yaml --prune # delete anything not in the YAML
```
See [docs/config-as-code.md](docs/config-as-code.md) for the full reference.
@@ -62,28 +81,39 @@ services:
- ./data:/data
- ./ssh_keys:/app/.ssh
environment:
- UPKEEP_DB_TYPE=sqlite
- UPKEEP_DB_DSN=/data/upkeep.db
- UPKEEP_STATUS_ENABLED=true
- UPKEEP_CLUSTER_SECRET=change-me
- UPTOP_DB_TYPE=sqlite
- UPTOP_DB_DSN=/data/uptop.db
- UPTOP_STATUS_ENABLED=true
- UPTOP_CLUSTER_SECRET=change-me
```
First run: attach to the container (`docker attach go-upkeep`), go to the Users tab, add your SSH public key. Then detach with `Ctrl+P, Ctrl+Q` and connect normally over SSH.
First run: attach to the container (`docker attach uptop`), go to the Users tab, add your SSH public key. Then detach with `Ctrl+P, Ctrl+Q` and connect normally over SSH.
## Environment variables
| Variable | Default | What it does |
|---|---|---|
| `UPKEEP_PORT` | `23234` | SSH server port |
| `UPKEEP_HTTP_PORT` | `8080` | HTTP server port (status page, push, metrics) |
| `UPKEEP_DB_TYPE` | `sqlite` | `sqlite` or `postgres` |
| `UPKEEP_DB_DSN` | `upkeep.db` | Database path or connection string |
| `UPKEEP_STATUS_ENABLED` | `false` | Enable public status page |
| `UPKEEP_STATUS_TITLE` | `System Status` | Status page title |
| `UPKEEP_CLUSTER_MODE` | `leader` | `leader` or `follower` |
| `UPKEEP_PEER_URL` | | Leader URL for follower nodes |
| `UPKEEP_CLUSTER_SECRET` | | Shared key for cluster + API auth |
| `UPKEEP_INSECURE_SKIP_VERIFY` | `false` | Skip TLS verification for checks |
| `UPTOP_PORT` | `23234` | SSH server port |
| `UPTOP_HTTP_PORT` | `8080` | HTTP server port (status page, push, metrics) |
| `UPTOP_DB_TYPE` | `sqlite` | `sqlite` or `postgres` |
| `UPTOP_DB_DSN` | `uptop.db` | Database path or connection string |
| `UPTOP_STATUS_ENABLED` | `false` | Enable public status page |
| `UPTOP_STATUS_TITLE` | `System Status` | Status page title |
| `UPTOP_CLUSTER_MODE` | `leader` | `leader` or `follower` |
| `UPTOP_PEER_URL` | | Leader URL for follower nodes |
| `UPTOP_CLUSTER_SECRET` | | Shared key for cluster + API auth |
| `UPTOP_INSECURE_SKIP_VERIFY` | `false` | Skip TLS verification for checks |
## Migrating from Uptime Kuma
Export your Kuma backup JSON, then:
```bash
curl -X POST http://localhost:8080/api/import/kuma \
-H "X-Upkeep-Secret: your-secret" \
-H "Content-Type: application/json" \
-d @kuma-backup.json
```
## License
+19
View File
@@ -0,0 +1,19 @@
# Security Policy
## Reporting a Vulnerability
If you find a security issue, please email security@lerkolabs.com rather than opening a public issue.
Include:
- Description of the vulnerability
- Steps to reproduce
- Potential impact
We'll acknowledge within 48 hours and aim to patch within 7 days for critical issues.
## Scope
- SSH server authentication
- Cluster API authentication
- Stored credentials (alert provider tokens)
- Status page information leakage
-368
View File
@@ -1,368 +0,0 @@
package main
import (
"context"
"flag"
"fmt"
"go-upkeep/internal/cluster"
"go-upkeep/internal/config"
"go-upkeep/internal/importer"
"go-upkeep/internal/models"
"go-upkeep/internal/monitor"
"go-upkeep/internal/server"
"go-upkeep/internal/store"
"go-upkeep/internal/tui"
"log"
"os"
"os/signal"
"strconv"
"syscall"
tea "github.com/charmbracelet/bubbletea"
"github.com/charmbracelet/ssh"
"github.com/charmbracelet/wish"
bm "github.com/charmbracelet/wish/bubbletea"
"github.com/mattn/go-isatty"
)
func main() {
log.SetOutput(os.Stderr)
if len(os.Args) >= 2 {
switch os.Args[1] {
case "apply":
runApply(os.Args[2:])
return
case "export":
runExport(os.Args[2:])
return
}
}
runServe(os.Args[1:])
}
func envOrDefault(key, fallback string) string {
if v := os.Getenv(key); v != "" {
return v
}
return fallback
}
func openStore(dbType, dsn string) store.Store {
var s store.Store
var err error
if dbType == "postgres" {
s, err = store.NewPostgresStore(dsn)
} else {
s, err = store.NewSQLiteStore(dsn)
}
if err != nil {
fmt.Fprintf(os.Stderr, "database error: %v\n", err)
os.Exit(1)
}
if err := s.Init(); err != nil {
fmt.Fprintf(os.Stderr, "database init error: %v\n", err)
os.Exit(1)
}
return s
}
func runApply(args []string) {
fs := flag.NewFlagSet("apply", flag.ExitOnError)
filePath := fs.String("f", "", "Path to YAML config file (required)")
dryRun := fs.Bool("dry-run", false, "Show planned changes without applying")
prune := fs.Bool("prune", false, "Delete monitors/alerts not in YAML")
dbType := fs.String("db-type", envOrDefault("UPKEEP_DB_TYPE", "sqlite"), "Database type")
dsn := fs.String("dsn", envOrDefault("UPKEEP_DB_DSN", "upkeep.db"), "Database DSN")
fs.Parse(args)
if *filePath == "" {
fmt.Fprintln(os.Stderr, "error: -f flag is required")
fs.Usage()
os.Exit(1)
}
s := openStore(*dbType, *dsn)
f, err := config.LoadFile(*filePath)
if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
changes, err := config.Apply(s, f, config.ApplyOpts{
DryRun: *dryRun,
Prune: *prune,
})
if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
fmt.Print(config.FormatChanges(changes, *dryRun))
}
func runExport(args []string) {
fs := flag.NewFlagSet("export", flag.ExitOnError)
outPath := fs.String("o", "-", "Output file path (- for stdout)")
dbType := fs.String("db-type", envOrDefault("UPKEEP_DB_TYPE", "sqlite"), "Database type")
dsn := fs.String("dsn", envOrDefault("UPKEEP_DB_DSN", "upkeep.db"), "Database DSN")
fs.Parse(args)
s := openStore(*dbType, *dsn)
f, err := config.Export(s)
if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
if err := config.WriteFile(f, *outPath); err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
}
func runServe(args []string) {
portVal := 23234
dbType := "sqlite"
dbDSN := "upkeep.db"
httpPort := 8080
enableStatus := false
statusTitle := "System Status"
clusterMode := "leader"
clusterPeer := ""
clusterKey := ""
if v := os.Getenv("UPKEEP_PORT"); v != "" {
if p, err := strconv.Atoi(v); err == nil {
portVal = p
}
}
if v := os.Getenv("UPKEEP_DB_TYPE"); v != "" {
dbType = v
}
if v := os.Getenv("UPKEEP_DB_DSN"); v != "" {
dbDSN = v
}
if v := os.Getenv("UPKEEP_HTTP_PORT"); v != "" {
if p, err := strconv.Atoi(v); err == nil {
httpPort = p
}
}
if v := os.Getenv("UPKEEP_STATUS_ENABLED"); v == "true" {
enableStatus = true
}
if v := os.Getenv("UPKEEP_STATUS_TITLE"); v != "" {
statusTitle = v
}
if v := os.Getenv("UPKEEP_CLUSTER_MODE"); v != "" {
clusterMode = v
}
if v := os.Getenv("UPKEEP_PEER_URL"); v != "" {
clusterPeer = v
}
if v := os.Getenv("UPKEEP_CLUSTER_SECRET"); v != "" {
clusterKey = v
}
nodeID := os.Getenv("UPKEEP_NODE_ID")
nodeName := os.Getenv("UPKEEP_NODE_NAME")
nodeRegion := os.Getenv("UPKEEP_NODE_REGION")
aggStrategy := os.Getenv("UPKEEP_AGG_STRATEGY")
if clusterMode == "probe" {
if nodeID == "" {
fmt.Fprintln(os.Stderr, "UPKEEP_NODE_ID is required for probe mode")
os.Exit(1)
}
if clusterPeer == "" {
fmt.Fprintln(os.Stderr, "UPKEEP_PEER_URL is required for probe mode")
os.Exit(1)
}
fmt.Printf("Cluster: Running as PROBE (node=%s, region=%s)\n", nodeID, nodeRegion)
ctx, cancel := context.WithCancel(context.Background())
done := make(chan os.Signal, 1)
signal.Notify(done, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
go func() {
<-done
cancel()
}()
if err := cluster.RunProbe(ctx, cluster.ProbeConfig{
NodeID: nodeID,
NodeName: nodeName,
Region: nodeRegion,
LeaderURL: clusterPeer,
SharedKey: clusterKey,
Interval: 30,
}); err != nil {
fmt.Fprintf(os.Stderr, "Probe error: %v\n", err)
}
return
}
fs := flag.NewFlagSet("serve", flag.ExitOnError)
port := fs.Int("port", portVal, "SSH Port")
flagDBType := fs.String("db-type", dbType, "Database type")
flagDSN := fs.String("dsn", dbDSN, "Database DSN")
demo := fs.Bool("demo", false, "Seed demo data")
importKuma := fs.String("import-kuma", "", "Import Uptime Kuma backup JSON file")
fs.Parse(args)
var s store.Store
var dbErr error
if *flagDBType == "postgres" {
s, dbErr = store.NewPostgresStore(*flagDSN)
fmt.Printf("Using PostgreSQL: %s\n", *flagDSN)
} else {
s, dbErr = store.NewSQLiteStore(*flagDSN)
fmt.Printf("Using SQLite: %s\n", *flagDSN)
}
if dbErr != nil {
fmt.Printf("Database connection error: %v\n", dbErr)
os.Exit(1)
}
if err := s.Init(); err != nil {
fmt.Printf("Database init error: %v\n", err)
os.Exit(1)
}
if *demo {
seedDemoData(s)
}
if *importKuma != "" {
kb, err := importer.LoadKumaFile(*importKuma)
if err != nil {
fmt.Printf("Kuma import error: %v\n", err)
os.Exit(1)
}
backup := importer.ConvertKuma(kb)
if err := s.ImportData(backup); err != nil {
fmt.Printf("Import failed: %v\n", err)
os.Exit(1)
}
fmt.Printf("Imported %d monitors and %d alerts from Uptime Kuma v%s\n", len(backup.Sites), len(backup.Alerts), kb.Version)
}
eng := monitor.NewEngine(s)
if os.Getenv("UPKEEP_INSECURE_SKIP_VERIFY") == "true" {
eng.SetInsecureSkipVerify(true)
}
if aggStrategy != "" {
eng.SetAggStrategy(monitor.AggregationStrategy(aggStrategy))
}
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
eng.InitHistory()
eng.InitLogs()
eng.Start(ctx)
server.Start(server.ServerConfig{
Port: httpPort,
EnableStatus: enableStatus,
Title: statusTitle,
ClusterKey: clusterKey,
}, s, eng)
cluster.Start(ctx, cluster.Config{
Mode: clusterMode,
PeerURL: clusterPeer,
SharedKey: clusterKey,
}, eng)
startSSHServer(*port, s, eng)
if isatty.IsTerminal(os.Stdout.Fd()) || isatty.IsCygwinTerminal(os.Stdout.Fd()) {
p := tea.NewProgram(tui.InitialModel(true, s, eng), tea.WithAltScreen(), tea.WithMouseCellMotion())
if _, err := p.Run(); err != nil {
fmt.Printf("Error: %v\n", err)
}
} else {
fmt.Println("Go-Upkeep running in HEADLESS mode")
done := make(chan os.Signal, 1)
signal.Notify(done, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
<-done
fmt.Println("Shutting down...")
}
cancel()
}
func startSSHServer(port int, db store.Store, eng *monitor.Engine) {
s, err := wish.NewServer(
wish.WithAddress(fmt.Sprintf(":%d", port)),
wish.WithHostKeyPath(".ssh/id_ed25519"),
wish.WithPublicKeyAuth(func(ctx ssh.Context, key ssh.PublicKey) bool {
return isKeyAllowed(db, key)
}),
wish.WithMiddleware(
bm.Middleware(func(s ssh.Session) (tea.Model, []tea.ProgramOption) {
return tui.InitialModel(false, db, eng), []tea.ProgramOption{tea.WithAltScreen(), tea.WithMouseCellMotion()}
}),
),
)
if err != nil {
fmt.Printf("SSH server error: %v\n", err)
return
}
go func() {
if err := s.ListenAndServe(); err != nil {
log.Fatalf("SSH server failed: %v", err)
}
}()
}
func seedDemoData(s store.Store) {
existing, _ := s.GetSites()
if len(existing) > 0 {
return
}
fmt.Println("Seeding demo data...")
s.AddAlert("Discord Ops", "discord", map[string]string{"url": "https://discord.com/api/webhooks/demo/token"})
s.AddAlert("Slack Infra", "slack", map[string]string{"url": "https://hooks.slack.com/services/DEMO/WEBHOOK"})
s.AddAlert("Email Oncall", "email", map[string]string{
"host": "smtp.example.com", "port": "587",
"user": "oncall@example.com", "pass": "replace-me",
"from": "oncall@example.com", "to": "team@example.com",
})
alerts, _ := s.GetAllAlerts()
alertID := 0
if len(alerts) > 0 {
alertID = alerts[0].ID
}
s.AddSite(models.Site{Name: "Google", URL: "https://www.google.com", Type: "http", Interval: 30, AlertID: alertID, CheckSSL: true, ExpiryThreshold: 14, MaxRetries: 2})
s.AddSite(models.Site{Name: "GitHub", URL: "https://github.com", Type: "http", Interval: 30, AlertID: alertID, CheckSSL: true, ExpiryThreshold: 7, MaxRetries: 3})
s.AddSite(models.Site{Name: "Cloudflare DNS", URL: "https://1.1.1.1", Type: "http", Interval: 60, AlertID: alertID, ExpiryThreshold: 7, MaxRetries: 1})
s.AddSite(models.Site{Name: "JSON Placeholder", URL: "https://jsonplaceholder.typicode.com/posts/1", Type: "http", Interval: 45, AlertID: alertID, ExpiryThreshold: 7, MaxRetries: 2})
s.AddSite(models.Site{Name: "Nonexistent Site", URL: "https://this-domain-does-not-exist-12345.com", Type: "http", Interval: 30, AlertID: alertID, ExpiryThreshold: 7, MaxRetries: 3})
s.AddSite(models.Site{Name: "Bad Port", URL: "https://localhost:19999", Type: "http", Interval: 30, ExpiryThreshold: 7, MaxRetries: 1})
s.AddSite(models.Site{Name: "Backup Cron", Type: "push", Interval: 300, AlertID: alertID, ExpiryThreshold: 7})
s.AddSite(models.Site{Name: "DB Healthcheck", Type: "push", Interval: 120, AlertID: alertID, ExpiryThreshold: 7})
s.AddSite(models.Site{Name: "Gateway", Type: "ping", Interval: 30, AlertID: alertID, Hostname: "10.0.0.1", Timeout: 5, ExpiryThreshold: 7})
s.AddSite(models.Site{Name: "SSH Server", Type: "port", Interval: 60, AlertID: alertID, Hostname: "10.0.0.1", Port: 22, Timeout: 5, ExpiryThreshold: 7})
}
func isKeyAllowed(db store.Store, incomingKey ssh.PublicKey) bool {
users, err := db.GetAllUsers()
if err != nil {
return false
}
for _, u := range users {
allowedKey, _, _, _, err := ssh.ParseAuthorizedKey([]byte(u.PublicKey))
if err != nil {
continue
}
if ssh.KeysEqual(allowedKey, incomingKey) {
return true
}
}
return false
}
+645
View File
@@ -0,0 +1,645 @@
package main
import (
"bufio"
"context"
"errors"
"flag"
"fmt"
"log"
"net/url"
"os"
"os/signal"
"path/filepath"
"strconv"
"strings"
"sync"
"syscall"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/cluster"
"gitea.lerkolabs.com/lerko/uptop/internal/config"
"gitea.lerkolabs.com/lerko/uptop/internal/importer"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"gitea.lerkolabs.com/lerko/uptop/internal/server"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
"gitea.lerkolabs.com/lerko/uptop/internal/tui"
tea "github.com/charmbracelet/bubbletea"
"github.com/charmbracelet/ssh"
"github.com/charmbracelet/wish"
bm "github.com/charmbracelet/wish/bubbletea"
"github.com/mattn/go-isatty"
)
var (
version = "dev"
commit = "none"
date = "unknown"
)
func main() {
log.SetOutput(os.Stderr)
if len(os.Args) >= 2 {
switch os.Args[1] {
case "apply":
runApply(os.Args[2:])
return
case "export":
runExport(os.Args[2:])
return
case "version", "--version", "-v":
printVersion()
return
case "migrate-secrets":
runMigrateSecrets(os.Args[2:])
return
}
}
runServe(os.Args[1:])
}
func printVersion() {
if version == "dev" {
fmt.Println("uptop dev")
} else {
fmt.Printf("uptop %s (%s, %s)\n", version, commit, date)
}
}
func envOrDefault(key, fallback string) string {
if v := os.Getenv(key); v != "" {
return v
}
return fallback
}
func redactDSN(dsn string) string {
u, err := url.Parse(dsn)
if err != nil {
return "***"
}
u.User = nil
return u.String()
}
func openStore(dbType, dsn string) store.Store {
var ss *store.SQLStore
var err error
if dbType == "postgres" {
ss, err = store.NewPostgresStore(dsn)
} else {
ss, err = store.NewSQLiteStore(dsn)
}
if err != nil {
fmt.Fprintf(os.Stderr, "database error: %v\n", err)
os.Exit(1)
}
if encKey := os.Getenv("UPTOP_ENCRYPTION_KEY"); encKey != "" {
enc, err := store.NewEncryptor(encKey)
if err != nil {
fmt.Fprintf(os.Stderr, "encryption key error: %v\n", err)
os.Exit(1)
}
ss.SetEncryptor(enc)
} else {
fmt.Println("WARNING: No UPTOP_ENCRYPTION_KEY set. Alert credentials stored unencrypted.")
}
if err := ss.Init(); err != nil {
fmt.Fprintf(os.Stderr, "database init error: %v\n", err)
os.Exit(1)
}
return ss
}
func runApply(args []string) {
fs := flag.NewFlagSet("apply", flag.ExitOnError)
filePath := fs.String("f", "", "Path to YAML config file (required)")
dryRun := fs.Bool("dry-run", false, "Show planned changes without applying")
prune := fs.Bool("prune", false, "Delete monitors/alerts not in YAML")
dbType := fs.String("db-type", envOrDefault("UPTOP_DB_TYPE", "sqlite"), "Database type")
dsn := fs.String("dsn", envOrDefault("UPTOP_DB_DSN", "uptop.db"), "Database DSN")
_ = fs.Parse(args) // ExitOnError: parse errors exit before returning
if *filePath == "" {
fmt.Fprintln(os.Stderr, "error: -f flag is required")
fs.Usage()
os.Exit(1)
}
s := openStore(*dbType, *dsn)
f, err := config.LoadFile(*filePath)
if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
changes, err := config.Apply(s, f, config.ApplyOpts{
DryRun: *dryRun,
Prune: *prune,
})
if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
fmt.Print(config.FormatChanges(changes, *dryRun))
}
func runExport(args []string) {
fs := flag.NewFlagSet("export", flag.ExitOnError)
outPath := fs.String("o", "-", "Output file path (- for stdout)")
dbType := fs.String("db-type", envOrDefault("UPTOP_DB_TYPE", "sqlite"), "Database type")
dsn := fs.String("dsn", envOrDefault("UPTOP_DB_DSN", "uptop.db"), "Database DSN")
_ = fs.Parse(args) // ExitOnError: parse errors exit before returning
s := openStore(*dbType, *dsn)
f, err := config.Export(s)
if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
if err := config.WriteFile(f, *outPath); err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
}
func runMigrateSecrets(args []string) {
fs := flag.NewFlagSet("migrate-secrets", flag.ExitOnError)
dbType := fs.String("db-type", envOrDefault("UPTOP_DB_TYPE", "sqlite"), "Database type")
dsn := fs.String("dsn", envOrDefault("UPTOP_DB_DSN", "uptop.db"), "Database DSN")
_ = fs.Parse(args)
encKey := os.Getenv("UPTOP_ENCRYPTION_KEY")
if encKey == "" {
fmt.Fprintln(os.Stderr, "error: UPTOP_ENCRYPTION_KEY must be set")
os.Exit(1)
}
enc, err := store.NewEncryptor(encKey)
if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
var ss *store.SQLStore
if *dbType == "postgres" {
ss, err = store.NewPostgresStore(*dsn)
} else {
ss, err = store.NewSQLiteStore(*dsn)
}
if err != nil {
fmt.Fprintf(os.Stderr, "database error: %v\n", err)
os.Exit(1)
}
if err := ss.Init(); err != nil {
fmt.Fprintf(os.Stderr, "database init error: %v\n", err)
os.Exit(1)
}
alerts, err := ss.GetAllAlerts()
if err != nil {
fmt.Fprintf(os.Stderr, "error loading alerts: %v\n", err)
os.Exit(1)
}
ss.SetEncryptor(enc)
migrated := 0
for _, a := range alerts {
if err := ss.UpdateAlert(a.ID, a.Name, a.Type, a.Settings); err != nil {
fmt.Fprintf(os.Stderr, "error migrating alert %q: %v\n", a.Name, err)
os.Exit(1)
}
migrated++
}
fmt.Printf("Migrated %d alert(s) to encrypted storage.\n", migrated)
}
func runServe(args []string) {
portVal := 23234
dbType := "sqlite"
dbDSN := "uptop.db"
httpPort := 8080
enableStatus := false
statusTitle := "System Status"
clusterMode := "leader"
clusterPeer := ""
clusterKey := ""
if v := os.Getenv("UPTOP_PORT"); v != "" {
if p, err := strconv.Atoi(v); err == nil {
portVal = p
}
}
if v := os.Getenv("UPTOP_DB_TYPE"); v != "" {
dbType = v
}
if v := os.Getenv("UPTOP_DB_DSN"); v != "" {
dbDSN = v
}
if v := os.Getenv("UPTOP_HTTP_PORT"); v != "" {
if p, err := strconv.Atoi(v); err == nil {
httpPort = p
}
}
if v := os.Getenv("UPTOP_STATUS_ENABLED"); v == "true" {
enableStatus = true
}
if v := os.Getenv("UPTOP_STATUS_TITLE"); v != "" {
statusTitle = v
}
if v := os.Getenv("UPTOP_CLUSTER_MODE"); v != "" {
clusterMode = v
}
if v := os.Getenv("UPTOP_PEER_URL"); v != "" {
clusterPeer = v
}
if v := os.Getenv("UPTOP_CLUSTER_SECRET"); v != "" {
clusterKey = v
}
nodeID := os.Getenv("UPTOP_NODE_ID")
nodeName := os.Getenv("UPTOP_NODE_NAME")
nodeRegion := os.Getenv("UPTOP_NODE_REGION")
aggStrategy := os.Getenv("UPTOP_AGG_STRATEGY")
if clusterMode == "probe" {
if nodeID == "" {
fmt.Fprintln(os.Stderr, "UPTOP_NODE_ID is required for probe mode")
os.Exit(1)
}
if clusterPeer == "" {
fmt.Fprintln(os.Stderr, "UPTOP_PEER_URL is required for probe mode")
os.Exit(1)
}
fmt.Printf("Cluster: Running as PROBE (node=%s, region=%s)\n", nodeID, nodeRegion)
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
done := make(chan os.Signal, 1)
signal.Notify(done, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
go func() {
<-done
cancel()
}()
probeAllowPrivate := os.Getenv("UPTOP_ALLOW_PRIVATE_TARGETS") == "true"
if probeAllowPrivate {
fmt.Println("WARNING: Private target blocking disabled. Monitor URLs can reach internal networks.")
}
if err := cluster.RunProbe(ctx, cluster.ProbeConfig{
NodeID: nodeID,
NodeName: nodeName,
Region: nodeRegion,
LeaderURL: clusterPeer,
SharedKey: clusterKey,
Interval: 30,
AllowPrivateTargets: probeAllowPrivate,
}); err != nil {
fmt.Fprintf(os.Stderr, "Probe error: %v\n", err)
}
return
}
fs := flag.NewFlagSet("serve", flag.ExitOnError)
port := fs.Int("port", portVal, "SSH Port")
flagDBType := fs.String("db-type", dbType, "Database type")
flagDSN := fs.String("dsn", dbDSN, "Database DSN")
demo := fs.Bool("demo", false, "Seed demo data")
importKuma := fs.String("import-kuma", "", "Import Uptime Kuma backup JSON file")
_ = fs.Parse(args) // ExitOnError: parse errors exit before returning
var ss *store.SQLStore
var dbErr error
if *flagDBType == "postgres" {
ss, dbErr = store.NewPostgresStore(*flagDSN)
fmt.Printf("Using PostgreSQL: %s\n", redactDSN(*flagDSN))
} else {
ss, dbErr = store.NewSQLiteStore(*flagDSN)
fmt.Printf("Using SQLite: %s\n", *flagDSN)
}
if dbErr != nil {
fmt.Fprintf(os.Stderr, "database connection error: %v\n", dbErr)
os.Exit(1)
}
defer ss.Close()
if encKey := os.Getenv("UPTOP_ENCRYPTION_KEY"); encKey != "" {
enc, err := store.NewEncryptor(encKey)
if err != nil {
fmt.Fprintf(os.Stderr, "encryption key error: %v\n", err)
os.Exit(1)
}
ss.SetEncryptor(enc)
} else {
fmt.Println("WARNING: No UPTOP_ENCRYPTION_KEY set. Alert credentials stored unencrypted.")
}
var s store.Store = ss
if err := s.Init(); err != nil {
fmt.Fprintf(os.Stderr, "database init error: %v\n", err)
os.Exit(1)
}
if *demo {
seedDemoData(s)
}
seedKeysFromEnv(s)
if *importKuma != "" {
kb, err := importer.LoadKumaFile(*importKuma)
if err != nil {
fmt.Fprintf(os.Stderr, "kuma import error: %v\n", err)
os.Exit(1)
}
backup := importer.ConvertKuma(kb)
if err := s.ImportData(backup); err != nil {
fmt.Fprintf(os.Stderr, "import failed: %v\n", err)
os.Exit(1)
}
fmt.Printf("Imported %d monitors and %d alerts from Uptime Kuma v%s\n", len(backup.Sites), len(backup.Alerts), kb.Version)
}
allowPrivate := os.Getenv("UPTOP_ALLOW_PRIVATE_TARGETS") == "true"
if allowPrivate {
fmt.Println("WARNING: Private target blocking disabled. Monitor URLs can reach internal networks.")
}
eng := monitor.NewEngineWithOpts(s, allowPrivate)
if os.Getenv("UPTOP_INSECURE_SKIP_VERIFY") == "true" {
eng.SetInsecureSkipVerify(true)
}
if aggStrategy != "" {
eng.SetAggStrategy(monitor.AggregationStrategy(aggStrategy))
}
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
eng.InitHistory()
eng.InitLogs()
eng.Start(ctx)
tlsCert := os.Getenv("UPTOP_TLS_CERT")
tlsKey := os.Getenv("UPTOP_TLS_KEY")
httpSrv := server.Start(server.ServerConfig{
Port: httpPort,
EnableStatus: enableStatus,
Title: statusTitle,
ClusterKey: clusterKey,
TLSCert: tlsCert,
TLSKey: tlsKey,
ClusterMode: clusterMode,
MetricsPublic: os.Getenv("UPTOP_METRICS_PUBLIC") == "true",
CORSOrigin: os.Getenv("UPTOP_CORS_ORIGIN"),
}, s, eng)
cluster.Start(ctx, cluster.Config{
Mode: clusterMode,
PeerURL: clusterPeer,
SharedKey: clusterKey,
}, eng)
kc := newKeyCache(s)
sshSrv := startSSHServer(*port, s, eng, kc)
if isatty.IsTerminal(os.Stdout.Fd()) || isatty.IsCygwinTerminal(os.Stdout.Fd()) {
p := tea.NewProgram(tui.InitialModel(true, s, eng), tea.WithAltScreen(), tea.WithMouseCellMotion())
if _, err := p.Run(); err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
}
} else {
fmt.Println("uptop running in HEADLESS mode")
done := make(chan os.Signal, 1)
signal.Notify(done, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
<-done
fmt.Println("Shutting down...")
}
cancel()
shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 30*time.Second)
defer shutdownCancel()
if httpSrv != nil {
if err := httpSrv.Shutdown(shutdownCtx); err != nil {
log.Printf("HTTP shutdown error: %v", err)
}
}
if sshSrv != nil {
if err := sshSrv.Shutdown(shutdownCtx); err != nil {
log.Printf("SSH shutdown error: %v", err)
}
}
}
func startSSHServer(port int, db store.Store, eng *monitor.Engine, kc *keyCache) *ssh.Server {
s, err := wish.NewServer(
wish.WithAddress(fmt.Sprintf(":%d", port)),
wish.WithHostKeyPath(envOrDefault("UPTOP_SSH_HOST_KEY", ".ssh/id_ed25519")),
wish.WithPublicKeyAuth(func(ctx ssh.Context, key ssh.PublicKey) bool {
return kc.IsAllowed(key)
}),
wish.WithMiddleware(
bm.Middleware(func(s ssh.Session) (tea.Model, []tea.ProgramOption) {
return tui.InitialModel(false, db, eng), []tea.ProgramOption{tea.WithAltScreen(), tea.WithMouseCellMotion()}
}),
),
)
if err != nil {
fmt.Fprintf(os.Stderr, "SSH server error: %v\n", err)
return nil
}
go func() {
if err := s.ListenAndServe(); err != nil && !errors.Is(err, ssh.ErrServerClosed) {
log.Printf("SSH server error: %v", err)
}
}()
return s
}
func seedDemoData(s store.Store) {
existing, _ := s.GetSites()
if len(existing) > 0 {
return
}
fmt.Println("Seeding demo data...")
if err := s.AddAlert("Discord Ops", "discord", map[string]string{"url": "https://discord.com/api/webhooks/demo/token"}); err != nil {
log.Printf("demo seed: add alert: %v", err)
return
}
if err := s.AddAlert("Slack Infra", "slack", map[string]string{"url": "https://hooks.slack.com/services/DEMO/WEBHOOK"}); err != nil {
log.Printf("demo seed: add alert: %v", err)
return
}
if err := s.AddAlert("Email Oncall", "email", map[string]string{
"host": "smtp.example.com", "port": "587",
"user": "oncall@example.com", "pass": "replace-me",
"from": "oncall@example.com", "to": "team@example.com",
}); err != nil {
log.Printf("demo seed: add alert: %v", err)
return
}
alerts, _ := s.GetAllAlerts()
alertID := 0
if len(alerts) > 0 {
alertID = alerts[0].ID
}
demoSites := []models.Site{
{Name: "Google", URL: "https://www.google.com", Type: "http", Interval: 30, AlertID: alertID, CheckSSL: true, ExpiryThreshold: 14, MaxRetries: 2},
{Name: "GitHub", URL: "https://github.com", Type: "http", Interval: 30, AlertID: alertID, CheckSSL: true, ExpiryThreshold: 7, MaxRetries: 3},
{Name: "Cloudflare DNS", URL: "https://1.1.1.1", Type: "http", Interval: 60, AlertID: alertID, ExpiryThreshold: 7, MaxRetries: 1},
{Name: "JSON Placeholder", URL: "https://jsonplaceholder.typicode.com/posts/1", Type: "http", Interval: 45, AlertID: alertID, ExpiryThreshold: 7, MaxRetries: 2},
{Name: "Nonexistent Site", URL: "https://this-domain-does-not-exist-12345.com", Type: "http", Interval: 30, AlertID: alertID, ExpiryThreshold: 7, MaxRetries: 3},
{Name: "Bad Port", URL: "https://localhost:19999", Type: "http", Interval: 30, ExpiryThreshold: 7, MaxRetries: 1},
{Name: "Backup Cron", Type: "push", Interval: 300, AlertID: alertID, ExpiryThreshold: 7},
{Name: "DB Healthcheck", Type: "push", Interval: 120, AlertID: alertID, ExpiryThreshold: 7},
{Name: "Gateway", Type: "ping", Interval: 30, AlertID: alertID, Hostname: "10.0.0.1", Timeout: 5, ExpiryThreshold: 7},
{Name: "SSH Server", Type: "port", Interval: 60, AlertID: alertID, Hostname: "10.0.0.1", Port: 22, Timeout: 5, ExpiryThreshold: 7},
}
for _, site := range demoSites {
if err := s.AddSite(site); err != nil {
log.Printf("demo seed: add site %q: %v", site.Name, err)
}
}
}
type keyCache struct {
mu sync.RWMutex
keys []ssh.PublicKey
updated time.Time
ttl time.Duration
db store.Store
}
func newKeyCache(db store.Store) *keyCache {
return &keyCache{db: db, ttl: 30 * time.Second}
}
func (c *keyCache) refresh() {
users, err := c.db.GetAllUsers()
if err != nil {
return
}
keys := make([]ssh.PublicKey, 0, len(users))
for _, u := range users {
k, _, _, _, err := ssh.ParseAuthorizedKey([]byte(u.PublicKey))
if err != nil {
continue
}
keys = append(keys, k)
}
c.mu.Lock()
c.keys = keys
c.updated = time.Now()
c.mu.Unlock()
}
func (c *keyCache) Invalidate() {
c.mu.Lock()
c.updated = time.Time{}
c.mu.Unlock()
}
func (c *keyCache) IsAllowed(incomingKey ssh.PublicKey) bool {
c.mu.RLock()
stale := time.Since(c.updated) > c.ttl
c.mu.RUnlock()
if stale {
c.refresh()
}
c.mu.RLock()
defer c.mu.RUnlock()
for _, k := range c.keys {
if ssh.KeysEqual(k, incomingKey) {
return true
}
}
return false
}
func seedKeysFromEnv(s store.Store) {
var keys []string
if v := os.Getenv("UPTOP_ADMIN_KEY"); v != "" {
keys = append(keys, strings.TrimSpace(v))
}
if path := os.Getenv("UPTOP_KEYS"); path != "" {
f, err := os.Open(filepath.Clean(path))
if err == nil {
scanner := bufio.NewScanner(f)
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if line == "" || strings.HasPrefix(line, "#") {
continue
}
keys = append(keys, line)
}
_ = f.Close()
}
}
if len(keys) == 0 {
return
}
existing, err := s.GetAllUsers()
if err != nil {
fmt.Fprintf(os.Stderr, "warning: could not check existing users: %v\n", err)
return
}
existingKeys := make(map[string]bool)
for _, u := range existing {
existingKeys[u.PublicKey] = true
}
added := 0
for i, key := range keys {
if existingKeys[key] {
continue
}
username := usernameFromKey(key, i, len(existing)+added)
if err := s.AddUser(username, key, "admin"); err != nil {
fmt.Fprintf(os.Stderr, "warning: failed to seed user %q: %v\n", username, err)
continue
}
fmt.Printf("Seeded admin user %q from %s\n", username, seedSource(i, len(keys), os.Getenv("UPTOP_ADMIN_KEY") != ""))
added++
}
}
func usernameFromKey(key string, index, totalExisting int) string {
parts := strings.Fields(key)
if len(parts) >= 3 {
comment := parts[2]
if at := strings.Index(comment, "@"); at > 0 {
return comment[:at]
}
return comment
}
if index == 0 && totalExisting == 0 {
return "admin"
}
return fmt.Sprintf("user-%d", totalExisting+1)
}
func seedSource(index, total int, hasEnvKey bool) string {
if hasEnvKey && index == 0 {
return "UPTOP_ADMIN_KEY"
}
return "UPTOP_KEYS"
}
+21 -21
View File
@@ -4,21 +4,21 @@ services:
# -------------------------
leader:
build: .
container_name: upkeep-leader
container_name: uptop-leader
ports:
- "23234:23234" # SSH
- "8080:8080" # HTTP
environment:
- UPKEEP_DB_TYPE=postgres
- UPTOP_DB_TYPE=postgres
# Note: Port 5432 is correct here because we are talking INSIDE the network
- UPKEEP_DB_DSN=postgres://devuser:devpass@leader-db:5432/upkeep_dev?sslmode=disable
- UPKEEP_HTTP_PORT=8080
- UPKEEP_STATUS_ENABLED=true
- UPKEEP_STATUS_TITLE=Leader Node
- UPTOP_DB_DSN=postgres://devuser:devpass@leader-db:5432/uptop_dev?sslmode=disable
- UPTOP_HTTP_PORT=8080
- UPTOP_STATUS_ENABLED=true
- UPTOP_STATUS_TITLE=Leader Node
# Cluster Config
- UPKEEP_CLUSTER_MODE=leader
- UPKEEP_CLUSTER_SECRET=mysecret
- UPTOP_CLUSTER_MODE=leader
- UPTOP_CLUSTER_SECRET=mysecret
depends_on:
- leader-db
stdin_open: true
@@ -26,11 +26,11 @@ services:
leader-db:
image: postgres:15-alpine
container_name: upkeep-leader-db
container_name: uptop-leader-db
environment:
POSTGRES_USER: devuser
POSTGRES_PASSWORD: devpass
POSTGRES_DB: upkeep_dev
POSTGRES_DB: uptop_dev
volumes:
- ./tmp/leader-data:/var/lib/postgresql/data
@@ -39,23 +39,23 @@ services:
# -------------------------
follower:
build: .
container_name: upkeep-follower
container_name: uptop-follower
ports:
- "23233:23234" # SSH (Mapped to different host port)
- "8081:8080" # HTTP (Mapped to different host port)
environment:
- UPKEEP_DB_TYPE=postgres
- UPTOP_DB_TYPE=postgres
# Connects to its OWN database
- UPKEEP_DB_DSN=postgres://devuser:devpass@follower-db:5432/upkeep_dev?sslmode=disable
- UPKEEP_HTTP_PORT=8080
- UPKEEP_STATUS_ENABLED=true
- UPKEEP_STATUS_TITLE=Follower Node
- UPTOP_DB_DSN=postgres://devuser:devpass@follower-db:5432/uptop_dev?sslmode=disable
- UPTOP_HTTP_PORT=8080
- UPTOP_STATUS_ENABLED=true
- UPTOP_STATUS_TITLE=Follower Node
# Cluster Config
- UPKEEP_CLUSTER_MODE=follower
- UPKEEP_CLUSTER_SECRET=mysecret
- UPTOP_CLUSTER_MODE=follower
- UPTOP_CLUSTER_SECRET=mysecret
# IMPORTANT: Uses the Service Name "leader" to connect internally
- UPKEEP_PEER_URL=http://leader:8080
- UPTOP_PEER_URL=http://leader:8080
depends_on:
- follower-db
stdin_open: true
@@ -63,10 +63,10 @@ services:
follower-db:
image: postgres:15-alpine
container_name: upkeep-follower-db
container_name: uptop-follower-db
environment:
POSTGRES_USER: devuser
POSTGRES_PASSWORD: devpass
POSTGRES_DB: upkeep_dev
POSTGRES_DB: uptop_dev
volumes:
- ./tmp/follower-data:/var/lib/postgresql/data
+8 -8
View File
@@ -4,19 +4,19 @@ services:
build:
context: .
dockerfile: Dockerfile
container_name: upkeep-dev
container_name: uptop-dev
ports:
- "23234:23234" # SSH Access
- "8080:8080" # HTTP (Push Monitors + Status Page)
environment:
# --- Database Configuration (Postgres) ---
- UPKEEP_DB_TYPE=postgres
- UPKEEP_DB_DSN=postgres://devuser:devpass@postgres:5432/upkeep_dev?sslmode=disable
- UPTOP_DB_TYPE=postgres
- UPTOP_DB_DSN=postgres://devuser:devpass@postgres:5432/uptop_dev?sslmode=disable
# --- Web Server Configuration (Phase 4) ---
- UPKEEP_HTTP_PORT=8080
- UPKEEP_STATUS_ENABLED=true
- UPKEEP_STATUS_TITLE=Dev Infrastructure Status
- UPTOP_HTTP_PORT=8080
- UPTOP_STATUS_ENABLED=true
- UPTOP_STATUS_TITLE=Dev Infrastructure Status
depends_on:
- postgres
stdin_open: true # Required for 'docker attach' (Local Admin Console)
@@ -25,11 +25,11 @@ services:
# The Database
postgres:
image: postgres:15-alpine
container_name: upkeep-postgres
container_name: uptop-postgres
environment:
POSTGRES_USER: devuser
POSTGRES_PASSWORD: devpass
POSTGRES_DB: upkeep_dev
POSTGRES_DB: uptop_dev
ports:
- "5432:5432" # Expose for external DB tools (DBeaver, etc.)
volumes:
+16 -16
View File
@@ -2,10 +2,10 @@ services:
leader:
build: .
environment:
- UPKEEP_CLUSTER_MODE=leader
- UPKEEP_CLUSTER_SECRET=changeme
- UPKEEP_AGG_STRATEGY=any-down
- UPKEEP_STATUS_ENABLED=true
- UPTOP_CLUSTER_MODE=leader
- UPTOP_CLUSTER_SECRET=changeme
- UPTOP_AGG_STRATEGY=any-down
- UPTOP_STATUS_ENABLED=true
ports:
- "8080:8080"
- "23234:23234"
@@ -13,23 +13,23 @@ services:
probe-us-east:
build: .
environment:
- UPKEEP_CLUSTER_MODE=probe
- UPKEEP_NODE_ID=us-east-1
- UPKEEP_NODE_NAME=US East Probe
- UPKEEP_NODE_REGION=us-east
- UPKEEP_PEER_URL=http://leader:8080
- UPKEEP_CLUSTER_SECRET=changeme
- UPTOP_CLUSTER_MODE=probe
- UPTOP_NODE_ID=us-east-1
- UPTOP_NODE_NAME=US East Probe
- UPTOP_NODE_REGION=us-east
- UPTOP_PEER_URL=http://leader:8080
- UPTOP_CLUSTER_SECRET=changeme
depends_on:
- leader
probe-eu-west:
build: .
environment:
- UPKEEP_CLUSTER_MODE=probe
- UPKEEP_NODE_ID=eu-west-1
- UPKEEP_NODE_NAME=EU West Probe
- UPKEEP_NODE_REGION=eu-west
- UPKEEP_PEER_URL=http://leader:8080
- UPKEEP_CLUSTER_SECRET=changeme
- UPTOP_CLUSTER_MODE=probe
- UPTOP_NODE_ID=eu-west-1
- UPTOP_NODE_NAME=EU West Probe
- UPTOP_NODE_REGION=eu-west
- UPTOP_PEER_URL=http://leader:8080
- UPTOP_CLUSTER_SECRET=changeme
depends_on:
- leader
+20
View File
@@ -0,0 +1,20 @@
services:
app:
build:
context: .
dockerfile: Dockerfile
container_name: uptop
restart: unless-stopped
ports:
- "23234:23234"
- "8080:8080"
environment:
- UPTOP_DB_TYPE=sqlite
- UPTOP_DB_DSN=/data/uptop.db
- UPTOP_HTTP_PORT=8080
- UPTOP_STATUS_ENABLED=true
- UPTOP_STATUS_TITLE=System Status
# SSH access: add your public key via env var or authorized_keys file
# - UPTOP_ADMIN_KEY=ssh-ed25519 AAAA... you@host
volumes:
- ./data:/data
+13 -13
View File
@@ -7,13 +7,13 @@ Define your monitors and alerts in a YAML file. Version control them, copy them
Export what you already have:
```bash
goupkeep export -o monitors.yaml
uptop export -o monitors.yaml
```
That gives you a working file you can edit and re-apply:
```bash
goupkeep apply -f monitors.yaml
uptop apply -f monitors.yaml
```
That's it. Apply only creates or updates — it won't delete anything unless you tell it to.
@@ -184,34 +184,34 @@ All 9 providers work in the YAML. The `settings` map is different per type.
**Export current state:**
```bash
goupkeep export -o monitors.yaml # to a file
goupkeep export # to stdout
uptop export -o monitors.yaml # to a file
uptop export # to stdout
```
**Apply a config:**
```bash
goupkeep apply -f monitors.yaml
uptop apply -f monitors.yaml
```
**See what would change first:**
```bash
goupkeep apply -f monitors.yaml --dry-run
uptop apply -f monitors.yaml --dry-run
```
**Delete monitors not in the YAML:**
```bash
goupkeep apply -f monitors.yaml --prune
uptop apply -f monitors.yaml --prune
```
Without `--prune`, apply never deletes anything. It only creates and updates.
**Pointing at a different database:**
```bash
goupkeep export -db-type postgres -dsn "host=localhost dbname=upkeep sslmode=disable"
goupkeep apply -f monitors.yaml -db-type postgres -dsn "..."
uptop export -db-type postgres -dsn "host=localhost dbname=uptop sslmode=disable"
uptop apply -f monitors.yaml -db-type postgres -dsn "..."
```
Both commands respect the `UPKEEP_DB_TYPE` and `UPKEEP_DB_DSN` environment variables too.
Both commands respect the `UPTOP_DB_TYPE` and `UPTOP_DB_DSN` environment variables too.
## How apply works
@@ -230,15 +230,15 @@ If something fails mid-apply, just fix the issue and run it again. It picks up w
```bash
# set up your monitors in the TUI first, then export
goupkeep export -o monitors.yaml
uptop export -o monitors.yaml
# commit it
git add monitors.yaml && git commit -m "add monitor config"
# deploy to another instance
scp monitors.yaml prod-server:
ssh prod-server goupkeep apply -f monitors.yaml
ssh prod-server uptop apply -f monitors.yaml
# or just keep it as a backup you can restore from
goupkeep apply -f monitors.yaml
uptop apply -f monitors.yaml
```
+10 -10
View File
@@ -1,6 +1,6 @@
module go-upkeep
module gitea.lerkolabs.com/lerko/uptop
go 1.24.4
go 1.26.3
require (
github.com/charmbracelet/bubbles v0.21.1-0.20250623103423-23b8fd6302d7
@@ -16,6 +16,7 @@ require (
github.com/mattn/go-sqlite3 v1.14.33
github.com/miekg/dns v1.1.72
github.com/prometheus-community/pro-bing v0.8.0
gopkg.in/yaml.v3 v3.0.1
)
require (
@@ -49,13 +50,12 @@ require (
github.com/muesli/termenv v0.16.0 // indirect
github.com/rivo/uniseg v0.4.7 // indirect
github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e // indirect
golang.org/x/crypto v0.47.0 // indirect
golang.org/x/crypto v0.52.0 // indirect
golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56 // indirect
golang.org/x/mod v0.31.0 // indirect
golang.org/x/net v0.49.0 // indirect
golang.org/x/sync v0.19.0 // indirect
golang.org/x/sys v0.40.0 // indirect
golang.org/x/text v0.33.0 // indirect
golang.org/x/tools v0.40.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
golang.org/x/mod v0.35.0 // indirect
golang.org/x/net v0.54.0 // indirect
golang.org/x/sync v0.20.0 // indirect
golang.org/x/sys v0.45.0 // indirect
golang.org/x/text v0.37.0 // indirect
golang.org/x/tools v0.44.0 // indirect
)
+17 -16
View File
@@ -101,26 +101,27 @@ github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOf
github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e h1:JVG44RsyaB9T2KIHavMF/ppJZNG9ZpyihvCd0w101no=
github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e/go.mod h1:RbqR21r5mrJuqunuUZ/Dhy/avygyECGrLceyNeo4LiM=
golang.org/x/crypto v0.47.0 h1:V6e3FRj+n4dbpw86FJ8Fv7XVOql7TEwpHapKoMJ/GO8=
golang.org/x/crypto v0.47.0/go.mod h1:ff3Y9VzzKbwSSEzWqJsJVBnWmRwRSHt/6Op5n9bQc4A=
golang.org/x/crypto v0.52.0 h1:RMs7fP2rXdep0CftQlK8Uf+kibLm7qkCcradZWYz988=
golang.org/x/crypto v0.52.0/go.mod h1:1QgfPxDqh0T2M/elOJtp9RvuR95kVjir0e6/BvEmGbc=
golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56 h1:2dVuKD2vS7b0QIHQbpyTISPd0LeHDbnYEryqj5Q1ug8=
golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56/go.mod h1:M4RDyNAINzryxdtnbRXRL/OHtkFuWGRjvuhBJpk2IlY=
golang.org/x/mod v0.31.0 h1:HaW9xtz0+kOcWKwli0ZXy79Ix+UW/vOfmWI5QVd2tgI=
golang.org/x/mod v0.31.0/go.mod h1:43JraMp9cGx1Rx3AqioxrbrhNsLl2l/iNAvuBkrezpg=
golang.org/x/net v0.49.0 h1:eeHFmOGUTtaaPSGNmjBKpbng9MulQsJURQUAfUwY++o=
golang.org/x/net v0.49.0/go.mod h1:/ysNB2EvaqvesRkuLAyjI1ycPZlQHM3q01F02UY/MV8=
golang.org/x/sync v0.19.0 h1:vV+1eWNmZ5geRlYjzm2adRgW2/mcpevXNg50YZtPCE4=
golang.org/x/sync v0.19.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI=
golang.org/x/mod v0.35.0 h1:Ww1D637e6Pg+Zb2KrWfHQUnH2dQRLBQyAtpr/haaJeM=
golang.org/x/mod v0.35.0/go.mod h1:+GwiRhIInF8wPm+4AoT6L0FA1QWAad3OMdTRx4tFYlU=
golang.org/x/net v0.54.0 h1:2zJIZAxAHV/OHCDTCOHAYehQzLfSXuf/5SoL/Dv6w/w=
golang.org/x/net v0.54.0/go.mod h1:Sj4oj8jK6XmHpBZU/zWHw3BV3abl4Kvi+Ut7cQcY+cQ=
golang.org/x/sync v0.20.0 h1:e0PTpb7pjO8GAtTs2dQ6jYa5BWYlMuX047Dco/pItO4=
golang.org/x/sync v0.20.0/go.mod h1:9xrNwdLfx4jkKbNva9FpL6vEN7evnE43NNNJQ2LF3+0=
golang.org/x/sys v0.0.0-20210809222454-d867a43fc93e/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.40.0 h1:DBZZqJ2Rkml6QMQsZywtnjnnGvHza6BTfYFWY9kjEWQ=
golang.org/x/sys v0.40.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks=
golang.org/x/term v0.39.0 h1:RclSuaJf32jOqZz74CkPA9qFuVTX7vhLlpfj/IGWlqY=
golang.org/x/term v0.39.0/go.mod h1:yxzUCTP/U+FzoxfdKmLaA0RV1WgE0VY7hXBwKtY/4ww=
golang.org/x/text v0.33.0 h1:B3njUFyqtHDUI5jMn1YIr5B0IE2U0qck04r6d4KPAxE=
golang.org/x/text v0.33.0/go.mod h1:LuMebE6+rBincTi9+xWTY8TztLzKHc/9C1uBCG27+q8=
golang.org/x/tools v0.40.0 h1:yLkxfA+Qnul4cs9QA3KnlFu0lVmd8JJfoq+E41uSutA=
golang.org/x/tools v0.40.0/go.mod h1:Ik/tzLRlbscWpqqMRjyWYDisX8bG13FrdXp3o4Sr9lc=
golang.org/x/sys v0.45.0 h1:dO4czNzziLiiXplLQgBCEpCvXQ3dnkn0SdaZSYdQ+FY=
golang.org/x/sys v0.45.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
golang.org/x/term v0.43.0 h1:S4RLU2sB31O/NCl+zFN9Aru9A/Cq2aqKpTZJ6B+DwT4=
golang.org/x/term v0.43.0/go.mod h1:lrhlHNdQJHO+1qVYiHfFKVuVioJIheAc3fBSMFYEIsk=
golang.org/x/text v0.37.0 h1:Cqjiwd9eSg8e0QAkyCaQTNHFIIzWtidPahFWR83rTrc=
golang.org/x/text v0.37.0/go.mod h1:a5sjxXGs9hsn/AJVwuElvCAo9v8QYLzvavO5z2PiM38=
golang.org/x/tools v0.44.0 h1:UP4ajHPIcuMjT1GqzDWRlalUEoY+uzoZKnhOjbIPD2c=
golang.org/x/tools v0.44.0/go.mod h1:KA0AfVErSdxRZIsOVipbv3rQhVXTnlU6UhKxHd1seDI=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
+43 -13
View File
@@ -2,20 +2,22 @@ package alert
import (
"bytes"
"context"
"encoding/json"
"fmt"
"go-upkeep/internal/models"
"net/http"
"net/smtp"
"strconv"
"strings"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
)
var alertClient = &http.Client{Timeout: 10 * time.Second}
type Provider interface {
Send(title, message string) error
Send(ctx context.Context, title, message string) error
}
type PayloadFunc func(title, message string) ([]byte, error)
@@ -23,14 +25,23 @@ type PayloadFunc func(title, message string) ([]byte, error)
type HTTPProvider struct {
URL string
Payload PayloadFunc
Headers map[string]string
}
func (h *HTTPProvider) Send(title, message string) error {
func (h *HTTPProvider) Send(ctx context.Context, title, message string) error {
body, err := h.Payload(title, message)
if err != nil {
return err
}
resp, err := alertClient.Post(h.URL, "application/json", bytes.NewBuffer(body))
req, err := http.NewRequestWithContext(ctx, "POST", h.URL, bytes.NewBuffer(body))
if err != nil {
return err
}
req.Header.Set("Content-Type", "application/json")
for k, v := range h.Headers {
req.Header.Set(k, v)
}
resp, err := alertClient.Do(req)
if err != nil {
return err
}
@@ -70,7 +81,7 @@ func pagerdutyPayload(routingKey, severity string) PayloadFunc {
"event_action": "trigger",
"payload": map[string]string{
"summary": fmt.Sprintf("%s: %s", title, message),
"source": "go-upkeep",
"source": "uptop",
"severity": severity,
},
})
@@ -158,8 +169,9 @@ func GetProvider(cfg models.AlertConfig) Provider {
}
serverURL := strings.TrimRight(cfg.Settings["url"], "/")
return &HTTPProvider{
URL: fmt.Sprintf("%s/message?token=%s", serverURL, cfg.Settings["token"]),
URL: serverURL + "/message",
Payload: gotifyPayload(priority),
Headers: map[string]string{"X-Gotify-Key": cfg.Settings["token"]},
}
default:
return nil
@@ -170,13 +182,31 @@ type EmailProvider struct {
Host, Port, User, Pass, To, From string
}
func (e *EmailProvider) Send(title, message string) error {
func sanitizeHeader(s string) string {
s = strings.ReplaceAll(s, "\r", "")
s = strings.ReplaceAll(s, "\n", "")
return s
}
func (e *EmailProvider) Send(ctx context.Context, title, message string) error {
select {
case <-ctx.Done():
return ctx.Err()
default:
}
auth := smtp.PlainAuth("", e.User, e.Pass, e.Host)
msg := []byte("To: " + e.To + "\r\n" +
"Subject: Go-Upkeep: " + title + "\r\n" +
to := sanitizeHeader(e.To)
from := sanitizeHeader(e.From)
subject := sanitizeHeader(title)
body := strings.ReplaceAll(message, "\r", "")
msg := []byte("From: " + from + "\r\n" +
"To: " + to + "\r\n" +
"Subject: uptop: " + subject + "\r\n" +
"MIME-Version: 1.0\r\n" +
"Content-Type: text/plain; charset=utf-8\r\n" +
"\r\n" +
message + "\r\n")
return smtp.SendMail(e.Host+":"+e.Port, auth, e.From, []string{e.To}, msg)
body + "\r\n")
return smtp.SendMail(e.Host+":"+e.Port, auth, from, []string{to}, msg)
}
type NtfyProvider struct {
@@ -187,9 +217,9 @@ type NtfyProvider struct {
Password string
}
func (n *NtfyProvider) Send(title, message string) error {
func (n *NtfyProvider) Send(ctx context.Context, title, message string) error {
url := strings.TrimRight(n.ServerURL, "/") + "/" + n.Topic
req, err := http.NewRequest("POST", url, strings.NewReader(message))
req, err := http.NewRequestWithContext(ctx, "POST", url, strings.NewReader(message))
if err != nil {
return err
}
+29 -10
View File
@@ -1,11 +1,13 @@
package alert
import (
"context"
"encoding/json"
"go-upkeep/internal/models"
"net/http"
"net/http/httptest"
"testing"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
)
func TestHTTPProviderDiscord(t *testing.T) {
@@ -17,7 +19,7 @@ func TestHTTPProviderDiscord(t *testing.T) {
defer srv.Close()
p := GetProvider(models.AlertConfig{Type: "discord", Settings: map[string]string{"url": srv.URL}})
if err := p.Send("Test Title", "Test Body"); err != nil {
if err := p.Send(context.Background(), "Test Title", "Test Body"); err != nil {
t.Fatalf("Send: %v", err)
}
@@ -35,7 +37,7 @@ func TestHTTPProviderSlack(t *testing.T) {
defer srv.Close()
p := GetProvider(models.AlertConfig{Type: "slack", Settings: map[string]string{"url": srv.URL}})
if err := p.Send("Alert", "Message"); err != nil {
if err := p.Send(context.Background(), "Alert", "Message"); err != nil {
t.Fatalf("Send: %v", err)
}
@@ -53,7 +55,7 @@ func TestHTTPProviderWebhook(t *testing.T) {
defer srv.Close()
p := GetProvider(models.AlertConfig{Type: "webhook", Settings: map[string]string{"url": srv.URL}})
if err := p.Send("Title", "Body"); err != nil {
if err := p.Send(context.Background(), "Title", "Body"); err != nil {
t.Fatalf("Send: %v", err)
}
@@ -69,7 +71,7 @@ func TestHTTPProviderErrorOnHTTP4xx(t *testing.T) {
defer srv.Close()
p := GetProvider(models.AlertConfig{Type: "discord", Settings: map[string]string{"url": srv.URL}})
if err := p.Send("Test", "Test"); err == nil {
if err := p.Send(context.Background(), "Test", "Test"); err == nil {
t.Fatal("expected error on 403 response")
}
}
@@ -89,7 +91,7 @@ func TestNtfyProvider(t *testing.T) {
"url": srv.URL,
"topic": "test",
}})
if err := p.Send("Alert Title", "Alert Body"); err != nil {
if err := p.Send(context.Background(), "Alert Title", "Alert Body"); err != nil {
t.Fatalf("Send: %v", err)
}
@@ -110,7 +112,7 @@ func TestHTTPProviderTelegram(t *testing.T) {
defer srv.Close()
p := &HTTPProvider{URL: srv.URL, Payload: telegramPayload("12345")}
if err := p.Send("Alert", "Down"); err != nil {
if err := p.Send(context.Background(), "Alert", "Down"); err != nil {
t.Fatalf("Send: %v", err)
}
if received["chat_id"] != "12345" {
@@ -133,7 +135,7 @@ func TestHTTPProviderPagerDuty(t *testing.T) {
defer srv.Close()
p := &HTTPProvider{URL: srv.URL, Payload: pagerdutyPayload("test-key", "critical")}
if err := p.Send("Alert", "Down"); err != nil {
if err := p.Send(context.Background(), "Alert", "Down"); err != nil {
t.Fatalf("Send: %v", err)
}
if received["routing_key"] != "test-key" {
@@ -160,7 +162,7 @@ func TestHTTPProviderPushover(t *testing.T) {
defer srv.Close()
p := &HTTPProvider{URL: srv.URL, Payload: pushoverPayload("app-tok", "user-key")}
if err := p.Send("Alert", "Down"); err != nil {
if err := p.Send(context.Background(), "Alert", "Down"); err != nil {
t.Fatalf("Send: %v", err)
}
if received["token"] != "app-tok" {
@@ -183,7 +185,7 @@ func TestHTTPProviderGotify(t *testing.T) {
defer srv.Close()
p := &HTTPProvider{URL: srv.URL, Payload: gotifyPayload("8")}
if err := p.Send("Alert", "Down"); err != nil {
if err := p.Send(context.Background(), "Alert", "Down"); err != nil {
t.Fatalf("Send: %v", err)
}
if received["title"] != "Alert" || received["message"] != "Down" {
@@ -211,3 +213,20 @@ func TestGetProviderUnknown(t *testing.T) {
t.Error("expected nil for unknown provider type")
}
}
func TestSanitizeHeader(t *testing.T) {
tests := []struct {
input, want string
}{
{"normal subject", "normal subject"},
{"inject\r\nBcc: evil@bad.com", "injectBcc: evil@bad.com"},
{"has\nnewline", "hasnewline"},
{"has\rcarriage", "hascarriage"},
}
for _, tt := range tests {
got := sanitizeHeader(tt.input)
if got != tt.want {
t.Errorf("sanitizeHeader(%q) = %q, want %q", tt.input, got, tt.want)
}
}
}
+5 -4
View File
@@ -3,10 +3,11 @@ package cluster
import (
"context"
"fmt"
"go-upkeep/internal/monitor"
"net/http"
"strings"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
)
type Config struct {
@@ -57,9 +58,9 @@ func runFollowerLoop(ctx context.Context, cfg Config, eng *monitor.Engine) {
resp, err := client.Do(req)
isLeaderHealthy := false
if err == nil && resp.StatusCode == 200 {
isLeaderHealthy = true
resp.Body.Close()
if err == nil {
isLeaderHealthy = resp.StatusCode == 200
_ = resp.Body.Close()
}
if isLeaderHealthy {
+397
View File
@@ -0,0 +1,397 @@
package cluster
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"sync"
"sync/atomic"
"testing"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
)
// --- Mock Store (minimal, for monitor.NewEngine) ---
type mockStore struct {
sites []models.Site
}
func (m *mockStore) Init() error { return nil }
func (m *mockStore) GetSites() ([]models.Site, error) { return m.sites, nil }
func (m *mockStore) AddSite(models.Site) error { return nil }
func (m *mockStore) UpdateSite(models.Site) error { return nil }
func (m *mockStore) UpdateSitePaused(int, bool) error { return nil }
func (m *mockStore) DeleteSite(int) error { return nil }
func (m *mockStore) GetAllAlerts() ([]models.AlertConfig, error) { return nil, nil }
func (m *mockStore) GetAlert(int) (models.AlertConfig, error) { return models.AlertConfig{}, nil }
func (m *mockStore) AddAlert(string, string, map[string]string) error { return nil }
func (m *mockStore) UpdateAlert(int, string, string, map[string]string) error { return nil }
func (m *mockStore) DeleteAlert(int) error { return nil }
func (m *mockStore) GetAllUsers() ([]models.User, error) { return nil, nil }
func (m *mockStore) AddUser(string, string, string) error { return nil }
func (m *mockStore) UpdateUser(int, string, string, string) error { return nil }
func (m *mockStore) DeleteUser(int) error { return nil }
func (m *mockStore) SaveCheck(int, int64, bool) error { return nil }
func (m *mockStore) SaveCheckFromNode(int, string, int64, bool) error { return nil }
func (m *mockStore) LoadAllHistory(int) (map[int][]models.CheckRecord, error) { return nil, nil }
func (m *mockStore) ExportData() (models.Backup, error) { return models.Backup{}, nil }
func (m *mockStore) ImportData(models.Backup) error { return nil }
func (m *mockStore) GetSiteByName(string) (models.Site, error) { return models.Site{}, nil }
func (m *mockStore) GetAlertByName(string) (models.AlertConfig, error) {
return models.AlertConfig{}, nil
}
func (m *mockStore) AddSiteReturningID(models.Site) (int, error) { return 0, nil }
func (m *mockStore) AddAlertReturningID(string, string, map[string]string) (int, error) {
return 0, nil
}
func (m *mockStore) RegisterNode(models.ProbeNode) error { return nil }
func (m *mockStore) GetNode(string) (models.ProbeNode, error) { return models.ProbeNode{}, nil }
func (m *mockStore) GetAllNodes() ([]models.ProbeNode, error) { return nil, nil }
func (m *mockStore) UpdateNodeLastSeen(string) error { return nil }
func (m *mockStore) DeleteNode(string) error { return nil }
func (m *mockStore) SaveLog(string) error { return nil }
func (m *mockStore) LoadLogs(int) ([]string, error) { return nil, nil }
func (m *mockStore) GetActiveMaintenanceWindows() ([]models.MaintenanceWindow, error) {
return nil, nil
}
func (m *mockStore) GetAllMaintenanceWindows(int) ([]models.MaintenanceWindow, error) {
return nil, nil
}
func (m *mockStore) AddMaintenanceWindow(models.MaintenanceWindow) error { return nil }
func (m *mockStore) EndMaintenanceWindow(int) error { return nil }
func (m *mockStore) DeleteMaintenanceWindow(int) error { return nil }
func (m *mockStore) IsMonitorInMaintenance(int) (bool, error) { return false, nil }
func (m *mockStore) GetPreference(string) (string, error) { return "", nil }
func (m *mockStore) SetPreference(string, string) error { return nil }
func (m *mockStore) SaveStateChange(int, string, string, string) error { return nil }
func (m *mockStore) GetStateChanges(int, int) ([]models.StateChange, error) { return nil, nil }
func (m *mockStore) Close() error { return nil }
// --- Cluster Start Tests ---
func TestStart_LeaderMode(t *testing.T) {
eng := monitor.NewEngine(&mockStore{})
eng.SetActive(false)
ctx := context.Background()
Start(ctx, Config{Mode: "leader"}, eng)
if !eng.IsActive() {
t.Error("leader mode should set engine active")
}
}
func TestStart_FollowerMode(t *testing.T) {
eng := monitor.NewEngine(&mockStore{})
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
Start(ctx, Config{Mode: "follower", PeerURL: "http://localhost:9999"}, eng)
time.Sleep(50 * time.Millisecond)
if eng.IsActive() {
t.Error("follower mode should set engine inactive")
}
}
// --- Follower Loop Tests ---
func TestFollowerLoop_FailoverOnLeaderDown(t *testing.T) {
eng := monitor.NewEngine(&mockStore{})
eng.SetActive(false)
// Server always returns 503
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(503)
}))
defer srv.Close()
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
go runFollowerLoop(ctx, Config{PeerURL: srv.URL, SharedKey: "key"}, eng)
// Follower checks every 5s, needs 3 failures → ~15s minimum
// But we can't wait that long in a test. The loop sleeps 5s between checks.
// We'll wait up to 20s for failover.
deadline := time.After(20 * time.Second)
for {
if eng.IsActive() {
return // success
}
select {
case <-deadline:
t.Fatal("expected failover to ACTIVE after 3 failures")
case <-time.After(500 * time.Millisecond):
}
}
}
func TestFollowerLoop_RecoveryOnLeaderReturn(t *testing.T) {
eng := monitor.NewEngine(&mockStore{})
eng.SetActive(true) // simulate already failed over
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
w.Write([]byte("OK"))
}))
defer srv.Close()
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
go runFollowerLoop(ctx, Config{PeerURL: srv.URL}, eng)
deadline := time.After(10 * time.Second)
for {
if !eng.IsActive() {
return // success — switched back to passive
}
select {
case <-deadline:
t.Fatal("expected switch back to PASSIVE when leader returns")
case <-time.After(500 * time.Millisecond):
}
}
}
func TestFollowerLoop_SendsSecret(t *testing.T) {
var mu sync.Mutex
var receivedSecret string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
mu.Lock()
receivedSecret = r.Header.Get("X-Upkeep-Secret")
mu.Unlock()
w.WriteHeader(200)
w.Write([]byte("OK"))
}))
defer srv.Close()
eng := monitor.NewEngine(&mockStore{})
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
go runFollowerLoop(ctx, Config{PeerURL: srv.URL, SharedKey: "test-secret"}, eng)
deadline := time.After(10 * time.Second)
for {
mu.Lock()
got := receivedSecret
mu.Unlock()
if got == "test-secret" {
return
}
select {
case <-deadline:
t.Fatalf("expected secret 'test-secret', got %q", got)
case <-time.After(500 * time.Millisecond):
}
}
}
func TestFollowerLoop_CancelContext(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
}))
defer srv.Close()
eng := monitor.NewEngine(&mockStore{})
ctx, cancel := context.WithCancel(context.Background())
done := make(chan struct{})
go func() {
runFollowerLoop(ctx, Config{PeerURL: srv.URL}, eng)
close(done)
}()
cancel()
select {
case <-done:
case <-time.After(3 * time.Second):
t.Fatal("expected follower loop to exit on context cancel")
}
}
// --- Probe Tests ---
func TestProbeRegister_Success(t *testing.T) {
var received map[string]string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
json.NewDecoder(r.Body).Decode(&received)
w.WriteHeader(200)
}))
defer srv.Close()
err := probeRegister(context.Background(), srv.Client(), ProbeConfig{
NodeID: "n1", NodeName: "US East", Region: "us-east", LeaderURL: srv.URL, SharedKey: "key",
})
if err != nil {
t.Fatalf("register: %v", err)
}
if received["id"] != "n1" {
t.Errorf("expected id n1, got %s", received["id"])
}
}
func TestProbeRegister_Failure(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
}))
defer srv.Close()
err := probeRegister(context.Background(), srv.Client(), ProbeConfig{
LeaderURL: srv.URL,
})
if err == nil {
t.Error("expected error on 401")
}
}
func TestProbeFetchAssignments_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
json.NewEncoder(w).Encode(map[string][]models.Site{
"sites": {{ID: 1, Name: "s1", Type: "http", URL: "http://example.com"}},
})
}))
defer srv.Close()
sites, err := probeFetchAssignments(context.Background(), srv.Client(), ProbeConfig{
NodeID: "n1", LeaderURL: srv.URL, SharedKey: "key",
})
if err != nil {
t.Fatalf("fetch: %v", err)
}
if len(sites) != 1 {
t.Errorf("expected 1 site, got %d", len(sites))
}
}
func TestProbeFetchAssignments_Unauthorized(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
}))
defer srv.Close()
_, err := probeFetchAssignments(context.Background(), srv.Client(), ProbeConfig{
LeaderURL: srv.URL,
})
if err == nil {
t.Error("expected error on 401")
}
}
func TestProbeExecuteChecks(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
}))
defer srv.Close()
sites := []models.Site{
{ID: 1, Type: "http", URL: srv.URL},
{ID: 2, Type: "http", URL: srv.URL},
}
strict := &http.Client{}
insecure := &http.Client{}
results := probeExecuteChecks(context.Background(), sites, strict, insecure, true)
if len(results) != 2 {
t.Fatalf("expected 2 results, got %d", len(results))
}
for _, r := range results {
if !r.IsUp {
t.Errorf("site %d expected UP", r.SiteID)
}
}
}
func TestProbeExecuteChecks_Concurrency(t *testing.T) {
var concurrent int64
var maxConcurrent int64
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
cur := atomic.AddInt64(&concurrent, 1)
for {
old := atomic.LoadInt64(&maxConcurrent)
if cur <= old || atomic.CompareAndSwapInt64(&maxConcurrent, old, cur) {
break
}
}
time.Sleep(50 * time.Millisecond)
atomic.AddInt64(&concurrent, -1)
w.WriteHeader(200)
}))
defer srv.Close()
var sites []models.Site
for i := 0; i < 20; i++ {
sites = append(sites, models.Site{ID: i + 1, Type: "http", URL: srv.URL})
}
results := probeExecuteChecks(context.Background(), sites, &http.Client{}, &http.Client{}, true)
if len(results) != 20 {
t.Errorf("expected 20 results, got %d", len(results))
}
mc := atomic.LoadInt64(&maxConcurrent)
if mc > 10 {
t.Errorf("expected max 10 concurrent, got %d", mc)
}
}
func TestProbeReportResults_Success(t *testing.T) {
var received struct {
NodeID string `json:"node_id"`
Results []probeResultItem `json:"results"`
}
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
json.NewDecoder(r.Body).Decode(&received)
w.WriteHeader(200)
}))
defer srv.Close()
err := probeReportResults(context.Background(), srv.Client(), ProbeConfig{
NodeID: "n1", LeaderURL: srv.URL, SharedKey: "key",
}, []probeResultItem{{SiteID: 1, LatencyNs: 5000000, IsUp: true}})
if err != nil {
t.Fatalf("report: %v", err)
}
if received.NodeID != "n1" {
t.Errorf("expected n1, got %s", received.NodeID)
}
if len(received.Results) != 1 {
t.Errorf("expected 1 result, got %d", len(received.Results))
}
}
func TestProbeReportResults_Failure(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(500)
}))
defer srv.Close()
err := probeReportResults(context.Background(), srv.Client(), ProbeConfig{
LeaderURL: srv.URL,
}, []probeResultItem{{SiteID: 1}})
if err == nil {
t.Error("expected error on 500")
}
}
// --- sleepCtx ---
func TestSleepCtx_Cancel(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
cancel()
start := time.Now()
sleepCtx(ctx, 10*time.Second)
if time.Since(start) > time.Second {
t.Error("expected immediate return on canceled context")
}
}
+25 -11
View File
@@ -6,12 +6,14 @@ import (
"crypto/tls"
"encoding/json"
"fmt"
"go-upkeep/internal/models"
"go-upkeep/internal/monitor"
"log"
"net/http"
"net/url"
"sync"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
)
type ProbeConfig struct {
@@ -21,6 +23,7 @@ type ProbeConfig struct {
LeaderURL string
SharedKey string
Interval int
AllowPrivateTargets bool
}
func RunProbe(ctx context.Context, cfg ProbeConfig) error {
@@ -29,11 +32,18 @@ func RunProbe(ctx context.Context, cfg ProbeConfig) error {
}
apiClient := &http.Client{Timeout: 10 * time.Second}
dial := monitor.SafeDialContext(cfg.AllowPrivateTargets)
strictClient := &http.Client{
Transport: &http.Transport{TLSClientConfig: &tls.Config{InsecureSkipVerify: false}},
Transport: &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: false},
DialContext: dial,
},
}
insecureClient := &http.Client{
Transport: &http.Transport{TLSClientConfig: &tls.Config{InsecureSkipVerify: true}},
Transport: &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: true}, //nolint:gosec // intentional for IgnoreTLS sites
DialContext: dial,
},
}
if err := probeRegister(ctx, apiClient, cfg); err != nil {
@@ -59,7 +69,7 @@ func RunProbe(ctx context.Context, cfg ProbeConfig) error {
continue
}
results := probeExecuteChecks(ctx, sites, strictClient, insecureClient)
results := probeExecuteChecks(ctx, sites, strictClient, insecureClient, cfg.AllowPrivateTargets)
if len(results) > 0 {
if err := probeReportResults(ctx, apiClient, cfg, results); err != nil {
@@ -85,7 +95,7 @@ func probeRegister(ctx context.Context, client *http.Client, cfg ProbeConfig) er
if err != nil {
return err
}
resp.Body.Close()
_ = resp.Body.Close()
if resp.StatusCode != 200 {
return fmt.Errorf("register returned %d", resp.StatusCode)
}
@@ -93,7 +103,8 @@ func probeRegister(ctx context.Context, client *http.Client, cfg ProbeConfig) er
}
func probeFetchAssignments(ctx context.Context, client *http.Client, cfg ProbeConfig) ([]models.Site, error) {
req, err := http.NewRequestWithContext(ctx, "GET", cfg.LeaderURL+"/api/probe/assignments?node_id="+cfg.NodeID, nil)
assignURL := cfg.LeaderURL + "/api/probe/assignments?" + url.Values{"node_id": {cfg.NodeID}}.Encode()
req, err := http.NewRequestWithContext(ctx, "GET", assignURL, nil)
if err != nil {
return nil, err
}
@@ -119,18 +130,20 @@ type probeResultItem struct {
SiteID int `json:"site_id"`
LatencyNs int64 `json:"latency_ns"`
IsUp bool `json:"is_up"`
ErrorReason string `json:"error_reason,omitempty"`
}
func probeExecuteChecks(ctx context.Context, sites []models.Site, strict, insecure *http.Client) []probeResultItem {
func probeExecuteChecks(ctx context.Context, sites []models.Site, strict, insecure *http.Client, allowPrivate bool) []probeResultItem {
var mu sync.Mutex
var results []probeResultItem
sem := make(chan struct{}, 10)
var wg sync.WaitGroup
loop:
for _, site := range sites {
select {
case <-ctx.Done():
break
break loop
default:
}
wg.Add(1)
@@ -139,12 +152,13 @@ func probeExecuteChecks(ctx context.Context, sites []models.Site, strict, insecu
defer wg.Done()
defer func() { <-sem }()
cr := monitor.RunCheck(s, strict, insecure, false)
cr := monitor.RunCheck(s, strict, insecure, false, allowPrivate)
mu.Lock()
results = append(results, probeResultItem{
SiteID: s.ID,
LatencyNs: cr.LatencyNs,
IsUp: cr.Status == "UP",
ErrorReason: cr.ErrorReason,
})
mu.Unlock()
}(site)
@@ -171,7 +185,7 @@ func probeReportResults(ctx context.Context, client *http.Client, cfg ProbeConfi
if err != nil {
return err
}
resp.Body.Close()
_ = resp.Body.Close()
if resp.StatusCode != 200 {
return fmt.Errorf("results returned %d", resp.StatusCode)
}
+2 -2
View File
@@ -2,8 +2,8 @@ package config
import (
"fmt"
"go-upkeep/internal/models"
"go-upkeep/internal/store"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
"reflect"
"strings"
)
+2 -2
View File
@@ -1,8 +1,8 @@
package config
import (
"go-upkeep/internal/models"
"go-upkeep/internal/store"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
"strings"
"testing"
)
+4 -3
View File
@@ -2,11 +2,12 @@ package config
import (
"fmt"
"go-upkeep/internal/models"
"go-upkeep/internal/store"
"os"
"sort"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
"gopkg.in/yaml.v3"
)
@@ -142,7 +143,7 @@ func WriteFile(f *File, path string) error {
_, err = os.Stdout.Write(data)
return err
}
return os.WriteFile(path, data, 0644)
return os.WriteFile(path, data, 0600)
}
func LoadFile(path string) (*File, error) {
+1 -1
View File
@@ -1,7 +1,7 @@
package config
import (
"go-upkeep/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"testing"
)
+5 -3
View File
@@ -3,7 +3,7 @@ package importer
import (
"encoding/json"
"fmt"
"go-upkeep/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"os"
"strings"
)
@@ -96,7 +96,9 @@ func convertKumaNotifications(entries []KumaNotifEntry) map[int]models.AlertConf
result := make(map[int]models.AlertConfig)
for _, entry := range entries {
var cfg KumaNotifConfig
json.Unmarshal([]byte(entry.Config), &cfg)
if err := json.Unmarshal([]byte(entry.Config), &cfg); err != nil {
continue
}
alert := models.AlertConfig{
ID: entry.ID,
@@ -175,7 +177,7 @@ func convertKumaMonitor(m KumaMonitor, alertMap map[int]int) models.Site {
for nidStr := range m.NotificationIDs {
var nid int
fmt.Sscanf(nidStr, "%d", &nid)
_, _ = fmt.Sscanf(nidStr, "%d", &nid) //nolint:errcheck
if upkeepID, ok := alertMap[nid]; ok {
site.AlertID = upkeepID
break
+30 -21
View File
@@ -2,8 +2,8 @@ package metrics
import (
"fmt"
"go-upkeep/internal/models"
"go-upkeep/internal/monitor"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"net/http"
"sort"
"strings"
@@ -16,65 +16,74 @@ func Handler(eng *monitor.Engine) http.HandlerFunc {
var b strings.Builder
writeHelp(&b, "upkeep_monitor_up", "gauge", "Whether the monitor is up (1) or down (0).")
writeHelp(&b, "uptop_monitor_up", "gauge", "Whether the monitor is up (1) or down (0).")
for _, s := range sites {
val := 0
if s.Status == "UP" {
val = 1
}
writeGauge(&b, "upkeep_monitor_up", labels(s), float64(val))
writeGauge(&b, "uptop_monitor_up", labels(s), float64(val))
}
writeHelp(&b, "upkeep_monitor_latency_seconds", "gauge", "Last check latency in seconds.")
writeHelp(&b, "uptop_monitor_latency_seconds", "gauge", "Last check latency in seconds.")
for _, s := range sites {
writeGauge(&b, "upkeep_monitor_latency_seconds", labels(s), s.Latency.Seconds())
writeGauge(&b, "uptop_monitor_latency_seconds", labels(s), s.Latency.Seconds())
}
writeHelp(&b, "upkeep_monitor_status_code", "gauge", "HTTP response status code of the last check.")
writeHelp(&b, "uptop_monitor_status_code", "gauge", "HTTP response status code of the last check.")
for _, s := range sites {
if s.Type != "http" {
continue
}
writeGauge(&b, "upkeep_monitor_status_code", labels(s), float64(s.StatusCode))
writeGauge(&b, "uptop_monitor_status_code", labels(s), float64(s.StatusCode))
}
writeHelp(&b, "upkeep_monitor_check_timestamp_seconds", "gauge", "Unix timestamp of the last check.")
writeHelp(&b, "uptop_monitor_check_timestamp_seconds", "gauge", "Unix timestamp of the last check.")
for _, s := range sites {
if s.LastCheck.IsZero() {
continue
}
writeGauge(&b, "upkeep_monitor_check_timestamp_seconds", labels(s), float64(s.LastCheck.Unix()))
writeGauge(&b, "uptop_monitor_check_timestamp_seconds", labels(s), float64(s.LastCheck.Unix()))
}
writeHelp(&b, "upkeep_monitor_paused", "gauge", "Whether the monitor is paused (1) or active (0).")
writeHelp(&b, "uptop_monitor_paused", "gauge", "Whether the monitor is paused (1) or active (0).")
for _, s := range sites {
val := 0
if s.Paused {
val = 1
}
writeGauge(&b, "upkeep_monitor_paused", labels(s), float64(val))
writeGauge(&b, "uptop_monitor_paused", labels(s), float64(val))
}
writeHelp(&b, "upkeep_monitor_cert_expiry_timestamp_seconds", "gauge", "Unix timestamp when the SSL certificate expires.")
writeHelp(&b, "uptop_monitor_maintenance", "gauge", "Whether the monitor is in a maintenance window (1) or not (0).")
for _, s := range sites {
val := 0
if eng.GetDisplayStatus(s) == "MAINT" {
val = 1
}
writeGauge(&b, "uptop_monitor_maintenance", labels(s), float64(val))
}
writeHelp(&b, "uptop_monitor_cert_expiry_timestamp_seconds", "gauge", "Unix timestamp when the SSL certificate expires.")
for _, s := range sites {
if !s.HasSSL || s.CertExpiry.IsZero() {
continue
}
writeGauge(&b, "upkeep_monitor_cert_expiry_timestamp_seconds", labels(s), float64(s.CertExpiry.Unix()))
writeGauge(&b, "uptop_monitor_cert_expiry_timestamp_seconds", labels(s), float64(s.CertExpiry.Unix()))
}
writeHelp(&b, "upkeep_monitor_checks_total", "counter", "Total number of checks performed.")
writeHelp(&b, "upkeep_monitor_checks_up_total", "counter", "Total number of successful checks.")
writeHelp(&b, "uptop_monitor_checks_total", "counter", "Total number of checks performed.")
writeHelp(&b, "uptop_monitor_checks_up_total", "counter", "Total number of successful checks.")
for _, s := range sites {
h, ok := eng.GetHistory(s.ID)
if !ok {
continue
}
writeGauge(&b, "upkeep_monitor_checks_total", labels(s), float64(h.TotalChecks))
writeGauge(&b, "upkeep_monitor_checks_up_total", labels(s), float64(h.UpChecks))
writeGauge(&b, "uptop_monitor_checks_total", labels(s), float64(h.TotalChecks))
writeGauge(&b, "uptop_monitor_checks_up_total", labels(s), float64(h.UpChecks))
}
writeHelp(&b, "upkeep_probe_up", "gauge", "Whether a probe node is online (1) or offline (0) based on last-seen time.")
writeHelp(&b, "uptop_probe_up", "gauge", "Whether a probe node is online (1) or offline (0) based on last-seen time.")
for _, site := range sites {
probeResults := eng.GetProbeResults(site.ID)
for nodeID, result := range probeResults {
@@ -83,12 +92,12 @@ func Handler(eng *monitor.Engine) http.HandlerFunc {
val = 1
}
nodeLabels := fmt.Sprintf(`id="%d",name="%s",node="%s"`, site.ID, escapeLabelValue(site.Name), escapeLabelValue(nodeID))
writeGauge(&b, "upkeep_probe_up", nodeLabels, float64(val))
writeGauge(&b, "uptop_probe_up", nodeLabels, float64(val))
}
}
w.Header().Set("Content-Type", "text/plain; version=0.0.4; charset=utf-8")
w.Write([]byte(b.String()))
_, _ = w.Write([]byte(b.String())) //nolint:errcheck
}
}
+25 -9
View File
@@ -2,13 +2,14 @@ package metrics
import (
"context"
"go-upkeep/internal/models"
"go-upkeep/internal/monitor"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
)
type mockStore struct {
@@ -52,6 +53,21 @@ func (m *mockStore) UpdateNodeLastSeen(string) error { return n
func (m *mockStore) DeleteNode(string) error { return nil }
func (m *mockStore) SaveLog(string) error { return nil }
func (m *mockStore) LoadLogs(int) ([]string, error) { return nil, nil }
func (m *mockStore) GetActiveMaintenanceWindows() ([]models.MaintenanceWindow, error) {
return nil, nil
}
func (m *mockStore) GetAllMaintenanceWindows(int) ([]models.MaintenanceWindow, error) {
return nil, nil
}
func (m *mockStore) AddMaintenanceWindow(models.MaintenanceWindow) error { return nil }
func (m *mockStore) EndMaintenanceWindow(int) error { return nil }
func (m *mockStore) DeleteMaintenanceWindow(int) error { return nil }
func (m *mockStore) IsMonitorInMaintenance(int) (bool, error) { return false, nil }
func (m *mockStore) GetPreference(string) (string, error) { return "", nil }
func (m *mockStore) SetPreference(string, string) error { return nil }
func (m *mockStore) SaveStateChange(int, string, string, string) error { return nil }
func (m *mockStore) GetStateChanges(int, int) ([]models.StateChange, error) { return nil, nil }
func (m *mockStore) Close() error { return nil }
func TestMetricsHandler(t *testing.T) {
ms := &mockStore{
@@ -81,13 +97,13 @@ func TestMetricsHandler(t *testing.T) {
}
expected := []string{
"# HELP upkeep_monitor_up",
"# TYPE upkeep_monitor_up gauge",
`upkeep_monitor_up{id="1",name="Example",type="http"}`,
`upkeep_monitor_up{id="2",name="DNS Check",type="dns"}`,
"# HELP upkeep_monitor_latency_seconds",
"# HELP upkeep_monitor_paused",
"# HELP upkeep_monitor_checks_total",
"# HELP uptop_monitor_up",
"# TYPE uptop_monitor_up gauge",
`uptop_monitor_up{id="1",name="Example",type="http"}`,
`uptop_monitor_up{id="2",name="DNS Check",type="dns"}`,
"# HELP uptop_monitor_latency_seconds",
"# HELP uptop_monitor_paused",
"# HELP uptop_monitor_checks_total",
}
for _, s := range expected {
if !strings.Contains(body, s) {
+25
View File
@@ -35,6 +35,18 @@ type Site struct {
HasSSL bool
LastCheck time.Time
SentSSLWarning bool
LastError string
StatusChangedAt time.Time
LastSuccessAt time.Time
}
type StateChange struct {
ID int
SiteID int
FromStatus string
ToStatus string
ErrorReason string
ChangedAt time.Time
}
type AlertConfig struct {
@@ -67,8 +79,21 @@ type ProbeNode struct {
Version string
}
type MaintenanceWindow struct {
ID int
MonitorID int
Title string
Description string
Type string // "maintenance" or "incident"
StartTime time.Time
EndTime time.Time // zero = ongoing
CreatedBy string
CreatedAt time.Time
}
type Backup struct {
Sites []Site `json:"sites"`
Alerts []AlertConfig `json:"alerts"`
Users []User `json:"users"`
MaintenanceWindows []MaintenanceWindow `json:"maintenance_windows,omitempty"`
}
+1
View File
@@ -15,6 +15,7 @@ type NodeResult struct {
IsUp bool
LatencyNs int64
CheckedAt time.Time
ErrorReason string
}
func AggregateStatus(results []NodeResult, strategy AggregationStrategy) (isUp bool, avgLatencyNs int64) {
+49 -11
View File
@@ -2,13 +2,15 @@ package monitor
import (
"context"
"go-upkeep/internal/models"
"fmt"
"net"
"net/http"
"strconv"
"strings"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"github.com/miekg/dns"
probing "github.com/prometheus-community/pro-bing"
)
@@ -20,9 +22,28 @@ type CheckResult struct {
LatencyNs int64
HasSSL bool
CertExpiry time.Time
ErrorReason string
}
func RunCheck(site models.Site, strict, insecure *http.Client, globalInsecure bool) CheckResult {
func RunCheck(site models.Site, strict, insecure *http.Client, globalInsecure bool, allowPrivate ...bool) CheckResult {
private := len(allowPrivate) > 0 && allowPrivate[0]
if site.Type != "http" && site.Type != "dns" && !private {
host := site.Hostname
if host == "" {
host = site.URL
}
if host != "" {
if ips, err := net.LookupIP(host); err == nil {
for _, ip := range ips {
if isPrivateIP(ip) {
return CheckResult{SiteID: site.ID, Status: "DOWN", ErrorReason: "target resolves to private IP"}
}
}
}
}
}
switch site.Type {
case "http":
return runHTTPCheck(site, strict, insecure, globalInsecure)
@@ -33,7 +54,7 @@ func RunCheck(site models.Site, strict, insecure *http.Client, globalInsecure bo
case "dns":
return runDNSCheck(site)
default:
return CheckResult{SiteID: site.ID, Status: "DOWN"}
return CheckResult{SiteID: site.ID, Status: "DOWN", ErrorReason: "unsupported monitor type: " + site.Type}
}
}
@@ -49,7 +70,7 @@ func runHTTPCheck(site models.Site, strict, insecure *http.Client, globalInsecur
req, err := http.NewRequestWithContext(ctx, method, site.URL, nil)
if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN"}
return CheckResult{SiteID: site.ID, Status: "DOWN", ErrorReason: "invalid request: " + err.Error()}
}
client := strict
@@ -69,6 +90,7 @@ func runHTTPCheck(site models.Site, strict, insecure *http.Client, globalInsecur
if err != nil {
result.Status = "DOWN"
result.ErrorReason = truncateError(err.Error(), 256)
return result
}
defer resp.Body.Close()
@@ -76,6 +98,11 @@ func runHTTPCheck(site models.Site, strict, insecure *http.Client, globalInsecur
result.StatusCode = resp.StatusCode
if !isCodeAccepted(resp.StatusCode, site.AcceptedCodes) {
result.Status = "DOWN"
expected := site.AcceptedCodes
if expected == "" {
expected = "200-299"
}
result.ErrorReason = fmt.Sprintf("HTTP %d (expected %s)", resp.StatusCode, expected)
}
if site.CheckSSL && resp.TLS != nil && len(resp.TLS.PeerCertificates) > 0 {
@@ -84,6 +111,7 @@ func runHTTPCheck(site models.Site, strict, insecure *http.Client, globalInsecur
result.CertExpiry = cert.NotAfter
if time.Now().After(cert.NotAfter) {
result.Status = "SSL EXP"
result.ErrorReason = "SSL certificate expired"
}
}
@@ -98,7 +126,7 @@ func runPingCheck(site models.Site) CheckResult {
pinger, err := probing.NewPinger(host)
if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN"}
return CheckResult{SiteID: site.ID, Status: "DOWN", ErrorReason: "ping setup: " + err.Error()}
}
pinger.Count = 1
pinger.Timeout = siteTimeout(site)
@@ -108,8 +136,11 @@ func runPingCheck(site models.Site) CheckResult {
err = pinger.Run()
latency := time.Since(start)
if err != nil || pinger.Statistics().PacketsRecv == 0 {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds()}
if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds(), ErrorReason: "ping failed: " + err.Error()}
}
if pinger.Statistics().PacketsRecv == 0 {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds(), ErrorReason: "no ICMP response"}
}
stats := pinger.Statistics()
@@ -129,9 +160,9 @@ func runPortCheck(site models.Site) CheckResult {
latency := time.Since(start)
if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds()}
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds(), ErrorReason: truncateError(err.Error(), 256)}
}
conn.Close()
_ = conn.Close()
return CheckResult{SiteID: site.ID, Status: "UP", LatencyNs: latency.Nanoseconds()}
}
@@ -180,10 +211,10 @@ func runDNSCheck(site models.Site) CheckResult {
latency := time.Since(start)
if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds()}
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds(), ErrorReason: "DNS query failed: " + err.Error()}
}
if r.Rcode != dns.RcodeSuccess {
return CheckResult{SiteID: site.ID, Status: "DOWN", StatusCode: r.Rcode, LatencyNs: latency.Nanoseconds()}
return CheckResult{SiteID: site.ID, Status: "DOWN", StatusCode: r.Rcode, LatencyNs: latency.Nanoseconds(), ErrorReason: "DNS RCODE: " + dns.RcodeToString[r.Rcode]}
}
return CheckResult{SiteID: site.ID, Status: "UP", LatencyNs: latency.Nanoseconds()}
}
@@ -216,3 +247,10 @@ func isCodeAccepted(code int, accepted string) bool {
}
return false
}
func truncateError(s string, max int) string {
if len(s) <= max {
return s
}
return s[:max-3] + "..."
}
+222
View File
@@ -0,0 +1,222 @@
package monitor
import (
"crypto/tls"
"net"
"net/http"
"net/http/httptest"
"strconv"
"testing"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
)
func TestRunCheck_HTTP_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
}))
defer srv.Close()
site := models.Site{ID: 1, Type: "http", URL: srv.URL}
result := RunCheck(site, http.DefaultClient, http.DefaultClient, false)
if result.Status != "UP" {
t.Errorf("expected UP, got %s", result.Status)
}
if result.StatusCode != 200 {
t.Errorf("expected 200, got %d", result.StatusCode)
}
if result.LatencyNs <= 0 {
t.Error("expected positive latency")
}
}
func TestRunCheck_HTTP_ServerError(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(500)
}))
defer srv.Close()
site := models.Site{ID: 1, Type: "http", URL: srv.URL}
result := RunCheck(site, http.DefaultClient, http.DefaultClient, false)
if result.Status != "DOWN" {
t.Errorf("expected DOWN, got %s", result.Status)
}
if result.StatusCode != 500 {
t.Errorf("expected 500, got %d", result.StatusCode)
}
}
func TestRunCheck_HTTP_CustomAcceptedCodes(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(302)
}))
defer srv.Close()
client := &http.Client{CheckRedirect: func(req *http.Request, via []*http.Request) error {
return http.ErrUseLastResponse
}}
site := models.Site{ID: 1, Type: "http", URL: srv.URL, AcceptedCodes: "200-399"}
result := RunCheck(site, client, client, false)
if result.Status != "UP" {
t.Errorf("expected UP with accepted 200-399, got %s", result.Status)
}
}
func TestRunCheck_HTTP_MethodRespected(t *testing.T) {
var receivedMethod string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
receivedMethod = r.Method
w.WriteHeader(200)
}))
defer srv.Close()
site := models.Site{ID: 1, Type: "http", URL: srv.URL, Method: "HEAD"}
RunCheck(site, http.DefaultClient, http.DefaultClient, false)
if receivedMethod != "HEAD" {
t.Errorf("expected HEAD, got %s", receivedMethod)
}
}
func TestRunCheck_HTTP_Timeout(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
time.Sleep(2 * time.Second)
w.WriteHeader(200)
}))
defer srv.Close()
site := models.Site{ID: 1, Type: "http", URL: srv.URL, Timeout: 1}
result := RunCheck(site, http.DefaultClient, http.DefaultClient, false)
if result.Status != "DOWN" {
t.Errorf("expected DOWN on timeout, got %s", result.Status)
}
}
func TestRunCheck_HTTP_SSLFields(t *testing.T) {
srv := httptest.NewTLSServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
}))
defer srv.Close()
insecureClient := &http.Client{
Transport: &http.Transport{TLSClientConfig: &tls.Config{InsecureSkipVerify: true}},
}
site := models.Site{ID: 1, Type: "http", URL: srv.URL, CheckSSL: true, IgnoreTLS: true}
result := RunCheck(site, http.DefaultClient, insecureClient, false)
if result.Status != "UP" {
t.Errorf("expected UP, got %s", result.Status)
}
if !result.HasSSL {
t.Error("expected HasSSL=true")
}
if result.CertExpiry.IsZero() {
t.Error("expected CertExpiry populated")
}
}
func TestRunCheck_Port_Open(t *testing.T) {
ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatal(err)
}
defer ln.Close()
_, portStr, _ := net.SplitHostPort(ln.Addr().String())
port, _ := strconv.Atoi(portStr)
site := models.Site{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 2}
result := RunCheck(site, nil, nil, false, true)
if result.Status != "UP" {
t.Errorf("expected UP, got %s", result.Status)
}
if result.LatencyNs <= 0 {
t.Error("expected positive latency")
}
}
func TestRunCheck_Port_Closed(t *testing.T) {
ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatal(err)
}
_, portStr, _ := net.SplitHostPort(ln.Addr().String())
port, _ := strconv.Atoi(portStr)
ln.Close()
site := models.Site{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 1}
result := RunCheck(site, nil, nil, false, true)
if result.Status != "DOWN" {
t.Errorf("expected DOWN, got %s", result.Status)
}
}
func TestRunCheck_Port_BlocksPrivateByDefault(t *testing.T) {
ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatal(err)
}
defer ln.Close()
_, portStr, _ := net.SplitHostPort(ln.Addr().String())
port, _ := strconv.Atoi(portStr)
site := models.Site{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 2}
result := RunCheck(site, nil, nil, false)
if result.Status != "DOWN" {
t.Errorf("expected DOWN when private targets blocked, got %s", result.Status)
}
}
func TestRunCheck_UnknownType(t *testing.T) {
site := models.Site{ID: 1, Type: "invalid"}
result := RunCheck(site, nil, nil, false)
if result.Status != "DOWN" {
t.Errorf("expected DOWN for unknown type, got %s", result.Status)
}
}
func TestIsCodeAccepted(t *testing.T) {
tests := []struct {
code int
accepted string
want bool
}{
{200, "", true},
{299, "", true},
{300, "", false},
{302, "200-399", true},
{400, "200-399", false},
{301, "200,301,404", true},
{500, "200,301,404", false},
{404, "200-299,400-499", true},
{500, "200-299,400-499", false},
}
for _, tt := range tests {
got := isCodeAccepted(tt.code, tt.accepted)
if got != tt.want {
t.Errorf("isCodeAccepted(%d, %q) = %v, want %v", tt.code, tt.accepted, got, tt.want)
}
}
}
func TestSiteTimeout(t *testing.T) {
if got := siteTimeout(models.Site{Timeout: 0}); got != 5*time.Second {
t.Errorf("expected 5s default, got %v", got)
}
if got := siteTimeout(models.Site{Timeout: 10}); got != 10*time.Second {
t.Errorf("expected 10s, got %v", got)
}
}
+273 -37
View File
@@ -4,14 +4,33 @@ import (
"context"
"crypto/tls"
"fmt"
"go-upkeep/internal/alert"
"go-upkeep/internal/models"
"go-upkeep/internal/store"
"math/rand/v2"
"net/http"
"regexp"
"strings"
"sync"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/alert"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
)
const (
maxLogEntries = 100
pollInterval = 5 * time.Second
minCheckInterval = 5
minPushGrace = 60 * time.Second
)
type AlertHealth struct {
LastSendAt time.Time
LastSendOK bool
LastError string
SendCount int
FailCount int
}
type Engine struct {
mu sync.RWMutex
liveState map[int]models.Site
@@ -25,32 +44,53 @@ type Engine struct {
histMu sync.RWMutex
histories map[int]*SiteHistory
tokenIndex map[string]int
tokenIndex map[string]int // protected by mu
probeResultsMu sync.RWMutex
probeResults map[int]map[string]NodeResult
aggStrategy AggregationStrategy
alertHealthMu sync.RWMutex
alertHealth map[int]AlertHealth
db store.Store
insecureSkipVerify bool
allowPrivateTargets bool
strictClient *http.Client
insecureClient *http.Client
}
func NewEngine(s store.Store) *Engine {
return newEngine(s, false)
}
func NewEngineWithOpts(s store.Store, allowPrivateTargets bool) *Engine {
return newEngine(s, allowPrivateTargets)
}
func newEngine(s store.Store, allowPrivateTargets bool) *Engine {
dial := SafeDialContext(allowPrivateTargets)
return &Engine{
liveState: make(map[int]models.Site),
histories: make(map[int]*SiteHistory),
tokenIndex: make(map[string]int),
probeResults: make(map[int]map[string]NodeResult),
alertHealth: make(map[int]AlertHealth),
aggStrategy: AggAnyDown,
isActive: true,
allowPrivateTargets: allowPrivateTargets,
db: s,
strictClient: &http.Client{
Transport: &http.Transport{TLSClientConfig: &tls.Config{InsecureSkipVerify: false}},
Transport: &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: false},
DialContext: dial,
},
},
insecureClient: &http.Client{
Transport: &http.Transport{TLSClientConfig: &tls.Config{InsecureSkipVerify: true}},
Transport: &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: true}, //nolint:gosec // intentional for IgnoreTLS sites
DialContext: dial,
},
},
}
}
@@ -59,14 +99,36 @@ func (e *Engine) SetInsecureSkipVerify(skip bool) {
e.insecureSkipVerify = skip
}
var ansiRe = regexp.MustCompile(`\x1b\[[0-9;]*[a-zA-Z]`)
func sanitizeLog(s string) string {
s = ansiRe.ReplaceAllString(s, "")
s = strings.ReplaceAll(s, "\n", "\\n")
s = strings.ReplaceAll(s, "\r", "")
return s
}
func fmtDurationShort(d time.Duration) string {
if d < time.Minute {
return fmt.Sprintf("%ds", int(d.Seconds()))
}
if d < time.Hour {
return fmt.Sprintf("%dm", int(d.Minutes()))
}
if d < 24*time.Hour {
return fmt.Sprintf("%dh %dm", int(d.Hours()), int(d.Minutes())%60)
}
return fmt.Sprintf("%dd %dh", int(d.Hours())/24, int(d.Hours())%24)
}
func (e *Engine) AddLog(msg string) {
e.logMu.Lock()
defer e.logMu.Unlock()
ts := time.Now().Format("15:04:05")
entry := fmt.Sprintf("[%s] %s", ts, msg)
entry := fmt.Sprintf("[%s] %s", ts, sanitizeLog(msg))
e.logStore = append([]string{entry}, e.logStore...)
if len(e.logStore) > 100 {
e.logStore = e.logStore[:100]
if len(e.logStore) > maxLogEntries {
e.logStore = e.logStore[:maxLogEntries]
}
go func() { _ = e.db.SaveLog(entry) }()
}
@@ -149,17 +211,38 @@ func (e *Engine) RecordHeartbeat(token string) bool {
return false
}
prevStatus := site.Status
site.LastCheck = time.Now()
wasDown := site.Status == "DOWN"
site.Status = "UP"
site.FailureCount = 0
site.Latency = 0
site.LastError = ""
site.LastSuccessAt = time.Now()
if prevStatus != "UP" {
site.StatusChangedAt = time.Now()
}
e.liveState[targetID] = site
if wasDown {
e.AddLog(fmt.Sprintf("Push Monitor '%s' recovered", site.Name))
e.triggerAlert(site.AlertID, "✅ RECOVERY", fmt.Sprintf("Push Monitor '%s' is receiving heartbeats.", site.Name))
switch prevStatus {
case "PENDING":
e.AddLog(fmt.Sprintf("Push Monitor '%s' received first heartbeat", site.Name))
case "LATE":
e.AddLog(fmt.Sprintf("Push Monitor '%s' heartbeat arrived (was late)", site.Name))
case "DOWN":
downDur := ""
if !site.StatusChangedAt.IsZero() {
downDur = fmt.Sprintf(" (was down %s)", fmtDurationShort(time.Since(site.StatusChangedAt)))
}
e.AddLog(fmt.Sprintf("Push Monitor '%s' recovered%s", site.Name, downDur))
go e.triggerAlert(site.AlertID, "✅ RECOVERY", fmt.Sprintf("Push Monitor '%s' is receiving heartbeats.%s", site.Name, downDur))
}
if prevStatus != "UP" && prevStatus != "PENDING" {
go func() { _ = e.db.SaveStateChange(targetID, prevStatus, "UP", "") }()
}
return true
}
@@ -191,7 +274,7 @@ func (e *Engine) Start(ctx context.Context) {
if err != nil {
e.AddLog(fmt.Sprintf("Failed to load sites: %v", err))
select {
case <-time.After(5 * time.Second):
case <-time.After(pollInterval):
case <-ctx.Done():
return
}
@@ -204,9 +287,6 @@ func (e *Engine) Start(ctx context.Context) {
if !exists {
e.mu.Lock()
s.Status = "PENDING"
if s.Type == "push" {
s.LastCheck = time.Now()
}
if h, ok := e.GetHistory(s.ID); ok && len(h.Statuses) > 0 {
if h.Statuses[len(h.Statuses)-1] {
s.Status = "UP"
@@ -225,7 +305,7 @@ func (e *Engine) Start(ctx context.Context) {
}
select {
case <-time.After(5 * time.Second):
case <-time.After(pollInterval):
case <-ctx.Done():
return
}
@@ -246,6 +326,9 @@ func (e *Engine) UpdateSiteConfig(site models.Site) {
site.LastCheck = existing.LastCheck
site.SentSSLWarning = existing.SentSSLWarning
site.FailureCount = existing.FailureCount
site.LastError = existing.LastError
site.StatusChangedAt = existing.StatusChangedAt
site.LastSuccessAt = existing.LastSuccessAt
e.liveState[site.ID] = site
e.addToTokenIndex(site)
}
@@ -277,6 +360,14 @@ func (e *Engine) ToggleSitePause(id int) bool {
}
func (e *Engine) monitorRoutine(ctx context.Context, id int) {
// Stagger initial check to avoid thundering herd on startup
stagger := time.Duration(rand.IntN(3000)) * time.Millisecond //nolint:gosec // non-security jitter
select {
case <-time.After(stagger):
case <-ctx.Done():
return
}
e.checkByID(id)
for {
select {
@@ -287,7 +378,7 @@ func (e *Engine) monitorRoutine(ctx context.Context, id int) {
if !e.IsActive() {
select {
case <-time.After(5 * time.Second):
case <-time.After(pollInterval):
case <-ctx.Done():
return
}
@@ -303,7 +394,7 @@ func (e *Engine) monitorRoutine(ctx context.Context, id int) {
if site.Paused {
select {
case <-time.After(5 * time.Second):
case <-time.After(pollInterval):
case <-ctx.Done():
return
}
@@ -311,11 +402,12 @@ func (e *Engine) monitorRoutine(ctx context.Context, id int) {
}
interval := site.Interval
if interval < 5 {
interval = 5
if interval < minCheckInterval {
interval = minCheckInterval
}
jitter := time.Duration(rand.IntN(interval*100)) * time.Millisecond //nolint:gosec // non-security jitter
select {
case <-time.After(time.Duration(interval) * time.Second):
case <-time.After(time.Duration(interval)*time.Second + jitter):
case <-ctx.Done():
return
}
@@ -341,39 +433,68 @@ func (e *Engine) checkByID(id int) {
case "group":
e.checkGroup(site)
default:
result := RunCheck(site, e.strictClient, e.insecureClient, e.insecureSkipVerify)
result := RunCheck(site, e.strictClient, e.insecureClient, e.insecureSkipVerify, e.allowPrivateTargets)
updatedSite := site
updatedSite.HasSSL = result.HasSSL
updatedSite.CertExpiry = result.CertExpiry
updatedSite.Latency = time.Duration(result.LatencyNs)
updatedSite.LastCheck = time.Now()
e.handleStatusChange(updatedSite, result.Status, result.StatusCode, time.Duration(result.LatencyNs))
e.handleStatusChange(updatedSite, result.Status, result.StatusCode, time.Duration(result.LatencyNs), result.ErrorReason)
}
}
func (e *Engine) checkPush(site models.Site) {
deadline := site.LastCheck.Add(time.Duration(site.Interval) * time.Second).Add(5 * time.Second)
if time.Now().After(deadline) {
e.handleStatusChange(site, "DOWN", 0, 0)
} else if site.Status != "UP" {
e.handleStatusChange(site, "UP", 200, 0)
if site.Status == "PENDING" {
return
}
interval := time.Duration(site.Interval) * time.Second
grace := interval / 2
if grace < minPushGrace {
grace = minPushGrace
}
overdue := site.LastCheck.Add(interval)
graceEnd := overdue.Add(grace)
now := time.Now()
if now.After(graceEnd) {
if site.Status != "DOWN" {
e.handleStatusChange(site, "DOWN", 0, 0, "heartbeat missed")
}
} else if now.After(overdue) {
if site.Status != "LATE" {
e.handleStatusChange(site, "LATE", 0, 0, "heartbeat overdue")
}
}
}
func (e *Engine) handleStatusChange(site models.Site, rawStatus string, code int, latency time.Duration) {
func (e *Engine) handleStatusChange(site models.Site, rawStatus string, code int, latency time.Duration, errorReason string) {
if !e.IsActive() {
return
}
newState := site
newState.StatusCode = code
newState.LastError = errorReason
if rawStatus == "UP" {
newState.LastSuccessAt = time.Now()
newState.LastError = ""
} else {
newState.LastSuccessAt = site.LastSuccessAt
}
if site.Status == "UP" && rawStatus != "UP" {
newState.FailureCount++
if newState.FailureCount > site.MaxRetries {
newState.Status = rawStatus
newState.FailureCount = site.MaxRetries + 1
if errorReason != "" {
e.AddLog(fmt.Sprintf("Monitor '%s' confirmed DOWN: %s", site.Name, errorReason))
} else {
e.AddLog(fmt.Sprintf("Monitor '%s' confirmed DOWN", site.Name))
}
} else {
e.AddLog(fmt.Sprintf("Monitor '%s' failed check %d/%d", site.Name, newState.FailureCount, site.MaxRetries))
}
@@ -385,10 +506,24 @@ func (e *Engine) handleStatusChange(site models.Site, rawStatus string, code int
newState.FailureCount = site.MaxRetries + 1
}
if newState.Status != site.Status && site.Status != "PENDING" {
newState.StatusChangedAt = time.Now()
} else if site.StatusChangedAt.IsZero() && newState.Status != "PENDING" {
newState.StatusChangedAt = time.Now()
} else {
newState.StatusChangedAt = site.StatusChangedAt
}
inMaint := e.isInMaintenance(site.ID)
if site.Type == "http" && site.CheckSSL && site.HasSSL {
daysLeft := int(time.Until(site.CertExpiry).Hours() / 24)
if daysLeft <= site.ExpiryThreshold && !site.SentSSLWarning && rawStatus != "SSL EXP" {
if !inMaint {
e.triggerAlert(site.AlertID, "SSL WARNING", fmt.Sprintf("SSL for '%s' expires in %d days", site.Name, daysLeft))
} else {
e.AddLog(fmt.Sprintf("SSL warning for '%s' suppressed (maintenance)", site.Name))
}
newState.SentSSLWarning = true
} else if daysLeft > site.ExpiryThreshold {
newState.SentSSLWarning = false
@@ -403,22 +538,49 @@ func (e *Engine) handleStatusChange(site models.Site, rawStatus string, code int
e.recordCheck(site.ID, latency, rawStatus == "UP")
if newState.Status != site.Status && site.Status != "PENDING" {
go func() { _ = e.db.SaveStateChange(site.ID, site.Status, newState.Status, errorReason) }()
}
isBroken := func(s string) bool { return s == "DOWN" || s == "SSL EXP" }
if site.Status == "UP" && newState.Status == "LATE" {
e.AddLog(fmt.Sprintf("Monitor '%s' heartbeat overdue", site.Name))
}
if !isBroken(site.Status) && isBroken(newState.Status) && newState.Status != "PENDING" {
if inMaint {
e.AddLog(fmt.Sprintf("Monitor '%s' is DOWN (alerts suppressed — maintenance)", site.Name))
} else {
msg := fmt.Sprintf("Monitor '%s' is DOWN (%s)", site.Name, rawStatus)
if errorReason != "" {
msg = fmt.Sprintf("Monitor '%s' is DOWN: %s", site.Name, errorReason)
}
if site.Type == "push" {
msg = fmt.Sprintf("Push Monitor '%s' missed heartbeat.", site.Name)
}
e.triggerAlert(site.AlertID, "🚨 ALERT", msg)
}
}
if isBroken(site.Status) && newState.Status == "UP" {
e.triggerAlert(site.AlertID, "✅ RECOVERY", fmt.Sprintf("Monitor '%s' is UP", site.Name))
downDur := ""
if !site.StatusChangedAt.IsZero() {
downDur = fmt.Sprintf(" (was down %s)", fmtDurationShort(time.Since(site.StatusChangedAt)))
}
e.AddLog(fmt.Sprintf("Monitor '%s' recovered%s", site.Name, downDur))
if !inMaint {
e.triggerAlert(site.AlertID, "✅ RECOVERY", fmt.Sprintf("Monitor '%s' is UP%s", site.Name, downDur))
}
}
if site.Status == "LATE" && newState.Status == "UP" && !isBroken(site.Status) {
e.AddLog(fmt.Sprintf("Monitor '%s' heartbeat arrived (was late)", site.Name))
}
}
func (e *Engine) triggerAlert(alertID int, title, message string) {
cfg, err := e.db.GetAlert(alertID)
if err != nil {
e.AddLog(fmt.Sprintf("Failed to load alert config %d: %v", alertID, err))
return
}
provider := alert.GetProvider(cfg)
@@ -426,12 +588,77 @@ func (e *Engine) triggerAlert(alertID int, title, message string) {
go func() {
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
_ = ctx
_ = provider.Send(title, message)
if err := provider.Send(ctx, title, message); err != nil {
e.AddLog(fmt.Sprintf("Alert send failed (%s): %v", cfg.Name, err))
e.recordAlertResult(alertID, false, err.Error())
} else {
e.recordAlertResult(alertID, true, "")
}
}()
}
}
func (e *Engine) recordAlertResult(alertID int, ok bool, errMsg string) {
e.alertHealthMu.Lock()
defer e.alertHealthMu.Unlock()
h := e.alertHealth[alertID]
h.LastSendAt = time.Now()
h.LastSendOK = ok
h.SendCount++
if ok {
h.LastError = ""
} else {
h.LastError = errMsg
h.FailCount++
}
e.alertHealth[alertID] = h
}
func (e *Engine) GetAlertHealth(alertID int) AlertHealth {
e.alertHealthMu.RLock()
defer e.alertHealthMu.RUnlock()
return e.alertHealth[alertID]
}
func (e *Engine) TestAlert(alertID int) error {
cfg, err := e.db.GetAlert(alertID)
if err != nil {
return fmt.Errorf("failed to load alert: %w", err)
}
provider := alert.GetProvider(cfg)
if provider == nil {
return fmt.Errorf("no provider for type %q", cfg.Type)
}
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
err = provider.Send(ctx, "🧪 Test Alert", fmt.Sprintf("Test notification from uptop for channel '%s'.", cfg.Name))
if err != nil {
e.recordAlertResult(alertID, false, err.Error())
return err
}
e.recordAlertResult(alertID, true, "")
e.AddLog(fmt.Sprintf("Test alert sent to '%s'", cfg.Name))
return nil
}
func (e *Engine) isInMaintenance(monitorID int) bool {
inMaint, err := e.db.IsMonitorInMaintenance(monitorID)
if err != nil {
return false
}
return inMaint
}
func (e *Engine) GetDisplayStatus(site models.Site) string {
if site.Paused {
return "PAUSED"
}
if e.isInMaintenance(site.ID) {
return "MAINT"
}
return site.Status
}
func (e *Engine) checkGroup(site models.Site) {
e.mu.RLock()
status := "UP"
@@ -445,7 +672,7 @@ func (e *Engine) checkGroup(site models.Site) {
if !child.Paused {
allPaused = false
}
if child.Paused {
if child.Paused || e.isInMaintenance(child.ID) {
continue
}
if child.Status == "DOWN" || child.Status == "SSL EXP" {
@@ -474,7 +701,7 @@ func (e *Engine) SetAggStrategy(strategy AggregationStrategy) {
e.aggStrategy = strategy
}
func (e *Engine) IngestProbeResult(nodeID string, siteID int, latencyNs int64, isUp bool) {
func (e *Engine) IngestProbeResult(nodeID string, siteID int, latencyNs int64, isUp bool, errorReason string) {
e.probeResultsMu.Lock()
if e.probeResults[siteID] == nil {
e.probeResults[siteID] = make(map[string]NodeResult)
@@ -484,6 +711,7 @@ func (e *Engine) IngestProbeResult(nodeID string, siteID int, latencyNs int64, i
IsUp: isUp,
LatencyNs: latencyNs,
CheckedAt: time.Now(),
ErrorReason: errorReason,
}
results := make([]NodeResult, 0, len(e.probeResults[siteID]))
for _, r := range e.probeResults[siteID] {
@@ -508,7 +736,7 @@ func (e *Engine) IngestProbeResult(nodeID string, siteID int, latencyNs int64, i
updatedSite := site
updatedSite.Latency = time.Duration(avgLatency)
updatedSite.LastCheck = time.Now()
e.handleStatusChange(updatedSite, rawStatus, 0, time.Duration(avgLatency))
e.handleStatusChange(updatedSite, rawStatus, 0, time.Duration(avgLatency), errorReason)
}
func (e *Engine) GetProbeResults(siteID int) map[string]NodeResult {
@@ -521,3 +749,11 @@ func (e *Engine) GetProbeResults(siteID int) map[string]NodeResult {
}
return cp
}
func (e *Engine) GetStateChanges(siteID int, limit int) []models.StateChange {
changes, err := e.db.GetStateChanges(siteID, limit)
if err != nil {
return nil
}
return changes
}
File diff suppressed because it is too large Load Diff
+68
View File
@@ -0,0 +1,68 @@
package monitor
import (
"context"
"fmt"
"net"
"time"
)
var privateRanges []*net.IPNet
func init() {
cidrs := []string{
"127.0.0.0/8",
"::1/128",
"10.0.0.0/8",
"172.16.0.0/12",
"192.168.0.0/16",
"169.254.0.0/16",
"fe80::/10",
"fc00::/7",
}
for _, cidr := range cidrs {
_, network, _ := net.ParseCIDR(cidr)
privateRanges = append(privateRanges, network)
}
}
func isPrivateIP(ip net.IP) bool {
for _, network := range privateRanges {
if network.Contains(ip) {
return true
}
}
return false
}
func SafeDialContext(allowPrivate bool) func(ctx context.Context, network, addr string) (net.Conn, error) {
return func(ctx context.Context, network, addr string) (net.Conn, error) {
host, port, err := net.SplitHostPort(addr)
if err != nil {
return nil, err
}
ips, err := net.DefaultResolver.LookupIPAddr(ctx, host)
if err != nil {
return nil, err
}
if !allowPrivate {
for _, ip := range ips {
if isPrivateIP(ip.IP) {
return nil, fmt.Errorf("blocked: %s resolves to private address %s", host, ip.IP)
}
}
}
dialer := &net.Dialer{Timeout: 10 * time.Second}
for _, ip := range ips {
target := net.JoinHostPort(ip.IP.String(), port)
conn, err := dialer.DialContext(ctx, network, target)
if err == nil {
return conn, nil
}
}
return nil, fmt.Errorf("failed to connect to %s", addr)
}
}
+47
View File
@@ -0,0 +1,47 @@
package monitor
import (
"net"
"testing"
)
func TestIsPrivateIP(t *testing.T) {
tests := []struct {
ip string
private bool
}{
{"127.0.0.1", true},
{"10.0.0.1", true},
{"172.16.0.1", true},
{"192.168.1.1", true},
{"169.254.169.254", true},
{"::1", true},
{"8.8.8.8", false},
{"1.1.1.1", false},
{"93.184.216.34", false},
}
for _, tt := range tests {
ip := net.ParseIP(tt.ip)
got := isPrivateIP(ip)
if got != tt.private {
t.Errorf("isPrivateIP(%s) = %v, want %v", tt.ip, got, tt.private)
}
}
}
func TestSafeDialContext_BlocksPrivate(t *testing.T) {
dial := SafeDialContext(false)
_, err := dial(t.Context(), "tcp", "127.0.0.1:80")
if err == nil {
t.Error("expected error dialing loopback with private blocking enabled")
}
}
func TestSafeDialContext_AllowsPrivate(t *testing.T) {
dial := SafeDialContext(true)
_, err := dial(t.Context(), "tcp", "127.0.0.1:80")
// Will fail to connect (nothing listening) but should NOT be blocked
if err != nil && err.Error() == "blocked: 127.0.0.1 resolves to private address 127.0.0.1" {
t.Error("should not block private IPs when allowPrivate is true")
}
}
+91
View File
@@ -0,0 +1,91 @@
package server
import (
"net"
"net/http"
"sync"
"time"
)
type visitor struct {
tokens float64
lastSeen time.Time
}
type RateLimiter struct {
mu sync.Mutex
visitors map[string]*visitor
rate float64
burst float64
}
func NewRateLimiter(requestsPerMinute int) *RateLimiter {
rl := &RateLimiter{
visitors: make(map[string]*visitor),
rate: float64(requestsPerMinute) / 60.0,
burst: float64(requestsPerMinute),
}
go rl.cleanup()
return rl
}
func (rl *RateLimiter) Allow(ip string) bool {
rl.mu.Lock()
defer rl.mu.Unlock()
v, exists := rl.visitors[ip]
now := time.Now()
if !exists {
rl.visitors[ip] = &visitor{tokens: rl.burst - 1, lastSeen: now}
return true
}
elapsed := now.Sub(v.lastSeen).Seconds()
v.tokens += elapsed * rl.rate
if v.tokens > rl.burst {
v.tokens = rl.burst
}
v.lastSeen = now
if v.tokens < 1 {
return false
}
v.tokens--
return true
}
func (rl *RateLimiter) cleanup() {
for {
time.Sleep(5 * time.Minute)
rl.mu.Lock()
cutoff := time.Now().Add(-10 * time.Minute)
for ip, v := range rl.visitors {
if v.lastSeen.Before(cutoff) {
delete(rl.visitors, ip)
}
}
rl.mu.Unlock()
}
}
func clientIP(r *http.Request) string {
if fwd := r.Header.Get("X-Forwarded-For"); fwd != "" {
return fwd
}
host, _, err := net.SplitHostPort(r.RemoteAddr)
if err != nil {
return r.RemoteAddr
}
return host
}
func RateLimit(limiter *RateLimiter, next http.HandlerFunc) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
if !limiter.Allow(clientIP(r)) {
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
next(w, r)
}
}
+249 -75
View File
@@ -1,20 +1,54 @@
package server
import (
"crypto/subtle"
"encoding/json"
"fmt"
"go-upkeep/internal/importer"
"go-upkeep/internal/metrics"
"go-upkeep/internal/models"
"go-upkeep/internal/monitor"
"go-upkeep/internal/store"
"html/template"
"log"
"net/http"
"sort"
"strings"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/importer"
"gitea.lerkolabs.com/lerko/uptop/internal/metrics"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
)
const maxRequestBody = 1 << 20
func checkSecret(got, want string) bool {
return subtle.ConstantTimeCompare([]byte(got), []byte(want)) == 1
}
func extractBearerToken(r *http.Request) string {
auth := r.Header.Get("Authorization")
if strings.HasPrefix(auth, "Bearer ") {
return strings.TrimPrefix(auth, "Bearer ")
}
return ""
}
var sensitiveKeys = map[string]bool{
"pass": true, "password": true, "token": true,
"routing_key": true, "user": true, "username": true,
}
func redactSettings(settings map[string]string) map[string]string {
redacted := make(map[string]string, len(settings))
for k, v := range settings {
if sensitiveKeys[k] && v != "" {
redacted[k] = "***REDACTED***"
} else {
redacted[k] = v
}
}
return redacted
}
var statusTpl = template.Must(template.New("status").Parse(`
<!DOCTYPE html>
<html>
@@ -33,8 +67,10 @@ var statusTpl = template.Must(template.New("status").Parse(`
.UP { background: #9ece6a; color: #1a1b26; }
.DOWN { background: #f7768e; color: #1a1b26; }
.PENDING { background: #e0af68; color: #1a1b26; }
.LATE { background: #e0af68; color: #1a1b26; }
.SSL-EXP { background: #e0af68; color: #1a1b26; }
.PAUSED { background: #565f89; color: #c0caf5; }
.MAINT { background: #bb9af7; color: #1a1b26; }
.summary { display: flex; justify-content: center; gap: 16px; margin-bottom: 24px; font-size: 0.95em; font-weight: 600; }
.summary span { padding: 4px 12px; border-radius: 6px; }
.summary .s-up { color: #9ece6a; }
@@ -52,7 +88,7 @@ var statusTpl = template.Must(template.New("status").Parse(`
<div id="summary" class="summary"></div>
<div id="stale" class="stale-bar"></div>
<div id="cards"></div>
<div style="text-align: center; margin-top: 40px; color: #565f89; font-size: 0.8em;">Powered by Go-Upkeep</div>
<div style="text-align: center; margin-top: 40px; color: #565f89; font-size: 0.8em;">Powered by uptop</div>
</div>
<script>
var lastUpdate = null;
@@ -68,15 +104,17 @@ var statusTpl = template.Must(template.New("status").Parse(`
}
function renderSummary(sites) {
var up = 0, down = 0, paused = 0, total = sites.length;
var up = 0, down = 0, paused = 0, maint = 0, total = sites.length;
for (var i = 0; i < sites.length; i++) {
if (sites[i].Paused) { paused++; continue; }
if (sites[i].Status === 'MAINT') { maint++; continue; }
if (sites[i].Status === 'UP') up++;
else if (sites[i].Status === 'DOWN') down++;
}
var el = document.getElementById('summary');
var parts = ['<span class="s-total">' + up + '/' + total + ' UP</span>'];
if (down > 0) parts.push('<span class="s-down">' + down + ' DOWN</span>');
if (maint > 0) parts.push('<span style="color:#bb9af7">' + maint + ' MAINT</span>');
if (paused > 0) parts.push('<span class="s-paused">' + paused + ' PAUSED</span>');
el.innerHTML = parts.join('<span style="color:#383838">·</span>');
}
@@ -110,7 +148,7 @@ var statusTpl = template.Must(template.New("status").Parse(`
renderSummary(sites);
for (var i = 0; i < sites.length; i++) {
var s = sites[i];
var st = s.Paused ? 'PAUSED' : s.Status;
var st = s.Status === 'MAINT' ? 'MAINT' : s.Paused ? 'PAUSED' : s.Status;
var cls = cssClass(st);
var meta = esc(s.Type) + ' | ' + (s.Type === 'http' ? esc(s.URL) : 'Heartbeat Monitor');
var lc = s.LastCheck ? new Date(s.LastCheck).toLocaleTimeString('en-GB', {hour12: false}) : '—';
@@ -147,113 +185,146 @@ type ServerConfig struct {
Port int
EnableStatus bool
Title string
ClusterKey string // Shared Secret for Security
ClusterKey string
TLSCert string
TLSKey string
ClusterMode string
MetricsPublic bool
CORSOrigin string
}
func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) {
func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
if cfg.ClusterKey == "" {
fmt.Println("WARNING: No UPKEEP_CLUSTER_SECRET set. Cluster API endpoints are unauthenticated.")
fmt.Println("WARNING: No UPTOP_CLUSTER_SECRET set. Cluster API endpoints are unauthenticated.")
}
pushRL := NewRateLimiter(60)
probeRL := NewRateLimiter(30)
backupRL := NewRateLimiter(10)
statusRL := NewRateLimiter(120)
mux := http.NewServeMux()
// 1. Push Heartbeat
mux.HandleFunc("/api/push", func(w http.ResponseWriter, r *http.Request) {
token := r.URL.Query().Get("token")
mux.HandleFunc("/api/push", RateLimit(pushRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet && r.Method != http.MethodPost {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
token := extractBearerToken(r)
if token == "" {
http.Error(w, "Missing token", 400)
if qt := r.URL.Query().Get("token"); qt != "" {
token = qt
log.Printf("DEPRECATED: push token in query string — use Authorization: Bearer header instead")
}
}
if token == "" {
http.Error(w, "Missing token", http.StatusBadRequest)
return
}
if eng.RecordHeartbeat(token) {
w.WriteHeader(http.StatusOK)
w.Write([]byte("OK"))
_, _ = w.Write([]byte("OK"))
} else {
http.Error(w, "Invalid Token", 404)
http.Error(w, "Invalid Token", http.StatusNotFound)
}
})
}))
// 2. Health Check (For Cluster Follower)
mux.HandleFunc("/api/health", func(w http.ResponseWriter, r *http.Request) {
if cfg.ClusterKey != "" && r.Header.Get("X-Upkeep-Secret") != cfg.ClusterKey {
http.Error(w, "Unauthorized", 401)
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey != "" && !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
w.WriteHeader(http.StatusOK)
w.Write([]byte("OK"))
_, _ = w.Write([]byte("OK"))
})
// 3. Config Export
mux.HandleFunc("/api/backup/export", func(w http.ResponseWriter, r *http.Request) {
if cfg.ClusterKey == "" || r.Header.Get("X-Upkeep-Secret") != cfg.ClusterKey {
http.Error(w, "Unauthorized: UPKEEP_CLUSTER_SECRET required", 401)
mux.HandleFunc("/api/backup/export", RateLimit(backupRL, func(w http.ResponseWriter, r *http.Request) {
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized: UPTOP_CLUSTER_SECRET required", http.StatusUnauthorized)
return
}
data, err := s.ExportData()
if err != nil {
log.Printf("Export failed: %v", err)
http.Error(w, "Export failed", 500)
http.Error(w, "Export failed", http.StatusInternalServerError)
return
}
json.NewEncoder(w).Encode(data)
})
if r.URL.Query().Get("redact_secrets") != "false" {
for i := range data.Alerts {
data.Alerts[i].Settings = redactSettings(data.Alerts[i].Settings)
}
}
_ = json.NewEncoder(w).Encode(data) //nolint:errcheck
}))
// 4. Config Import
mux.HandleFunc("/api/backup/import", func(w http.ResponseWriter, r *http.Request) {
mux.HandleFunc("/api/backup/import", RateLimit(backupRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "POST required", 405)
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey == "" || r.Header.Get("X-Upkeep-Secret") != cfg.ClusterKey {
http.Error(w, "Unauthorized", 401)
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var data models.Backup
if err := json.NewDecoder(r.Body).Decode(&data); err != nil {
http.Error(w, "Invalid JSON", 400)
http.Error(w, "Invalid JSON", http.StatusBadRequest)
return
}
if err := s.ImportData(data); err != nil {
log.Printf("Import failed: %v", err)
http.Error(w, "Import failed", 500)
http.Error(w, "Import failed", http.StatusInternalServerError)
return
}
w.Write([]byte("Import Successful"))
})
_, _ = w.Write([]byte("Import Successful"))
}))
// 5. Kuma Import
mux.HandleFunc("/api/import/kuma", func(w http.ResponseWriter, r *http.Request) {
mux.HandleFunc("/api/import/kuma", RateLimit(backupRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "POST required", 405)
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey == "" || r.Header.Get("X-Upkeep-Secret") != cfg.ClusterKey {
http.Error(w, "Unauthorized", 401)
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var kb importer.KumaBackup
if err := json.NewDecoder(r.Body).Decode(&kb); err != nil {
log.Printf("Invalid Kuma JSON: %v", err)
http.Error(w, "Invalid Kuma JSON", 400)
http.Error(w, "Invalid Kuma JSON", http.StatusBadRequest)
return
}
backup := importer.ConvertKuma(&kb)
if err := s.ImportData(backup); err != nil {
log.Printf("Kuma import failed: %v", err)
http.Error(w, "Import failed", 500)
http.Error(w, "Import failed", http.StatusInternalServerError)
return
}
w.Write([]byte(fmt.Sprintf("Imported %d monitors, %d alerts from Kuma v%s", len(backup.Sites), len(backup.Alerts), kb.Version)))
})
fmt.Fprintf(w, "Imported %d monitors, %d alerts from Kuma v%s", len(backup.Sites), len(backup.Alerts), kb.Version)
}))
// 6. Probe Registration
mux.HandleFunc("/api/probe/register", func(w http.ResponseWriter, r *http.Request) {
mux.HandleFunc("/api/probe/register", RateLimit(probeRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "POST required", 405)
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey == "" || r.Header.Get("X-Upkeep-Secret") != cfg.ClusterKey {
http.Error(w, "Unauthorized", 401)
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var req struct {
ID string `json:"id"`
Name string `json:"name"`
@@ -261,27 +332,31 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) {
Version string `json:"version"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, "Invalid JSON", 400)
http.Error(w, "Invalid JSON", http.StatusBadRequest)
return
}
if req.ID == "" {
http.Error(w, "id is required", 400)
http.Error(w, "id is required", http.StatusBadRequest)
return
}
if err := s.RegisterNode(models.ProbeNode{
ID: req.ID, Name: req.Name, Region: req.Region, Version: req.Version,
}); err != nil {
log.Printf("Probe register failed: %v", err)
http.Error(w, "Registration failed", 500)
http.Error(w, "Registration failed", http.StatusInternalServerError)
return
}
json.NewEncoder(w).Encode(map[string]bool{"ok": true})
})
_ = json.NewEncoder(w).Encode(map[string]bool{"ok": true}) //nolint:errcheck
}))
// 7. Probe Assignment Fetch
mux.HandleFunc("/api/probe/assignments", func(w http.ResponseWriter, r *http.Request) {
if cfg.ClusterKey == "" || r.Header.Get("X-Upkeep-Secret") != cfg.ClusterKey {
http.Error(w, "Unauthorized", 401)
mux.HandleFunc("/api/probe/assignments", RateLimit(probeRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
nodeID := r.URL.Query().Get("node_id")
@@ -312,69 +387,166 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) {
assigned = append(assigned, site)
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string][]models.Site{"sites": assigned})
})
_ = json.NewEncoder(w).Encode(map[string][]models.Site{"sites": assigned}) //nolint:errcheck
}))
// 8. Probe Result Submission
mux.HandleFunc("/api/probe/results", func(w http.ResponseWriter, r *http.Request) {
mux.HandleFunc("/api/probe/results", RateLimit(probeRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "POST required", 405)
http.Error(w, "POST required", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey == "" || r.Header.Get("X-Upkeep-Secret") != cfg.ClusterKey {
http.Error(w, "Unauthorized", 401)
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var req struct {
NodeID string `json:"node_id"`
Results []struct {
SiteID int `json:"site_id"`
LatencyNs int64 `json:"latency_ns"`
IsUp bool `json:"is_up"`
ErrorReason string `json:"error_reason"`
} `json:"results"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, "Invalid JSON", 400)
http.Error(w, "Invalid JSON", http.StatusBadRequest)
return
}
if req.NodeID == "" {
http.Error(w, "node_id is required", 400)
http.Error(w, "node_id is required", http.StatusBadRequest)
return
}
for _, result := range req.Results {
if err := s.SaveCheckFromNode(result.SiteID, req.NodeID, result.LatencyNs, result.IsUp); err != nil {
log.Printf("Failed to save probe result: %v", err)
}
eng.IngestProbeResult(req.NodeID, result.SiteID, result.LatencyNs, result.IsUp)
eng.IngestProbeResult(req.NodeID, result.SiteID, result.LatencyNs, result.IsUp, result.ErrorReason)
}
s.UpdateNodeLastSeen(req.NodeID)
json.NewEncoder(w).Encode(map[string]bool{"ok": true})
})
if err := s.UpdateNodeLastSeen(req.NodeID); err != nil {
log.Printf("Failed to update node last seen: %v", err)
}
_ = json.NewEncoder(w).Encode(map[string]bool{"ok": true}) //nolint:errcheck
}))
// 9. Prometheus Metrics
mux.HandleFunc("/metrics", metrics.Handler(eng))
mux.HandleFunc("/metrics", func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if !cfg.MetricsPublic && cfg.ClusterKey != "" {
if !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
}
metrics.Handler(eng)(w, r)
})
// 10. Status Page
if cfg.EnableStatus {
mux.HandleFunc("/status", func(w http.ResponseWriter, r *http.Request) { renderStatusPage(w, cfg.Title, eng) })
mux.HandleFunc("/status/json", func(w http.ResponseWriter, r *http.Request) {
mux.HandleFunc("/status", RateLimit(statusRL, func(w http.ResponseWriter, r *http.Request) { renderStatusPage(w, cfg.Title, eng) }))
mux.HandleFunc("/status/json", RateLimit(statusRL, func(w http.ResponseWriter, r *http.Request) {
state := eng.GetLiveState()
activeWindows, _ := s.GetActiveMaintenanceWindows()
maintSet := make(map[int]bool)
allInMaint := false
for _, mw := range activeWindows {
if mw.Type != "maintenance" {
continue
}
if mw.MonitorID == 0 {
allInMaint = true
} else {
maintSet[mw.MonitorID] = true
}
}
for id, site := range state {
site.Token = ""
if allInMaint || maintSet[site.ID] || (site.ParentID > 0 && maintSet[site.ParentID]) {
site.Status = "MAINT"
}
state[id] = site
}
if cfg.CORSOrigin != "" {
w.Header().Set("Access-Control-Allow-Origin", cfg.CORSOrigin)
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(state)
})
_ = json.NewEncoder(w).Encode(state) //nolint:errcheck
}))
}
if cfg.ClusterMode != "" && cfg.ClusterMode != "leader" && cfg.TLSCert == "" {
fmt.Println("WARNING: Cluster mode active without TLS. Secrets transmitted in cleartext.")
}
handler := loggingMiddleware(securityHeadersMiddleware(mux))
if cfg.TLSCert != "" {
handler = hstsMiddleware(handler)
}
go func() {
addr := fmt.Sprintf(":%d", cfg.Port)
srv := &http.Server{
Addr: addr,
Handler: handler,
ReadHeaderTimeout: 10 * time.Second,
ReadTimeout: 30 * time.Second,
WriteTimeout: 60 * time.Second,
IdleTimeout: 120 * time.Second,
}
go func() {
if cfg.TLSCert != "" && cfg.TLSKey != "" {
fmt.Printf("HTTPS Server listening on %s\n", addr)
if err := srv.ListenAndServeTLS(cfg.TLSCert, cfg.TLSKey); err != nil && err != http.ErrServerClosed {
log.Printf("HTTPS server error: %v", err)
}
} else {
fmt.Printf("HTTP Server listening on %s\n", addr)
if err := http.ListenAndServe(addr, mux); err != nil {
log.Fatalf("HTTP server failed: %v", err)
if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
log.Printf("HTTP server error: %v", err)
}
}
}()
return srv
}
type statusWriter struct {
http.ResponseWriter
code int
}
func (w *statusWriter) WriteHeader(code int) {
w.code = code
w.ResponseWriter.WriteHeader(code)
}
func loggingMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
sw := &statusWriter{ResponseWriter: w, code: 200}
next.ServeHTTP(sw, r)
path := strings.ReplaceAll(strings.ReplaceAll(r.URL.Path, "\n", ""), "\r", "")
log.Printf("%s %s %d %s %s", r.Method, path, sw.code, time.Since(start).Round(time.Millisecond), clientIP(r)) //nolint:gosec // path sanitized above
})
}
func securityHeadersMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("X-Content-Type-Options", "nosniff")
w.Header().Set("X-Frame-Options", "DENY")
w.Header().Set("Referrer-Policy", "no-referrer")
w.Header().Set("Content-Security-Policy", "default-src 'self'; script-src 'unsafe-inline'; style-src 'unsafe-inline'")
next.ServeHTTP(w, r)
})
}
func hstsMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Strict-Transport-Security", "max-age=63072000; includeSubDomains")
next.ServeHTTP(w, r)
})
}
func renderStatusPage(w http.ResponseWriter, title string, eng *monitor.Engine) {
@@ -396,5 +568,7 @@ func renderStatusPage(w http.ResponseWriter, title string, eng *monitor.Engine)
Title string
Sites []models.Site
}{Title: title, Sites: sites}
statusTpl.Execute(w, data)
if err := statusTpl.Execute(w, data); err != nil {
log.Printf("Failed to render status page: %v", err)
}
}
+556
View File
@@ -0,0 +1,556 @@
package server
import (
"bytes"
"encoding/json"
"fmt"
"net"
"net/http"
"sync"
"testing"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
)
// --- Mock Store ---
type mockStore struct {
mu sync.Mutex
sites []models.Site
alerts []models.AlertConfig
nodes map[string]models.ProbeNode
importedData *models.Backup
registeredNodes []models.ProbeNode
maintWindows []models.MaintenanceWindow
}
func newMockStore() *mockStore {
return &mockStore{
nodes: make(map[string]models.ProbeNode),
}
}
func (m *mockStore) Init() error { return nil }
func (m *mockStore) GetSites() ([]models.Site, error) { return m.sites, nil }
func (m *mockStore) AddSite(models.Site) error { return nil }
func (m *mockStore) UpdateSite(models.Site) error { return nil }
func (m *mockStore) UpdateSitePaused(int, bool) error { return nil }
func (m *mockStore) DeleteSite(int) error { return nil }
func (m *mockStore) GetAllAlerts() ([]models.AlertConfig, error) { return m.alerts, nil }
func (m *mockStore) GetAlert(int) (models.AlertConfig, error) { return models.AlertConfig{}, nil }
func (m *mockStore) AddAlert(string, string, map[string]string) error { return nil }
func (m *mockStore) UpdateAlert(int, string, string, map[string]string) error { return nil }
func (m *mockStore) DeleteAlert(int) error { return nil }
func (m *mockStore) GetAllUsers() ([]models.User, error) { return nil, nil }
func (m *mockStore) AddUser(string, string, string) error { return nil }
func (m *mockStore) UpdateUser(int, string, string, string) error { return nil }
func (m *mockStore) DeleteUser(int) error { return nil }
func (m *mockStore) SaveCheck(int, int64, bool) error { return nil }
func (m *mockStore) SaveCheckFromNode(siteID int, nodeID string, latencyNs int64, isUp bool) error {
return nil
}
func (m *mockStore) LoadAllHistory(int) (map[int][]models.CheckRecord, error) {
return nil, nil
}
func (m *mockStore) GetSiteByName(string) (models.Site, error) { return models.Site{}, nil }
func (m *mockStore) GetAlertByName(string) (models.AlertConfig, error) {
return models.AlertConfig{}, nil
}
func (m *mockStore) AddSiteReturningID(models.Site) (int, error) { return 0, nil }
func (m *mockStore) AddAlertReturningID(string, string, map[string]string) (int, error) {
return 0, nil
}
func (m *mockStore) GetAllNodes() ([]models.ProbeNode, error) { return nil, nil }
func (m *mockStore) UpdateNodeLastSeen(string) error { return nil }
func (m *mockStore) DeleteNode(string) error { return nil }
func (m *mockStore) SaveLog(string) error { return nil }
func (m *mockStore) LoadLogs(int) ([]string, error) { return nil, nil }
func (m *mockStore) GetAllMaintenanceWindows(int) ([]models.MaintenanceWindow, error) {
return nil, nil
}
func (m *mockStore) AddMaintenanceWindow(models.MaintenanceWindow) error { return nil }
func (m *mockStore) EndMaintenanceWindow(int) error { return nil }
func (m *mockStore) DeleteMaintenanceWindow(int) error { return nil }
func (m *mockStore) IsMonitorInMaintenance(int) (bool, error) { return false, nil }
func (m *mockStore) GetPreference(string) (string, error) { return "", nil }
func (m *mockStore) SetPreference(string, string) error { return nil }
func (m *mockStore) SaveStateChange(int, string, string, string) error { return nil }
func (m *mockStore) GetStateChanges(int, int) ([]models.StateChange, error) { return nil, nil }
func (m *mockStore) Close() error { return nil }
func (m *mockStore) ExportData() (models.Backup, error) {
return models.Backup{
Sites: m.sites,
Alerts: m.alerts,
}, nil
}
func (m *mockStore) ImportData(data models.Backup) error {
m.mu.Lock()
defer m.mu.Unlock()
m.importedData = &data
return nil
}
func (m *mockStore) RegisterNode(node models.ProbeNode) error {
m.mu.Lock()
defer m.mu.Unlock()
m.registeredNodes = append(m.registeredNodes, node)
m.nodes[node.ID] = node
return nil
}
func (m *mockStore) GetNode(id string) (models.ProbeNode, error) {
m.mu.Lock()
defer m.mu.Unlock()
if n, ok := m.nodes[id]; ok {
return n, nil
}
return models.ProbeNode{}, fmt.Errorf("not found")
}
func (m *mockStore) GetActiveMaintenanceWindows() ([]models.MaintenanceWindow, error) {
return m.maintWindows, nil
}
// --- Helpers ---
func freePort() int {
ln, _ := net.Listen("tcp", "127.0.0.1:0")
port := ln.Addr().(*net.TCPAddr).Port
ln.Close()
return port
}
type testServer struct {
baseURL string
srv *http.Server
store *mockStore
engine *monitor.Engine
}
func newTestServer(t *testing.T, clusterKey string, enableStatus bool) *testServer {
t.Helper()
ms := newMockStore()
eng := monitor.NewEngine(ms)
port := freePort()
srv := Start(ServerConfig{
Port: port,
EnableStatus: enableStatus,
Title: "Test Status",
ClusterKey: clusterKey,
}, ms, eng)
ts := &testServer{
baseURL: fmt.Sprintf("http://127.0.0.1:%d", port),
srv: srv,
store: ms,
engine: eng,
}
// Wait for server to be ready
deadline := time.Now().Add(2 * time.Second)
for time.Now().Before(deadline) {
resp, err := http.Get(ts.baseURL + "/api/health")
if err == nil {
resp.Body.Close()
break
}
time.Sleep(10 * time.Millisecond)
}
t.Cleanup(func() {
srv.Close()
})
return ts
}
func authReq(method, url, secret string, body []byte) (*http.Response, error) {
var req *http.Request
var err error
if body != nil {
req, err = http.NewRequest(method, url, bytes.NewReader(body))
} else {
req, err = http.NewRequest(method, url, nil)
}
if err != nil {
return nil, err
}
if secret != "" {
req.Header.Set("X-Upkeep-Secret", secret)
}
return http.DefaultClient.Do(req)
}
// --- Tests ---
func TestCheckSecret(t *testing.T) {
if !checkSecret("mykey", "mykey") {
t.Error("expected match")
}
if checkSecret("mykey", "wrong") {
t.Error("expected no match")
}
if checkSecret("", "key") {
t.Error("expected no match for empty got")
}
}
// --- Push Heartbeat ---
func TestPush_MissingToken(t *testing.T) {
ts := newTestServer(t, "secret", false)
resp, err := http.Get(ts.baseURL + "/api/push")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 400 {
t.Errorf("expected 400, got %d", resp.StatusCode)
}
}
func TestPush_InvalidToken(t *testing.T) {
ts := newTestServer(t, "secret", false)
resp, err := http.Get(ts.baseURL + "/api/push?token=bad")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 404 {
t.Errorf("expected 404, got %d", resp.StatusCode)
}
}
// --- Health ---
func TestHealth_NoSecret(t *testing.T) {
ts := newTestServer(t, "", false)
resp, err := http.Get(ts.baseURL + "/api/health")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
t.Errorf("expected 200 with no cluster key, got %d", resp.StatusCode)
}
}
func TestHealth_ValidSecret(t *testing.T) {
ts := newTestServer(t, "secret", false)
resp, err := authReq("GET", ts.baseURL+"/api/health", "secret", nil)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
t.Errorf("expected 200, got %d", resp.StatusCode)
}
}
func TestHealth_WrongSecret(t *testing.T) {
ts := newTestServer(t, "secret", false)
resp, err := authReq("GET", ts.baseURL+"/api/health", "wrong", nil)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 401 {
t.Errorf("expected 401, got %d", resp.StatusCode)
}
}
// --- Backup Export ---
func TestExport_Unauthorized_NoKey(t *testing.T) {
ts := newTestServer(t, "", false)
resp, err := http.Get(ts.baseURL + "/api/backup/export")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 401 {
t.Errorf("expected 401 when no cluster key configured, got %d", resp.StatusCode)
}
}
func TestExport_Unauthorized_WrongKey(t *testing.T) {
ts := newTestServer(t, "secret", false)
resp, err := authReq("GET", ts.baseURL+"/api/backup/export", "wrong", nil)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 401 {
t.Errorf("expected 401, got %d", resp.StatusCode)
}
}
func TestExport_Success(t *testing.T) {
ts := newTestServer(t, "secret", false)
ts.store.sites = []models.Site{{ID: 1, Name: "example", URL: "http://example.com"}}
resp, err := authReq("GET", ts.baseURL+"/api/backup/export", "secret", nil)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
t.Errorf("expected 200, got %d", resp.StatusCode)
}
var backup models.Backup
json.NewDecoder(resp.Body).Decode(&backup)
if len(backup.Sites) != 1 {
t.Errorf("expected 1 site, got %d", len(backup.Sites))
}
}
// --- Backup Import ---
func TestImport_MethodNotAllowed(t *testing.T) {
ts := newTestServer(t, "secret", false)
resp, err := authReq("GET", ts.baseURL+"/api/backup/import", "secret", nil)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 405 {
t.Errorf("expected 405, got %d", resp.StatusCode)
}
}
func TestImport_Unauthorized(t *testing.T) {
ts := newTestServer(t, "secret", false)
body, _ := json.Marshal(models.Backup{})
resp, err := authReq("POST", ts.baseURL+"/api/backup/import", "wrong", body)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 401 {
t.Errorf("expected 401, got %d", resp.StatusCode)
}
}
func TestImport_Success(t *testing.T) {
ts := newTestServer(t, "secret", false)
backup := models.Backup{
Sites: []models.Site{{Name: "imported", URL: "http://example.com"}},
}
body, _ := json.Marshal(backup)
resp, err := authReq("POST", ts.baseURL+"/api/backup/import", "secret", body)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
t.Errorf("expected 200, got %d", resp.StatusCode)
}
ts.store.mu.Lock()
defer ts.store.mu.Unlock()
if ts.store.importedData == nil {
t.Error("expected import data to be stored")
}
}
func TestImport_InvalidJSON(t *testing.T) {
ts := newTestServer(t, "secret", false)
resp, err := authReq("POST", ts.baseURL+"/api/backup/import", "secret", []byte("not json"))
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 400 {
t.Errorf("expected 400, got %d", resp.StatusCode)
}
}
// --- Probe Registration ---
func TestProbeRegister_Success(t *testing.T) {
ts := newTestServer(t, "secret", false)
body, _ := json.Marshal(map[string]string{
"id": "node-1", "name": "US East", "region": "us-east",
})
resp, err := authReq("POST", ts.baseURL+"/api/probe/register", "secret", body)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
t.Errorf("expected 200, got %d", resp.StatusCode)
}
ts.store.mu.Lock()
defer ts.store.mu.Unlock()
if len(ts.store.registeredNodes) != 1 {
t.Errorf("expected 1 registered node, got %d", len(ts.store.registeredNodes))
}
if ts.store.registeredNodes[0].ID != "node-1" {
t.Errorf("expected node-1, got %s", ts.store.registeredNodes[0].ID)
}
}
func TestProbeRegister_MissingID(t *testing.T) {
ts := newTestServer(t, "secret", false)
body, _ := json.Marshal(map[string]string{"name": "test"})
resp, err := authReq("POST", ts.baseURL+"/api/probe/register", "secret", body)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 400 {
t.Errorf("expected 400, got %d", resp.StatusCode)
}
}
func TestProbeRegister_Unauthorized(t *testing.T) {
ts := newTestServer(t, "secret", false)
body, _ := json.Marshal(map[string]string{"id": "node-1"})
resp, err := authReq("POST", ts.baseURL+"/api/probe/register", "wrong", body)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 401 {
t.Errorf("expected 401, got %d", resp.StatusCode)
}
}
// --- Probe Results ---
func TestProbeResults_Success(t *testing.T) {
ts := newTestServer(t, "secret", false)
body, _ := json.Marshal(map[string]any{
"node_id": "node-1",
"results": []map[string]any{
{"site_id": 1, "latency_ns": 5000000, "is_up": true},
},
})
resp, err := authReq("POST", ts.baseURL+"/api/probe/results", "secret", body)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
t.Errorf("expected 200, got %d", resp.StatusCode)
}
}
func TestProbeResults_MissingNodeID(t *testing.T) {
ts := newTestServer(t, "secret", false)
body, _ := json.Marshal(map[string]any{
"results": []map[string]any{},
})
resp, err := authReq("POST", ts.baseURL+"/api/probe/results", "secret", body)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 400 {
t.Errorf("expected 400, got %d", resp.StatusCode)
}
}
// --- Status Page ---
func TestStatusPage_Enabled(t *testing.T) {
ts := newTestServer(t, "secret", true)
resp, err := http.Get(ts.baseURL + "/status")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
t.Errorf("expected 200, got %d", resp.StatusCode)
}
}
func TestStatusJSON_TokensStripped(t *testing.T) {
ts := newTestServer(t, "secret", true)
// Inject a site with a token into engine state
ts.engine.UpdateSiteConfig(models.Site{ID: 1, Name: "test", Type: "push", Token: "secret-token", Status: "UP"})
// Need to inject directly since UpdateSiteConfig only updates existing
func() {
ts.engine.RecordHeartbeat("unused") // just to exercise, won't match
}()
resp, err := http.Get(ts.baseURL + "/status/json")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
t.Errorf("expected 200, got %d", resp.StatusCode)
}
var state map[string]models.Site
json.NewDecoder(resp.Body).Decode(&state)
for _, site := range state {
if site.Token != "" {
t.Error("expected token stripped from status JSON response")
}
}
}
func TestStatusJSON_MaintenanceOverride(t *testing.T) {
ts := newTestServer(t, "secret", true)
ts.store.maintWindows = []models.MaintenanceWindow{
{ID: 1, MonitorID: 0, Type: "maintenance", StartTime: time.Now().Add(-1 * time.Hour)},
}
resp, err := http.Get(ts.baseURL + "/status/json")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
t.Errorf("expected 200, got %d", resp.StatusCode)
}
}
func TestStatusPage_Disabled(t *testing.T) {
ts := newTestServer(t, "secret", false)
resp, err := http.Get(ts.baseURL + "/status")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 404 {
t.Errorf("expected 404 when status disabled, got %d", resp.StatusCode)
}
}
// --- Probe Assignments ---
func TestProbeAssignments_Success(t *testing.T) {
ts := newTestServer(t, "secret", false)
resp, err := authReq("GET", ts.baseURL+"/api/probe/assignments", "secret", nil)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
t.Errorf("expected 200, got %d", resp.StatusCode)
}
var result map[string][]models.Site
json.NewDecoder(resp.Body).Decode(&result)
if _, ok := result["sites"]; !ok {
t.Error("expected 'sites' key in response")
}
}
func TestProbeAssignments_Unauthorized(t *testing.T) {
ts := newTestServer(t, "secret", false)
resp, err := authReq("GET", ts.baseURL+"/api/probe/assignments", "wrong", nil)
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if resp.StatusCode != 401 {
t.Errorf("expected 401, got %d", resp.StatusCode)
}
}
+70
View File
@@ -0,0 +1,70 @@
package store
import (
"crypto/aes"
"crypto/cipher"
"crypto/rand"
"encoding/base64"
"encoding/hex"
"fmt"
"io"
"strings"
)
const encryptedPrefix = "enc:"
type Encryptor struct {
gcm cipher.AEAD
}
func NewEncryptor(hexKey string) (*Encryptor, error) {
key, err := hex.DecodeString(hexKey)
if err != nil {
return nil, fmt.Errorf("invalid encryption key: must be hex-encoded: %w", err)
}
if len(key) != 32 {
return nil, fmt.Errorf("invalid encryption key: must be 32 bytes (64 hex chars), got %d bytes", len(key))
}
block, err := aes.NewCipher(key)
if err != nil {
return nil, fmt.Errorf("create cipher: %w", err)
}
gcm, err := cipher.NewGCM(block)
if err != nil {
return nil, fmt.Errorf("create GCM: %w", err)
}
return &Encryptor{gcm: gcm}, nil
}
func (e *Encryptor) Encrypt(plaintext string) (string, error) {
nonce := make([]byte, e.gcm.NonceSize())
if _, err := io.ReadFull(rand.Reader, nonce); err != nil {
return "", fmt.Errorf("generate nonce: %w", err)
}
ciphertext := e.gcm.Seal(nonce, nonce, []byte(plaintext), nil)
return encryptedPrefix + base64.StdEncoding.EncodeToString(ciphertext), nil
}
func (e *Encryptor) Decrypt(data string) (string, error) {
if !strings.HasPrefix(data, encryptedPrefix) {
return data, nil
}
raw, err := base64.StdEncoding.DecodeString(strings.TrimPrefix(data, encryptedPrefix))
if err != nil {
return "", fmt.Errorf("decode base64: %w", err)
}
nonceSize := e.gcm.NonceSize()
if len(raw) < nonceSize {
return "", fmt.Errorf("ciphertext too short")
}
nonce, ciphertext := raw[:nonceSize], raw[nonceSize:]
plaintext, err := e.gcm.Open(nil, nonce, ciphertext, nil)
if err != nil {
return "", fmt.Errorf("decrypt: %w", err)
}
return string(plaintext), nil
}
func IsEncrypted(data string) bool {
return strings.HasPrefix(data, encryptedPrefix)
}
+83
View File
@@ -0,0 +1,83 @@
package store
import (
"encoding/hex"
"testing"
)
func testKey() string {
key := make([]byte, 32)
for i := range key {
key[i] = byte(i)
}
return hex.EncodeToString(key)
}
func TestEncryptorRoundTrip(t *testing.T) {
enc, err := NewEncryptor(testKey())
if err != nil {
t.Fatal(err)
}
original := `{"host":"smtp.example.com","pass":"s3cret"}`
encrypted, err := enc.Encrypt(original)
if err != nil {
t.Fatal(err)
}
if !IsEncrypted(encrypted) {
t.Error("expected encrypted prefix")
}
if encrypted == original {
t.Error("encrypted should differ from original")
}
decrypted, err := enc.Decrypt(encrypted)
if err != nil {
t.Fatal(err)
}
if decrypted != original {
t.Errorf("got %q, want %q", decrypted, original)
}
}
func TestEncryptorDecryptPlaintext(t *testing.T) {
enc, err := NewEncryptor(testKey())
if err != nil {
t.Fatal(err)
}
plain := `{"url":"https://hooks.slack.com/test"}`
result, err := enc.Decrypt(plain)
if err != nil {
t.Fatal(err)
}
if result != plain {
t.Errorf("plaintext passthrough failed: got %q", result)
}
}
func TestEncryptorBadKey(t *testing.T) {
_, err := NewEncryptor("tooshort")
if err == nil {
t.Error("expected error for short key")
}
_, err = NewEncryptor("not-hex-at-all-but-long-enough-to-be-64-chars-if-we-keep-going!!")
if err == nil {
t.Error("expected error for non-hex key")
}
}
func TestEncryptorUniqueCiphertexts(t *testing.T) {
enc, err := NewEncryptor(testKey())
if err != nil {
t.Fatal(err)
}
a, _ := enc.Encrypt("same")
b, _ := enc.Encrypt("same")
if a == b {
t.Error("two encryptions of same plaintext should produce different ciphertexts")
}
}
+5 -7
View File
@@ -1,6 +1,9 @@
package store
import "database/sql"
import (
"database/sql"
"strconv"
)
type Dialect interface {
DriverName() string
@@ -13,8 +16,6 @@ type Dialect interface {
UpsertNodeSQL() string
}
// rewritePlaceholders converts ? markers to $1, $2, etc. for Postgres.
// For SQLite (or any dialect not needing rewrite), returns the input unchanged.
func rewritePlaceholders(query string, dollarStyle bool) string {
if !dollarStyle {
return query
@@ -25,10 +26,7 @@ func rewritePlaceholders(query string, dollarStyle bool) string {
if query[i] == '?' {
n++
buf = append(buf, '$')
if n >= 10 {
buf = append(buf, byte('0'+n/10))
}
buf = append(buf, byte('0'+n%10))
buf = append(buf, []byte(strconv.Itoa(n))...)
} else {
buf = append(buf, query[i])
}
+49 -6
View File
@@ -2,6 +2,7 @@ package store
import (
"database/sql"
"log"
_ "github.com/lib/pq"
)
@@ -56,6 +57,30 @@ func (d *PostgresDialect) CreateTablesSQL() []string {
message TEXT NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
)`,
`CREATE TABLE IF NOT EXISTS maintenance_windows (
id SERIAL PRIMARY KEY,
monitor_id INTEGER DEFAULT 0,
title TEXT NOT NULL,
description TEXT DEFAULT '',
type TEXT DEFAULT 'maintenance',
start_time TIMESTAMP NOT NULL,
end_time TIMESTAMP,
created_by TEXT DEFAULT '',
created_at TIMESTAMP DEFAULT NOW()
)`,
`CREATE TABLE IF NOT EXISTS preferences (
key TEXT PRIMARY KEY,
value TEXT NOT NULL
)`,
`CREATE TABLE IF NOT EXISTS state_changes (
id SERIAL PRIMARY KEY,
site_id INTEGER NOT NULL,
from_status TEXT NOT NULL,
to_status TEXT NOT NULL,
error_reason TEXT DEFAULT '',
changed_at TIMESTAMP DEFAULT NOW()
)`,
`CREATE INDEX IF NOT EXISTS idx_state_changes_site ON state_changes(site_id, changed_at DESC)`,
}
}
@@ -84,13 +109,31 @@ func (d *PostgresDialect) UpsertNodeSQL() string {
func (d *PostgresDialect) ResetSequenceOnEmpty(db *sql.DB, table string) {}
func (d *PostgresDialect) ImportWipe(tx *sql.Tx) {
tx.Exec("TRUNCATE TABLE sites RESTART IDENTITY CASCADE")
tx.Exec("TRUNCATE TABLE alerts RESTART IDENTITY CASCADE")
tx.Exec("TRUNCATE TABLE users RESTART IDENTITY CASCADE")
if _, err := tx.Exec("TRUNCATE TABLE sites RESTART IDENTITY CASCADE"); err != nil {
log.Printf("import wipe error: %v", err)
}
if _, err := tx.Exec("TRUNCATE TABLE alerts RESTART IDENTITY CASCADE"); err != nil {
log.Printf("import wipe error: %v", err)
}
if _, err := tx.Exec("TRUNCATE TABLE users RESTART IDENTITY CASCADE"); err != nil {
log.Printf("import wipe error: %v", err)
}
if _, err := tx.Exec("TRUNCATE TABLE maintenance_windows RESTART IDENTITY CASCADE"); err != nil {
log.Printf("import wipe error: %v", err)
}
}
func (d *PostgresDialect) ImportResetSequences(tx *sql.Tx) {
tx.Exec("SELECT setval('sites_id_seq', (SELECT COALESCE(MAX(id), 1) FROM sites))")
tx.Exec("SELECT setval('alerts_id_seq', (SELECT COALESCE(MAX(id), 1) FROM alerts))")
tx.Exec("SELECT setval('users_id_seq', (SELECT COALESCE(MAX(id), 1) FROM users))")
if _, err := tx.Exec("SELECT setval('sites_id_seq', (SELECT COALESCE(MAX(id), 1) FROM sites))"); err != nil {
log.Printf("sequence reset error: %v", err)
}
if _, err := tx.Exec("SELECT setval('alerts_id_seq', (SELECT COALESCE(MAX(id), 1) FROM alerts))"); err != nil {
log.Printf("sequence reset error: %v", err)
}
if _, err := tx.Exec("SELECT setval('users_id_seq', (SELECT COALESCE(MAX(id), 1) FROM users))"); err != nil {
log.Printf("sequence reset error: %v", err)
}
if _, err := tx.Exec("SELECT setval('maintenance_windows_id_seq', (SELECT COALESCE(MAX(id), 1) FROM maintenance_windows))"); err != nil {
log.Printf("sequence reset error: %v", err)
}
}
+61 -9
View File
@@ -2,6 +2,7 @@ package store
import (
"database/sql"
"log"
_ "github.com/mattn/go-sqlite3"
)
@@ -9,7 +10,14 @@ import (
type SQLiteDialect struct{}
func NewSQLiteStore(path string) (*SQLStore, error) {
return NewSQLStore("sqlite3", path, &SQLiteDialect{})
s, err := NewSQLStore("sqlite3", path, &SQLiteDialect{})
if err != nil {
return nil, err
}
if _, err := s.db.Exec("PRAGMA journal_mode=WAL"); err != nil {
log.Printf("WAL mode failed: %v", err)
}
return s, nil
}
func (d *SQLiteDialect) DriverName() string { return "sqlite3" }
@@ -56,6 +64,30 @@ func (d *SQLiteDialect) CreateTablesSQL() []string {
message TEXT NOT NULL,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
)`,
`CREATE TABLE IF NOT EXISTS maintenance_windows (
id INTEGER PRIMARY KEY AUTOINCREMENT,
monitor_id INTEGER DEFAULT 0,
title TEXT NOT NULL,
description TEXT DEFAULT '',
type TEXT DEFAULT 'maintenance',
start_time DATETIME NOT NULL,
end_time DATETIME,
created_by TEXT DEFAULT '',
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
)`,
`CREATE TABLE IF NOT EXISTS preferences (
key TEXT PRIMARY KEY,
value TEXT NOT NULL
)`,
`CREATE TABLE IF NOT EXISTS state_changes (
id INTEGER PRIMARY KEY AUTOINCREMENT,
site_id INTEGER NOT NULL,
from_status TEXT NOT NULL,
to_status TEXT NOT NULL,
error_reason TEXT DEFAULT '',
changed_at DATETIME DEFAULT CURRENT_TIMESTAMP
)`,
`CREATE INDEX IF NOT EXISTS idx_state_changes_site ON state_changes(site_id, changed_at DESC)`,
}
}
@@ -83,19 +115,39 @@ func (d *SQLiteDialect) UpsertNodeSQL() string {
func (d *SQLiteDialect) ResetSequenceOnEmpty(db *sql.DB, table string) {
var count int
db.QueryRow("SELECT COUNT(*) FROM " + table).Scan(&count)
_ = db.QueryRow("SELECT COUNT(*) FROM " + table).Scan(&count) //nolint:errcheck
if count == 0 {
db.Exec("DELETE FROM sqlite_sequence WHERE name=?", table)
if _, err := db.Exec("DELETE FROM sqlite_sequence WHERE name=?", table); err != nil {
log.Printf("sequence cleanup error: %v", err)
}
}
}
func (d *SQLiteDialect) ImportWipe(tx *sql.Tx) {
tx.Exec("DELETE FROM sites")
tx.Exec("DELETE FROM sqlite_sequence WHERE name='sites'")
tx.Exec("DELETE FROM alerts")
tx.Exec("DELETE FROM sqlite_sequence WHERE name='alerts'")
tx.Exec("DELETE FROM users")
tx.Exec("DELETE FROM sqlite_sequence WHERE name='users'")
if _, err := tx.Exec("DELETE FROM sites"); err != nil {
log.Printf("import wipe error: %v", err)
}
if _, err := tx.Exec("DELETE FROM sqlite_sequence WHERE name='sites'"); err != nil {
log.Printf("import wipe error: %v", err)
}
if _, err := tx.Exec("DELETE FROM alerts"); err != nil {
log.Printf("import wipe error: %v", err)
}
if _, err := tx.Exec("DELETE FROM sqlite_sequence WHERE name='alerts'"); err != nil {
log.Printf("import wipe error: %v", err)
}
if _, err := tx.Exec("DELETE FROM users"); err != nil {
log.Printf("import wipe error: %v", err)
}
if _, err := tx.Exec("DELETE FROM sqlite_sequence WHERE name='users'"); err != nil {
log.Printf("import wipe error: %v", err)
}
if _, err := tx.Exec("DELETE FROM maintenance_windows"); err != nil {
log.Printf("import wipe error: %v", err)
}
if _, err := tx.Exec("DELETE FROM sqlite_sequence WHERE name='maintenance_windows'"); err != nil {
log.Printf("import wipe error: %v", err)
}
}
func (d *SQLiteDialect) ImportResetSequences(tx *sql.Tx) {}
+280 -37
View File
@@ -6,13 +6,24 @@ import (
"encoding/hex"
"encoding/json"
"fmt"
"go-upkeep/internal/models"
"strings"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
)
const (
maxCheckHistory = 1000
checkHistoryPruneAt = 1100
maxMaintenanceExport = 1000
maxRequestBody = 1 << 20
)
type SQLStore struct {
db *sql.DB
dialect Dialect
dollar bool
encryptor *Encryptor
}
func NewSQLStore(driverName, dsn string, dialect Dialect) (*SQLStore, error) {
@@ -20,20 +31,45 @@ func NewSQLStore(driverName, dsn string, dialect Dialect) (*SQLStore, error) {
if err != nil {
return nil, err
}
db.SetMaxOpenConns(25)
db.SetMaxIdleConns(5)
db.SetConnMaxLifetime(5 * time.Minute)
_, isDollar := dialect.(*PostgresDialect)
return &SQLStore{db: db, dialect: dialect, dollar: isDollar}, nil
}
func (s *SQLStore) SetEncryptor(enc *Encryptor) {
s.encryptor = enc
}
func (s *SQLStore) encryptSettings(jsonStr string) (string, error) {
if s.encryptor == nil {
return jsonStr, nil
}
return s.encryptor.Encrypt(jsonStr)
}
func (s *SQLStore) decryptSettings(data string) (string, error) {
if s.encryptor == nil {
return data, nil
}
return s.encryptor.Decrypt(data)
}
func (s *SQLStore) q(query string) string {
return rewritePlaceholders(query, s.dollar)
}
func generateToken() string {
func generateToken() (string, error) {
b := make([]byte, 16)
if _, err := rand.Read(b); err != nil {
panic("crypto/rand failed: " + err.Error())
return "", fmt.Errorf("crypto/rand failed: %w", err)
}
return hex.EncodeToString(b)
return hex.EncodeToString(b), nil
}
func (s *SQLStore) Close() error {
return s.db.Close()
}
func (s *SQLStore) Init() error {
@@ -43,14 +79,20 @@ func (s *SQLStore) Init() error {
}
}
for _, m := range s.dialect.MigrationsSQL() {
s.db.Exec(m)
if _, err := s.db.Exec(m); err != nil {
errMsg := err.Error()
if strings.Contains(errMsg, "already exists") || strings.Contains(errMsg, "duplicate column") {
continue
}
return fmt.Errorf("migration failed: %w", err)
}
}
return nil
}
func (s *SQLStore) GetSites() ([]models.Site, error) {
bf := s.dialect.BoolFalse()
query := fmt.Sprintf(
query := fmt.Sprintf( //nolint:gosec // bf is a dialect boolean literal, not user input
"SELECT id, COALESCE(name, url), url, COALESCE(type, 'http'), COALESCE(token, ''), interval, alert_id, check_ssl, threshold, max_retries, COALESCE(hostname, ''), COALESCE(port, 0), COALESCE(timeout, 0), COALESCE(method, 'GET'), COALESCE(description, ''), COALESCE(parent_id, 0), COALESCE(accepted_codes, '200-299'), COALESCE(dns_resolve_type, ''), COALESCE(dns_server, ''), COALESCE(ignore_tls, %s), COALESCE(paused, %s), COALESCE(regions, '') FROM sites",
bf, bf,
)
@@ -76,7 +118,11 @@ func (s *SQLStore) GetSites() ([]models.Site, error) {
func (s *SQLStore) AddSite(site models.Site) error {
token := ""
if site.Type == "push" {
token = generateToken()
var err error
token, err = generateToken()
if err != nil {
return fmt.Errorf("generate push token: %w", err)
}
}
_, err := s.db.Exec(s.q("INSERT INTO sites (name, url, type, token, interval, alert_id, check_ssl, threshold, max_retries, hostname, port, timeout, method, description, parent_id, accepted_codes, dns_resolve_type, dns_server, ignore_tls, paused, regions) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"),
site.Name, site.URL, site.Type, token, site.Interval, site.AlertID, site.CheckSSL, site.ExpiryThreshold, site.MaxRetries,
@@ -86,9 +132,13 @@ func (s *SQLStore) AddSite(site models.Site) error {
func (s *SQLStore) UpdateSite(site models.Site) error {
var existingToken string
s.db.QueryRow(s.q("SELECT token FROM sites WHERE id=?"), site.ID).Scan(&existingToken)
_ = s.db.QueryRow(s.q("SELECT token FROM sites WHERE id=?"), site.ID).Scan(&existingToken) //nolint:errcheck
if site.Type == "push" && existingToken == "" {
existingToken = generateToken()
var err error
existingToken, err = generateToken()
if err != nil {
return fmt.Errorf("generate push token: %w", err)
}
}
_, err := s.db.Exec(s.q("UPDATE sites SET name=?, url=?, type=?, token=?, interval=?, alert_id=?, check_ssl=?, threshold=?, max_retries=?, hostname=?, port=?, timeout=?, method=?, description=?, parent_id=?, accepted_codes=?, dns_resolve_type=?, dns_server=?, ignore_tls=?, paused=?, regions=? WHERE id=?"),
site.Name, site.URL, site.Type, existingToken, site.Interval, site.AlertID, site.CheckSSL, site.ExpiryThreshold, site.MaxRetries,
@@ -112,7 +162,7 @@ func (s *SQLStore) DeleteSite(id int) error {
func (s *SQLStore) GetSiteByName(name string) (models.Site, error) {
bf := s.dialect.BoolFalse()
query := fmt.Sprintf(
query := fmt.Sprintf( //nolint:gosec // bf is a dialect boolean literal, not user input
"SELECT id, COALESCE(name, url), url, COALESCE(type, 'http'), COALESCE(token, ''), interval, alert_id, check_ssl, threshold, max_retries, COALESCE(hostname, ''), COALESCE(port, 0), COALESCE(timeout, 0), COALESCE(method, 'GET'), COALESCE(description, ''), COALESCE(parent_id, 0), COALESCE(accepted_codes, '200-299'), COALESCE(dns_resolve_type, ''), COALESCE(dns_server, ''), COALESCE(ignore_tls, %s), COALESCE(paused, %s), COALESCE(regions, '') FROM sites WHERE name = %s",
bf, bf, s.q("?"),
)
@@ -124,37 +174,82 @@ func (s *SQLStore) GetSiteByName(name string) (models.Site, error) {
return st, err
}
func (s *SQLStore) unmarshalSettings(raw string) (map[string]string, error) {
decrypted, err := s.decryptSettings(raw)
if err != nil {
return nil, fmt.Errorf("decrypt settings: %w", err)
}
var m map[string]string
if err := json.Unmarshal([]byte(decrypted), &m); err != nil {
return nil, fmt.Errorf("unmarshal settings: %w", err)
}
return m, nil
}
func (s *SQLStore) marshalSettings(settings map[string]string) (string, error) {
jsonBytes, err := json.Marshal(settings)
if err != nil {
return "", err
}
return s.encryptSettings(string(jsonBytes))
}
func (s *SQLStore) GetAlertByName(name string) (models.AlertConfig, error) {
var a models.AlertConfig
var settingsJSON string
err := s.db.QueryRow(s.q("SELECT id, name, type, settings FROM alerts WHERE name = ?"), name).Scan(&a.ID, &a.Name, &a.Type, &settingsJSON)
var settingsRaw string
err := s.db.QueryRow(s.q("SELECT id, name, type, settings FROM alerts WHERE name = ?"), name).Scan(&a.ID, &a.Name, &a.Type, &settingsRaw)
if err != nil {
return a, err
}
json.Unmarshal([]byte(settingsJSON), &a.Settings)
a.Settings, err = s.unmarshalSettings(settingsRaw)
if err != nil {
return a, fmt.Errorf("alert %q: %w", name, err)
}
return a, nil
}
func (s *SQLStore) AddSiteReturningID(site models.Site) (int, error) {
if err := s.AddSite(site); err != nil {
return 0, err
token := ""
if site.Type == "push" {
var err error
token, err = generateToken()
if err != nil {
return 0, fmt.Errorf("generate push token: %w", err)
}
created, err := s.GetSiteByName(site.Name)
}
if s.dollar {
var id int
err := s.db.QueryRow(s.q("INSERT INTO sites (name, url, type, token, interval, alert_id, check_ssl, threshold, max_retries, hostname, port, timeout, method, description, parent_id, accepted_codes, dns_resolve_type, dns_server, ignore_tls, paused, regions) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) RETURNING id"),
site.Name, site.URL, site.Type, token, site.Interval, site.AlertID, site.CheckSSL, site.ExpiryThreshold, site.MaxRetries,
site.Hostname, site.Port, site.Timeout, site.Method, site.Description, site.ParentID, site.AcceptedCodes, site.DNSResolveType, site.DNSServer, site.IgnoreTLS, site.Paused, site.Regions).Scan(&id)
return id, err
}
result, err := s.db.Exec(s.q("INSERT INTO sites (name, url, type, token, interval, alert_id, check_ssl, threshold, max_retries, hostname, port, timeout, method, description, parent_id, accepted_codes, dns_resolve_type, dns_server, ignore_tls, paused, regions) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"),
site.Name, site.URL, site.Type, token, site.Interval, site.AlertID, site.CheckSSL, site.ExpiryThreshold, site.MaxRetries,
site.Hostname, site.Port, site.Timeout, site.Method, site.Description, site.ParentID, site.AcceptedCodes, site.DNSResolveType, site.DNSServer, site.IgnoreTLS, site.Paused, site.Regions)
if err != nil {
return 0, err
}
return created.ID, nil
id, err := result.LastInsertId()
return int(id), err
}
func (s *SQLStore) AddAlertReturningID(name, aType string, settings map[string]string) (int, error) {
if err := s.AddAlert(name, aType, settings); err != nil {
return 0, err
}
created, err := s.GetAlertByName(name)
stored, err := s.marshalSettings(settings)
if err != nil {
return 0, err
}
return created.ID, nil
if s.dollar {
var id int
err := s.db.QueryRow(s.q("INSERT INTO alerts (name, type, settings) VALUES (?, ?, ?) RETURNING id"), name, aType, stored).Scan(&id)
return id, err
}
result, err := s.db.Exec(s.q("INSERT INTO alerts (name, type, settings) VALUES (?, ?, ?)"), name, aType, stored)
if err != nil {
return 0, err
}
id, err := result.LastInsertId()
return int(id), err
}
func (s *SQLStore) GetAllAlerts() ([]models.AlertConfig, error) {
@@ -166,11 +261,14 @@ func (s *SQLStore) GetAllAlerts() ([]models.AlertConfig, error) {
var alerts []models.AlertConfig
for rows.Next() {
var a models.AlertConfig
var settingsJSON string
if err := rows.Scan(&a.ID, &a.Name, &a.Type, &settingsJSON); err != nil {
var settingsRaw string
if err := rows.Scan(&a.ID, &a.Name, &a.Type, &settingsRaw); err != nil {
return alerts, err
}
json.Unmarshal([]byte(settingsJSON), &a.Settings)
a.Settings, err = s.unmarshalSettings(settingsRaw)
if err != nil {
return alerts, fmt.Errorf("alert %q: %w", a.Name, err)
}
alerts = append(alerts, a)
}
return alerts, rows.Err()
@@ -178,30 +276,33 @@ func (s *SQLStore) GetAllAlerts() ([]models.AlertConfig, error) {
func (s *SQLStore) GetAlert(id int) (models.AlertConfig, error) {
var a models.AlertConfig
var settingsJSON string
err := s.db.QueryRow(s.q("SELECT id, name, type, settings FROM alerts WHERE id = ?"), id).Scan(&a.ID, &a.Name, &a.Type, &settingsJSON)
var settingsRaw string
err := s.db.QueryRow(s.q("SELECT id, name, type, settings FROM alerts WHERE id = ?"), id).Scan(&a.ID, &a.Name, &a.Type, &settingsRaw)
if err != nil {
return a, err
}
json.Unmarshal([]byte(settingsJSON), &a.Settings)
a.Settings, err = s.unmarshalSettings(settingsRaw)
if err != nil {
return a, fmt.Errorf("alert %d: %w", id, err)
}
return a, nil
}
func (s *SQLStore) AddAlert(name, aType string, settings map[string]string) error {
jsonBytes, err := json.Marshal(settings)
stored, err := s.marshalSettings(settings)
if err != nil {
return err
}
_, err = s.db.Exec(s.q("INSERT INTO alerts (name, type, settings) VALUES (?, ?, ?)"), name, aType, string(jsonBytes))
_, err = s.db.Exec(s.q("INSERT INTO alerts (name, type, settings) VALUES (?, ?, ?)"), name, aType, stored)
return err
}
func (s *SQLStore) UpdateAlert(id int, name, aType string, settings map[string]string) error {
jsonBytes, err := json.Marshal(settings)
stored, err := s.marshalSettings(settings)
if err != nil {
return err
}
_, err = s.db.Exec(s.q("UPDATE alerts SET name=?, type=?, settings=? WHERE id=?"), name, aType, string(jsonBytes), id)
_, err = s.db.Exec(s.q("UPDATE alerts SET name=?, type=?, settings=? WHERE id=?"), name, aType, stored, id)
return err
}
@@ -246,6 +347,29 @@ func (s *SQLStore) DeleteUser(id int) error {
return err
}
func (s *SQLStore) SaveStateChange(siteID int, fromStatus, toStatus, errorReason string) error {
_, err := s.db.Exec(s.q("INSERT INTO state_changes (site_id, from_status, to_status, error_reason) VALUES (?, ?, ?, ?)"),
siteID, fromStatus, toStatus, errorReason)
return err
}
func (s *SQLStore) GetStateChanges(siteID int, limit int) ([]models.StateChange, error) {
rows, err := s.db.Query(s.q("SELECT id, site_id, from_status, to_status, error_reason, changed_at FROM state_changes WHERE site_id = ? ORDER BY changed_at DESC LIMIT ?"), siteID, limit)
if err != nil {
return nil, err
}
defer rows.Close()
var changes []models.StateChange
for rows.Next() {
var sc models.StateChange
if err := rows.Scan(&sc.ID, &sc.SiteID, &sc.FromStatus, &sc.ToStatus, &sc.ErrorReason, &sc.ChangedAt); err != nil {
return changes, err
}
changes = append(changes, sc)
}
return changes, rows.Err()
}
func (s *SQLStore) SaveCheck(siteID int, latencyNs int64, isUp bool) error {
return s.SaveCheckFromNode(siteID, "", latencyNs, isUp)
}
@@ -255,10 +379,16 @@ func (s *SQLStore) SaveCheckFromNode(siteID int, nodeID string, latencyNs int64,
if err != nil {
return err
}
_, err = s.db.Exec(s.q(`DELETE FROM check_history WHERE site_id = ? AND id NOT IN (
SELECT id FROM check_history WHERE site_id = ? ORDER BY checked_at DESC LIMIT 1000
)`), siteID, siteID)
var count int
_ = s.db.QueryRow(s.q("SELECT COUNT(*) FROM check_history WHERE site_id = ?"), siteID).Scan(&count)
if count > checkHistoryPruneAt {
pruneQuery := fmt.Sprintf(`DELETE FROM check_history WHERE site_id = ? AND id NOT IN (
SELECT id FROM check_history WHERE site_id = ? ORDER BY checked_at DESC LIMIT %d
)`, maxCheckHistory)
_, err = s.db.Exec(s.q(pruneQuery), siteID, siteID)
return err
}
return nil
}
func (s *SQLStore) RegisterNode(node models.ProbeNode) error {
@@ -356,6 +486,108 @@ func (s *SQLStore) LoadAllHistory(limit int) (map[int][]models.CheckRecord, erro
return result, rows.Err()
}
func (s *SQLStore) scanMaintenanceWindow(rows *sql.Rows) (models.MaintenanceWindow, error) {
var mw models.MaintenanceWindow
var endTime sql.NullTime
if err := rows.Scan(&mw.ID, &mw.MonitorID, &mw.Title, &mw.Description, &mw.Type, &mw.StartTime, &endTime, &mw.CreatedBy, &mw.CreatedAt); err != nil {
return mw, err
}
if endTime.Valid {
mw.EndTime = endTime.Time
}
return mw, nil
}
func (s *SQLStore) GetActiveMaintenanceWindows() ([]models.MaintenanceWindow, error) {
rows, err := s.db.Query(s.q("SELECT id, monitor_id, title, description, type, start_time, end_time, created_by, created_at FROM maintenance_windows WHERE start_time <= CURRENT_TIMESTAMP AND (end_time IS NULL OR end_time > CURRENT_TIMESTAMP) ORDER BY start_time DESC"))
if err != nil {
return nil, err
}
defer rows.Close()
var windows []models.MaintenanceWindow
for rows.Next() {
mw, err := s.scanMaintenanceWindow(rows)
if err != nil {
return windows, err
}
windows = append(windows, mw)
}
return windows, rows.Err()
}
func (s *SQLStore) GetAllMaintenanceWindows(limit int) ([]models.MaintenanceWindow, error) {
rows, err := s.db.Query(s.q("SELECT id, monitor_id, title, description, type, start_time, end_time, created_by, created_at FROM maintenance_windows ORDER BY created_at DESC LIMIT ?"), limit)
if err != nil {
return nil, err
}
defer rows.Close()
var windows []models.MaintenanceWindow
for rows.Next() {
mw, err := s.scanMaintenanceWindow(rows)
if err != nil {
return windows, err
}
windows = append(windows, mw)
}
return windows, rows.Err()
}
func (s *SQLStore) AddMaintenanceWindow(mw models.MaintenanceWindow) error {
if mw.StartTime.IsZero() {
mw.StartTime = time.Now()
}
_, err := s.db.Exec(s.q("INSERT INTO maintenance_windows (monitor_id, title, description, type, start_time, end_time, created_by) VALUES (?, ?, ?, ?, ?, ?, ?)"),
mw.MonitorID, mw.Title, mw.Description, mw.Type, mw.StartTime, sql.NullTime{Time: mw.EndTime, Valid: !mw.EndTime.IsZero()}, mw.CreatedBy)
return err
}
func (s *SQLStore) EndMaintenanceWindow(id int) error {
_, err := s.db.Exec(s.q("UPDATE maintenance_windows SET end_time = CURRENT_TIMESTAMP WHERE id = ?"), id)
return err
}
func (s *SQLStore) DeleteMaintenanceWindow(id int) error {
_, err := s.db.Exec(s.q("DELETE FROM maintenance_windows WHERE id = ?"), id)
if err != nil {
return err
}
s.dialect.ResetSequenceOnEmpty(s.db, "maintenance_windows")
return nil
}
func (s *SQLStore) IsMonitorInMaintenance(monitorID int) (bool, error) {
var count int
err := s.db.QueryRow(s.q(`SELECT COUNT(*) FROM maintenance_windows
WHERE type = 'maintenance'
AND start_time <= CURRENT_TIMESTAMP
AND (end_time IS NULL OR end_time > CURRENT_TIMESTAMP)
AND (monitor_id = 0 OR monitor_id = ?
OR monitor_id IN (SELECT parent_id FROM sites WHERE id = ? AND parent_id > 0))`),
monitorID, monitorID).Scan(&count)
if err != nil {
return false, err
}
return count > 0, nil
}
func (s *SQLStore) GetPreference(key string) (string, error) {
var value string
err := s.db.QueryRow(s.q("SELECT value FROM preferences WHERE key = ?"), key).Scan(&value)
if err != nil {
return "", err
}
return value, nil
}
func (s *SQLStore) SetPreference(key, value string) error {
if s.dollar {
_, err := s.db.Exec(s.q("INSERT INTO preferences (key, value) VALUES (?, ?) ON CONFLICT (key) DO UPDATE SET value = ?"), key, value, value)
return err
}
_, err := s.db.Exec("INSERT OR REPLACE INTO preferences (key, value) VALUES (?, ?)", key, value)
return err
}
func (s *SQLStore) ExportData() (models.Backup, error) {
sites, err := s.GetSites()
if err != nil {
@@ -369,7 +601,11 @@ func (s *SQLStore) ExportData() (models.Backup, error) {
if err != nil {
return models.Backup{}, err
}
return models.Backup{Sites: sites, Alerts: alerts, Users: users}, nil
windows, err := s.GetAllMaintenanceWindows(maxMaintenanceExport)
if err != nil {
return models.Backup{}, err
}
return models.Backup{Sites: sites, Alerts: alerts, Users: users, MaintenanceWindows: windows}, nil
}
func (s *SQLStore) ImportData(data models.Backup) error {
@@ -377,7 +613,7 @@ func (s *SQLStore) ImportData(data models.Backup) error {
if err != nil {
return err
}
defer tx.Rollback()
defer tx.Rollback() //nolint:errcheck
s.dialect.ImportWipe(tx)
@@ -403,6 +639,13 @@ func (s *SQLStore) ImportData(data models.Backup) error {
}
}
for _, mw := range data.MaintenanceWindows {
if _, err := tx.Exec(s.q("INSERT INTO maintenance_windows (id, monitor_id, title, description, type, start_time, end_time, created_by) VALUES (?, ?, ?, ?, ?, ?, ?, ?)"),
mw.ID, mw.MonitorID, mw.Title, mw.Description, mw.Type, mw.StartTime, sql.NullTime{Time: mw.EndTime, Valid: !mw.EndTime.IsZero()}, mw.CreatedBy); err != nil {
return err
}
}
s.dialect.ImportResetSequences(tx)
return tx.Commit()
+1 -1
View File
@@ -1,7 +1,7 @@
package store
import (
"go-upkeep/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"testing"
)
+20 -1
View File
@@ -1,7 +1,7 @@
package store
import (
"go-upkeep/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
)
type Store interface {
@@ -38,6 +38,10 @@ type Store interface {
SaveCheckFromNode(siteID int, nodeID string, latencyNs int64, isUp bool) error
LoadAllHistory(limit int) (map[int][]models.CheckRecord, error)
// State Changes
SaveStateChange(siteID int, fromStatus, toStatus, errorReason string) error
GetStateChanges(siteID int, limit int) ([]models.StateChange, error)
// Nodes
RegisterNode(node models.ProbeNode) error
GetNode(id string) (models.ProbeNode, error)
@@ -49,7 +53,22 @@ type Store interface {
SaveLog(message string) error
LoadLogs(limit int) ([]string, error)
// Maintenance Windows
GetActiveMaintenanceWindows() ([]models.MaintenanceWindow, error)
GetAllMaintenanceWindows(limit int) ([]models.MaintenanceWindow, error)
AddMaintenanceWindow(mw models.MaintenanceWindow) error
EndMaintenanceWindow(id int) error
DeleteMaintenanceWindow(id int) error
IsMonitorInMaintenance(monitorID int) (bool, error)
// Preferences
GetPreference(key string) (string, error)
SetPreference(key, value string) error
// Backup & Restore
ExportData() (models.Backup, error)
ImportData(data models.Backup) error
// Lifecycle
Close() error
}
+97 -6
View File
@@ -2,7 +2,10 @@ package tui
import (
"fmt"
"strings"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
tea "github.com/charmbracelet/bubbletea"
"github.com/charmbracelet/huh"
"github.com/charmbracelet/lipgloss"
@@ -113,34 +116,122 @@ func fmtAlertConfig(alert struct {
}
}
func fmtAlertHealth(h monitor.AlertHealth) string {
if h.LastSendAt.IsZero() {
return subtleStyle.Render("●")
}
if h.LastSendOK {
return specialStyle.Render("●")
}
return dangerStyle.Render("●")
}
func fmtAlertLastSent(h monitor.AlertHealth) string {
if h.LastSendAt.IsZero() {
return subtleStyle.Render("never")
}
d := time.Since(h.LastSendAt)
if d < time.Minute {
return fmt.Sprintf("%ds ago", int(d.Seconds()))
}
if d < time.Hour {
return fmt.Sprintf("%dm ago", int(d.Minutes()))
}
if d < 24*time.Hour {
return fmt.Sprintf("%dh ago", int(d.Hours()))
}
return fmt.Sprintf("%dd ago", int(d.Hours())/24)
}
func (m Model) viewAlertsTab() string {
if len(m.alerts) == 0 {
return "\n No alert channels configured. Press [n] to add one."
}
var headers []string
var widths []int
if m.isWide() {
headers = []string{"#", "", "NAME", "TYPE", "CONFIG", "LAST SENT"}
widths = []int{4, 3, 18, 12, 40, 12}
} else {
headers = []string{"#", "", "NAME", "TYPE", "CONFIG", "SENT"}
widths = []int{4, 3, 14, 10, 24, 8}
}
nameW := widths[2]
cfgW := widths[4]
return m.renderTable(
[]string{"#", "NAME", "TYPE", "CONFIG"},
headers,
len(m.alerts),
func(start, end int) [][]string {
var rows [][]string
for i := start; i < end; i++ {
a := m.alerts[i]
h := m.engine.GetAlertHealth(a.ID)
rows = append(rows, []string{
fmt.Sprintf("%d", i+1),
m.zones.Mark(fmt.Sprintf("alert-%d", i), limitStr(a.Name, 15)),
fmtAlertHealth(h),
m.zones.Mark(fmt.Sprintf("alert-%d", i), limitStr(a.Name, nameW-2)),
fmtAlertType(a.Type),
fmtAlertConfig(struct {
limitStr(fmtAlertConfig(struct {
Type string
Settings map[string]string
}{a.Type, a.Settings}),
}{a.Type, a.Settings}), cfgW-2),
fmtAlertLastSent(h),
})
}
return rows
},
nil, nil,
widths, nil,
)
}
func (m Model) viewAlertDetailPanel() string {
if m.cursor >= len(m.alerts) {
return ""
}
a := m.alerts[m.cursor]
h := m.engine.GetAlertHealth(a.ID)
var b strings.Builder
b.WriteString(subtleStyle.Render(" Alerts > ") + titleStyle.Render(a.Name) + "\n\n")
row := func(label, value string) {
fmt.Fprintf(&b, " %-16s %s\n", subtleStyle.Render(label), value)
}
row("Type", fmtAlertType(a.Type))
if h.LastSendAt.IsZero() {
row("Health", subtleStyle.Render("never sent"))
} else if h.LastSendOK {
row("Health", specialStyle.Render("OK"))
} else {
row("Health", dangerStyle.Render("FAILED"))
}
if !h.LastSendAt.IsZero() {
row("Last Sent", h.LastSendAt.Format("2006-01-02 15:04:05")+" ("+fmtAlertLastSent(h)+")")
}
if h.SendCount > 0 {
row("Sends", fmt.Sprintf("%d sent, %d failed", h.SendCount, h.FailCount))
}
if h.LastError != "" {
row("Last Error", dangerStyle.Render(limitStr(h.LastError, 60)))
}
b.WriteString("\n" + subtleStyle.Render(" CONFIGURATION") + "\n")
for k, v := range a.Settings {
row(k, v)
}
b.WriteString("\n\n")
b.WriteString(subtleStyle.Render(" [i/Esc] Back [e] Edit [t] Test [q] Quit"))
return lipgloss.NewStyle().Padding(1, 2).Render(b.String())
}
func (m *Model) initAlertHuhForm() tea.Cmd {
m.alertFormData = &alertFormData{
AlertType: "discord",
@@ -319,7 +410,7 @@ func (m *Model) initAlertHuhForm() tea.Cmd {
).Title("Gotify Settings").WithHideFunc(func() bool {
return m.alertFormData.AlertType != "gotify"
}),
).WithTheme(huh.ThemeDracula())
).WithTheme(m.theme.HuhTheme())
return m.huhForm.Init()
}
+88 -20
View File
@@ -5,27 +5,83 @@ import (
"strings"
)
func colorizeLog(line string) string {
type logSeverity int
const (
severityInfo logSeverity = iota
severityWarn
severityDown
severityUp
severitySystem
)
func classifyLog(line string) logSeverity {
lower := strings.ToLower(line)
switch {
case strings.Contains(lower, "confirmed down"),
strings.Contains(lower, "is down"),
strings.Contains(lower, "missed heartbeat"),
strings.Contains(lower, "failed check"),
strings.Contains(lower, "ssl warning"):
return dangerStyle.Render(line)
strings.Contains(lower, "alert send failed"):
return severityDown
case strings.Contains(lower, "recovered"),
strings.Contains(lower, "is up"),
strings.Contains(lower, "recovery"):
return specialStyle.Render(line)
strings.Contains(lower, "recovery"),
strings.Contains(lower, "first heartbeat"):
return severityUp
case strings.Contains(lower, "failed check"),
strings.Contains(lower, "ssl warning"),
strings.Contains(lower, "overdue"),
strings.Contains(lower, "was late"):
return severityWarn
case strings.Contains(lower, "engine"),
strings.Contains(lower, "cluster"):
return titleStyle.Render(line)
strings.Contains(lower, "cluster"),
strings.Contains(lower, "loaded"),
strings.Contains(lower, "paused"),
strings.Contains(lower, "resumed"):
return severitySystem
default:
return line
return severityInfo
}
}
func isImportantLog(sev logSeverity) bool {
return sev == severityDown || sev == severityUp || sev == severitySystem
}
func renderLogTag(sev logSeverity) string {
switch sev {
case severityDown:
return dangerStyle.Render(" DOWN ")
case severityUp:
return specialStyle.Render(" UP ")
case severityWarn:
return warnStyle.Render(" WARN ")
case severitySystem:
return titleStyle.Render(" SYS ")
default:
return subtleStyle.Render(" info ")
}
}
func renderLogLine(line string) string {
sev := classifyLog(line)
tag := renderLogTag(sev)
ts := ""
msg := line
if len(line) > 10 && line[0] == '[' {
if idx := strings.Index(line, "]"); idx > 0 && idx < 12 {
ts = subtleStyle.Render(line[1:idx])
msg = strings.TrimSpace(line[idx+1:])
}
}
if ts != "" {
return fmt.Sprintf(" %s %s %s", ts, tag, msg)
}
return fmt.Sprintf(" %s %s", tag, msg)
}
func (m Model) viewLogsTab() string {
content := m.logViewport.View()
if strings.TrimSpace(content) == "" || content == "Waiting for logs..." {
@@ -33,22 +89,34 @@ func (m Model) viewLogsTab() string {
}
lines := strings.Split(content, "\n")
var colored []string
var rendered []string
total := 0
shown := 0
for _, line := range lines {
if line == "" {
colored = append(colored, line)
if strings.TrimSpace(line) == "" {
continue
}
colored = append(colored, colorizeLog(line))
total++
sev := classifyLog(line)
if m.logFilterImportant && !isImportantLog(sev) {
continue
}
shown++
rendered = append(rendered, renderLogLine(line))
}
count := 0
for _, l := range lines {
if strings.TrimSpace(l) != "" {
count++
}
filterLabel := "All"
if m.logFilterImportant {
filterLabel = "Important"
}
header := subtleStyle.Render(fmt.Sprintf(" %d entries [↑/↓] Scroll [PgUp/PgDn] Page", count))
return "\n" + header + "\n\n" + strings.Join(colored, "\n")
header := subtleStyle.Render(fmt.Sprintf(
" %d entries [↑/↓] Scroll [PgUp/PgDn] Page [f] Filter: %s", shown, filterLabel))
if m.logFilterImportant && shown < total {
header += subtleStyle.Render(fmt.Sprintf(" (%d hidden)", total-shown))
}
return "\n" + header + "\n\n" + strings.Join(rendered, "\n")
}
+247
View File
@@ -0,0 +1,247 @@
package tui
import (
"fmt"
"strconv"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
tea "github.com/charmbracelet/bubbletea"
"github.com/charmbracelet/huh"
"github.com/charmbracelet/lipgloss"
)
var maintStyle lipgloss.Style
type maintFormData struct {
Title string
Description string
Type string
MonitorID string
Duration string
CustomHours string
}
func fmtMaintStatus(mw models.MaintenanceWindow) string {
now := time.Now()
if mw.StartTime.After(now) {
return warnStyle.Render("SCHEDULED")
}
if !mw.EndTime.IsZero() && mw.EndTime.Before(now) {
return subtleStyle.Render("ENDED")
}
return specialStyle.Render("ACTIVE")
}
func fmtMaintType(t string) string {
if t == "incident" {
return dangerStyle.Render("incident")
}
return maintStyle.Render("maintenance")
}
func fmtMaintMonitorW(monitorID int, sites []models.Site, maxW int) string {
if monitorID == 0 {
return "All"
}
for _, s := range sites {
if s.ID == monitorID {
return limitStr(s.Name, maxW)
}
}
return fmt.Sprintf("#%d", monitorID)
}
func fmtMaintTime(t time.Time, colW int) string {
if t.IsZero() {
return subtleStyle.Render("—")
}
now := time.Now()
if t.Year() == now.Year() && t.YearDay() == now.YearDay() {
return t.Format("15:04")
}
if colW >= 14 {
return t.Format("15:04 Jan 02")
}
return t.Format("Jan 02")
}
func (m Model) isMonitorInMaintenance(monitorID int) bool {
for _, mw := range m.maintenanceWindows {
if mw.Type != "maintenance" {
continue
}
now := time.Now()
if mw.StartTime.After(now) {
continue
}
if !mw.EndTime.IsZero() && mw.EndTime.Before(now) {
continue
}
if mw.MonitorID == 0 || mw.MonitorID == monitorID {
return true
}
for _, s := range m.sites {
if s.ID == monitorID && s.ParentID > 0 && mw.MonitorID == s.ParentID {
return true
}
}
}
return false
}
func (m Model) viewMaintTab() string {
if len(m.maintenanceWindows) == 0 {
return "\n No maintenance windows or incidents. Press [n] to create one."
}
var headers []string
var widths []int
if m.isWide() {
headers = []string{"#", "TITLE", "TYPE", "MONITORS", "STATUS", "STARTED", "ENDS"}
widths = []int{4, 24, 14, 22, 12, 16, 16}
} else {
headers = []string{"#", "TITLE", "TYPE", "MON", "ST", "START", "ENDS"}
widths = []int{4, 14, 13, 14, 11, 14, 14}
}
titleW := widths[1]
monW := widths[3]
timeW := widths[5]
return m.renderTable(
headers,
len(m.maintenanceWindows),
func(start, end int) [][]string {
var rows [][]string
allSites := m.engine.GetAllSites()
for i := start; i < end; i++ {
mw := m.maintenanceWindows[i]
rows = append(rows, []string{
strconv.Itoa(i + 1),
m.zones.Mark(fmt.Sprintf("maint-%d", i), limitStr(mw.Title, titleW-2)),
fmtMaintType(mw.Type),
fmtMaintMonitorW(mw.MonitorID, allSites, monW-2),
fmtMaintStatus(mw),
fmtMaintTime(mw.StartTime, timeW),
fmtMaintTime(mw.EndTime, timeW),
})
}
return rows
},
widths,
nil,
)
}
func (m *Model) initMaintHuhForm() tea.Cmd {
m.maintFormData = &maintFormData{
Type: "maintenance",
MonitorID: "0",
Duration: "1h",
CustomHours: "12",
}
monitorOpts := []huh.Option[string]{huh.NewOption("All Monitors", "0")}
allSites := m.engine.GetAllSites()
for _, s := range allSites {
label := s.Name
if s.Type == "group" {
label = s.Name + " (group)"
}
monitorOpts = append(monitorOpts, huh.NewOption(label, strconv.Itoa(s.ID)))
}
m.huhForm = huh.NewForm(
huh.NewGroup(
huh.NewInput().Title("Title").
Placeholder("DB Migration").
Value(&m.maintFormData.Title).
Validate(func(s string) error {
if s == "" {
return fmt.Errorf("title is required")
}
return nil
}),
huh.NewSelect[string]().Title("Type").
Options(
huh.NewOption("Maintenance (suppress alerts)", "maintenance"),
huh.NewOption("Incident (informational)", "incident"),
).Value(&m.maintFormData.Type),
huh.NewSelect[string]().Title("Affected Monitors").
Options(monitorOpts...).
Value(&m.maintFormData.MonitorID),
huh.NewInput().Title("Description").
Placeholder("Optional notes").
Value(&m.maintFormData.Description),
).Title("Maintenance Window"),
huh.NewGroup(
huh.NewSelect[string]().Title("Duration").
Options(
huh.NewOption("1 hour", "1h"),
huh.NewOption("2 hours", "2h"),
huh.NewOption("4 hours", "4h"),
huh.NewOption("8 hours", "8h"),
huh.NewOption("Indefinite (end manually)", "indefinite"),
huh.NewOption("Custom", "custom"),
).Value(&m.maintFormData.Duration),
huh.NewInput().Title("Custom Duration (hours)").
Placeholder("12").
Value(&m.maintFormData.CustomHours).
Validate(func(s string) error {
if m.maintFormData.Duration != "custom" {
return nil
}
v, err := strconv.Atoi(s)
if err != nil {
return fmt.Errorf("must be a number")
}
if v < 1 {
return fmt.Errorf("must be at least 1 hour")
}
return nil
}),
).Title("Duration").WithHideFunc(func() bool {
return m.maintFormData.Type == "incident"
}),
).WithTheme(m.theme.HuhTheme())
return m.huhForm.Init()
}
func (m *Model) submitMaintForm() {
d := m.maintFormData
monitorID, _ := strconv.Atoi(d.MonitorID)
mw := models.MaintenanceWindow{
MonitorID: monitorID,
Title: d.Title,
Description: d.Description,
Type: d.Type,
StartTime: time.Now(),
}
if d.Type == "maintenance" {
switch d.Duration {
case "1h":
mw.EndTime = mw.StartTime.Add(1 * time.Hour)
case "2h":
mw.EndTime = mw.StartTime.Add(2 * time.Hour)
case "4h":
mw.EndTime = mw.StartTime.Add(4 * time.Hour)
case "8h":
mw.EndTime = mw.StartTime.Add(8 * time.Hour)
case "custom":
hours, _ := strconv.Atoi(d.CustomHours)
if hours < 1 {
hours = 1
}
mw.EndTime = mw.StartTime.Add(time.Duration(hours) * time.Hour)
}
}
if err := m.store.AddMaintenanceWindow(mw); err != nil {
m.engine.AddLog("Add maintenance window failed: " + err.Error())
}
m.state = stateDashboard
}
+13 -29
View File
@@ -2,8 +2,6 @@ package tui
import (
"fmt"
"go-upkeep/internal/models"
"strings"
"time"
)
@@ -12,16 +10,25 @@ func (m Model) viewNodesTab() string {
return "\n No probe nodes connected."
}
colWidths := []int{0, 12, 20, 10, 8}
var headers []string
var widths []int
if m.isWide() {
headers = []string{"NAME", "REGION", "LAST SEEN", "VERSION", "STATUS"}
widths = []int{24, 14, 16, 12, 10}
} else {
headers = []string{"NAME", "REGION", "SEEN", "VER", "STATUS"}
widths = []int{16, 10, 10, 8, 8}
}
nameW := widths[0]
return m.renderTable(
[]string{"NAME", "REGION", "LAST SEEN", "VERSION", "STATUS"},
headers,
len(m.nodes),
func(start, end int) [][]string {
var rows [][]string
for i := start; i < end; i++ {
node := m.nodes[i]
name := limitStr(node.Name, 20)
name := limitStr(node.Name, nameW-2)
if name == "" {
name = node.ID
}
@@ -39,7 +46,7 @@ func (m Model) viewNodesTab() string {
}
return rows
},
colWidths,
widths,
nil,
)
}
@@ -71,26 +78,3 @@ func fmtNodeLastSeen(t time.Time) string {
}
return fmt.Sprintf("%dh ago", int(ago.Hours()))
}
func fmtProbeRegions(site models.Site, probeResults map[string]probeStatus) string {
if len(probeResults) == 0 {
return subtleStyle.Render("—")
}
var parts []string
for region, status := range probeResults {
short := region
if len(short) > 6 {
short = short[:6]
}
if status.isUp {
parts = append(parts, specialStyle.Render(short+":UP"))
} else {
parts = append(parts, dangerStyle.Render(short+":DN"))
}
}
return strings.Join(parts, " ")
}
type probeStatus struct {
isUp bool
}
+347 -51
View File
@@ -2,12 +2,13 @@ package tui
import (
"fmt"
"go-upkeep/internal/models"
"net/url"
"strconv"
"strings"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
tea "github.com/charmbracelet/bubbletea"
"github.com/charmbracelet/huh"
"github.com/charmbracelet/lipgloss"
@@ -29,18 +30,15 @@ func typeIcon(siteType string, collapsed bool) string {
return "◆"
case "group":
if collapsed {
return ""
return ""
}
return ""
return ""
default:
return "·"
}
}
var siteGroupStyle = lipgloss.NewStyle().
Padding(0, 1).
Bold(true).
Foreground(lipgloss.Color("#7D56F4"))
var siteGroupStyle lipgloss.Style
type siteFormData struct {
Name string
@@ -62,14 +60,18 @@ type siteFormData struct {
Regions string
}
func latencySparkline(latencies []time.Duration, width int) string {
func latencySparkline(latencies []time.Duration, statuses []bool, width int) string {
if len(latencies) == 0 {
return subtleStyle.Render(strings.Repeat("·", width))
}
samples := latencies
sampledStatuses := statuses
if len(samples) > width {
samples = samples[len(samples)-width:]
if len(sampledStatuses) > width {
sampledStatuses = sampledStatuses[len(sampledStatuses)-width:]
}
}
minL, maxL := samples[0], samples[0]
@@ -87,7 +89,7 @@ func latencySparkline(latencies []time.Duration, width int) string {
sb.WriteString(subtleStyle.Render(strings.Repeat("·", remaining)))
}
spread := maxL - minL
for _, l := range samples {
for i, l := range samples {
idx := 0
if spread > 0 {
idx = int(float64(l-minL) / float64(spread) * 7)
@@ -96,6 +98,10 @@ func latencySparkline(latencies []time.Duration, width int) string {
}
}
ch := string(sparkChars[idx])
isDown := i < len(sampledStatuses) && !sampledStatuses[i]
if isDown {
sb.WriteString(dangerStyle.Render(ch))
} else {
ms := l.Milliseconds()
if ms < 200 {
sb.WriteString(specialStyle.Render(ch))
@@ -105,6 +111,7 @@ func latencySparkline(latencies []time.Duration, width int) string {
sb.WriteString(dangerStyle.Render(ch))
}
}
}
return sb.String()
}
@@ -132,6 +139,93 @@ func heartbeatSparkline(statuses []bool, width int) string {
return sb.String()
}
func (m Model) groupSparkline(groupID int, width int) string {
allSites := m.engine.GetAllSites()
var childStatuses [][]bool
for _, s := range allSites {
if s.ParentID == groupID && !s.Paused && !m.isMonitorInMaintenance(s.ID) {
hist, _ := m.engine.GetHistory(s.ID)
if len(hist.Statuses) > 0 {
childStatuses = append(childStatuses, hist.Statuses)
}
}
}
if len(childStatuses) == 0 {
return subtleStyle.Render(strings.Repeat("·", width))
}
maxLen := 0
for _, s := range childStatuses {
if len(s) > maxLen {
maxLen = len(s)
}
}
if maxLen > width {
maxLen = width
}
aggregated := make([]bool, maxLen)
for i := 0; i < maxLen; i++ {
allUp := true
for _, statuses := range childStatuses {
idx := len(statuses) - maxLen + i
if idx >= 0 && !statuses[idx] {
allUp = false
break
}
}
aggregated[i] = allUp
}
var sb strings.Builder
if remaining := width - len(aggregated); remaining > 0 {
sb.WriteString(subtleStyle.Render(strings.Repeat("·", remaining)))
}
for _, up := range aggregated {
if up {
sb.WriteString(specialStyle.Render("●"))
} else {
sb.WriteString(dangerStyle.Render("●"))
}
}
return sb.String()
}
func (m Model) groupUptime(groupID int) string {
allSites := m.engine.GetAllSites()
var allStatuses [][]bool
for _, s := range allSites {
if s.ParentID == groupID && !s.Paused && !m.isMonitorInMaintenance(s.ID) {
hist, _ := m.engine.GetHistory(s.ID)
if len(hist.Statuses) > 0 {
allStatuses = append(allStatuses, hist.Statuses)
}
}
}
if len(allStatuses) == 0 {
return subtleStyle.Render("—")
}
total, up := 0, 0
for _, statuses := range allStatuses {
for _, s := range statuses {
total++
if s {
up++
}
}
}
return fmtUptime(func() []bool {
out := make([]bool, total)
idx := 0
for _, statuses := range allStatuses {
copy(out[idx:], statuses)
idx += len(statuses)
}
return out
}())
}
func fmtLatency(d time.Duration) string {
ms := d.Milliseconds()
if ms == 0 {
@@ -207,42 +301,113 @@ func fmtRetries(site models.Site) string {
return s
}
func fmtStatus(status string, paused bool) string {
func fmtStatus(status string, paused bool, inMaint bool) string {
if paused {
return warnStyle.Render("PAUSED")
}
switch {
case status == "DOWN" || status == "SSL EXP":
if inMaint {
return maintStyle.Render("MAINT")
}
switch status {
case "DOWN", "SSL EXP":
return dangerStyle.Render(status)
case status == "PENDING":
case "LATE":
return warnStyle.Render(status)
case "PENDING":
return subtleStyle.Render(status)
default:
return specialStyle.Render(status)
}
}
func (m Model) dynamicWidths() (nameW, sparkW int) {
fixed := 6 + 10 + 10 + 8 + 8 + 7 + 9 // #, TYPE, STATUS, LATENCY, UPTIME, SSL, RETRY
overhead := 30 // cell padding + borders
avail := m.termWidth - 6 - fixed - overhead
if avail < 30 {
avail = 30
func fmtDuration(d time.Duration) string {
if d < time.Minute {
return fmt.Sprintf("%ds", int(d.Seconds()))
}
if d < time.Hour {
return fmt.Sprintf("%dm", int(d.Minutes()))
}
if d < 24*time.Hour {
h := int(d.Hours())
m := int(d.Minutes()) % 60
if m > 0 {
return fmt.Sprintf("%dh %dm", h, m)
}
return fmt.Sprintf("%dh", h)
}
days := int(d.Hours()) / 24
hours := int(d.Hours()) % 24
if hours > 0 {
return fmt.Sprintf("%dd %dh", days, hours)
}
return fmt.Sprintf("%dd", days)
}
type tableLayout struct {
nameW, sparkW int
headers []string
colWidths []int
}
func (m Model) computeLayout() tableLayout {
wide := m.isWide()
var fixed int
var headers []string
var widths []int
if wide {
// # NAME TYPE STATUS LATENCY UPTIME HISTORY SSL RETRIES
headers = []string{"#", "NAME", "TYPE", "STATUS", "LATENCY", "UPTIME", "HISTORY", "SSL", "RETRIES"}
widths = []int{4, 0, 10, 10, 10, 8, 0, 7, 9}
fixed = 4 + 10 + 10 + 10 + 8 + 7 + 9
} else {
// # NAME TYPE STATUS LAT UP% HISTORY SSL RT
headers = []string{"#", "NAME", "TYPE", "STATUS", "LAT", "UP%", "HISTORY", "SSL", "RT"}
widths = []int{4, 0, 8, 8, 7, 8, 0, 5, 5}
fixed = 4 + 8 + 8 + 7 + 8 + 5 + 5
}
numCols := len(headers)
borderOverhead := 2 + (numCols - 1)
avail := m.termWidth - chromePadH - 2 - borderOverhead - fixed
if avail < 20 {
avail = 20
}
maxName := 0
for _, s := range m.sites {
if n := len([]rune(s.Name)); n > maxName {
maxName = n
}
}
maxName += 4
nameW := avail / 2
if nameW > maxName {
nameW = maxName
}
nameW = avail / 2
sparkW = avail - nameW - 2 // -2 for spark column padding
if nameW < 13 {
nameW = 13
}
if nameW > 40 {
nameW = 40
}
sparkW := avail - nameW
if sparkW < 10 {
sparkW = 10
}
if sparkW > 60 {
sparkW = 60
widths[1] = nameW
widths[6] = sparkW
return tableLayout{
nameW: nameW,
sparkW: sparkW,
headers: headers,
colWidths: widths,
}
return
}
func (m Model) viewSitesTab() string {
@@ -250,22 +415,26 @@ func (m Model) viewSitesTab() string {
if len(m.sites) == 0 {
welcome := lipgloss.NewStyle().
Border(lipgloss.RoundedBorder()).
BorderForeground(lipgloss.Color("#7D56F4")).
BorderForeground(m.theme.Accent).
Padding(1, 3).
Render(
titleStyle.Render("Go-Upkeep") + "\n\n" +
titleStyle.Render("uptop") + "\n\n" +
"No monitors configured yet.\n\n" +
subtleStyle.Render("[n] Add your first monitor"),
)
return "\n" + welcome
}
nameW, sparkWidth := m.dynamicWidths()
colWidths := []int{6, 0, 10, 10, 8, 8, sparkWidth + 2, 7, 9}
layout := m.computeLayout()
nameW := layout.nameW
sparkWidth := layout.sparkW - 2
if sparkWidth < 8 {
sparkWidth = 8
}
var groupRows map[int]bool
return m.renderTable(
[]string{"#", "NAME", "TYPE", "STATUS", "LATENCY", "UPTIME", "HISTORY", "SSL", "RETRY"},
layout.headers,
len(m.sites),
func(start, end int) [][]string {
groupRows = make(map[int]bool)
@@ -278,12 +447,12 @@ func (m Model) viewSitesTab() string {
icon := typeIcon("group", m.collapsed[site.ID])
rows = append(rows, []string{
strconv.Itoa(i + 1),
m.zones.Mark(fmt.Sprintf("site-%d", i), icon+" "+limitStr(site.Name, nameW-2)),
m.zones.Mark(fmt.Sprintf("site-%d", i), icon+" "+limitStr(site.Name, nameW-4)),
"group",
fmtStatus(site.Status, site.Paused),
fmtStatus(site.Status, site.Paused, m.isMonitorInMaintenance(site.ID)),
subtleStyle.Render("—"),
subtleStyle.Render("—"),
subtleStyle.Render(strings.Repeat("·", sparkWidth)),
m.groupUptime(site.ID),
m.groupSparkline(site.ID, sparkWidth),
subtleStyle.Render("-"),
subtleStyle.Render("—"),
})
@@ -296,9 +465,17 @@ func (m Model) viewSitesTab() string {
if i+1 >= len(m.sites) || m.sites[i+1].ParentID != site.ParentID {
prefix = "└"
}
name = prefix + " " + limitStr(name, nameW-2)
name = prefix + " " + limitStr(name, nameW-4)
} else {
name = limitStr(name, nameW)
name = limitStr(name, nameW-2)
}
if (site.Status == "DOWN" || site.Status == "SSL EXP" || site.Status == "LATE") && site.LastError != "" {
nameLen := len([]rune(name))
errSpace := nameW - nameLen - 3
if errSpace > 10 {
name = name + " " + subtleStyle.Render(limitStr(site.LastError, errSpace))
}
}
hist, _ := m.engine.GetHistory(site.ID)
@@ -306,14 +483,14 @@ func (m Model) viewSitesTab() string {
if site.Type == "push" {
spark = heartbeatSparkline(hist.Statuses, sparkWidth)
} else {
spark = latencySparkline(hist.Latencies, sparkWidth)
spark = latencySparkline(hist.Latencies, hist.Statuses, sparkWidth)
}
rows = append(rows, []string{
strconv.Itoa(i + 1),
m.zones.Mark(fmt.Sprintf("site-%d", i), name),
typeIcon(site.Type, false) + " " + site.Type,
fmtStatus(site.Status, site.Paused),
fmtStatus(site.Status, site.Paused, m.isMonitorInMaintenance(site.ID)),
fmtLatency(site.Latency),
fmtUptime(hist.Statuses),
spark,
@@ -323,7 +500,7 @@ func (m Model) viewSitesTab() string {
}
return rows
},
colWidths,
layout.colWidths,
func(row, col int) *lipgloss.Style {
if groupRows[row] {
s := siteGroupStyle
@@ -419,7 +596,7 @@ func (m *Model) initSiteHuhForm() tea.Cmd {
Description("Required for HTTP monitors").
Value(&m.siteFormData.URL).
Validate(func(s string) error {
if m.siteFormData.SiteType == "push" || m.siteFormData.SiteType == "group" {
if m.siteFormData.SiteType != "http" {
return nil
}
if s == "" {
@@ -465,12 +642,15 @@ func (m *Model) initSiteHuhForm() tea.Cmd {
Description("Target port for TCP port monitors").
Value(&m.siteFormData.Port).
Validate(func(s string) error {
if m.siteFormData.SiteType != "port" {
return nil
}
v, err := strconv.Atoi(s)
if err != nil {
return fmt.Errorf("must be a number")
}
if v < 0 || v > 65535 {
return fmt.Errorf("port must be 0-65535")
if v < 1 || v > 65535 {
return fmt.Errorf("port must be 1-65535")
}
return nil
}),
@@ -525,6 +705,9 @@ func (m *Model) initSiteHuhForm() tea.Cmd {
Placeholder("7").
Value(&m.siteFormData.Threshold).
Validate(func(s string) error {
if !m.siteFormData.CheckSSL {
return nil
}
v, err := strconv.Atoi(s)
if err != nil {
return fmt.Errorf("must be a number")
@@ -538,6 +721,9 @@ func (m *Model) initSiteHuhForm() tea.Cmd {
Placeholder("0").
Value(&m.siteFormData.Retries).
Validate(func(s string) error {
if m.siteFormData.SiteType == "group" {
return nil
}
v, err := strconv.Atoi(s)
if err != nil {
return fmt.Errorf("must be a number")
@@ -552,7 +738,7 @@ func (m *Model) initSiteHuhForm() tea.Cmd {
).Title("Advanced").WithHideFunc(func() bool {
return m.siteFormData.SiteType == "group"
}),
).WithTheme(huh.ThemeDracula())
).WithTheme(m.theme.HuhTheme())
return m.huhForm.Init()
}
@@ -616,14 +802,58 @@ func (m Model) viewDetailPanel() string {
var b strings.Builder
title := titleStyle.Render(fmt.Sprintf(" %s", site.Name))
b.WriteString(title + "\n\n")
var breadcrumb string
if site.ParentID > 0 {
for _, s := range m.sites {
if s.ID == site.ParentID {
breadcrumb = subtleStyle.Render(" Sites > "+s.Name+" > ") + titleStyle.Render(site.Name)
break
}
}
}
if breadcrumb == "" {
breadcrumb = subtleStyle.Render(" Sites > ") + titleStyle.Render(site.Name)
}
b.WriteString(breadcrumb + "\n\n")
row := func(label, value string) {
b.WriteString(fmt.Sprintf(" %-16s %s\n", subtleStyle.Render(label), value))
fmt.Fprintf(&b, " %-16s %s\n", subtleStyle.Render(label), value)
}
row("Status", fmtStatus(site.Status, site.Paused))
section := func(label string) {
b.WriteString("\n" + subtleStyle.Render(" "+label) + "\n")
}
row("Status", fmtStatus(site.Status, site.Paused, m.isMonitorInMaintenance(site.ID)))
if (site.Status == "DOWN" || site.Status == "SSL EXP" || site.Status == "LATE") && site.LastError != "" {
row("Error", dangerStyle.Render(limitStr(site.LastError, 60)))
}
if site.Type == "http" && site.StatusCode > 0 {
row("HTTP Code", strconv.Itoa(site.StatusCode))
}
if !site.StatusChangedAt.IsZero() {
dur := time.Since(site.StatusChangedAt)
row("State Since", site.StatusChangedAt.Format("2006-01-02 15:04:05")+" ("+fmtDuration(dur)+")")
}
if !site.LastSuccessAt.IsZero() {
ago := time.Since(site.LastSuccessAt)
row("Last Success", site.LastSuccessAt.Format("15:04:05")+" ("+fmtDuration(ago)+" ago)")
}
if m.isMonitorInMaintenance(site.ID) {
for _, mw := range m.maintenanceWindows {
if mw.Type == "maintenance" && (mw.MonitorID == 0 || mw.MonitorID == site.ID || mw.MonitorID == site.ParentID) {
row("Maintenance", maintStyle.Render(mw.Title))
break
}
}
}
section("ENDPOINT")
row("Type", site.Type)
if site.URL != "" {
row("URL", site.URL)
@@ -634,20 +864,36 @@ func (m Model) viewDetailPanel() string {
if site.Port > 0 {
row("Port", strconv.Itoa(site.Port))
}
section("TIMING")
row("Interval", fmt.Sprintf("%ds", site.Interval))
if site.Timeout > 0 {
row("Timeout", fmt.Sprintf("%ds", site.Timeout))
}
row("Latency", fmtLatency(site.Latency))
row("Uptime", fmtUptime(hist.Statuses))
if !site.LastCheck.IsZero() {
row("Last Check", site.LastCheck.Format("15:04:05"))
}
if site.Type == "http" {
section("HTTP")
if site.Method != "" && site.Method != "GET" {
row("Method", site.Method)
row("Codes", site.AcceptedCodes)
}
codes := site.AcceptedCodes
if codes == "" {
codes = "200-299"
}
row("Codes", codes)
row("SSL", fmtSSL(site))
if site.IgnoreTLS {
row("TLS Verify", dangerStyle.Render("disabled"))
}
}
if site.MaxRetries > 0 || site.Regions != "" || site.Description != "" {
section("CONFIG")
if site.MaxRetries > 0 {
row("Retries", fmtRetries(site))
}
@@ -657,8 +903,6 @@ func (m Model) viewDetailPanel() string {
if site.Description != "" {
row("Description", site.Description)
}
if !site.LastCheck.IsZero() {
row("Last Check", site.LastCheck.Format("15:04:05"))
}
probeResults := m.engine.GetProbeResults(site.ID)
@@ -671,7 +915,30 @@ func (m Model) viewDetailPanel() string {
}
latency := time.Duration(result.LatencyNs).Milliseconds()
ago := time.Since(result.CheckedAt).Truncate(time.Second)
b.WriteString(fmt.Sprintf(" %-14s %s %dms %s ago\n", nodeID, status, latency, ago))
line := fmt.Sprintf(" %-14s %s %dms %s ago", nodeID, status, latency, ago)
if !result.IsUp && result.ErrorReason != "" {
line += " " + dangerStyle.Render(limitStr(result.ErrorReason, 30))
}
b.WriteString(line + "\n")
}
}
stateChanges := m.engine.GetStateChanges(site.ID, 5)
if len(stateChanges) > 0 {
b.WriteString("\n" + subtleStyle.Render(" STATE CHANGES") + "\n")
for _, sc := range stateChanges {
ago := fmtDuration(time.Since(sc.ChangedAt))
arrow := subtleStyle.Render(sc.FromStatus) + " → "
if sc.ToStatus == "UP" {
arrow += specialStyle.Render(sc.ToStatus)
} else {
arrow += dangerStyle.Render(sc.ToStatus)
}
line := fmt.Sprintf(" %s %s", arrow, subtleStyle.Render(ago+" ago"))
if sc.ErrorReason != "" && sc.ToStatus != "UP" {
line += " " + dangerStyle.Render(limitStr(sc.ErrorReason, 40))
}
b.WriteString(line + "\n")
}
}
@@ -679,8 +946,37 @@ func (m Model) viewDetailPanel() string {
const sparkWidth = 40
if site.Type == "push" {
b.WriteString(" " + heartbeatSparkline(hist.Statuses, sparkWidth))
if len(hist.Statuses) > 0 {
up := 0
for _, s := range hist.Statuses {
if s {
up++
}
}
fmt.Fprintf(&b, "\n %s %d/%d checks up",
subtleStyle.Render("Heartbeats"),
up, len(hist.Statuses))
}
} else {
b.WriteString(" " + latencySparkline(hist.Latencies, sparkWidth))
b.WriteString(" " + latencySparkline(hist.Latencies, hist.Statuses, sparkWidth))
if len(hist.Latencies) > 0 {
minL, maxL := hist.Latencies[0], hist.Latencies[0]
var total time.Duration
for _, l := range hist.Latencies {
total += l
if l < minL {
minL = l
}
if l > maxL {
maxL = l
}
}
avg := total / time.Duration(len(hist.Latencies))
fmt.Fprintf(&b, "\n %s %dms %s %dms %s %dms",
subtleStyle.Render("Min"), minL.Milliseconds(),
subtleStyle.Render("Avg"), avg.Milliseconds(),
subtleStyle.Render("Max"), maxL.Milliseconds())
}
}
b.WriteString("\n\n")
+15 -4
View File
@@ -32,8 +32,19 @@ func (m Model) viewUsersTab() string {
return "\n No users configured. Press [n] to add one."
}
var headers []string
var widths []int
if m.isWide() {
headers = []string{"#", "USERNAME", "ROLE", "PUBLIC KEY"}
widths = []int{4, 18, 10, 50}
} else {
headers = []string{"#", "USER", "ROLE", "KEY"}
widths = []int{4, 14, 8, 30}
}
userW := widths[1]
return m.renderTable(
[]string{"#", "USERNAME", "ROLE", "PUBLIC KEY"},
headers,
len(m.users),
func(start, end int) [][]string {
var rows [][]string
@@ -41,14 +52,14 @@ func (m Model) viewUsersTab() string {
u := m.users[i]
rows = append(rows, []string{
fmt.Sprintf("%d", i+1),
m.zones.Mark(fmt.Sprintf("user-%d", i), limitStr(u.Username, 15)),
m.zones.Mark(fmt.Sprintf("user-%d", i), limitStr(u.Username, userW-2)),
fmtRole(u.Role),
fmtKey(u.PublicKey),
})
}
return rows
},
nil, nil,
widths, nil,
)
}
@@ -94,7 +105,7 @@ func (m *Model) initUserHuhForm() tea.Cmd {
huh.NewOption("Admin", "admin"),
).Value(&m.userFormData.Role),
).Title("SSH Access"),
).WithTheme(huh.ThemeDracula())
).WithTheme(m.theme.HuhTheme())
return m.huhForm.Init()
}
+39 -22
View File
@@ -6,25 +6,21 @@ import (
)
var (
tableHeaderStyle = lipgloss.NewStyle().
Foreground(lipgloss.Color("#7D56F4")).
Bold(true).
Padding(0, 1)
tableCellStyle = lipgloss.NewStyle().Padding(0, 1)
tableSelectedStyle = lipgloss.NewStyle().
Padding(0, 1).
Bold(true).
Foreground(lipgloss.Color("#ffffff")).
Background(lipgloss.Color("#3b3b5c"))
tableBorderStyle = lipgloss.NewStyle().
Foreground(lipgloss.Color("#444"))
tableHeaderStyle lipgloss.Style
tableCellStyle lipgloss.Style
tableSelectedStyle lipgloss.Style
tableBorderStyle lipgloss.Style
tableZebraStyle lipgloss.Style
)
type StyleOverride func(row, col int) *lipgloss.Style
const wideBreakpoint = 120
func (m Model) isWide() bool {
return m.termWidth >= wideBreakpoint
}
func (m Model) renderTable(headers []string, items int, buildRows func(start, end int) [][]string, colWidths []int, styleOverride StyleOverride) string {
if items == 0 {
return ""
@@ -38,7 +34,16 @@ func (m Model) renderTable(headers []string, items int, buildRows func(start, en
selectedVisual := m.cursor - m.tableOffset
rows := buildRows(m.tableOffset, end)
tableWidth := m.termWidth - 6
colTotal := 0
for _, w := range colWidths {
colTotal += w
}
borderOverhead := 2 + len(colWidths) - 1
tableWidth := colTotal + borderOverhead
maxWidth := m.termWidth - chromePadH - 2
if tableWidth > maxWidth {
tableWidth = maxWidth
}
if tableWidth < 40 {
tableWidth = 40
}
@@ -51,22 +56,34 @@ func (m Model) renderTable(headers []string, items int, buildRows func(start, en
Rows(rows...).
StyleFunc(func(row, col int) lipgloss.Style {
if row == table.HeaderRow {
return tableHeaderStyle
h := tableHeaderStyle
if col < len(colWidths) && colWidths[col] > 0 {
h = h.Width(colWidths[col]).MaxWidth(colWidths[col])
}
return h
}
isSelected := row == selectedVisual
if styleOverride != nil {
if s := styleOverride(row, col); s != nil {
if col < len(colWidths) && colWidths[col] > 0 {
return s.Width(colWidths[col])
style := *s
if isSelected {
style = tableSelectedStyle.Foreground(s.GetForeground())
}
return *s
if col < len(colWidths) && colWidths[col] > 0 {
style = style.Width(colWidths[col]).MaxWidth(colWidths[col])
}
return style
}
}
base := tableCellStyle
if row == selectedVisual {
if row%2 == 1 {
base = tableZebraStyle
}
if isSelected {
base = tableSelectedStyle
}
if col < len(colWidths) && colWidths[col] > 0 {
base = base.Width(colWidths[col])
base = base.Width(colWidths[col]).MaxWidth(colWidths[col])
}
return base
})
+191
View File
@@ -0,0 +1,191 @@
package tui
import (
"github.com/charmbracelet/huh"
"github.com/charmbracelet/lipgloss"
)
type Theme struct {
Name string
// Base layers
Bg lipgloss.Color
Surface lipgloss.Color
Panel lipgloss.Color
Border lipgloss.Color
// Text
Fg lipgloss.Color
Muted lipgloss.Color
Subtle lipgloss.Color
// Semantic
Success lipgloss.Color
Warning lipgloss.Color
Danger lipgloss.Color
Info lipgloss.Color
Accent lipgloss.Color
Purple lipgloss.Color
// Table
ZebraBg lipgloss.Color
// Selection
SelectedFg lipgloss.Color
SelectedBg lipgloss.Color
}
var themes = []Theme{
themeFlexokiDark,
themeTokyoNight,
themeCatppuccinMocha,
themeNord,
themeGruvbox,
}
var themeFlexokiDark = Theme{
Name: "Flexoki Dark",
Bg: "#1C1B1A",
Surface: "#282726",
Panel: "#343331",
Border: "#575653",
Fg: "#CECDC3",
Muted: "#878580",
Subtle: "#6F6E69",
Success: "#879A39",
Warning: "#D0A215",
Danger: "#D14D41",
Info: "#4385BE",
Accent: "#3AA99F",
Purple: "#8B7EC8",
ZebraBg: "#222120",
SelectedFg: "#FFFCF0",
SelectedBg: "#403E3C",
}
var themeTokyoNight = Theme{
Name: "Tokyo Night",
Bg: "#1a1b26",
Surface: "#24283b",
Panel: "#292e42",
Border: "#3b4261",
Fg: "#c0caf5",
Muted: "#a9b1d6",
Subtle: "#565f89",
Success: "#9ece6a",
Warning: "#e0af68",
Danger: "#f7768e",
Info: "#7aa2f7",
Accent: "#7dcfff",
Purple: "#bb9af7",
ZebraBg: "#1c1d28",
SelectedFg: "#c0caf5",
SelectedBg: "#292e42",
}
var themeGruvbox = Theme{
Name: "Gruvbox",
Bg: "#282828",
Surface: "#3c3836",
Panel: "#504945",
Border: "#665c54",
Fg: "#ebdbb2",
Muted: "#bdae93",
Subtle: "#7c6f64",
Success: "#b8bb26",
Warning: "#fabd2f",
Danger: "#fb4934",
Info: "#83a598",
Accent: "#8ec07c",
Purple: "#d3869b",
ZebraBg: "#2a2a2a",
SelectedFg: "#fbf1c7",
SelectedBg: "#504945",
}
var themeCatppuccinMocha = Theme{
Name: "Catppuccin Mocha",
Bg: "#1e1e2e",
Surface: "#313244",
Panel: "#45475a",
Border: "#585b70",
Fg: "#cdd6f4",
Muted: "#a6adc8",
Subtle: "#6c7086",
Success: "#a6e3a1",
Warning: "#f9e2af",
Danger: "#f38ba8",
Info: "#89b4fa",
Accent: "#94e2d5",
Purple: "#cba6f7",
ZebraBg: "#232334",
SelectedFg: "#cdd6f4",
SelectedBg: "#45475a",
}
var themeNord = Theme{
Name: "Nord",
Bg: "#2e3440",
Surface: "#3b4252",
Panel: "#434c5e",
Border: "#4c566a",
Fg: "#d8dee9",
Muted: "#d8dee9",
Subtle: "#4c566a",
Success: "#a3be8c",
Warning: "#ebcb8b",
Danger: "#bf616a",
Info: "#81a1c1",
Accent: "#88c0d0",
Purple: "#b48ead",
ZebraBg: "#323845",
SelectedFg: "#eceff4",
SelectedBg: "#434c5e",
}
func (t Theme) HuhTheme() *huh.Theme {
ht := huh.ThemeBase()
ht.Focused.Base = ht.Focused.Base.BorderForeground(t.Border)
ht.Focused.Card = ht.Focused.Base
ht.Focused.Title = ht.Focused.Title.Foreground(t.Accent).Bold(true)
ht.Focused.NoteTitle = ht.Focused.NoteTitle.Foreground(t.Accent).Bold(true).MarginBottom(1)
ht.Focused.Description = ht.Focused.Description.Foreground(t.Muted)
ht.Focused.ErrorIndicator = ht.Focused.ErrorIndicator.Foreground(t.Danger)
ht.Focused.ErrorMessage = ht.Focused.ErrorMessage.Foreground(t.Danger)
ht.Focused.SelectSelector = ht.Focused.SelectSelector.Foreground(t.Purple)
ht.Focused.NextIndicator = ht.Focused.NextIndicator.Foreground(t.Purple)
ht.Focused.PrevIndicator = ht.Focused.PrevIndicator.Foreground(t.Purple)
ht.Focused.Option = ht.Focused.Option.Foreground(t.Fg)
ht.Focused.MultiSelectSelector = ht.Focused.MultiSelectSelector.Foreground(t.Purple)
ht.Focused.SelectedOption = ht.Focused.SelectedOption.Foreground(t.Success)
ht.Focused.SelectedPrefix = lipgloss.NewStyle().Foreground(t.Success).SetString("✓ ")
ht.Focused.UnselectedPrefix = lipgloss.NewStyle().Foreground(t.Subtle).SetString("• ")
ht.Focused.UnselectedOption = ht.Focused.UnselectedOption.Foreground(t.Fg)
ht.Focused.FocusedButton = ht.Focused.FocusedButton.Foreground(t.Bg).Background(t.Accent)
ht.Focused.Next = ht.Focused.FocusedButton
ht.Focused.BlurredButton = ht.Focused.BlurredButton.Foreground(t.Fg).Background(t.Surface)
ht.Focused.TextInput.Cursor = ht.Focused.TextInput.Cursor.Foreground(t.Accent)
ht.Focused.TextInput.Placeholder = ht.Focused.TextInput.Placeholder.Foreground(t.Subtle)
ht.Focused.TextInput.Prompt = ht.Focused.TextInput.Prompt.Foreground(t.Purple)
ht.Blurred = ht.Focused
ht.Blurred.Base = ht.Focused.Base.BorderStyle(lipgloss.HiddenBorder())
ht.Blurred.Card = ht.Blurred.Base
ht.Blurred.NextIndicator = lipgloss.NewStyle()
ht.Blurred.PrevIndicator = lipgloss.NewStyle()
ht.Group.Title = ht.Focused.Title
ht.Group.Description = ht.Focused.Description
return ht
}
func themeByName(name string) Theme {
for _, t := range themes {
if t.Name == name {
return t
}
}
return themes[0]
}
+291 -61
View File
@@ -1,15 +1,17 @@
package tui
import (
"encoding/json"
"fmt"
"go-upkeep/internal/models"
"go-upkeep/internal/monitor"
"go-upkeep/internal/store"
"math"
"sort"
"strings"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
"github.com/charmbracelet/bubbles/viewport"
tea "github.com/charmbracelet/bubbletea"
"github.com/charmbracelet/harmonica"
@@ -19,18 +21,46 @@ import (
)
var (
subtleStyle = lipgloss.NewStyle().Foreground(lipgloss.AdaptiveColor{Light: "#9ca0b0", Dark: "#565f89"})
specialStyle = lipgloss.NewStyle().Foreground(lipgloss.AdaptiveColor{Light: "#43BF6D", Dark: "#73F59F"})
warnStyle = lipgloss.NewStyle().Foreground(lipgloss.AdaptiveColor{Light: "#F0E442", Dark: "#F0E442"})
dangerStyle = lipgloss.NewStyle().Foreground(lipgloss.AdaptiveColor{Light: "#F25D94", Dark: "#F25D94"})
titleStyle = lipgloss.NewStyle().Foreground(lipgloss.Color("#7D56F4")).Bold(true)
activeTab = lipgloss.NewStyle().Border(lipgloss.NormalBorder(), false, false, true, false).BorderForeground(lipgloss.Color("#7D56F4")).Foreground(lipgloss.Color("#7D56F4")).Bold(true).Padding(0, 1)
inactiveTab = lipgloss.NewStyle().Padding(0, 1).Foreground(lipgloss.AdaptiveColor{Light: "#AAA", Dark: "#555"})
subtleStyle lipgloss.Style
specialStyle lipgloss.Style
warnStyle lipgloss.Style
dangerStyle lipgloss.Style
titleStyle lipgloss.Style
activeTab lipgloss.Style
inactiveTab lipgloss.Style
)
func applyTheme(t Theme) {
subtleStyle = lipgloss.NewStyle().Foreground(t.Subtle)
specialStyle = lipgloss.NewStyle().Foreground(t.Success)
warnStyle = lipgloss.NewStyle().Foreground(t.Warning)
dangerStyle = lipgloss.NewStyle().Foreground(t.Danger)
titleStyle = lipgloss.NewStyle().Foreground(t.Accent).Bold(true)
activeTab = lipgloss.NewStyle().Border(lipgloss.NormalBorder(), false, false, true, false).BorderForeground(t.Accent).Foreground(t.Accent).Bold(true).Padding(0, 1)
inactiveTab = lipgloss.NewStyle().Padding(0, 1).Foreground(t.Muted)
tableHeaderStyle = lipgloss.NewStyle().Foreground(t.Accent).Bold(true).Padding(0, 1)
tableCellStyle = lipgloss.NewStyle().Padding(0, 1)
tableSelectedStyle = lipgloss.NewStyle().Padding(0, 1).Bold(true).Foreground(t.SelectedFg).Background(t.SelectedBg)
tableBorderStyle = lipgloss.NewStyle().Foreground(t.Border)
tableZebraStyle = lipgloss.NewStyle().Padding(0, 1).Background(t.ZebraBg)
siteGroupStyle = lipgloss.NewStyle().Padding(0, 1).Bold(true).Foreground(t.Accent)
maintStyle = lipgloss.NewStyle().Foreground(t.Purple)
}
var pulseFrames = []string{"⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"}
const (
chromePadV = 2 // outer Padding(1,2): 1 top + 1 bottom
chromePadH = 4 // outer Padding(1,2): 2 left + 2 right
chromeHeader = 1 // tab bar line
chromeGaps = 2 // "\n" separators: before content + before footer
chromeFooter = 2 // footer: "\n" prefix + text line
chromeTable = 3 // renderTable "\n" prefix + top border + header + bottom border (lipgloss collapses two into three rendered lines)
chromeBase = chromePadV + chromeHeader + chromeGaps + chromeFooter + chromeTable
)
type sessionState int
const (
@@ -38,10 +68,12 @@ const (
stateLogs
stateUsers
stateDetail
stateAlertDetail
stateFormSite
stateFormAlert
stateFormUser
stateConfirmDelete
stateFormMaint
)
type Model struct {
@@ -59,8 +91,10 @@ type Model struct {
siteFormData *siteFormData
alertFormData *alertFormData
userFormData *userFormData
maintFormData *maintFormData
logViewport viewport.Model
logFilterImportant bool
isAdmin bool
zones *zone.Manager
@@ -71,6 +105,8 @@ type Model struct {
collapsed map[int]bool
store store.Store
engine *monitor.Engine
theme Theme
themeIndex int
// harmonica animation state
pulseSpring harmonica.Spring
@@ -82,6 +118,7 @@ type Model struct {
alerts []models.AlertConfig
users []models.User
nodes []models.ProbeNode
maintenanceWindows []models.MaintenanceWindow
filterMode bool
filterText string
@@ -92,6 +129,20 @@ func InitialModel(isAdmin bool, s store.Store, eng *monitor.Engine) Model {
vpLogs.SetContent("Waiting for logs...")
z := zone.New()
spring := harmonica.NewSpring(harmonica.FPS(10), 6.0, 0.4)
collapsed := loadCollapsed(s)
themeName, _ := s.GetPreference("theme")
theme := themeByName(themeName)
themeIdx := 0
for i, t := range themes {
if t.Name == theme.Name {
themeIdx = i
break
}
}
applyTheme(theme)
return Model{
state: stateDashboard,
logViewport: vpLogs,
@@ -101,10 +152,39 @@ func InitialModel(isAdmin bool, s store.Store, eng *monitor.Engine) Model {
engine: eng,
zones: z,
pulseSpring: spring,
collapsed: make(map[int]bool),
collapsed: collapsed,
theme: theme,
themeIndex: themeIdx,
}
}
func loadCollapsed(s store.Store) map[int]bool {
m := make(map[int]bool)
raw, err := s.GetPreference("collapsed_groups")
if err != nil || raw == "" {
return m
}
var ids []int
if err := json.Unmarshal([]byte(raw), &ids); err != nil {
return m
}
for _, id := range ids {
m[id] = true
}
return m
}
func saveCollapsed(s store.Store, collapsed map[int]bool) {
var ids []int
for id, v := range collapsed {
if v {
ids = append(ids, id)
}
}
data, _ := json.Marshal(ids)
_ = s.SetPreference("collapsed_groups", string(data))
}
func (m Model) Init() tea.Cmd {
return tea.Batch(tea.ClearScreen, tea.Tick(time.Second, func(t time.Time) tea.Msg { return t }))
}
@@ -128,7 +208,12 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
m.engine.AddLog("Delete alert failed: " + err.Error())
}
m.adjustCursor(len(m.alerts) - 1)
case 3:
case 4:
if err := m.store.DeleteMaintenanceWindow(m.deleteID); err != nil {
m.engine.AddLog("Delete maintenance window failed: " + err.Error())
}
m.adjustCursor(len(m.maintenanceWindows) - 1)
case 5:
if err := m.store.DeleteUser(m.deleteID); err != nil {
m.engine.AddLog("Delete user failed: " + err.Error())
}
@@ -136,12 +221,12 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
}
m.refreshData()
m.state = stateDashboard
if m.deleteTab == 4 {
if m.deleteTab == 5 {
m.state = stateUsers
}
case "n", "N", "esc":
m.state = stateDashboard
if m.deleteTab == 4 {
if m.deleteTab == 5 {
m.state = stateUsers
}
case "ctrl+c":
@@ -152,7 +237,11 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
}
// Form state: forward ALL messages to huh (keys, timers, resize, etc.)
if m.state == stateFormSite || m.state == stateFormAlert || m.state == stateFormUser {
if m.state == stateFormSite || m.state == stateFormAlert || m.state == stateFormUser || m.state == stateFormMaint {
if wsm, ok := msg.(tea.WindowSizeMsg); ok {
m.termWidth = wsm.Width
m.termHeight = wsm.Height
}
if keyMsg, ok := msg.(tea.KeyMsg); ok {
if keyMsg.String() == "ctrl+c" {
return m, tea.Quit
@@ -160,7 +249,7 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
if keyMsg.String() == "esc" {
m.huhForm = nil
m.state = stateDashboard
if m.currentTab == 4 {
if m.currentTab == 5 {
m.state = stateUsers
}
return m, nil
@@ -186,12 +275,16 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
case tea.WindowSizeMsg:
m.termWidth = msg.Width
m.termHeight = msg.Height
m.maxTableRows = msg.Height - 12
chrome := chromeBase
if m.filterMode || m.filterText != "" {
chrome++
}
m.maxTableRows = msg.Height - chrome
if m.maxTableRows < 1 {
m.maxTableRows = 1
}
m.logViewport.Width = msg.Width
m.logViewport.Height = msg.Height - 6
m.logViewport.Width = msg.Width - chromePadH
m.logViewport.Height = msg.Height - (chromePadV + chromeHeader + chromeGaps + chromeFooter)
return m, tea.ClearScreen
case time.Time:
@@ -209,18 +302,21 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
if msg.Button == tea.MouseButtonWheelUp || msg.Button == tea.MouseButtonWheelDown {
if m.state == stateLogs {
if msg.Button == tea.MouseButtonWheelUp {
m.logViewport.LineUp(3)
m.logViewport.ScrollUp(3)
} else {
m.logViewport.LineDown(3)
m.logViewport.ScrollDown(3)
}
return m, nil
}
listLen := len(m.sites)
if m.currentTab == 1 {
switch m.currentTab {
case 1:
listLen = len(m.alerts)
} else if m.currentTab == 3 {
case 3:
listLen = len(m.nodes)
} else if m.currentTab == 4 {
case 4:
listLen = len(m.maintenanceWindows)
case 5:
listLen = len(m.users)
}
if msg.Button == tea.MouseButtonWheelUp {
@@ -289,6 +385,14 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
return m, tea.Quit
}
return m, nil
case stateAlertDetail:
switch msg.String() {
case "i", "esc":
m.state = stateDashboard
case "q":
return m, tea.Quit
}
return m, nil
case stateDashboard, stateLogs, stateUsers:
switch msg.String() {
case "q":
@@ -298,6 +402,11 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
m.filterMode = true
return m, nil
}
case "f":
if m.state == stateLogs {
m.logFilterImportant = !m.logFilterImportant
return m, nil
}
case "tab":
m.switchTab(m.currentTab + 1)
case "pgup", "pgdown":
@@ -307,7 +416,7 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
}
case "up", "k":
if m.state == stateLogs {
m.logViewport.LineUp(1)
m.logViewport.ScrollUp(1)
} else if m.cursor > 0 {
m.cursor--
if m.cursor < m.tableOffset {
@@ -316,7 +425,7 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
}
case "down", "j":
if m.state == stateLogs {
m.logViewport.LineDown(1)
m.logViewport.ScrollDown(1)
} else {
max := len(m.sites) - 1
if m.currentTab == 1 {
@@ -326,6 +435,9 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
max = len(m.nodes) - 1
}
if m.currentTab == 4 {
max = len(m.maintenanceWindows) - 1
}
if m.currentTab == 5 {
max = len(m.users) - 1
}
if m.cursor < max {
@@ -344,7 +456,10 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
} else if m.currentTab == 1 {
m.state = stateFormAlert
return m, m.initAlertHuhForm()
} else if m.currentTab == 4 && m.isAdmin {
} else if m.currentTab == 4 {
m.state = stateFormMaint
return m, m.initMaintHuhForm()
} else if m.currentTab == 5 && m.isAdmin {
m.state = stateFormUser
return m, m.initUserHuhForm()
}
@@ -358,15 +473,26 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
m.editID = m.alerts[m.cursor].ID
m.state = stateFormAlert
return m, m.initAlertHuhForm()
} else if m.currentTab == 4 && m.isAdmin && len(m.users) > 0 {
} else if m.currentTab == 5 && m.isAdmin && len(m.users) > 0 {
m.editID = m.users[m.cursor].ID
m.state = stateFormUser
return m, m.initUserHuhForm()
}
case "t":
if m.currentTab == 1 && len(m.alerts) > 0 {
a := m.alerts[m.cursor]
go func() {
if err := m.engine.TestAlert(a.ID); err != nil {
m.engine.AddLog(fmt.Sprintf("Test alert failed (%s): %v", a.Name, err))
}
}()
return m, nil
}
case " ":
if m.currentTab == 0 && len(m.sites) > 0 && m.sites[m.cursor].Type == "group" {
gid := m.sites[m.cursor].ID
m.collapsed[gid] = !m.collapsed[gid]
saveCollapsed(m.store, m.collapsed)
m.refreshData()
}
case "p":
@@ -380,7 +506,26 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
case "i":
if m.currentTab == 0 && len(m.sites) > 0 {
m.state = stateDetail
} else if m.currentTab == 1 && len(m.alerts) > 0 {
m.state = stateAlertDetail
}
case "x":
if m.currentTab == 4 && len(m.maintenanceWindows) > 0 {
mw := m.maintenanceWindows[m.cursor]
now := time.Now()
isActive := !mw.StartTime.After(now) && (mw.EndTime.IsZero() || mw.EndTime.After(now))
if isActive {
if err := m.store.EndMaintenanceWindow(mw.ID); err != nil {
m.engine.AddLog("End maintenance failed: " + err.Error())
}
m.refreshData()
}
}
case "T":
m.themeIndex = (m.themeIndex + 1) % len(themes)
m.theme = themes[m.themeIndex]
applyTheme(m.theme)
_ = m.store.SetPreference("theme", m.theme.Name)
case "d", "backspace":
if m.currentTab == 0 && len(m.sites) > 0 {
m.deleteID = m.sites[m.cursor].ID
@@ -392,10 +537,15 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
m.deleteName = m.alerts[m.cursor].Name
m.deleteTab = 1
m.state = stateConfirmDelete
} else if m.currentTab == 4 && m.isAdmin && len(m.users) > 0 {
} else if m.currentTab == 4 && len(m.maintenanceWindows) > 0 {
m.deleteID = m.maintenanceWindows[m.cursor].ID
m.deleteName = m.maintenanceWindows[m.cursor].Title
m.deleteTab = 4
m.state = stateConfirmDelete
} else if m.currentTab == 5 && m.isAdmin && len(m.users) > 0 {
m.deleteID = m.users[m.cursor].ID
m.deleteName = m.users[m.cursor].Username
m.deleteTab = 4
m.deleteTab = 5
m.state = stateConfirmDelete
}
}
@@ -405,9 +555,9 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
}
func (m *Model) handleClick(msg tea.MouseMsg) (tea.Model, tea.Cmd) {
tabCount := 4
tabCount := 5
if m.isAdmin {
tabCount = 5
tabCount = 6
}
for i := 0; i < tabCount; i++ {
if m.zones.Get(fmt.Sprintf("tab-%d", i)).InBounds(msg) {
@@ -443,6 +593,19 @@ func (m *Model) handleClick(msg tea.MouseMsg) (tea.Model, tea.Cmd) {
}
if m.currentTab == 4 {
end := m.tableOffset + m.maxTableRows
if end > len(m.maintenanceWindows) {
end = len(m.maintenanceWindows)
}
for i := m.tableOffset; i < end; i++ {
if m.zones.Get(fmt.Sprintf("maint-%d", i)).InBounds(msg) {
m.cursor = i
return m, nil
}
}
}
if m.currentTab == 5 {
end := m.tableOffset + m.maxTableRows
if end > len(m.users) {
end = len(m.users)
@@ -459,9 +622,9 @@ func (m *Model) handleClick(msg tea.MouseMsg) (tea.Model, tea.Cmd) {
}
func (m *Model) switchTab(idx int) {
maxTabs := 3
maxTabs := 4
if m.isAdmin {
maxTabs = 4
maxTabs = 5
}
if idx > maxTabs {
idx = 0
@@ -472,7 +635,7 @@ func (m *Model) switchTab(idx int) {
switch idx {
case 2:
m.state = stateLogs
case 4:
case 5:
m.state = stateUsers
default:
m.state = stateDashboard
@@ -545,14 +708,20 @@ func (m *Model) refreshData() {
if nodes, err := m.store.GetAllNodes(); err == nil {
m.nodes = nodes
}
if windows, err := m.store.GetAllMaintenanceWindows(100); err == nil {
m.maintenanceWindows = windows
}
m.logViewport.SetContent(strings.Join(m.engine.GetLogs(), "\n"))
listLen := len(m.sites)
if m.currentTab == 1 {
switch m.currentTab {
case 1:
listLen = len(m.alerts)
} else if m.currentTab == 3 {
case 3:
listLen = len(m.nodes)
} else if m.currentTab == 4 {
case 4:
listLen = len(m.maintenanceWindows)
case 5:
listLen = len(m.users)
}
if listLen > 0 && m.cursor >= listLen {
@@ -577,6 +746,10 @@ func (m *Model) submitForm() {
if m.userFormData != nil {
m.submitUserForm()
}
case stateFormMaint:
if m.maintFormData != nil {
m.submitMaintForm()
}
}
}
@@ -588,7 +761,7 @@ func (m Model) pulseIndicator() string {
}
hasDown := false
for _, s := range m.sites {
if !s.Paused && (s.Status == "DOWN" || s.Status == "SSL EXP") {
if !s.Paused && !m.isMonitorInMaintenance(s.ID) && (s.Status == "DOWN" || s.Status == "SSL EXP") {
hasDown = true
break
}
@@ -606,20 +779,23 @@ func (m Model) View() string {
switch m.state {
case stateConfirmDelete:
kind := "monitor"
if m.deleteTab == 1 {
switch m.deleteTab {
case 1:
kind = "alert"
} else if m.deleteTab == 4 {
case 4:
kind = "maintenance window"
case 5:
kind = "user"
}
msg := dangerStyle.Render(fmt.Sprintf("Delete %s \"%s\"?", kind, m.deleteName))
hint := subtleStyle.Render("[y] Confirm [n] Cancel")
box := lipgloss.NewStyle().
Border(lipgloss.RoundedBorder()).
BorderForeground(lipgloss.Color("#F25D94")).
BorderForeground(m.theme.Danger).
Padding(1, 3).
Render(msg + "\n\n" + hint)
return lipgloss.NewStyle().Padding(2, 4).Render(box)
case stateFormSite, stateFormAlert, stateFormUser:
case stateFormSite, stateFormAlert, stateFormUser, stateFormMaint:
if m.huhForm != nil {
title := ""
switch m.state {
@@ -638,7 +814,14 @@ func (m Model) View() string {
if m.editID > 0 {
title = fmt.Sprintf("Edit User #%d", m.editID)
}
case stateFormMaint:
title = "New Maintenance Window"
}
formHeight := m.termHeight - 7
if formHeight < 5 {
formHeight = 5
}
m.huhForm.WithHeight(formHeight)
header := titleStyle.Render(title)
footer := subtleStyle.Render("\n[Esc] Cancel")
return lipgloss.NewStyle().Padding(1, 2).Render(header + "\n\n" + m.huhForm.View() + "\n" + footer)
@@ -646,16 +829,31 @@ func (m Model) View() string {
return ""
case stateDetail:
return m.viewDetailPanel()
case stateAlertDetail:
return m.viewAlertDetailPanel()
default:
return m.zones.Scan(m.viewDashboard())
}
}
func (m Model) viewDashboard() string {
allSites := m.engine.GetAllSites()
totalMonitors := 0
downCount := 0
for _, s := range m.sites {
if !s.Paused && (s.Status == "DOWN" || s.Status == "SSL EXP") {
lateCount := 0
for _, s := range allSites {
if s.Type == "group" {
continue
}
totalMonitors++
if s.Paused || m.isMonitorInMaintenance(s.ID) {
continue
}
switch s.Status {
case "DOWN", "SSL EXP":
downCount++
case "LATE":
lateCount++
}
}
offlineNodes := 0
@@ -668,8 +866,10 @@ func (m Model) viewDashboard() string {
var sitesLabel string
if downCount > 0 {
sitesLabel = fmt.Sprintf("Sites (%d↓)", downCount)
} else if len(m.sites) > 0 {
sitesLabel = fmt.Sprintf("Sites (%d)", len(m.sites))
} else if lateCount > 0 {
sitesLabel = fmt.Sprintf("Sites (%d)", lateCount)
} else if totalMonitors > 0 {
sitesLabel = fmt.Sprintf("Sites (%d)", totalMonitors)
} else {
sitesLabel = "Sites"
}
@@ -682,7 +882,21 @@ func (m Model) viewDashboard() string {
nodesLabel = "Nodes"
}
tabs := []string{sitesLabel, "Alerts", "Logs", nodesLabel}
activeMaint := 0
for _, mw := range m.maintenanceWindows {
now := time.Now()
if !mw.StartTime.After(now) && (mw.EndTime.IsZero() || mw.EndTime.After(now)) {
activeMaint++
}
}
var maintLabel string
if activeMaint > 0 {
maintLabel = fmt.Sprintf("Maint (%d)", activeMaint)
} else {
maintLabel = "Maint"
}
tabs := []string{sitesLabel, "Alerts", "Logs", nodesLabel, maintLabel}
if m.isAdmin {
tabs = append(tabs, "Users")
}
@@ -712,19 +926,26 @@ func (m Model) viewDashboard() string {
case 3:
content = m.viewNodesTab()
case 4:
content = m.viewMaintTab()
case 5:
if m.isAdmin {
content = m.viewUsersTab()
}
}
upCount := len(m.sites) - downCount
upCount := totalMonitors - downCount - lateCount
var upStr string
if downCount > 0 {
upStr = dangerStyle.Render(fmt.Sprintf("%d/%d UP", upCount, len(m.sites)))
upStr = dangerStyle.Render(fmt.Sprintf("%d/%d UP", upCount, totalMonitors))
} else if lateCount > 0 {
upStr = warnStyle.Render(fmt.Sprintf("%d/%d UP", upCount, totalMonitors))
} else {
upStr = specialStyle.Render(fmt.Sprintf("%d/%d UP", upCount, len(m.sites)))
upStr = specialStyle.Render(fmt.Sprintf("%d/%d UP", upCount, totalMonitors))
}
statusParts := []string{upStr}
if lateCount > 0 {
statusParts = append(statusParts, warnStyle.Render(fmt.Sprintf("%d LATE", lateCount)))
}
if len(m.nodes) > 0 {
online := 0
for _, n := range m.nodes {
@@ -738,17 +959,23 @@ func (m Model) viewDashboard() string {
var footer string
if m.filterMode {
cursor := lipgloss.NewStyle().Foreground(lipgloss.Color("#7D56F4")).Render("│")
cursor := lipgloss.NewStyle().Foreground(m.theme.Accent).Render("│")
footer = "\n" + titleStyle.Render("/") + " " + m.filterText + cursor + " " + subtleStyle.Render("[Enter]Apply [Esc]Clear")
} else {
var keys string
switch m.currentTab {
case 0:
keys = "[/]Filter [n]New [e]Edit [i]Info [d]Del [p]Pause [Tab]Switch [q]Quit"
keys = "[/]Filter [n]New [e]Edit [i]Info [d]Del [p]Pause [T]Theme [Tab]Switch [q]Quit"
case 1:
keys = "[n]New [e]Edit [i]Info [d]Del [t]Test [T]Theme [Tab]Switch [q]Quit"
case 2:
keys = "[f]Filter [T]Theme [Tab]Switch [q]Quit"
case 4:
keys = "[n]Add [d]Revoke [Tab]Switch [q]Quit"
keys = "[n]New [x]End [d]Del [T]Theme [Tab]Switch [q]Quit"
case 5:
keys = "[n]Add [d]Revoke [T]Theme [Tab]Switch [q]Quit"
default:
keys = "[Tab]Switch [q]Quit"
keys = "[T]Theme [Tab]Switch [q]Quit"
}
footer = "\n" + statusLine + " " + subtleStyle.Render(keys)
if m.filterText != "" && m.currentTab == 0 {
@@ -769,16 +996,19 @@ func siteOrder(s models.Site) int {
switch s.Status {
case "DOWN", "SSL EXP":
return 0
case "PENDING":
return 2
default:
case "LATE":
return 1
case "PENDING":
return 3
default:
return 2
}
}
func limitStr(text string, max int) string {
if len(text) > max {
return text[:max-3] + "..."
runes := []rune(text)
if len(runes) > max {
return string(runes[:max-3]) + "..."
}
return text
}
+274
View File
@@ -0,0 +1,274 @@
package main
import (
"database/sql"
"fmt"
"math/rand"
"os"
"time"
_ "github.com/mattn/go-sqlite3"
)
func main() {
if len(os.Args) < 2 {
fmt.Fprintln(os.Stderr, "usage: backfill <db-path>")
os.Exit(1)
}
db, err := sql.Open("sqlite3", os.Args[1])
if err != nil {
fmt.Fprintf(os.Stderr, "open: %v\n", err)
os.Exit(1)
}
defer db.Close()
ids, err := loadSiteIDs(db)
if err != nil {
fmt.Fprintf(os.Stderr, "load site IDs: %v\n", err)
os.Exit(1)
}
rng := rand.New(rand.NewSource(42))
now := time.Now().UTC()
if err := backfillHistory(db, rng, now, ids); err != nil {
fmt.Fprintf(os.Stderr, "history: %v\n", err)
os.Exit(1)
}
if err := backfillStateChanges(db, now, ids); err != nil {
fmt.Fprintf(os.Stderr, "state changes: %v\n", err)
os.Exit(1)
}
if err := backfillLogs(db, now); err != nil {
fmt.Fprintf(os.Stderr, "logs: %v\n", err)
os.Exit(1)
}
if err := backfillNodes(db, now); err != nil {
fmt.Fprintf(os.Stderr, "nodes: %v\n", err)
os.Exit(1)
}
if err := backfillMaintenance(db, now, ids); err != nil {
fmt.Fprintf(os.Stderr, "maintenance: %v\n", err)
os.Exit(1)
}
var count int
db.QueryRow("SELECT COUNT(*) FROM check_history").Scan(&count)
fmt.Printf("Backfill complete: %d check records\n", count)
var token string
if err := db.QueryRow("SELECT token FROM sites WHERE name='Nightly Backup'").Scan(&token); err == nil {
fmt.Printf("PUSH_TOKEN=%s\n", token)
}
}
func loadSiteIDs(db *sql.DB) (map[string]int, error) {
rows, err := db.Query("SELECT id, name FROM sites")
if err != nil {
return nil, err
}
defer rows.Close()
ids := make(map[string]int)
for rows.Next() {
var id int
var name string
if err := rows.Scan(&id, &name); err != nil {
return nil, err
}
ids[name] = id
}
return ids, rows.Err()
}
type monitorProfile struct {
name string
minMs int
maxMs int
downFrom int // check index where DOWN starts (-1 = never)
}
func backfillHistory(db *sql.DB, rng *rand.Rand, now time.Time, ids map[string]int) error {
profiles := []monitorProfile{
{"Nextcloud", 40, 80, -1},
{"Jellyfin", 80, 200, -1},
{"Home Assistant", 15, 45, -1},
{"Gitea", 40, 90, -1},
{"Traefik Dashboard", 5, 25, -1},
{"Vaultwarden", 50, 130, -1},
{"Personal Blog", 25, 65, -1},
{"Immich", 100, 280, -1}, // spikes handled below
{"Auth Portal", 30, 70, 40}, // DOWN after check 40
{"Edge Router", 5, 15, -1}, // ping
{"Postgres", 1, 5, -1}, // port
{"DNS Primary", 10, 30, -1},
}
tx, err := db.Begin()
if err != nil {
return err
}
defer tx.Rollback()
stmt, err := tx.Prepare("INSERT INTO check_history (site_id, latency_ns, is_up, checked_at) VALUES (?, ?, ?, ?)")
if err != nil {
return err
}
defer stmt.Close()
const total = 60
for _, p := range profiles {
siteID, ok := ids[p.name]
if !ok {
continue
}
for i := 0; i < total; i++ {
minutesAgo := (total - i) * 24
checkedAt := now.Add(-time.Duration(minutesAgo) * time.Minute)
var latencyNs int64
isUp := true
if p.downFrom >= 0 && i >= p.downFrom {
latencyNs = 0
isUp = false
} else {
ms := p.minMs + rng.Intn(p.maxMs-p.minMs)
if p.name == "Immich" && i%17 == 0 {
ms = 250 + rng.Intn(100)
}
latencyNs = int64(ms) * 1_000_000
}
if _, err := stmt.Exec(siteID, latencyNs, isUp, checkedAt.Format("2006-01-02 15:04:05")); err != nil {
return err
}
}
}
return tx.Commit()
}
func backfillStateChanges(db *sql.DB, now time.Time, ids map[string]int) error {
type sc struct {
name string
from string
to string
reason string
at time.Time
}
changes := []sc{
{"Nextcloud", "UP", "DOWN", "read timeout", now.Add(-3 * 24 * time.Hour).Add(-5 * time.Minute)},
{"Nextcloud", "DOWN", "UP", "", now.Add(-3 * 24 * time.Hour)},
{"Jellyfin", "UP", "DOWN", "connection reset", now.Add(-18 * time.Hour).Add(-3 * time.Minute)},
{"Jellyfin", "DOWN", "UP", "", now.Add(-18 * time.Hour)},
{"Auth Portal", "UP", "DOWN", "connection refused", now.Add(-8 * time.Hour)},
{"Immich", "UP", "DOWN", "502 Bad Gateway", now.Add(-12 * time.Hour).Add(-8 * time.Minute)},
{"Immich", "DOWN", "UP", "", now.Add(-12 * time.Hour)},
}
tx, err := db.Begin()
if err != nil {
return err
}
defer tx.Rollback()
stmt, err := tx.Prepare("INSERT INTO state_changes (site_id, from_status, to_status, error_reason, changed_at) VALUES (?, ?, ?, ?, ?)")
if err != nil {
return err
}
defer stmt.Close()
for _, c := range changes {
siteID, ok := ids[c.name]
if !ok {
continue
}
if _, err := stmt.Exec(siteID, c.from, c.to, c.reason, c.at.Format("2006-01-02 15:04:05")); err != nil {
return err
}
}
return tx.Commit()
}
func backfillLogs(db *sql.DB, now time.Time) error {
type logEntry struct {
msg string
at time.Time
}
logs := []logEntry{
{"[06:12] Monitor 'Auth Portal' confirmed DOWN: connection refused", now.Add(-8 * time.Hour)},
{"[06:12] Monitor 'Auth Portal' failed check 2/2", now.Add(-8*time.Hour - 30*time.Second)},
{"[06:11] Monitor 'Auth Portal' failed check 1/2", now.Add(-8*time.Hour - 60*time.Second)},
{"[12:33] Monitor 'Immich' recovered (was down 8m)", now.Add(-12 * time.Hour)},
{"[12:25] Monitor 'Immich' confirmed DOWN: 502 Bad Gateway", now.Add(-12*time.Hour - 8*time.Minute)},
{"[12:25] Monitor 'Immich' failed check 3/3", now.Add(-12*time.Hour - 8*time.Minute - 30*time.Second)},
{"[12:25] Monitor 'Immich' failed check 2/3", now.Add(-12*time.Hour - 8*time.Minute - 60*time.Second)},
{"[12:24] Monitor 'Immich' failed check 1/3", now.Add(-12*time.Hour - 9*time.Minute)},
{"[06:14] Monitor 'Jellyfin' recovered (was down 3m)", now.Add(-18 * time.Hour)},
{"[06:11] Monitor 'Jellyfin' confirmed DOWN: connection reset", now.Add(-18*time.Hour - 3*time.Minute)},
{"[06:11] Monitor 'Jellyfin' failed check 2/2", now.Add(-18*time.Hour - 3*time.Minute - 30*time.Second)},
{"[06:10] Monitor 'Jellyfin' failed check 1/2", now.Add(-18*time.Hour - 4*time.Minute)},
{"[23:45] SSL certificate for 'Personal Blog' expires in 42 days", now.Add(-28 * time.Hour)},
{"[08:00] Loaded check history from database", now.Add(-32*time.Hour - 30*time.Minute)},
{"[08:00] Engine RESUMED (Active)", now.Add(-32*time.Hour - 30*time.Minute - 5*time.Second)},
}
tx, err := db.Begin()
if err != nil {
return err
}
defer tx.Rollback()
stmt, err := tx.Prepare("INSERT INTO logs (message, created_at) VALUES (?, ?)")
if err != nil {
return err
}
defer stmt.Close()
for _, l := range logs {
if _, err := stmt.Exec(l.msg, l.at.Format("2006-01-02 15:04:05")); err != nil {
return err
}
}
return tx.Commit()
}
func backfillNodes(db *sql.DB, now time.Time) error {
_, err := db.Exec(
"INSERT OR REPLACE INTO nodes (id, name, region, last_seen, version) VALUES (?, ?, ?, ?, ?)",
"node-1", "leader", "us-east", now.Format("2006-01-02 15:04:05"), "2026.05.1",
)
return err
}
func backfillMaintenance(db *sql.DB, now time.Time, ids map[string]int) error {
tx, err := db.Begin()
if err != nil {
return err
}
defer tx.Rollback()
stmt, err := tx.Prepare("INSERT INTO maintenance_windows (monitor_id, title, description, type, start_time, end_time, created_by) VALUES (?, ?, ?, ?, ?, ?, ?)")
if err != nil {
return err
}
defer stmt.Close()
jellyfinID := ids["Jellyfin"]
past := now.Add(-3 * 24 * time.Hour)
if _, err := stmt.Exec(jellyfinID, "Jellyfin upgrade", "Upgrade to v10.10 + plugin updates", "maintenance",
past.Format("2006-01-02 15:04:05"),
past.Add(2*time.Hour).Format("2006-01-02 15:04:05"),
"admin"); err != nil {
return err
}
future := now.Add(2 * 24 * time.Hour)
if _, err := stmt.Exec(0, "Network switch replacement", "Replacing core switch in rack 2", "maintenance",
future.Format("2006-01-02 15:04:05"),
future.Add(4*time.Hour).Format("2006-01-02 15:04:05"),
"admin"); err != nil {
return err
}
return tx.Commit()
}
+54
View File
@@ -0,0 +1,54 @@
Set Shell "bash"
Set Width 1400
Set Height 800
Set FontSize 14
Set Padding 20
Set Framerate 15
Set TypingSpeed 50ms
Hide
Type "bash vhs/setup.sh /tmp/uptop-vhs.db"
Enter
Sleep 45s
Show
Sleep 5s
# Sites tab — hero shot with mixed monitor states
Screenshot vhs/screenshots/monitors.png
Sleep 1s
# Navigate to Nextcloud (row 6: group + 3 children + Auth Portal)
Down
Sleep 200ms
Down
Sleep 200ms
Down
Sleep 200ms
Down
Sleep 200ms
Down
Sleep 200ms
Type "i"
Sleep 3s
Screenshot vhs/screenshots/detail.png
Sleep 1s
# Close detail
Escape
Sleep 1s
# Tab to Alerts
Tab
Sleep 2s
Screenshot vhs/screenshots/alerts.png
Sleep 1s
# Tab to Logs
Tab
Sleep 2s
Screenshot vhs/screenshots/logs.png
Sleep 1s
# Quit
Type "q"
Sleep 1s
Binary file not shown.

After

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 160 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 219 KiB

+141
View File
@@ -0,0 +1,141 @@
alerts:
- name: Discord Homelab
type: discord
settings:
url: https://discord.com/api/webhooks/1234567890/demo-token
- name: Ntfy Alerts
type: webhook
settings:
url: https://ntfy.example.com/homelab-alerts
- name: Email Oncall
type: email
settings:
host: smtp.example.com
port: "587"
user: alerts@example.com
pass: "••••••••"
from: alerts@example.com
to: oncall@example.com
- name: Slack Ops
type: slack
settings:
url: https://hooks.slack.com/services/T00000/B00000/demo-token
monitors:
# HTTP — homelab services
- name: Nextcloud
type: http
url: https://example.com
interval: 30
alert: Discord Homelab
check_ssl: true
expiry_threshold: 14
max_retries: 2
- name: Jellyfin
type: http
url: https://example.com
interval: 30
alert: Discord Homelab
max_retries: 2
- name: Home Assistant
type: http
url: https://example.com
interval: 30
alert: Discord Homelab
max_retries: 3
- name: Gitea
type: http
url: https://example.com
interval: 60
alert: Discord Homelab
check_ssl: true
expiry_threshold: 14
max_retries: 2
- name: Traefik Dashboard
type: http
url: https://example.com
interval: 60
alert: Discord Homelab
max_retries: 1
- name: Vaultwarden
type: http
url: https://example.com
interval: 30
alert: Discord Homelab
check_ssl: true
expiry_threshold: 14
max_retries: 3
- name: Personal Blog
type: http
url: https://example.com
interval: 120
alert: Discord Homelab
check_ssl: true
expiry_threshold: 14
max_retries: 2
- name: Immich
type: http
url: https://example.com
interval: 60
alert: Discord Homelab
check_ssl: true
expiry_threshold: 7
max_retries: 3
# HTTP — deliberate failure
- name: Auth Portal
type: http
url: http://localhost:1
interval: 30
alert: Discord Homelab
max_retries: 2
# Push — cron jobs
- name: Nightly Backup
type: push
interval: 300
alert: Discord Homelab
- name: Cert Renewal
type: push
interval: 300
alert: Discord Homelab
# Infrastructure group
- name: Infrastructure
type: group
alert: Discord Homelab
monitors:
- name: Edge Router
type: ping
hostname: 8.8.8.8
interval: 30
alert: Discord Homelab
timeout: 5
- name: Postgres
type: port
hostname: localhost
port: 18099
interval: 60
alert: Discord Homelab
timeout: 5
- name: DNS Primary
type: dns
hostname: google.com
dns_server: 8.8.8.8
dns_resolve_type: A
interval: 60
alert: Discord Homelab
timeout: 5
Executable
+27
View File
@@ -0,0 +1,27 @@
#!/bin/bash
# VHS screenshot setup: seed monitors, backfill history, start server.
set -e
DB="${1:?usage: setup.sh <db-path>}"
rm -f "$DB" "$DB-shm" "$DB-wal"
echo "==> Seeding monitors and alerts..."
UPTOP_DB_DSN="$DB" ./uptop apply -f vhs/seed.yaml 2>&1
echo "==> Backfilling check history..."
BACKFILL_OUT=$(go run ./vhs/backfill/ "$DB")
echo "$BACKFILL_OUT"
PUSH_TOKEN=$(echo "$BACKFILL_OUT" | grep '^PUSH_TOKEN=' | cut -d= -f2)
if [ -n "$PUSH_TOKEN" ]; then
echo "==> Sending push heartbeat in 15s (background)..."
(sleep 15 && curl -s "http://localhost:18099/api/push" -H "Authorization: Bearer $PUSH_TOKEN" > /dev/null 2>&1) &
fi
echo "==> Starting uptop server..."
exec env \
UPTOP_DB_DSN="$DB" \
UPTOP_PORT=23299 \
UPTOP_HTTP_PORT=18099 \
UPTOP_ALLOW_PRIVATE_TARGETS=true \
./uptop serve 2>/dev/null