25 Commits

Author SHA1 Message Date
lerko f2d663ea76 chore: add TUI screenshots via VHS with realistic seed data
CI / test (pull_request) Successful in 2m42s
CI / lint (pull_request) Failing after 1m11s
CI / vulncheck (pull_request) Successful in 56s
Screenshots capture 4 views: monitors dashboard (hero), detail panel,
alerts tab, and logs tab. Seed data uses homelab-themed monitors with
a SQL backfill for rich sparkline history, state changes, and log
entries.

Also fixes latencySparkline to color DOWN checks red instead of green
— previously failed checks with 0ms latency rendered as green bars.
2026-05-28 18:26:51 -04:00
lerko cfbf01274d chore(tui): visual polish — detail sections, column headers, alert detail (#37)
CI / test (push) Successful in 2m40s
CI / lint (push) Successful in 1m2s
CI / vulncheck (push) Successful in 51s
Release / docker (push) Has been cancelled
Release / release (push) Has been cancelled
## Summary

Bundled remaining UX polish items from the screenshot review.

### Changes

**Detail panel sections (#5)**
- Fields grouped into ENDPOINT, TIMING, HTTP, CONFIG sections with subtle headers
- Matches existing PROBE RESULTS and STATE CHANGES section pattern
- Cleaner visual hierarchy without box-drawing clutter

**Omit unconfigured fields (#6)**
- Timeout hidden when 0 (unconfigured)
- Method hidden when default GET
- AcceptedCodes shows "200-299" explicitly when empty

**Column header (#7)**
- `LATENCY` → `LAT` (design short, never truncate — htop/btop pattern)

**Alert detail view (#8)**
- `i` key on Alerts tab opens full detail panel
- Shows: type, health status, last sent time, send/fail counts, last error
- Full config key:value pairs (untruncated)
- Keybinding: `[i/Esc] Back  [e] Edit  [t] Test  [q] Quit`

### Files (3)
- `internal/tui/tab_sites.go` — section headers, field omission, LAT header
- `internal/tui/tab_alerts.go` — viewAlertDetailPanel()
- `internal/tui/tui.go` — stateAlertDetail, key handler, render routing

Reviewed-on: lerko/uptop#37
2026-05-28 20:40:29 +00:00
lerko 26e297cbae Merge pull request 'feat: alert channel health indicator + test alerts' (#36) from feat/alert-health into main
CI / test (push) Successful in 2m48s
CI / lint (push) Successful in 1m17s
CI / vulncheck (push) Successful in 1m6s
Reviewed-on: lerko/uptop#36
2026-05-28 01:33:00 +00:00
lerko 0aa2f9cd8a feat: alert channel health indicator + test alerts
CI / test (pull_request) Successful in 2m46s
CI / lint (pull_request) Successful in 1m1s
CI / vulncheck (pull_request) Successful in 51s
Track alert delivery health at runtime:
- AlertHealth struct: LastSendAt, LastSendOK, LastError, SendCount, FailCount
- triggerAlert records success/failure after each Send()
- Health data exposed via GetAlertHealth() for TUI

Alerts tab enriched:
- Health dot column: green (OK), red (failed), gray (never sent)
- LAST SENT column: relative time ("2m ago", "never")
- [t] key sends test notification through selected channel

Inspired by Grafana's contact point health columns.
2026-05-27 21:23:06 -04:00
lerko f17f06a1c6 Merge pull request 'feat: logs tab overhaul — severity tags, filtering, recovery durations' (#35) from feat/logs-overhaul into main
CI / test (push) Successful in 2m47s
CI / lint (push) Successful in 1m16s
CI / vulncheck (push) Successful in 56s
Reviewed-on: lerko/uptop#35
2026-05-28 00:35:24 +00:00
lerko b14d5e19db feat: logs tab overhaul — severity tags, filtering, recovery durations
CI / test (pull_request) Successful in 2m36s
CI / lint (pull_request) Successful in 1m1s
CI / vulncheck (pull_request) Successful in 51s
Logs tab visual overhaul:
- Severity-classified entries: DOWN (red), UP (green), WARN (amber),
  SYS (cyan), info (gray) — rendered as inline tags, not whole-line color
- Column-aligned format: [timestamp] [severity tag] [message]
- Filter toggle (f key): All vs Important only (hides retry noise)
- Header shows entry count, filter state, hidden count

Engine log improvements:
- Recovery messages include downtime duration ("was down 14m")
- LATE transition logged ("heartbeat overdue")
- Push monitor recovery includes downtime duration
2026-05-27 20:14:43 -04:00
lerko a2b38ddc60 Merge pull request 'feat: proper push monitor lifecycle — PENDING, LATE, DOWN' (#34) from feat/push-monitor-states into main
CI / test (push) Successful in 2m48s
CI / lint (push) Successful in 1m17s
CI / vulncheck (push) Successful in 56s
Reviewed-on: lerko/uptop#34
2026-05-28 00:01:56 +00:00
lerko 5dc31108f8 feat: proper push monitor lifecycle — PENDING, LATE, DOWN states
CI / test (pull_request) Successful in 2m41s
CI / lint (pull_request) Successful in 1m7s
CI / vulncheck (pull_request) Successful in 46s
Push monitors no longer lie about status:

- PENDING stays until first heartbeat (no auto-promote to UP)
- LATE state (amber) when overdue but within grace period
- DOWN only after grace period expires
- Grace period = interval/2, minimum 60s

RecordHeartbeat now handles all transitions:
- PENDING → UP (first heartbeat, logged)
- LATE → UP (late arrival, logged)
- DOWN → UP (recovery, alert + state change persisted)

TUI updates:
- LATE rendered in amber/warning color
- Status bar shows LATE count separately
- Tab badge shows ⚠ for late monitors
- Sort order: DOWN > LATE > UP > PENDING > PAUSED
- Detail panel shows error for LATE monitors

Inspired by Healthchecks.io state machine (new/up/grace/down).
2026-05-27 19:56:50 -04:00
lerko 63773b13d0 Merge pull request 'feat: show error reason when monitors go DOWN' (#33) from feat/error-reason into main
CI / test (push) Successful in 2m51s
CI / lint (push) Successful in 1m1s
CI / vulncheck (push) Successful in 56s
Reviewed-on: lerko/uptop#33
2026-05-27 23:38:26 +00:00
lerko bc3a44beac feat: show error reason when monitors go DOWN
CI / test (pull_request) Successful in 2m42s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Successful in 51s
Propagate check failure reasons through the entire stack:
- Checker captures specific errors (DNS, timeout, HTTP status, SSL, etc.)
- Engine tracks LastError, StatusChangedAt, LastSuccessAt per monitor
- State transitions persisted to new state_changes table
- Detail panel shows error reason, HTTP code, state duration, last
  success time, and last 5 state change events
- Monitor table shows inline error preview for DOWN services
- Alert messages include error reason
- Probe nodes forward error reasons to leader

15 files changed across models, checker, engine, store, TUI, and probes.
2026-05-27 19:32:30 -04:00
lerko d8a2cab90f feat: seed SSH users from env var and authorized_keys file (#31)
CI / test (push) Successful in 2m36s
CI / lint (push) Successful in 1m12s
CI / vulncheck (push) Successful in 56s
Release / release (push) Has been cancelled
Release / docker (push) Has been cancelled
## Summary

Docker onboarding was broken — no way to add first SSH user without `docker attach` to TUI.

Now reads SSH public keys from two sources on startup:
- `UPTOP_ADMIN_KEY` env var — single key for quick single-user setup
- `UPTOP_KEYS` file path — authorized_keys format for team setup

Dockerfile already sets `UPTOP_KEYS=/data/authorized_keys` and compose mounts `./data:/data`, so the flow is:

```
echo "ssh-ed25519 AAAA... me@host" > ./data/authorized_keys
docker compose up -d
ssh -p 23234 localhost
```

### Behavior
- Skips keys already in DB (idempotent across restarts)
- All seeded users get admin role
- Username parsed from key comment (e.g. `tyler@macbook` → `tyler`)
- Comments and blank lines in keys file are ignored

### Tested
- UPTOP_ADMIN_KEY seeds single admin user
- UPTOP_KEYS file seeds multiple users with correct usernames
- Second startup skips existing keys (no duplicates)
- Build and all tests pass

Reviewed-on: lerko/uptop#31
2026-05-27 21:15:00 +00:00
lerko ea721601ab Merge pull request 'ci: overhaul pipeline — caching, GoReleaser, govulncheck' (#30) from ci/pipeline-overhaul into main
CI / test (push) Successful in 2m50s
CI / lint (push) Successful in 1m11s
CI / vulncheck (push) Successful in 56s
Reviewed-on: lerko/uptop#30
2026-05-27 00:37:32 +00:00
lerko b1935aa682 fix(deps): bump golang.org/x/crypto v0.47.0 → v0.52.0
CI / test (pull_request) Successful in 2m46s
CI / lint (pull_request) Successful in 1m12s
CI / vulncheck (pull_request) Successful in 56s
Fixes 7 vulns (GO-2026-5014 through GO-2026-5023) found by govulncheck.
Also bumps x/net, x/sys, x/text, x/sync, x/mod, x/tools to latest.
2026-05-26 20:20:23 -04:00
lerko 2cd3dcddb4 chore: bump Go 1.24.4 → 1.26.3, Alpine 3.21 → 3.23
CI / test (pull_request) Successful in 2m57s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Failing after 1m1s
Go 1.24 EOL since Feb 2026. Fixes 33 stdlib vulns found by
govulncheck (database/sql, os/exec, net/http). Gets Green Tea GC.
2026-05-26 20:12:43 -04:00
lerko 7d4ef1f594 fix(ci): remove explicit container, use sh shell
CI / test (pull_request) Successful in 2m44s
CI / lint (pull_request) Successful in 1m11s
CI / vulncheck (pull_request) Failing after 1m7s
Act runner is Alpine-based — container: directive breaks node-based
actions (checkout, setup-go). Runner already has apk natively.
Added shell: sh to all jobs since runner lacks bash.
2026-05-26 18:44:08 -04:00
lerko f0ff87c0d0 fix(ci): rename GITEA_TOKEN to RELEASE_TOKEN
CI / test (pull_request) Failing after 31s
CI / lint (pull_request) Successful in 1m9s
CI / vulncheck (pull_request) Failing after 15s
Gitea reserves the GITEA_ prefix for repo action secrets.
2026-05-26 18:36:11 -04:00
lerko 5aab391b74 ci: overhaul pipeline — caching, GoReleaser, govulncheck
- Add module + build cache to CI (was only caching go-build, not go/pkg/mod)
- Declare explicit Alpine container instead of relying on runner image
- Drop redundant go vet (already in golangci-lint)
- Add govulncheck job for dependency CVE scanning
- Add GoReleaser config for Gitea-native binary releases + checksums
- Replace .github/workflows/docker.yml with .gitea/workflows/release.yml
- Docker multiarch (amd64+arm64) via buildx in release workflow
- Dockerfile: add --mount=type=cache for mod/build, add -trimpath
2026-05-26 18:24:19 -04:00
lerko 8ad213c96c Merge pull request 'fix(security): phase 4 code quality and low-severity fixes' (#29) from security/phase-4-quality into main
CI / test (push) Successful in 4m35s
CI / lint (push) Successful in 1m7s
Reviewed-on: lerko/uptop#29
2026-05-26 21:31:40 +00:00
lerko 986f9f1d55 fix(security): phase 4 code quality and low-severity fixes
CI / test (pull_request) Successful in 4m24s
CI / lint (pull_request) Successful in 1m1s
- Fix limitStr to handle multi-byte UTF-8 characters correctly
- Sanitize log messages: strip ANSI escape sequences and newlines
- URL-encode probe node_id instead of string concatenation
- Fix follower resp.Body leak on non-200 responses
- Make SSH host key path configurable via UPTOP_SSH_HOST_KEY env var
- Add HTTP method checks on GET-only endpoints (405 for wrong methods)
- Extract magic numbers into named constants across monitor/store/server
- Standardize error output to stderr for all startup errors
2026-05-26 17:25:47 -04:00
lerko c50ec82dcb Merge pull request 'fix(security): phase 3 medium reliability and hardening' (#28) from security/phase-3-reliability into main
CI / test (push) Successful in 4m25s
CI / lint (push) Successful in 1m6s
Reviewed-on: lerko/uptop#28
2026-05-26 21:07:30 +00:00
lerko bd561d9a5e fix(security): phase 3 medium reliability and hardening
CI / test (pull_request) Successful in 4m23s
CI / lint (pull_request) Successful in 1m11s
- Fail hard on critical migration errors (ignore only "already exists")
- Cache SSH user keys with 30s TTL (avoid DB query per auth attempt)
- Configure DB connection pooling (25 open, 5 idle, 5m lifetime)
- Enable SQLite WAL mode for concurrent read/write
- Optimize check history pruning (only prune above 1100 rows)
- Add security headers: X-Content-Type-Options, X-Frame-Options, CSP, Referrer-Policy
- Add CORS policy on /status/json via UPTOP_CORS_ORIGIN env var
- Add HTTP request logging middleware (method, path, status, duration, IP)
- Fix config file permissions from 0644 to 0600
- Pin Docker images: golang:1.24-alpine3.21, alpine:3.21
- Fix Docker CI tag pattern for CalVer (was semver)
- Pass build args (VERSION, COMMIT, BUILD_DATE) to Docker build
2026-05-26 16:57:03 -04:00
lerko 7a8f2ad15b Merge pull request 'fix(security): phase 2 high-severity hardening' (#27) from security/phase-2-hardening into main
CI / test (push) Successful in 4m33s
CI / lint (push) Successful in 1m6s
Reviewed-on: lerko/uptop#27
2026-05-26 15:31:18 +00:00
lerko d30d1460bd fix(security): phase 2 high-severity hardening
CI / test (pull_request) Successful in 4m31s
CI / lint (pull_request) Successful in 56s
- Push heartbeat accepts Authorization: Bearer header (query string deprecated)
- Gotify alerts use X-Gotify-Key header instead of token in URL
- Per-IP rate limiting on all API endpoints (token-bucket)
- /metrics gated behind cluster secret (UPTOP_METRICS_PUBLIC=true to opt out)
- Config export redacts passwords/tokens by default (redact_secrets=false to override)
- Fix rewritePlaceholders for 100+ SQL parameters
- Fix AddSiteReturningID/AddAlertReturningID race with LastInsertId/RETURNING
- HTTP server timeouts: read 30s, write 60s, idle 120s
2026-05-25 21:15:33 -04:00
lerko b43dfae98f Merge pull request 'fix(security): phase 1 critical fixes for public release' (#26) from security/phase-1-critical into main
CI / test (push) Successful in 4m19s
CI / lint (push) Successful in 1m6s
Reviewed-on: lerko/uptop#26
2026-05-26 00:43:52 +00:00
lerko 60b30935b3 fix(security): phase 1 critical fixes for public release
CI / test (pull_request) Successful in 4m40s
CI / lint (pull_request) Successful in 1m2s
- Redact PostgreSQL DSN password from stdout/logs
- Harden .dockerignore to exclude .ssh/, .claude/, *.db, *.local files
- SSRF protection: block private/loopback/link-local IPs by default
  (UPTOP_ALLOW_PRIVATE_TARGETS=true to override for homelab use)
- Fix email header injection via CRLF in monitor names
- AES-256-GCM encryption for alert credentials at rest
  (UPTOP_ENCRYPTION_KEY env var, migrate-secrets subcommand)
- TLS support for HTTP server (UPTOP_TLS_CERT/UPTOP_TLS_KEY)
  with HSTS header when TLS enabled
2026-05-25 11:26:47 -04:00
51 changed files with 2709 additions and 484 deletions
+12
View File
@@ -1,3 +1,15 @@
.git .git
tmp/ tmp/
vendor/ vendor/
# Security: keep sensitive/local files out of Docker build context
.ssh/
.claude/
.github/
.gitea/
CLAUDE.md
*.local.json
*.local.md
*.local
*.db
*.db-journal
+33 -7
View File
@@ -5,6 +5,9 @@ on:
branches: [main] branches: [main]
pull_request: pull_request:
env:
GO_VERSION: "1.26"
jobs: jobs:
test: test:
runs-on: ubuntu-latest runs-on: ubuntu-latest
@@ -16,32 +19,55 @@ jobs:
- uses: actions/setup-go@v5 - uses: actions/setup-go@v5
with: with:
go-version: "1.24" go-version: "1.26"
- uses: actions/cache@v4 - uses: actions/cache@v4
with: with:
path: ~/.cache/go-build path: |
key: go-build-${{ hashFiles('**/*.go', 'go.sum') }} ~/go/pkg/mod
restore-keys: go-build- ~/.cache/go-build
key: go-${{ hashFiles('go.sum') }}
restore-keys: go-
- name: Install build tools - name: Install build tools
run: apk add --no-cache gcc musl-dev run: apk add --no-cache gcc musl-dev
- name: Vet - name: Download modules
run: go vet ./... run: go mod download
- name: Test - name: Test
run: CGO_ENABLED=1 go test -race -timeout 120s ./... run: CGO_ENABLED=1 go test -race -timeout 120s ./...
lint: lint:
runs-on: ubuntu-latest runs-on: ubuntu-latest
defaults:
run:
shell: sh
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- uses: actions/setup-go@v5 - uses: actions/setup-go@v5
with: with:
go-version: "1.24" go-version: "1.26"
- uses: golangci/golangci-lint-action@v7 - uses: golangci/golangci-lint-action@v7
with: with:
version: v2.11.2 version: v2.11.2
vulncheck:
runs-on: ubuntu-latest
defaults:
run:
shell: sh
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: "1.26"
- name: Install govulncheck
run: go install golang.org/x/vuln/cmd/govulncheck@latest
- name: Run govulncheck
run: govulncheck ./...
+68
View File
@@ -0,0 +1,68 @@
name: Release
on:
push:
tags:
- "[0-9]*"
jobs:
release:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/setup-go@v5
with:
go-version: "1.26"
- uses: actions/cache@v4
with:
path: |
~/go/pkg/mod
~/.cache/go-build
key: release-go-${{ hashFiles('go.sum') }}
restore-keys: release-go-
- name: Run GoReleaser
uses: goreleaser/goreleaser-action@v7
with:
distribution: goreleaser
version: "~> v2"
args: release --clean
env:
GORELEASER_FORCE_TOKEN: gitea
GITEA_TOKEN: ${{ secrets.RELEASE_TOKEN }}
docker:
runs-on: ubuntu-latest
needs: [release]
steps:
- uses: actions/checkout@v4
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
platforms: linux/amd64,linux/arm64
tags: |
lerkolabs/uptop:${{ github.ref_name }}
lerkolabs/uptop:latest
build-args: |
VERSION=${{ github.ref_name }}
COMMIT=${{ github.sha }}
BUILD_DATE=${{ github.event.head_commit.timestamp }}
-45
View File
@@ -1,45 +0,0 @@
name: Publish Release
on:
push:
tags:
- 'v*'
jobs:
push_to_registry:
name: Build and Push Docker Image
runs-on: ubuntu-latest
steps:
- name: Check out the repo
uses: actions/checkout@v4
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Extract metadata (tags, labels)
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ secrets.DOCKERHUB_USERNAME }}/uptop
tags: |
# This turns git tag "v1.0.0" into docker tag "1.0.0"
type=semver,pattern={{version}}
# This updates the "latest" tag to this version
type=raw,value=latest
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
+42
View File
@@ -0,0 +1,42 @@
version: 2
gitea_urls:
api: https://gitea.lerkolabs.com/api/v1
download: https://gitea.lerkolabs.com
release:
gitea:
owner: lerko
name: uptop
builds:
- main: ./cmd/uptop/main.go
binary: uptop
env:
- CGO_ENABLED=1
goos:
- linux
goarch:
- amd64
ldflags:
- -s -w
- -X main.version={{ .Version }}
- -X main.commit={{ .Commit }}
- -X main.date={{ .Date }}
flags:
- -trimpath
archives:
- formats: [tar.gz]
name_template: "{{ .ProjectName }}_{{ .Os }}_{{ .Arch }}"
checksum:
name_template: checksums.txt
changelog:
sort: asc
filters:
exclude:
- "^docs:"
- "^chore:"
- "^style:"
+7 -4
View File
@@ -1,18 +1,21 @@
# --- Stage 1: Builder --- # --- Stage 1: Builder ---
FROM golang:alpine AS builder FROM golang:1.26-alpine3.23 AS builder
RUN apk add --no-cache gcc musl-dev RUN apk add --no-cache gcc musl-dev
WORKDIR /app WORKDIR /app
COPY go.mod go.sum ./ COPY go.mod go.sum ./
RUN go mod download RUN --mount=type=cache,target=/go/pkg/mod \
go mod download
COPY . . COPY . .
ENV CGO_ENABLED=1 ENV CGO_ENABLED=1
ARG VERSION=dev ARG VERSION=dev
ARG COMMIT=none ARG COMMIT=none
ARG BUILD_DATE=unknown ARG BUILD_DATE=unknown
RUN go build -ldflags="-s -w -X main.version=${VERSION} -X main.commit=${COMMIT} -X main.date=${BUILD_DATE}" -o uptop ./cmd/uptop/main.go RUN --mount=type=cache,target=/go/pkg/mod \
--mount=type=cache,target=/root/.cache/go-build \
go build -trimpath -ldflags="-s -w -X main.version=${VERSION} -X main.commit=${COMMIT} -X main.date=${BUILD_DATE}" -o uptop ./cmd/uptop/main.go
# --- Stage 2: Runner --- # --- Stage 2: Runner ---
FROM alpine:latest FROM alpine:3.23
WORKDIR /app WORKDIR /app
RUN apk add --no-cache ca-certificates openssh-client RUN apk add --no-cache ca-certificates openssh-client
RUN mkdir /data RUN mkdir /data
+258 -32
View File
@@ -1,10 +1,22 @@
package main package main
import ( import (
"bufio"
"context" "context"
"errors" "errors"
"flag" "flag"
"fmt" "fmt"
"log"
"net/url"
"os"
"os/signal"
"path/filepath"
"strconv"
"strings"
"sync"
"syscall"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/cluster" "gitea.lerkolabs.com/lerko/uptop/internal/cluster"
"gitea.lerkolabs.com/lerko/uptop/internal/config" "gitea.lerkolabs.com/lerko/uptop/internal/config"
"gitea.lerkolabs.com/lerko/uptop/internal/importer" "gitea.lerkolabs.com/lerko/uptop/internal/importer"
@@ -13,12 +25,6 @@ import (
"gitea.lerkolabs.com/lerko/uptop/internal/server" "gitea.lerkolabs.com/lerko/uptop/internal/server"
"gitea.lerkolabs.com/lerko/uptop/internal/store" "gitea.lerkolabs.com/lerko/uptop/internal/store"
"gitea.lerkolabs.com/lerko/uptop/internal/tui" "gitea.lerkolabs.com/lerko/uptop/internal/tui"
"log"
"os"
"os/signal"
"strconv"
"syscall"
"time"
tea "github.com/charmbracelet/bubbletea" tea "github.com/charmbracelet/bubbletea"
"github.com/charmbracelet/ssh" "github.com/charmbracelet/ssh"
@@ -47,6 +53,9 @@ func main() {
case "version", "--version", "-v": case "version", "--version", "-v":
printVersion() printVersion()
return return
case "migrate-secrets":
runMigrateSecrets(os.Args[2:])
return
} }
} }
runServe(os.Args[1:]) runServe(os.Args[1:])
@@ -67,23 +76,42 @@ func envOrDefault(key, fallback string) string {
return fallback return fallback
} }
func redactDSN(dsn string) string {
u, err := url.Parse(dsn)
if err != nil {
return "***"
}
u.User = nil
return u.String()
}
func openStore(dbType, dsn string) store.Store { func openStore(dbType, dsn string) store.Store {
var s store.Store var ss *store.SQLStore
var err error var err error
if dbType == "postgres" { if dbType == "postgres" {
s, err = store.NewPostgresStore(dsn) ss, err = store.NewPostgresStore(dsn)
} else { } else {
s, err = store.NewSQLiteStore(dsn) ss, err = store.NewSQLiteStore(dsn)
} }
if err != nil { if err != nil {
fmt.Fprintf(os.Stderr, "database error: %v\n", err) fmt.Fprintf(os.Stderr, "database error: %v\n", err)
os.Exit(1) os.Exit(1)
} }
if err := s.Init(); err != nil { if encKey := os.Getenv("UPTOP_ENCRYPTION_KEY"); encKey != "" {
enc, err := store.NewEncryptor(encKey)
if err != nil {
fmt.Fprintf(os.Stderr, "encryption key error: %v\n", err)
os.Exit(1)
}
ss.SetEncryptor(enc)
} else {
fmt.Println("WARNING: No UPTOP_ENCRYPTION_KEY set. Alert credentials stored unencrypted.")
}
if err := ss.Init(); err != nil {
fmt.Fprintf(os.Stderr, "database init error: %v\n", err) fmt.Fprintf(os.Stderr, "database init error: %v\n", err)
os.Exit(1) os.Exit(1)
} }
return s return ss
} }
func runApply(args []string) { func runApply(args []string) {
@@ -142,6 +170,56 @@ func runExport(args []string) {
} }
} }
func runMigrateSecrets(args []string) {
fs := flag.NewFlagSet("migrate-secrets", flag.ExitOnError)
dbType := fs.String("db-type", envOrDefault("UPTOP_DB_TYPE", "sqlite"), "Database type")
dsn := fs.String("dsn", envOrDefault("UPTOP_DB_DSN", "uptop.db"), "Database DSN")
_ = fs.Parse(args)
encKey := os.Getenv("UPTOP_ENCRYPTION_KEY")
if encKey == "" {
fmt.Fprintln(os.Stderr, "error: UPTOP_ENCRYPTION_KEY must be set")
os.Exit(1)
}
enc, err := store.NewEncryptor(encKey)
if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
var ss *store.SQLStore
if *dbType == "postgres" {
ss, err = store.NewPostgresStore(*dsn)
} else {
ss, err = store.NewSQLiteStore(*dsn)
}
if err != nil {
fmt.Fprintf(os.Stderr, "database error: %v\n", err)
os.Exit(1)
}
if err := ss.Init(); err != nil {
fmt.Fprintf(os.Stderr, "database init error: %v\n", err)
os.Exit(1)
}
alerts, err := ss.GetAllAlerts()
if err != nil {
fmt.Fprintf(os.Stderr, "error loading alerts: %v\n", err)
os.Exit(1)
}
ss.SetEncryptor(enc)
migrated := 0
for _, a := range alerts {
if err := ss.UpdateAlert(a.ID, a.Name, a.Type, a.Settings); err != nil {
fmt.Fprintf(os.Stderr, "error migrating alert %q: %v\n", a.Name, err)
os.Exit(1)
}
migrated++
}
fmt.Printf("Migrated %d alert(s) to encrypted storage.\n", migrated)
}
func runServe(args []string) { func runServe(args []string) {
portVal := 23234 portVal := 23234
dbType := "sqlite" dbType := "sqlite"
@@ -211,6 +289,11 @@ func runServe(args []string) {
cancel() cancel()
}() }()
probeAllowPrivate := os.Getenv("UPTOP_ALLOW_PRIVATE_TARGETS") == "true"
if probeAllowPrivate {
fmt.Println("WARNING: Private target blocking disabled. Monitor URLs can reach internal networks.")
}
if err := cluster.RunProbe(ctx, cluster.ProbeConfig{ if err := cluster.RunProbe(ctx, cluster.ProbeConfig{
NodeID: nodeID, NodeID: nodeID,
NodeName: nodeName, NodeName: nodeName,
@@ -218,6 +301,7 @@ func runServe(args []string) {
LeaderURL: clusterPeer, LeaderURL: clusterPeer,
SharedKey: clusterKey, SharedKey: clusterKey,
Interval: 30, Interval: 30,
AllowPrivateTargets: probeAllowPrivate,
}); err != nil { }); err != nil {
fmt.Fprintf(os.Stderr, "Probe error: %v\n", err) fmt.Fprintf(os.Stderr, "Probe error: %v\n", err)
} }
@@ -232,44 +316,63 @@ func runServe(args []string) {
importKuma := fs.String("import-kuma", "", "Import Uptime Kuma backup JSON file") importKuma := fs.String("import-kuma", "", "Import Uptime Kuma backup JSON file")
_ = fs.Parse(args) // ExitOnError: parse errors exit before returning _ = fs.Parse(args) // ExitOnError: parse errors exit before returning
var s store.Store var ss *store.SQLStore
var dbErr error var dbErr error
if *flagDBType == "postgres" { if *flagDBType == "postgres" {
s, dbErr = store.NewPostgresStore(*flagDSN) ss, dbErr = store.NewPostgresStore(*flagDSN)
fmt.Printf("Using PostgreSQL: %s\n", *flagDSN) fmt.Printf("Using PostgreSQL: %s\n", redactDSN(*flagDSN))
} else { } else {
s, dbErr = store.NewSQLiteStore(*flagDSN) ss, dbErr = store.NewSQLiteStore(*flagDSN)
fmt.Printf("Using SQLite: %s\n", *flagDSN) fmt.Printf("Using SQLite: %s\n", *flagDSN)
} }
if dbErr != nil { if dbErr != nil {
fmt.Printf("Database connection error: %v\n", dbErr) fmt.Fprintf(os.Stderr, "database connection error: %v\n", dbErr)
os.Exit(1) os.Exit(1)
} }
defer s.Close() defer ss.Close()
if encKey := os.Getenv("UPTOP_ENCRYPTION_KEY"); encKey != "" {
enc, err := store.NewEncryptor(encKey)
if err != nil {
fmt.Fprintf(os.Stderr, "encryption key error: %v\n", err)
os.Exit(1)
}
ss.SetEncryptor(enc)
} else {
fmt.Println("WARNING: No UPTOP_ENCRYPTION_KEY set. Alert credentials stored unencrypted.")
}
var s store.Store = ss
if err := s.Init(); err != nil { if err := s.Init(); err != nil {
fmt.Printf("Database init error: %v\n", err) fmt.Fprintf(os.Stderr, "database init error: %v\n", err)
os.Exit(1) os.Exit(1)
} }
if *demo { if *demo {
seedDemoData(s) seedDemoData(s)
} }
seedKeysFromEnv(s)
if *importKuma != "" { if *importKuma != "" {
kb, err := importer.LoadKumaFile(*importKuma) kb, err := importer.LoadKumaFile(*importKuma)
if err != nil { if err != nil {
fmt.Printf("Kuma import error: %v\n", err) fmt.Fprintf(os.Stderr, "kuma import error: %v\n", err)
os.Exit(1) os.Exit(1)
} }
backup := importer.ConvertKuma(kb) backup := importer.ConvertKuma(kb)
if err := s.ImportData(backup); err != nil { if err := s.ImportData(backup); err != nil {
fmt.Printf("Import failed: %v\n", err) fmt.Fprintf(os.Stderr, "import failed: %v\n", err)
os.Exit(1) os.Exit(1)
} }
fmt.Printf("Imported %d monitors and %d alerts from Uptime Kuma v%s\n", len(backup.Sites), len(backup.Alerts), kb.Version) fmt.Printf("Imported %d monitors and %d alerts from Uptime Kuma v%s\n", len(backup.Sites), len(backup.Alerts), kb.Version)
} }
eng := monitor.NewEngine(s) allowPrivate := os.Getenv("UPTOP_ALLOW_PRIVATE_TARGETS") == "true"
if allowPrivate {
fmt.Println("WARNING: Private target blocking disabled. Monitor URLs can reach internal networks.")
}
eng := monitor.NewEngineWithOpts(s, allowPrivate)
if os.Getenv("UPTOP_INSECURE_SKIP_VERIFY") == "true" { if os.Getenv("UPTOP_INSECURE_SKIP_VERIFY") == "true" {
eng.SetInsecureSkipVerify(true) eng.SetInsecureSkipVerify(true)
} }
@@ -284,11 +387,19 @@ func runServe(args []string) {
eng.InitLogs() eng.InitLogs()
eng.Start(ctx) eng.Start(ctx)
tlsCert := os.Getenv("UPTOP_TLS_CERT")
tlsKey := os.Getenv("UPTOP_TLS_KEY")
httpSrv := server.Start(server.ServerConfig{ httpSrv := server.Start(server.ServerConfig{
Port: httpPort, Port: httpPort,
EnableStatus: enableStatus, EnableStatus: enableStatus,
Title: statusTitle, Title: statusTitle,
ClusterKey: clusterKey, ClusterKey: clusterKey,
TLSCert: tlsCert,
TLSKey: tlsKey,
ClusterMode: clusterMode,
MetricsPublic: os.Getenv("UPTOP_METRICS_PUBLIC") == "true",
CORSOrigin: os.Getenv("UPTOP_CORS_ORIGIN"),
}, s, eng) }, s, eng)
cluster.Start(ctx, cluster.Config{ cluster.Start(ctx, cluster.Config{
@@ -297,12 +408,13 @@ func runServe(args []string) {
SharedKey: clusterKey, SharedKey: clusterKey,
}, eng) }, eng)
sshSrv := startSSHServer(*port, s, eng) kc := newKeyCache(s)
sshSrv := startSSHServer(*port, s, eng, kc)
if isatty.IsTerminal(os.Stdout.Fd()) || isatty.IsCygwinTerminal(os.Stdout.Fd()) { if isatty.IsTerminal(os.Stdout.Fd()) || isatty.IsCygwinTerminal(os.Stdout.Fd()) {
p := tea.NewProgram(tui.InitialModel(true, s, eng), tea.WithAltScreen(), tea.WithMouseCellMotion()) p := tea.NewProgram(tui.InitialModel(true, s, eng), tea.WithAltScreen(), tea.WithMouseCellMotion())
if _, err := p.Run(); err != nil { if _, err := p.Run(); err != nil {
fmt.Printf("Error: %v\n", err) fmt.Fprintf(os.Stderr, "error: %v\n", err)
} }
} else { } else {
fmt.Println("uptop running in HEADLESS mode") fmt.Println("uptop running in HEADLESS mode")
@@ -327,12 +439,12 @@ func runServe(args []string) {
} }
} }
func startSSHServer(port int, db store.Store, eng *monitor.Engine) *ssh.Server { func startSSHServer(port int, db store.Store, eng *monitor.Engine, kc *keyCache) *ssh.Server {
s, err := wish.NewServer( s, err := wish.NewServer(
wish.WithAddress(fmt.Sprintf(":%d", port)), wish.WithAddress(fmt.Sprintf(":%d", port)),
wish.WithHostKeyPath(".ssh/id_ed25519"), wish.WithHostKeyPath(envOrDefault("UPTOP_SSH_HOST_KEY", ".ssh/id_ed25519")),
wish.WithPublicKeyAuth(func(ctx ssh.Context, key ssh.PublicKey) bool { wish.WithPublicKeyAuth(func(ctx ssh.Context, key ssh.PublicKey) bool {
return isKeyAllowed(db, key) return kc.IsAllowed(key)
}), }),
wish.WithMiddleware( wish.WithMiddleware(
bm.Middleware(func(s ssh.Session) (tea.Model, []tea.ProgramOption) { bm.Middleware(func(s ssh.Session) (tea.Model, []tea.ProgramOption) {
@@ -341,7 +453,7 @@ func startSSHServer(port int, db store.Store, eng *monitor.Engine) *ssh.Server {
), ),
) )
if err != nil { if err != nil {
fmt.Printf("SSH server error: %v\n", err) fmt.Fprintf(os.Stderr, "SSH server error: %v\n", err)
return nil return nil
} }
go func() { go func() {
@@ -401,19 +513,133 @@ func seedDemoData(s store.Store) {
} }
} }
func isKeyAllowed(db store.Store, incomingKey ssh.PublicKey) bool { type keyCache struct {
users, err := db.GetAllUsers() mu sync.RWMutex
keys []ssh.PublicKey
updated time.Time
ttl time.Duration
db store.Store
}
func newKeyCache(db store.Store) *keyCache {
return &keyCache{db: db, ttl: 30 * time.Second}
}
func (c *keyCache) refresh() {
users, err := c.db.GetAllUsers()
if err != nil { if err != nil {
return false return
} }
keys := make([]ssh.PublicKey, 0, len(users))
for _, u := range users { for _, u := range users {
allowedKey, _, _, _, err := ssh.ParseAuthorizedKey([]byte(u.PublicKey)) k, _, _, _, err := ssh.ParseAuthorizedKey([]byte(u.PublicKey))
if err != nil { if err != nil {
continue continue
} }
if ssh.KeysEqual(allowedKey, incomingKey) { keys = append(keys, k)
}
c.mu.Lock()
c.keys = keys
c.updated = time.Now()
c.mu.Unlock()
}
func (c *keyCache) Invalidate() {
c.mu.Lock()
c.updated = time.Time{}
c.mu.Unlock()
}
func (c *keyCache) IsAllowed(incomingKey ssh.PublicKey) bool {
c.mu.RLock()
stale := time.Since(c.updated) > c.ttl
c.mu.RUnlock()
if stale {
c.refresh()
}
c.mu.RLock()
defer c.mu.RUnlock()
for _, k := range c.keys {
if ssh.KeysEqual(k, incomingKey) {
return true return true
} }
} }
return false return false
} }
func seedKeysFromEnv(s store.Store) {
var keys []string
if v := os.Getenv("UPTOP_ADMIN_KEY"); v != "" {
keys = append(keys, strings.TrimSpace(v))
}
if path := os.Getenv("UPTOP_KEYS"); path != "" {
f, err := os.Open(filepath.Clean(path))
if err == nil {
scanner := bufio.NewScanner(f)
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if line == "" || strings.HasPrefix(line, "#") {
continue
}
keys = append(keys, line)
}
_ = f.Close()
}
}
if len(keys) == 0 {
return
}
existing, err := s.GetAllUsers()
if err != nil {
fmt.Fprintf(os.Stderr, "warning: could not check existing users: %v\n", err)
return
}
existingKeys := make(map[string]bool)
for _, u := range existing {
existingKeys[u.PublicKey] = true
}
added := 0
for i, key := range keys {
if existingKeys[key] {
continue
}
username := usernameFromKey(key, i, len(existing)+added)
if err := s.AddUser(username, key, "admin"); err != nil {
fmt.Fprintf(os.Stderr, "warning: failed to seed user %q: %v\n", username, err)
continue
}
fmt.Printf("Seeded admin user %q from %s\n", username, seedSource(i, len(keys), os.Getenv("UPTOP_ADMIN_KEY") != ""))
added++
}
}
func usernameFromKey(key string, index, totalExisting int) string {
parts := strings.Fields(key)
if len(parts) >= 3 {
comment := parts[2]
if at := strings.Index(comment, "@"); at > 0 {
return comment[:at]
}
return comment
}
if index == 0 && totalExisting == 0 {
return "admin"
}
return fmt.Sprintf("user-%d", totalExisting+1)
}
func seedSource(index, total int, hasEnvKey bool) string {
if hasEnvKey && index == 0 {
return "UPTOP_ADMIN_KEY"
}
return "UPTOP_KEYS"
}
+2
View File
@@ -14,5 +14,7 @@ services:
- UPTOP_HTTP_PORT=8080 - UPTOP_HTTP_PORT=8080
- UPTOP_STATUS_ENABLED=true - UPTOP_STATUS_ENABLED=true
- UPTOP_STATUS_TITLE=System Status - UPTOP_STATUS_TITLE=System Status
# SSH access: add your public key via env var or authorized_keys file
# - UPTOP_ADMIN_KEY=ssh-ed25519 AAAA... you@host
volumes: volumes:
- ./data:/data - ./data:/data
+9 -9
View File
@@ -1,6 +1,6 @@
module gitea.lerkolabs.com/lerko/uptop module gitea.lerkolabs.com/lerko/uptop
go 1.24.4 go 1.26.3
require ( require (
github.com/charmbracelet/bubbles v0.21.1-0.20250623103423-23b8fd6302d7 github.com/charmbracelet/bubbles v0.21.1-0.20250623103423-23b8fd6302d7
@@ -16,6 +16,7 @@ require (
github.com/mattn/go-sqlite3 v1.14.33 github.com/mattn/go-sqlite3 v1.14.33
github.com/miekg/dns v1.1.72 github.com/miekg/dns v1.1.72
github.com/prometheus-community/pro-bing v0.8.0 github.com/prometheus-community/pro-bing v0.8.0
gopkg.in/yaml.v3 v3.0.1
) )
require ( require (
@@ -49,13 +50,12 @@ require (
github.com/muesli/termenv v0.16.0 // indirect github.com/muesli/termenv v0.16.0 // indirect
github.com/rivo/uniseg v0.4.7 // indirect github.com/rivo/uniseg v0.4.7 // indirect
github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e // indirect github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e // indirect
golang.org/x/crypto v0.47.0 // indirect golang.org/x/crypto v0.52.0 // indirect
golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56 // indirect golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56 // indirect
golang.org/x/mod v0.31.0 // indirect golang.org/x/mod v0.35.0 // indirect
golang.org/x/net v0.49.0 // indirect golang.org/x/net v0.54.0 // indirect
golang.org/x/sync v0.19.0 // indirect golang.org/x/sync v0.20.0 // indirect
golang.org/x/sys v0.40.0 // indirect golang.org/x/sys v0.45.0 // indirect
golang.org/x/text v0.33.0 // indirect golang.org/x/text v0.37.0 // indirect
golang.org/x/tools v0.40.0 // indirect golang.org/x/tools v0.44.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
) )
+17 -16
View File
@@ -101,26 +101,27 @@ github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOf
github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY= github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e h1:JVG44RsyaB9T2KIHavMF/ppJZNG9ZpyihvCd0w101no= github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e h1:JVG44RsyaB9T2KIHavMF/ppJZNG9ZpyihvCd0w101no=
github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e/go.mod h1:RbqR21r5mrJuqunuUZ/Dhy/avygyECGrLceyNeo4LiM= github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e/go.mod h1:RbqR21r5mrJuqunuUZ/Dhy/avygyECGrLceyNeo4LiM=
golang.org/x/crypto v0.47.0 h1:V6e3FRj+n4dbpw86FJ8Fv7XVOql7TEwpHapKoMJ/GO8= golang.org/x/crypto v0.52.0 h1:RMs7fP2rXdep0CftQlK8Uf+kibLm7qkCcradZWYz988=
golang.org/x/crypto v0.47.0/go.mod h1:ff3Y9VzzKbwSSEzWqJsJVBnWmRwRSHt/6Op5n9bQc4A= golang.org/x/crypto v0.52.0/go.mod h1:1QgfPxDqh0T2M/elOJtp9RvuR95kVjir0e6/BvEmGbc=
golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56 h1:2dVuKD2vS7b0QIHQbpyTISPd0LeHDbnYEryqj5Q1ug8= golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56 h1:2dVuKD2vS7b0QIHQbpyTISPd0LeHDbnYEryqj5Q1ug8=
golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56/go.mod h1:M4RDyNAINzryxdtnbRXRL/OHtkFuWGRjvuhBJpk2IlY= golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56/go.mod h1:M4RDyNAINzryxdtnbRXRL/OHtkFuWGRjvuhBJpk2IlY=
golang.org/x/mod v0.31.0 h1:HaW9xtz0+kOcWKwli0ZXy79Ix+UW/vOfmWI5QVd2tgI= golang.org/x/mod v0.35.0 h1:Ww1D637e6Pg+Zb2KrWfHQUnH2dQRLBQyAtpr/haaJeM=
golang.org/x/mod v0.31.0/go.mod h1:43JraMp9cGx1Rx3AqioxrbrhNsLl2l/iNAvuBkrezpg= golang.org/x/mod v0.35.0/go.mod h1:+GwiRhIInF8wPm+4AoT6L0FA1QWAad3OMdTRx4tFYlU=
golang.org/x/net v0.49.0 h1:eeHFmOGUTtaaPSGNmjBKpbng9MulQsJURQUAfUwY++o= golang.org/x/net v0.54.0 h1:2zJIZAxAHV/OHCDTCOHAYehQzLfSXuf/5SoL/Dv6w/w=
golang.org/x/net v0.49.0/go.mod h1:/ysNB2EvaqvesRkuLAyjI1ycPZlQHM3q01F02UY/MV8= golang.org/x/net v0.54.0/go.mod h1:Sj4oj8jK6XmHpBZU/zWHw3BV3abl4Kvi+Ut7cQcY+cQ=
golang.org/x/sync v0.19.0 h1:vV+1eWNmZ5geRlYjzm2adRgW2/mcpevXNg50YZtPCE4= golang.org/x/sync v0.20.0 h1:e0PTpb7pjO8GAtTs2dQ6jYa5BWYlMuX047Dco/pItO4=
golang.org/x/sync v0.19.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI= golang.org/x/sync v0.20.0/go.mod h1:9xrNwdLfx4jkKbNva9FpL6vEN7evnE43NNNJQ2LF3+0=
golang.org/x/sys v0.0.0-20210809222454-d867a43fc93e/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.0.0-20210809222454-d867a43fc93e/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.40.0 h1:DBZZqJ2Rkml6QMQsZywtnjnnGvHza6BTfYFWY9kjEWQ= golang.org/x/sys v0.45.0 h1:dO4czNzziLiiXplLQgBCEpCvXQ3dnkn0SdaZSYdQ+FY=
golang.org/x/sys v0.40.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks= golang.org/x/sys v0.45.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
golang.org/x/term v0.39.0 h1:RclSuaJf32jOqZz74CkPA9qFuVTX7vhLlpfj/IGWlqY= golang.org/x/term v0.43.0 h1:S4RLU2sB31O/NCl+zFN9Aru9A/Cq2aqKpTZJ6B+DwT4=
golang.org/x/term v0.39.0/go.mod h1:yxzUCTP/U+FzoxfdKmLaA0RV1WgE0VY7hXBwKtY/4ww= golang.org/x/term v0.43.0/go.mod h1:lrhlHNdQJHO+1qVYiHfFKVuVioJIheAc3fBSMFYEIsk=
golang.org/x/text v0.33.0 h1:B3njUFyqtHDUI5jMn1YIr5B0IE2U0qck04r6d4KPAxE= golang.org/x/text v0.37.0 h1:Cqjiwd9eSg8e0QAkyCaQTNHFIIzWtidPahFWR83rTrc=
golang.org/x/text v0.33.0/go.mod h1:LuMebE6+rBincTi9+xWTY8TztLzKHc/9C1uBCG27+q8= golang.org/x/text v0.37.0/go.mod h1:a5sjxXGs9hsn/AJVwuElvCAo9v8QYLzvavO5z2PiM38=
golang.org/x/tools v0.40.0 h1:yLkxfA+Qnul4cs9QA3KnlFu0lVmd8JJfoq+E41uSutA= golang.org/x/tools v0.44.0 h1:UP4ajHPIcuMjT1GqzDWRlalUEoY+uzoZKnhOjbIPD2c=
golang.org/x/tools v0.40.0/go.mod h1:Ik/tzLRlbscWpqqMRjyWYDisX8bG13FrdXp3o4Sr9lc= golang.org/x/tools v0.44.0/go.mod h1:KA0AfVErSdxRZIsOVipbv3rQhVXTnlU6UhKxHd1seDI=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA= gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
+25 -6
View File
@@ -5,12 +5,13 @@ import (
"context" "context"
"encoding/json" "encoding/json"
"fmt" "fmt"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"net/http" "net/http"
"net/smtp" "net/smtp"
"strconv" "strconv"
"strings" "strings"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
) )
var alertClient = &http.Client{Timeout: 10 * time.Second} var alertClient = &http.Client{Timeout: 10 * time.Second}
@@ -24,6 +25,7 @@ type PayloadFunc func(title, message string) ([]byte, error)
type HTTPProvider struct { type HTTPProvider struct {
URL string URL string
Payload PayloadFunc Payload PayloadFunc
Headers map[string]string
} }
func (h *HTTPProvider) Send(ctx context.Context, title, message string) error { func (h *HTTPProvider) Send(ctx context.Context, title, message string) error {
@@ -36,6 +38,9 @@ func (h *HTTPProvider) Send(ctx context.Context, title, message string) error {
return err return err
} }
req.Header.Set("Content-Type", "application/json") req.Header.Set("Content-Type", "application/json")
for k, v := range h.Headers {
req.Header.Set(k, v)
}
resp, err := alertClient.Do(req) resp, err := alertClient.Do(req)
if err != nil { if err != nil {
return err return err
@@ -164,8 +169,9 @@ func GetProvider(cfg models.AlertConfig) Provider {
} }
serverURL := strings.TrimRight(cfg.Settings["url"], "/") serverURL := strings.TrimRight(cfg.Settings["url"], "/")
return &HTTPProvider{ return &HTTPProvider{
URL: fmt.Sprintf("%s/message?token=%s", serverURL, cfg.Settings["token"]), URL: serverURL + "/message",
Payload: gotifyPayload(priority), Payload: gotifyPayload(priority),
Headers: map[string]string{"X-Gotify-Key": cfg.Settings["token"]},
} }
default: default:
return nil return nil
@@ -176,6 +182,12 @@ type EmailProvider struct {
Host, Port, User, Pass, To, From string Host, Port, User, Pass, To, From string
} }
func sanitizeHeader(s string) string {
s = strings.ReplaceAll(s, "\r", "")
s = strings.ReplaceAll(s, "\n", "")
return s
}
func (e *EmailProvider) Send(ctx context.Context, title, message string) error { func (e *EmailProvider) Send(ctx context.Context, title, message string) error {
select { select {
case <-ctx.Done(): case <-ctx.Done():
@@ -183,11 +195,18 @@ func (e *EmailProvider) Send(ctx context.Context, title, message string) error {
default: default:
} }
auth := smtp.PlainAuth("", e.User, e.Pass, e.Host) auth := smtp.PlainAuth("", e.User, e.Pass, e.Host)
msg := []byte("To: " + e.To + "\r\n" + to := sanitizeHeader(e.To)
"Subject: uptop: " + title + "\r\n" + from := sanitizeHeader(e.From)
subject := sanitizeHeader(title)
body := strings.ReplaceAll(message, "\r", "")
msg := []byte("From: " + from + "\r\n" +
"To: " + to + "\r\n" +
"Subject: uptop: " + subject + "\r\n" +
"MIME-Version: 1.0\r\n" +
"Content-Type: text/plain; charset=utf-8\r\n" +
"\r\n" + "\r\n" +
message + "\r\n") body + "\r\n")
return smtp.SendMail(e.Host+":"+e.Port, auth, e.From, []string{e.To}, msg) return smtp.SendMail(e.Host+":"+e.Port, auth, from, []string{to}, msg)
} }
type NtfyProvider struct { type NtfyProvider struct {
+19 -1
View File
@@ -3,10 +3,11 @@ package alert
import ( import (
"context" "context"
"encoding/json" "encoding/json"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"net/http" "net/http"
"net/http/httptest" "net/http/httptest"
"testing" "testing"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
) )
func TestHTTPProviderDiscord(t *testing.T) { func TestHTTPProviderDiscord(t *testing.T) {
@@ -212,3 +213,20 @@ func TestGetProviderUnknown(t *testing.T) {
t.Error("expected nil for unknown provider type") t.Error("expected nil for unknown provider type")
} }
} }
func TestSanitizeHeader(t *testing.T) {
tests := []struct {
input, want string
}{
{"normal subject", "normal subject"},
{"inject\r\nBcc: evil@bad.com", "injectBcc: evil@bad.com"},
{"has\nnewline", "hasnewline"},
{"has\rcarriage", "hascarriage"},
}
for _, tt := range tests {
got := sanitizeHeader(tt.input)
if got != tt.want {
t.Errorf("sanitizeHeader(%q) = %q, want %q", tt.input, got, tt.want)
}
}
}
+4 -3
View File
@@ -3,10 +3,11 @@ package cluster
import ( import (
"context" "context"
"fmt" "fmt"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"net/http" "net/http"
"strings" "strings"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
) )
type Config struct { type Config struct {
@@ -57,8 +58,8 @@ func runFollowerLoop(ctx context.Context, cfg Config, eng *monitor.Engine) {
resp, err := client.Do(req) resp, err := client.Do(req)
isLeaderHealthy := false isLeaderHealthy := false
if err == nil && resp.StatusCode == 200 { if err == nil {
isLeaderHealthy = true isLeaderHealthy = resp.StatusCode == 200
_ = resp.Body.Close() _ = resp.Body.Close()
} }
+7 -4
View File
@@ -3,14 +3,15 @@ package cluster
import ( import (
"context" "context"
"encoding/json" "encoding/json"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"net/http" "net/http"
"net/http/httptest" "net/http/httptest"
"sync" "sync"
"sync/atomic" "sync/atomic"
"testing" "testing"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
) )
// --- Mock Store (minimal, for monitor.NewEngine) --- // --- Mock Store (minimal, for monitor.NewEngine) ---
@@ -66,6 +67,8 @@ func (m *mockStore) DeleteMaintenanceWindow(int) error { retur
func (m *mockStore) IsMonitorInMaintenance(int) (bool, error) { return false, nil } func (m *mockStore) IsMonitorInMaintenance(int) (bool, error) { return false, nil }
func (m *mockStore) GetPreference(string) (string, error) { return "", nil } func (m *mockStore) GetPreference(string) (string, error) { return "", nil }
func (m *mockStore) SetPreference(string, string) error { return nil } func (m *mockStore) SetPreference(string, string) error { return nil }
func (m *mockStore) SaveStateChange(int, string, string, string) error { return nil }
func (m *mockStore) GetStateChanges(int, int) ([]models.StateChange, error) { return nil, nil }
func (m *mockStore) Close() error { return nil } func (m *mockStore) Close() error { return nil }
// --- Cluster Start Tests --- // --- Cluster Start Tests ---
@@ -295,7 +298,7 @@ func TestProbeExecuteChecks(t *testing.T) {
strict := &http.Client{} strict := &http.Client{}
insecure := &http.Client{} insecure := &http.Client{}
results := probeExecuteChecks(context.Background(), sites, strict, insecure) results := probeExecuteChecks(context.Background(), sites, strict, insecure, true)
if len(results) != 2 { if len(results) != 2 {
t.Fatalf("expected 2 results, got %d", len(results)) t.Fatalf("expected 2 results, got %d", len(results))
@@ -329,7 +332,7 @@ func TestProbeExecuteChecks_Concurrency(t *testing.T) {
sites = append(sites, models.Site{ID: i + 1, Type: "http", URL: srv.URL}) sites = append(sites, models.Site{ID: i + 1, Type: "http", URL: srv.URL})
} }
results := probeExecuteChecks(context.Background(), sites, &http.Client{}, &http.Client{}) results := probeExecuteChecks(context.Background(), sites, &http.Client{}, &http.Client{}, true)
if len(results) != 20 { if len(results) != 20 {
t.Errorf("expected 20 results, got %d", len(results)) t.Errorf("expected 20 results, got %d", len(results))
} }
+21 -8
View File
@@ -6,12 +6,14 @@ import (
"crypto/tls" "crypto/tls"
"encoding/json" "encoding/json"
"fmt" "fmt"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"log" "log"
"net/http" "net/http"
"net/url"
"sync" "sync"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
) )
type ProbeConfig struct { type ProbeConfig struct {
@@ -21,6 +23,7 @@ type ProbeConfig struct {
LeaderURL string LeaderURL string
SharedKey string SharedKey string
Interval int Interval int
AllowPrivateTargets bool
} }
func RunProbe(ctx context.Context, cfg ProbeConfig) error { func RunProbe(ctx context.Context, cfg ProbeConfig) error {
@@ -29,11 +32,18 @@ func RunProbe(ctx context.Context, cfg ProbeConfig) error {
} }
apiClient := &http.Client{Timeout: 10 * time.Second} apiClient := &http.Client{Timeout: 10 * time.Second}
dial := monitor.SafeDialContext(cfg.AllowPrivateTargets)
strictClient := &http.Client{ strictClient := &http.Client{
Transport: &http.Transport{TLSClientConfig: &tls.Config{InsecureSkipVerify: false}}, Transport: &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: false},
DialContext: dial,
},
} }
insecureClient := &http.Client{ insecureClient := &http.Client{
Transport: &http.Transport{TLSClientConfig: &tls.Config{InsecureSkipVerify: true}}, //nolint:gosec // intentional for IgnoreTLS sites Transport: &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: true}, //nolint:gosec // intentional for IgnoreTLS sites
DialContext: dial,
},
} }
if err := probeRegister(ctx, apiClient, cfg); err != nil { if err := probeRegister(ctx, apiClient, cfg); err != nil {
@@ -59,7 +69,7 @@ func RunProbe(ctx context.Context, cfg ProbeConfig) error {
continue continue
} }
results := probeExecuteChecks(ctx, sites, strictClient, insecureClient) results := probeExecuteChecks(ctx, sites, strictClient, insecureClient, cfg.AllowPrivateTargets)
if len(results) > 0 { if len(results) > 0 {
if err := probeReportResults(ctx, apiClient, cfg, results); err != nil { if err := probeReportResults(ctx, apiClient, cfg, results); err != nil {
@@ -93,7 +103,8 @@ func probeRegister(ctx context.Context, client *http.Client, cfg ProbeConfig) er
} }
func probeFetchAssignments(ctx context.Context, client *http.Client, cfg ProbeConfig) ([]models.Site, error) { func probeFetchAssignments(ctx context.Context, client *http.Client, cfg ProbeConfig) ([]models.Site, error) {
req, err := http.NewRequestWithContext(ctx, "GET", cfg.LeaderURL+"/api/probe/assignments?node_id="+cfg.NodeID, nil) assignURL := cfg.LeaderURL + "/api/probe/assignments?" + url.Values{"node_id": {cfg.NodeID}}.Encode()
req, err := http.NewRequestWithContext(ctx, "GET", assignURL, nil)
if err != nil { if err != nil {
return nil, err return nil, err
} }
@@ -119,9 +130,10 @@ type probeResultItem struct {
SiteID int `json:"site_id"` SiteID int `json:"site_id"`
LatencyNs int64 `json:"latency_ns"` LatencyNs int64 `json:"latency_ns"`
IsUp bool `json:"is_up"` IsUp bool `json:"is_up"`
ErrorReason string `json:"error_reason,omitempty"`
} }
func probeExecuteChecks(ctx context.Context, sites []models.Site, strict, insecure *http.Client) []probeResultItem { func probeExecuteChecks(ctx context.Context, sites []models.Site, strict, insecure *http.Client, allowPrivate bool) []probeResultItem {
var mu sync.Mutex var mu sync.Mutex
var results []probeResultItem var results []probeResultItem
sem := make(chan struct{}, 10) sem := make(chan struct{}, 10)
@@ -140,12 +152,13 @@ loop:
defer wg.Done() defer wg.Done()
defer func() { <-sem }() defer func() { <-sem }()
cr := monitor.RunCheck(s, strict, insecure, false) cr := monitor.RunCheck(s, strict, insecure, false, allowPrivate)
mu.Lock() mu.Lock()
results = append(results, probeResultItem{ results = append(results, probeResultItem{
SiteID: s.ID, SiteID: s.ID,
LatencyNs: cr.LatencyNs, LatencyNs: cr.LatencyNs,
IsUp: cr.Status == "UP", IsUp: cr.Status == "UP",
ErrorReason: cr.ErrorReason,
}) })
mu.Unlock() mu.Unlock()
}(site) }(site)
+4 -3
View File
@@ -2,11 +2,12 @@ package config
import ( import (
"fmt" "fmt"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
"os" "os"
"sort" "sort"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
"gopkg.in/yaml.v3" "gopkg.in/yaml.v3"
) )
@@ -142,7 +143,7 @@ func WriteFile(f *File, path string) error {
_, err = os.Stdout.Write(data) _, err = os.Stdout.Write(data)
return err return err
} }
return os.WriteFile(path, data, 0644) //nolint:gosec // config files should be group-readable return os.WriteFile(path, data, 0600)
} }
func LoadFile(path string) (*File, error) { func LoadFile(path string) (*File, error) {
+5 -2
View File
@@ -2,13 +2,14 @@ package metrics
import ( import (
"context" "context"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"net/http" "net/http"
"net/http/httptest" "net/http/httptest"
"strings" "strings"
"testing" "testing"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
) )
type mockStore struct { type mockStore struct {
@@ -64,6 +65,8 @@ func (m *mockStore) DeleteMaintenanceWindow(int) error { retur
func (m *mockStore) IsMonitorInMaintenance(int) (bool, error) { return false, nil } func (m *mockStore) IsMonitorInMaintenance(int) (bool, error) { return false, nil }
func (m *mockStore) GetPreference(string) (string, error) { return "", nil } func (m *mockStore) GetPreference(string) (string, error) { return "", nil }
func (m *mockStore) SetPreference(string, string) error { return nil } func (m *mockStore) SetPreference(string, string) error { return nil }
func (m *mockStore) SaveStateChange(int, string, string, string) error { return nil }
func (m *mockStore) GetStateChanges(int, int) ([]models.StateChange, error) { return nil, nil }
func (m *mockStore) Close() error { return nil } func (m *mockStore) Close() error { return nil }
func TestMetricsHandler(t *testing.T) { func TestMetricsHandler(t *testing.T) {
+12
View File
@@ -35,6 +35,18 @@ type Site struct {
HasSSL bool HasSSL bool
LastCheck time.Time LastCheck time.Time
SentSSLWarning bool SentSSLWarning bool
LastError string
StatusChangedAt time.Time
LastSuccessAt time.Time
}
type StateChange struct {
ID int
SiteID int
FromStatus string
ToStatus string
ErrorReason string
ChangedAt time.Time
} }
type AlertConfig struct { type AlertConfig struct {
+1
View File
@@ -15,6 +15,7 @@ type NodeResult struct {
IsUp bool IsUp bool
LatencyNs int64 LatencyNs int64
CheckedAt time.Time CheckedAt time.Time
ErrorReason string
} }
func AggregateStatus(results []NodeResult, strategy AggregationStrategy) (isUp bool, avgLatencyNs int64) { func AggregateStatus(results []NodeResult, strategy AggregationStrategy) (isUp bool, avgLatencyNs int64) {
+48 -10
View File
@@ -2,13 +2,15 @@ package monitor
import ( import (
"context" "context"
"gitea.lerkolabs.com/lerko/uptop/internal/models" "fmt"
"net" "net"
"net/http" "net/http"
"strconv" "strconv"
"strings" "strings"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"github.com/miekg/dns" "github.com/miekg/dns"
probing "github.com/prometheus-community/pro-bing" probing "github.com/prometheus-community/pro-bing"
) )
@@ -20,9 +22,28 @@ type CheckResult struct {
LatencyNs int64 LatencyNs int64
HasSSL bool HasSSL bool
CertExpiry time.Time CertExpiry time.Time
ErrorReason string
} }
func RunCheck(site models.Site, strict, insecure *http.Client, globalInsecure bool) CheckResult { func RunCheck(site models.Site, strict, insecure *http.Client, globalInsecure bool, allowPrivate ...bool) CheckResult {
private := len(allowPrivate) > 0 && allowPrivate[0]
if site.Type != "http" && site.Type != "dns" && !private {
host := site.Hostname
if host == "" {
host = site.URL
}
if host != "" {
if ips, err := net.LookupIP(host); err == nil {
for _, ip := range ips {
if isPrivateIP(ip) {
return CheckResult{SiteID: site.ID, Status: "DOWN", ErrorReason: "target resolves to private IP"}
}
}
}
}
}
switch site.Type { switch site.Type {
case "http": case "http":
return runHTTPCheck(site, strict, insecure, globalInsecure) return runHTTPCheck(site, strict, insecure, globalInsecure)
@@ -33,7 +54,7 @@ func RunCheck(site models.Site, strict, insecure *http.Client, globalInsecure bo
case "dns": case "dns":
return runDNSCheck(site) return runDNSCheck(site)
default: default:
return CheckResult{SiteID: site.ID, Status: "DOWN"} return CheckResult{SiteID: site.ID, Status: "DOWN", ErrorReason: "unsupported monitor type: " + site.Type}
} }
} }
@@ -49,7 +70,7 @@ func runHTTPCheck(site models.Site, strict, insecure *http.Client, globalInsecur
req, err := http.NewRequestWithContext(ctx, method, site.URL, nil) req, err := http.NewRequestWithContext(ctx, method, site.URL, nil)
if err != nil { if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN"} return CheckResult{SiteID: site.ID, Status: "DOWN", ErrorReason: "invalid request: " + err.Error()}
} }
client := strict client := strict
@@ -69,6 +90,7 @@ func runHTTPCheck(site models.Site, strict, insecure *http.Client, globalInsecur
if err != nil { if err != nil {
result.Status = "DOWN" result.Status = "DOWN"
result.ErrorReason = truncateError(err.Error(), 256)
return result return result
} }
defer resp.Body.Close() defer resp.Body.Close()
@@ -76,6 +98,11 @@ func runHTTPCheck(site models.Site, strict, insecure *http.Client, globalInsecur
result.StatusCode = resp.StatusCode result.StatusCode = resp.StatusCode
if !isCodeAccepted(resp.StatusCode, site.AcceptedCodes) { if !isCodeAccepted(resp.StatusCode, site.AcceptedCodes) {
result.Status = "DOWN" result.Status = "DOWN"
expected := site.AcceptedCodes
if expected == "" {
expected = "200-299"
}
result.ErrorReason = fmt.Sprintf("HTTP %d (expected %s)", resp.StatusCode, expected)
} }
if site.CheckSSL && resp.TLS != nil && len(resp.TLS.PeerCertificates) > 0 { if site.CheckSSL && resp.TLS != nil && len(resp.TLS.PeerCertificates) > 0 {
@@ -84,6 +111,7 @@ func runHTTPCheck(site models.Site, strict, insecure *http.Client, globalInsecur
result.CertExpiry = cert.NotAfter result.CertExpiry = cert.NotAfter
if time.Now().After(cert.NotAfter) { if time.Now().After(cert.NotAfter) {
result.Status = "SSL EXP" result.Status = "SSL EXP"
result.ErrorReason = "SSL certificate expired"
} }
} }
@@ -98,7 +126,7 @@ func runPingCheck(site models.Site) CheckResult {
pinger, err := probing.NewPinger(host) pinger, err := probing.NewPinger(host)
if err != nil { if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN"} return CheckResult{SiteID: site.ID, Status: "DOWN", ErrorReason: "ping setup: " + err.Error()}
} }
pinger.Count = 1 pinger.Count = 1
pinger.Timeout = siteTimeout(site) pinger.Timeout = siteTimeout(site)
@@ -108,8 +136,11 @@ func runPingCheck(site models.Site) CheckResult {
err = pinger.Run() err = pinger.Run()
latency := time.Since(start) latency := time.Since(start)
if err != nil || pinger.Statistics().PacketsRecv == 0 { if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds()} return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds(), ErrorReason: "ping failed: " + err.Error()}
}
if pinger.Statistics().PacketsRecv == 0 {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds(), ErrorReason: "no ICMP response"}
} }
stats := pinger.Statistics() stats := pinger.Statistics()
@@ -129,7 +160,7 @@ func runPortCheck(site models.Site) CheckResult {
latency := time.Since(start) latency := time.Since(start)
if err != nil { if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds()} return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds(), ErrorReason: truncateError(err.Error(), 256)}
} }
_ = conn.Close() _ = conn.Close()
return CheckResult{SiteID: site.ID, Status: "UP", LatencyNs: latency.Nanoseconds()} return CheckResult{SiteID: site.ID, Status: "UP", LatencyNs: latency.Nanoseconds()}
@@ -180,10 +211,10 @@ func runDNSCheck(site models.Site) CheckResult {
latency := time.Since(start) latency := time.Since(start)
if err != nil { if err != nil {
return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds()} return CheckResult{SiteID: site.ID, Status: "DOWN", LatencyNs: latency.Nanoseconds(), ErrorReason: "DNS query failed: " + err.Error()}
} }
if r.Rcode != dns.RcodeSuccess { if r.Rcode != dns.RcodeSuccess {
return CheckResult{SiteID: site.ID, Status: "DOWN", StatusCode: r.Rcode, LatencyNs: latency.Nanoseconds()} return CheckResult{SiteID: site.ID, Status: "DOWN", StatusCode: r.Rcode, LatencyNs: latency.Nanoseconds(), ErrorReason: "DNS RCODE: " + dns.RcodeToString[r.Rcode]}
} }
return CheckResult{SiteID: site.ID, Status: "UP", LatencyNs: latency.Nanoseconds()} return CheckResult{SiteID: site.ID, Status: "UP", LatencyNs: latency.Nanoseconds()}
} }
@@ -216,3 +247,10 @@ func isCodeAccepted(code int, accepted string) bool {
} }
return false return false
} }
func truncateError(s string, max int) string {
if len(s) <= max {
return s
}
return s[:max-3] + "..."
}
+22 -3
View File
@@ -2,13 +2,14 @@ package monitor
import ( import (
"crypto/tls" "crypto/tls"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"net" "net"
"net/http" "net/http"
"net/http/httptest" "net/http/httptest"
"strconv" "strconv"
"testing" "testing"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
) )
func TestRunCheck_HTTP_Success(t *testing.T) { func TestRunCheck_HTTP_Success(t *testing.T) {
@@ -132,7 +133,7 @@ func TestRunCheck_Port_Open(t *testing.T) {
port, _ := strconv.Atoi(portStr) port, _ := strconv.Atoi(portStr)
site := models.Site{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 2} site := models.Site{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 2}
result := RunCheck(site, nil, nil, false) result := RunCheck(site, nil, nil, false, true)
if result.Status != "UP" { if result.Status != "UP" {
t.Errorf("expected UP, got %s", result.Status) t.Errorf("expected UP, got %s", result.Status)
@@ -152,13 +153,31 @@ func TestRunCheck_Port_Closed(t *testing.T) {
ln.Close() ln.Close()
site := models.Site{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 1} site := models.Site{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 1}
result := RunCheck(site, nil, nil, false) result := RunCheck(site, nil, nil, false, true)
if result.Status != "DOWN" { if result.Status != "DOWN" {
t.Errorf("expected DOWN, got %s", result.Status) t.Errorf("expected DOWN, got %s", result.Status)
} }
} }
func TestRunCheck_Port_BlocksPrivateByDefault(t *testing.T) {
ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatal(err)
}
defer ln.Close()
_, portStr, _ := net.SplitHostPort(ln.Addr().String())
port, _ := strconv.Atoi(portStr)
site := models.Site{ID: 1, Type: "port", Hostname: "127.0.0.1", Port: port, Timeout: 2}
result := RunCheck(site, nil, nil, false)
if result.Status != "DOWN" {
t.Errorf("expected DOWN when private targets blocked, got %s", result.Status)
}
}
func TestRunCheck_UnknownType(t *testing.T) { func TestRunCheck_UnknownType(t *testing.T) {
site := models.Site{ID: 1, Type: "invalid"} site := models.Site{ID: 1, Type: "invalid"}
result := RunCheck(site, nil, nil, false) result := RunCheck(site, nil, nil, false)
+228 -36
View File
@@ -4,15 +4,33 @@ import (
"context" "context"
"crypto/tls" "crypto/tls"
"fmt" "fmt"
"math/rand/v2"
"net/http"
"regexp"
"strings"
"sync"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/alert" "gitea.lerkolabs.com/lerko/uptop/internal/alert"
"gitea.lerkolabs.com/lerko/uptop/internal/models" "gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/store" "gitea.lerkolabs.com/lerko/uptop/internal/store"
"math/rand/v2"
"net/http"
"sync"
"time"
) )
const (
maxLogEntries = 100
pollInterval = 5 * time.Second
minCheckInterval = 5
minPushGrace = 60 * time.Second
)
type AlertHealth struct {
LastSendAt time.Time
LastSendOK bool
LastError string
SendCount int
FailCount int
}
type Engine struct { type Engine struct {
mu sync.RWMutex mu sync.RWMutex
liveState map[int]models.Site liveState map[int]models.Site
@@ -32,26 +50,47 @@ type Engine struct {
probeResults map[int]map[string]NodeResult probeResults map[int]map[string]NodeResult
aggStrategy AggregationStrategy aggStrategy AggregationStrategy
alertHealthMu sync.RWMutex
alertHealth map[int]AlertHealth
db store.Store db store.Store
insecureSkipVerify bool insecureSkipVerify bool
allowPrivateTargets bool
strictClient *http.Client strictClient *http.Client
insecureClient *http.Client insecureClient *http.Client
} }
func NewEngine(s store.Store) *Engine { func NewEngine(s store.Store) *Engine {
return newEngine(s, false)
}
func NewEngineWithOpts(s store.Store, allowPrivateTargets bool) *Engine {
return newEngine(s, allowPrivateTargets)
}
func newEngine(s store.Store, allowPrivateTargets bool) *Engine {
dial := SafeDialContext(allowPrivateTargets)
return &Engine{ return &Engine{
liveState: make(map[int]models.Site), liveState: make(map[int]models.Site),
histories: make(map[int]*SiteHistory), histories: make(map[int]*SiteHistory),
tokenIndex: make(map[string]int), tokenIndex: make(map[string]int),
probeResults: make(map[int]map[string]NodeResult), probeResults: make(map[int]map[string]NodeResult),
alertHealth: make(map[int]AlertHealth),
aggStrategy: AggAnyDown, aggStrategy: AggAnyDown,
isActive: true, isActive: true,
allowPrivateTargets: allowPrivateTargets,
db: s, db: s,
strictClient: &http.Client{ strictClient: &http.Client{
Transport: &http.Transport{TLSClientConfig: &tls.Config{InsecureSkipVerify: false}}, Transport: &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: false},
DialContext: dial,
},
}, },
insecureClient: &http.Client{ insecureClient: &http.Client{
Transport: &http.Transport{TLSClientConfig: &tls.Config{InsecureSkipVerify: true}}, //nolint:gosec // intentional for IgnoreTLS sites Transport: &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: true}, //nolint:gosec // intentional for IgnoreTLS sites
DialContext: dial,
},
}, },
} }
} }
@@ -60,14 +99,36 @@ func (e *Engine) SetInsecureSkipVerify(skip bool) {
e.insecureSkipVerify = skip e.insecureSkipVerify = skip
} }
var ansiRe = regexp.MustCompile(`\x1b\[[0-9;]*[a-zA-Z]`)
func sanitizeLog(s string) string {
s = ansiRe.ReplaceAllString(s, "")
s = strings.ReplaceAll(s, "\n", "\\n")
s = strings.ReplaceAll(s, "\r", "")
return s
}
func fmtDurationShort(d time.Duration) string {
if d < time.Minute {
return fmt.Sprintf("%ds", int(d.Seconds()))
}
if d < time.Hour {
return fmt.Sprintf("%dm", int(d.Minutes()))
}
if d < 24*time.Hour {
return fmt.Sprintf("%dh %dm", int(d.Hours()), int(d.Minutes())%60)
}
return fmt.Sprintf("%dd %dh", int(d.Hours())/24, int(d.Hours())%24)
}
func (e *Engine) AddLog(msg string) { func (e *Engine) AddLog(msg string) {
e.logMu.Lock() e.logMu.Lock()
defer e.logMu.Unlock() defer e.logMu.Unlock()
ts := time.Now().Format("15:04:05") ts := time.Now().Format("15:04:05")
entry := fmt.Sprintf("[%s] %s", ts, msg) entry := fmt.Sprintf("[%s] %s", ts, sanitizeLog(msg))
e.logStore = append([]string{entry}, e.logStore...) e.logStore = append([]string{entry}, e.logStore...)
if len(e.logStore) > 100 { if len(e.logStore) > maxLogEntries {
e.logStore = e.logStore[:100] e.logStore = e.logStore[:maxLogEntries]
} }
go func() { _ = e.db.SaveLog(entry) }() go func() { _ = e.db.SaveLog(entry) }()
} }
@@ -150,17 +211,38 @@ func (e *Engine) RecordHeartbeat(token string) bool {
return false return false
} }
prevStatus := site.Status
site.LastCheck = time.Now() site.LastCheck = time.Now()
wasDown := site.Status == "DOWN"
site.Status = "UP" site.Status = "UP"
site.FailureCount = 0 site.FailureCount = 0
site.Latency = 0 site.Latency = 0
site.LastError = ""
site.LastSuccessAt = time.Now()
if prevStatus != "UP" {
site.StatusChangedAt = time.Now()
}
e.liveState[targetID] = site e.liveState[targetID] = site
if wasDown { switch prevStatus {
e.AddLog(fmt.Sprintf("Push Monitor '%s' recovered", site.Name)) case "PENDING":
e.triggerAlert(site.AlertID, "✅ RECOVERY", fmt.Sprintf("Push Monitor '%s' is receiving heartbeats.", site.Name)) e.AddLog(fmt.Sprintf("Push Monitor '%s' received first heartbeat", site.Name))
case "LATE":
e.AddLog(fmt.Sprintf("Push Monitor '%s' heartbeat arrived (was late)", site.Name))
case "DOWN":
downDur := ""
if !site.StatusChangedAt.IsZero() {
downDur = fmt.Sprintf(" (was down %s)", fmtDurationShort(time.Since(site.StatusChangedAt)))
} }
e.AddLog(fmt.Sprintf("Push Monitor '%s' recovered%s", site.Name, downDur))
go e.triggerAlert(site.AlertID, "✅ RECOVERY", fmt.Sprintf("Push Monitor '%s' is receiving heartbeats.%s", site.Name, downDur))
}
if prevStatus != "UP" && prevStatus != "PENDING" {
go func() { _ = e.db.SaveStateChange(targetID, prevStatus, "UP", "") }()
}
return true return true
} }
@@ -192,7 +274,7 @@ func (e *Engine) Start(ctx context.Context) {
if err != nil { if err != nil {
e.AddLog(fmt.Sprintf("Failed to load sites: %v", err)) e.AddLog(fmt.Sprintf("Failed to load sites: %v", err))
select { select {
case <-time.After(5 * time.Second): case <-time.After(pollInterval):
case <-ctx.Done(): case <-ctx.Done():
return return
} }
@@ -205,9 +287,6 @@ func (e *Engine) Start(ctx context.Context) {
if !exists { if !exists {
e.mu.Lock() e.mu.Lock()
s.Status = "PENDING" s.Status = "PENDING"
if s.Type == "push" {
s.LastCheck = time.Now()
}
if h, ok := e.GetHistory(s.ID); ok && len(h.Statuses) > 0 { if h, ok := e.GetHistory(s.ID); ok && len(h.Statuses) > 0 {
if h.Statuses[len(h.Statuses)-1] { if h.Statuses[len(h.Statuses)-1] {
s.Status = "UP" s.Status = "UP"
@@ -226,7 +305,7 @@ func (e *Engine) Start(ctx context.Context) {
} }
select { select {
case <-time.After(5 * time.Second): case <-time.After(pollInterval):
case <-ctx.Done(): case <-ctx.Done():
return return
} }
@@ -247,6 +326,9 @@ func (e *Engine) UpdateSiteConfig(site models.Site) {
site.LastCheck = existing.LastCheck site.LastCheck = existing.LastCheck
site.SentSSLWarning = existing.SentSSLWarning site.SentSSLWarning = existing.SentSSLWarning
site.FailureCount = existing.FailureCount site.FailureCount = existing.FailureCount
site.LastError = existing.LastError
site.StatusChangedAt = existing.StatusChangedAt
site.LastSuccessAt = existing.LastSuccessAt
e.liveState[site.ID] = site e.liveState[site.ID] = site
e.addToTokenIndex(site) e.addToTokenIndex(site)
} }
@@ -296,7 +378,7 @@ func (e *Engine) monitorRoutine(ctx context.Context, id int) {
if !e.IsActive() { if !e.IsActive() {
select { select {
case <-time.After(5 * time.Second): case <-time.After(pollInterval):
case <-ctx.Done(): case <-ctx.Done():
return return
} }
@@ -312,7 +394,7 @@ func (e *Engine) monitorRoutine(ctx context.Context, id int) {
if site.Paused { if site.Paused {
select { select {
case <-time.After(5 * time.Second): case <-time.After(pollInterval):
case <-ctx.Done(): case <-ctx.Done():
return return
} }
@@ -320,8 +402,8 @@ func (e *Engine) monitorRoutine(ctx context.Context, id int) {
} }
interval := site.Interval interval := site.Interval
if interval < 5 { if interval < minCheckInterval {
interval = 5 interval = minCheckInterval
} }
jitter := time.Duration(rand.IntN(interval*100)) * time.Millisecond //nolint:gosec // non-security jitter jitter := time.Duration(rand.IntN(interval*100)) * time.Millisecond //nolint:gosec // non-security jitter
select { select {
@@ -351,39 +433,68 @@ func (e *Engine) checkByID(id int) {
case "group": case "group":
e.checkGroup(site) e.checkGroup(site)
default: default:
result := RunCheck(site, e.strictClient, e.insecureClient, e.insecureSkipVerify) result := RunCheck(site, e.strictClient, e.insecureClient, e.insecureSkipVerify, e.allowPrivateTargets)
updatedSite := site updatedSite := site
updatedSite.HasSSL = result.HasSSL updatedSite.HasSSL = result.HasSSL
updatedSite.CertExpiry = result.CertExpiry updatedSite.CertExpiry = result.CertExpiry
updatedSite.Latency = time.Duration(result.LatencyNs) updatedSite.Latency = time.Duration(result.LatencyNs)
updatedSite.LastCheck = time.Now() updatedSite.LastCheck = time.Now()
e.handleStatusChange(updatedSite, result.Status, result.StatusCode, time.Duration(result.LatencyNs)) e.handleStatusChange(updatedSite, result.Status, result.StatusCode, time.Duration(result.LatencyNs), result.ErrorReason)
} }
} }
func (e *Engine) checkPush(site models.Site) { func (e *Engine) checkPush(site models.Site) {
deadline := site.LastCheck.Add(time.Duration(site.Interval) * time.Second).Add(5 * time.Second) if site.Status == "PENDING" {
if time.Now().After(deadline) { return
e.handleStatusChange(site, "DOWN", 0, 0) }
} else if site.Status != "UP" {
e.handleStatusChange(site, "UP", 200, 0) interval := time.Duration(site.Interval) * time.Second
grace := interval / 2
if grace < minPushGrace {
grace = minPushGrace
}
overdue := site.LastCheck.Add(interval)
graceEnd := overdue.Add(grace)
now := time.Now()
if now.After(graceEnd) {
if site.Status != "DOWN" {
e.handleStatusChange(site, "DOWN", 0, 0, "heartbeat missed")
}
} else if now.After(overdue) {
if site.Status != "LATE" {
e.handleStatusChange(site, "LATE", 0, 0, "heartbeat overdue")
}
} }
} }
func (e *Engine) handleStatusChange(site models.Site, rawStatus string, code int, latency time.Duration) { func (e *Engine) handleStatusChange(site models.Site, rawStatus string, code int, latency time.Duration, errorReason string) {
if !e.IsActive() { if !e.IsActive() {
return return
} }
newState := site newState := site
newState.StatusCode = code newState.StatusCode = code
newState.LastError = errorReason
if rawStatus == "UP" {
newState.LastSuccessAt = time.Now()
newState.LastError = ""
} else {
newState.LastSuccessAt = site.LastSuccessAt
}
if site.Status == "UP" && rawStatus != "UP" { if site.Status == "UP" && rawStatus != "UP" {
newState.FailureCount++ newState.FailureCount++
if newState.FailureCount > site.MaxRetries { if newState.FailureCount > site.MaxRetries {
newState.Status = rawStatus newState.Status = rawStatus
newState.FailureCount = site.MaxRetries + 1 newState.FailureCount = site.MaxRetries + 1
if errorReason != "" {
e.AddLog(fmt.Sprintf("Monitor '%s' confirmed DOWN: %s", site.Name, errorReason))
} else {
e.AddLog(fmt.Sprintf("Monitor '%s' confirmed DOWN", site.Name)) e.AddLog(fmt.Sprintf("Monitor '%s' confirmed DOWN", site.Name))
}
} else { } else {
e.AddLog(fmt.Sprintf("Monitor '%s' failed check %d/%d", site.Name, newState.FailureCount, site.MaxRetries)) e.AddLog(fmt.Sprintf("Monitor '%s' failed check %d/%d", site.Name, newState.FailureCount, site.MaxRetries))
} }
@@ -395,6 +506,14 @@ func (e *Engine) handleStatusChange(site models.Site, rawStatus string, code int
newState.FailureCount = site.MaxRetries + 1 newState.FailureCount = site.MaxRetries + 1
} }
if newState.Status != site.Status && site.Status != "PENDING" {
newState.StatusChangedAt = time.Now()
} else if site.StatusChangedAt.IsZero() && newState.Status != "PENDING" {
newState.StatusChangedAt = time.Now()
} else {
newState.StatusChangedAt = site.StatusChangedAt
}
inMaint := e.isInMaintenance(site.ID) inMaint := e.isInMaintenance(site.ID)
if site.Type == "http" && site.CheckSSL && site.HasSSL { if site.Type == "http" && site.CheckSSL && site.HasSSL {
@@ -419,12 +538,24 @@ func (e *Engine) handleStatusChange(site models.Site, rawStatus string, code int
e.recordCheck(site.ID, latency, rawStatus == "UP") e.recordCheck(site.ID, latency, rawStatus == "UP")
if newState.Status != site.Status && site.Status != "PENDING" {
go func() { _ = e.db.SaveStateChange(site.ID, site.Status, newState.Status, errorReason) }()
}
isBroken := func(s string) bool { return s == "DOWN" || s == "SSL EXP" } isBroken := func(s string) bool { return s == "DOWN" || s == "SSL EXP" }
if site.Status == "UP" && newState.Status == "LATE" {
e.AddLog(fmt.Sprintf("Monitor '%s' heartbeat overdue", site.Name))
}
if !isBroken(site.Status) && isBroken(newState.Status) && newState.Status != "PENDING" { if !isBroken(site.Status) && isBroken(newState.Status) && newState.Status != "PENDING" {
if inMaint { if inMaint {
e.AddLog(fmt.Sprintf("Monitor '%s' is DOWN (alerts suppressed — maintenance)", site.Name)) e.AddLog(fmt.Sprintf("Monitor '%s' is DOWN (alerts suppressed — maintenance)", site.Name))
} else { } else {
msg := fmt.Sprintf("Monitor '%s' is DOWN (%s)", site.Name, rawStatus) msg := fmt.Sprintf("Monitor '%s' is DOWN (%s)", site.Name, rawStatus)
if errorReason != "" {
msg = fmt.Sprintf("Monitor '%s' is DOWN: %s", site.Name, errorReason)
}
if site.Type == "push" { if site.Type == "push" {
msg = fmt.Sprintf("Push Monitor '%s' missed heartbeat.", site.Name) msg = fmt.Sprintf("Push Monitor '%s' missed heartbeat.", site.Name)
} }
@@ -432,11 +563,17 @@ func (e *Engine) handleStatusChange(site models.Site, rawStatus string, code int
} }
} }
if isBroken(site.Status) && newState.Status == "UP" { if isBroken(site.Status) && newState.Status == "UP" {
if !inMaint { downDur := ""
e.triggerAlert(site.AlertID, "✅ RECOVERY", fmt.Sprintf("Monitor '%s' is UP", site.Name)) if !site.StatusChangedAt.IsZero() {
} else { downDur = fmt.Sprintf(" (was down %s)", fmtDurationShort(time.Since(site.StatusChangedAt)))
e.AddLog(fmt.Sprintf("Monitor '%s' recovered (maintenance active, alert suppressed)", site.Name))
} }
e.AddLog(fmt.Sprintf("Monitor '%s' recovered%s", site.Name, downDur))
if !inMaint {
e.triggerAlert(site.AlertID, "✅ RECOVERY", fmt.Sprintf("Monitor '%s' is UP%s", site.Name, downDur))
}
}
if site.Status == "LATE" && newState.Status == "UP" && !isBroken(site.Status) {
e.AddLog(fmt.Sprintf("Monitor '%s' heartbeat arrived (was late)", site.Name))
} }
} }
@@ -453,11 +590,57 @@ func (e *Engine) triggerAlert(alertID int, title, message string) {
defer cancel() defer cancel()
if err := provider.Send(ctx, title, message); err != nil { if err := provider.Send(ctx, title, message); err != nil {
e.AddLog(fmt.Sprintf("Alert send failed (%s): %v", cfg.Name, err)) e.AddLog(fmt.Sprintf("Alert send failed (%s): %v", cfg.Name, err))
e.recordAlertResult(alertID, false, err.Error())
} else {
e.recordAlertResult(alertID, true, "")
} }
}() }()
} }
} }
func (e *Engine) recordAlertResult(alertID int, ok bool, errMsg string) {
e.alertHealthMu.Lock()
defer e.alertHealthMu.Unlock()
h := e.alertHealth[alertID]
h.LastSendAt = time.Now()
h.LastSendOK = ok
h.SendCount++
if ok {
h.LastError = ""
} else {
h.LastError = errMsg
h.FailCount++
}
e.alertHealth[alertID] = h
}
func (e *Engine) GetAlertHealth(alertID int) AlertHealth {
e.alertHealthMu.RLock()
defer e.alertHealthMu.RUnlock()
return e.alertHealth[alertID]
}
func (e *Engine) TestAlert(alertID int) error {
cfg, err := e.db.GetAlert(alertID)
if err != nil {
return fmt.Errorf("failed to load alert: %w", err)
}
provider := alert.GetProvider(cfg)
if provider == nil {
return fmt.Errorf("no provider for type %q", cfg.Type)
}
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
err = provider.Send(ctx, "🧪 Test Alert", fmt.Sprintf("Test notification from uptop for channel '%s'.", cfg.Name))
if err != nil {
e.recordAlertResult(alertID, false, err.Error())
return err
}
e.recordAlertResult(alertID, true, "")
e.AddLog(fmt.Sprintf("Test alert sent to '%s'", cfg.Name))
return nil
}
func (e *Engine) isInMaintenance(monitorID int) bool { func (e *Engine) isInMaintenance(monitorID int) bool {
inMaint, err := e.db.IsMonitorInMaintenance(monitorID) inMaint, err := e.db.IsMonitorInMaintenance(monitorID)
if err != nil { if err != nil {
@@ -518,7 +701,7 @@ func (e *Engine) SetAggStrategy(strategy AggregationStrategy) {
e.aggStrategy = strategy e.aggStrategy = strategy
} }
func (e *Engine) IngestProbeResult(nodeID string, siteID int, latencyNs int64, isUp bool) { func (e *Engine) IngestProbeResult(nodeID string, siteID int, latencyNs int64, isUp bool, errorReason string) {
e.probeResultsMu.Lock() e.probeResultsMu.Lock()
if e.probeResults[siteID] == nil { if e.probeResults[siteID] == nil {
e.probeResults[siteID] = make(map[string]NodeResult) e.probeResults[siteID] = make(map[string]NodeResult)
@@ -528,6 +711,7 @@ func (e *Engine) IngestProbeResult(nodeID string, siteID int, latencyNs int64, i
IsUp: isUp, IsUp: isUp,
LatencyNs: latencyNs, LatencyNs: latencyNs,
CheckedAt: time.Now(), CheckedAt: time.Now(),
ErrorReason: errorReason,
} }
results := make([]NodeResult, 0, len(e.probeResults[siteID])) results := make([]NodeResult, 0, len(e.probeResults[siteID]))
for _, r := range e.probeResults[siteID] { for _, r := range e.probeResults[siteID] {
@@ -552,7 +736,7 @@ func (e *Engine) IngestProbeResult(nodeID string, siteID int, latencyNs int64, i
updatedSite := site updatedSite := site
updatedSite.Latency = time.Duration(avgLatency) updatedSite.Latency = time.Duration(avgLatency)
updatedSite.LastCheck = time.Now() updatedSite.LastCheck = time.Now()
e.handleStatusChange(updatedSite, rawStatus, 0, time.Duration(avgLatency)) e.handleStatusChange(updatedSite, rawStatus, 0, time.Duration(avgLatency), errorReason)
} }
func (e *Engine) GetProbeResults(siteID int) map[string]NodeResult { func (e *Engine) GetProbeResults(siteID int) map[string]NodeResult {
@@ -565,3 +749,11 @@ func (e *Engine) GetProbeResults(siteID int) map[string]NodeResult {
} }
return cp return cp
} }
func (e *Engine) GetStateChanges(siteID int, limit int) []models.StateChange {
changes, err := e.db.GetStateChanges(siteID, limit)
if err != nil {
return nil
}
return changes
}
+42 -21
View File
@@ -2,10 +2,11 @@ package monitor
import ( import (
"fmt" "fmt"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"sync" "sync"
"testing" "testing"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
) )
// --- Mock Store --- // --- Mock Store ---
@@ -73,6 +74,8 @@ func (m *mockStore) EndMaintenanceWindow(int) error { retur
func (m *mockStore) DeleteMaintenanceWindow(int) error { return nil } func (m *mockStore) DeleteMaintenanceWindow(int) error { return nil }
func (m *mockStore) GetPreference(string) (string, error) { return "", nil } func (m *mockStore) GetPreference(string) (string, error) { return "", nil }
func (m *mockStore) SetPreference(string, string) error { return nil } func (m *mockStore) SetPreference(string, string) error { return nil }
func (m *mockStore) SaveStateChange(int, string, string, string) error { return nil }
func (m *mockStore) GetStateChanges(int, int) ([]models.StateChange, error) { return nil, nil }
func (m *mockStore) Close() error { return nil } func (m *mockStore) Close() error { return nil }
func (m *mockStore) GetAllAlerts() ([]models.AlertConfig, error) { func (m *mockStore) GetAllAlerts() ([]models.AlertConfig, error) {
@@ -174,7 +177,7 @@ func TestHandleStatusChange_PendingToUp(t *testing.T) {
site := models.Site{ID: 1, Name: "test", Status: "PENDING", MaxRetries: 3, AlertID: 1} site := models.Site{ID: 1, Name: "test", Status: "PENDING", MaxRetries: 3, AlertID: 1}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "UP", 200, 10*time.Millisecond) e.handleStatusChange(site, "UP", 200, 10*time.Millisecond, "")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if s.Status != "UP" { if s.Status != "UP" {
@@ -195,7 +198,7 @@ func TestHandleStatusChange_UpIncrementFailure(t *testing.T) {
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 3, FailureCount: 0} site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 3, FailureCount: 0}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "DOWN", 500, 0) e.handleStatusChange(site, "DOWN", 500, 0, "test error")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if s.Status != "UP" { if s.Status != "UP" {
@@ -213,7 +216,7 @@ func TestHandleStatusChange_UpToDown_ExceedsRetries(t *testing.T) {
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 2, FailureCount: 2, AlertID: 1} site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 2, FailureCount: 2, AlertID: 1}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "DOWN", 500, 0) e.handleStatusChange(site, "DOWN", 500, 0, "test error")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if s.Status != "DOWN" { if s.Status != "DOWN" {
@@ -236,7 +239,7 @@ func TestHandleStatusChange_UpToDown_ZeroRetries(t *testing.T) {
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 0, FailureCount: 0, AlertID: 1} site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 0, FailureCount: 0, AlertID: 1}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "DOWN", 0, 0) e.handleStatusChange(site, "DOWN", 0, 0, "test error")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if s.Status != "DOWN" { if s.Status != "DOWN" {
@@ -255,7 +258,7 @@ func TestHandleStatusChange_DownToUp_Recovery(t *testing.T) {
site := models.Site{ID: 1, Name: "test", Status: "DOWN", FailureCount: 4, AlertID: 1} site := models.Site{ID: 1, Name: "test", Status: "DOWN", FailureCount: 4, AlertID: 1}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "UP", 200, 5*time.Millisecond) e.handleStatusChange(site, "UP", 200, 5*time.Millisecond, "")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if s.Status != "UP" { if s.Status != "UP" {
@@ -276,7 +279,7 @@ func TestHandleStatusChange_DownStaysDown(t *testing.T) {
site := models.Site{ID: 1, Name: "test", Status: "DOWN", MaxRetries: 2, FailureCount: 3} site := models.Site{ID: 1, Name: "test", Status: "DOWN", MaxRetries: 2, FailureCount: 3}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "DOWN", 0, 0) e.handleStatusChange(site, "DOWN", 0, 0, "test error")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if s.Status != "DOWN" { if s.Status != "DOWN" {
@@ -295,7 +298,7 @@ func TestHandleStatusChange_SSLExpired(t *testing.T) {
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 0, AlertID: 1} site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 0, AlertID: 1}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "SSL EXP", 0, 0) e.handleStatusChange(site, "SSL EXP", 0, 0, "SSL certificate expired")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if s.Status != "SSL EXP" { if s.Status != "SSL EXP" {
@@ -315,7 +318,7 @@ func TestHandleStatusChange_AlertSuppressedMaintenance(t *testing.T) {
site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 0, AlertID: 1} site := models.Site{ID: 1, Name: "test", Status: "UP", MaxRetries: 0, AlertID: 1}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "DOWN", 0, 0) e.handleStatusChange(site, "DOWN", 0, 0, "test error")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if s.Status != "DOWN" { if s.Status != "DOWN" {
@@ -346,7 +349,7 @@ func TestHandleStatusChange_RecoverySuppressedMaintenance(t *testing.T) {
site := models.Site{ID: 1, Name: "test", Status: "DOWN", AlertID: 1} site := models.Site{ID: 1, Name: "test", Status: "DOWN", AlertID: 1}
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "UP", 200, 0) e.handleStatusChange(site, "UP", 200, 0, "")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if s.Status != "UP" { if s.Status != "UP" {
@@ -370,7 +373,7 @@ func TestHandleStatusChange_SSLWarning(t *testing.T) {
} }
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "UP", 200, 0) e.handleStatusChange(site, "UP", 200, 0, "")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if !s.SentSSLWarning { if !s.SentSSLWarning {
@@ -393,7 +396,7 @@ func TestHandleStatusChange_SSLWarningNotRepeated(t *testing.T) {
} }
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "UP", 200, 0) e.handleStatusChange(site, "UP", 200, 0, "")
waitAsync() waitAsync()
if len(ms.getAlertCallsSnapshot()) != 0 { if len(ms.getAlertCallsSnapshot()) != 0 {
@@ -412,7 +415,7 @@ func TestHandleStatusChange_SSLWarningReset(t *testing.T) {
} }
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "UP", 200, 0) e.handleStatusChange(site, "UP", 200, 0, "")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if s.SentSSLWarning { if s.SentSSLWarning {
@@ -433,7 +436,7 @@ func TestHandleStatusChange_SSLWarningSuppressedMaint(t *testing.T) {
} }
injectSite(e, site) injectSite(e, site)
e.handleStatusChange(site, "UP", 200, 0) e.handleStatusChange(site, "UP", 200, 0, "")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if !s.SentSSLWarning { if !s.SentSSLWarning {
@@ -452,7 +455,7 @@ func TestHandleStatusChange_InactiveEngine(t *testing.T) {
injectSite(e, site) injectSite(e, site)
e.SetActive(false) e.SetActive(false)
e.handleStatusChange(site, "DOWN", 0, 0) e.handleStatusChange(site, "DOWN", 0, 0, "test error")
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if s.Status != "UP" { if s.Status != "UP" {
@@ -534,7 +537,7 @@ func TestCheckPush_DeadlineMissed(t *testing.T) {
site := models.Site{ site := models.Site{
ID: 1, Name: "push", Type: "push", Status: "UP", ID: 1, Name: "push", Type: "push", Status: "UP",
Interval: 10, MaxRetries: 0, Interval: 10, MaxRetries: 0,
LastCheck: time.Now().Add(-20 * time.Second), LastCheck: time.Now().Add(-120 * time.Second),
} }
injectSite(e, site) injectSite(e, site)
@@ -546,6 +549,24 @@ func TestCheckPush_DeadlineMissed(t *testing.T) {
} }
} }
func TestCheckPush_OverdueBecomesLate(t *testing.T) {
ms := newMockStore()
e := newTestEngine(ms)
site := models.Site{
ID: 1, Name: "push", Type: "push", Status: "UP",
Interval: 300,
LastCheck: time.Now().Add(-310 * time.Second),
}
injectSite(e, site)
e.checkPush(site)
s, _ := getSite(e, 1)
if s.Status != "LATE" {
t.Errorf("expected LATE when overdue but within grace, got %s", s.Status)
}
}
func TestCheckPush_WithinDeadline(t *testing.T) { func TestCheckPush_WithinDeadline(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
@@ -563,20 +584,20 @@ func TestCheckPush_WithinDeadline(t *testing.T) {
} }
} }
func TestCheckPush_PendingToUp(t *testing.T) { func TestCheckPush_PendingStaysPending(t *testing.T) {
ms := newMockStore() ms := newMockStore()
e := newTestEngine(ms) e := newTestEngine(ms)
site := models.Site{ site := models.Site{
ID: 1, Name: "push", Type: "push", Status: "PENDING", ID: 1, Name: "push", Type: "push", Status: "PENDING",
Interval: 60, LastCheck: time.Now(), Interval: 60,
} }
injectSite(e, site) injectSite(e, site)
e.checkPush(site) e.checkPush(site)
s, _ := getSite(e, 1) s, _ := getSite(e, 1)
if s.Status != "UP" { if s.Status != "PENDING" {
t.Errorf("expected UP, got %s", s.Status) t.Errorf("expected PENDING to stay until first heartbeat, got %s", s.Status)
} }
} }
@@ -991,7 +1012,7 @@ func TestConcurrent_HandleStatusChangeAndGetState(t *testing.T) {
wg.Add(2) wg.Add(2)
go func() { go func() {
defer wg.Done() defer wg.Done()
e.handleStatusChange(site, "DOWN", 500, 0) e.handleStatusChange(site, "DOWN", 500, 0, "test error")
}() }()
go func() { go func() {
defer wg.Done() defer wg.Done()
+68
View File
@@ -0,0 +1,68 @@
package monitor
import (
"context"
"fmt"
"net"
"time"
)
var privateRanges []*net.IPNet
func init() {
cidrs := []string{
"127.0.0.0/8",
"::1/128",
"10.0.0.0/8",
"172.16.0.0/12",
"192.168.0.0/16",
"169.254.0.0/16",
"fe80::/10",
"fc00::/7",
}
for _, cidr := range cidrs {
_, network, _ := net.ParseCIDR(cidr)
privateRanges = append(privateRanges, network)
}
}
func isPrivateIP(ip net.IP) bool {
for _, network := range privateRanges {
if network.Contains(ip) {
return true
}
}
return false
}
func SafeDialContext(allowPrivate bool) func(ctx context.Context, network, addr string) (net.Conn, error) {
return func(ctx context.Context, network, addr string) (net.Conn, error) {
host, port, err := net.SplitHostPort(addr)
if err != nil {
return nil, err
}
ips, err := net.DefaultResolver.LookupIPAddr(ctx, host)
if err != nil {
return nil, err
}
if !allowPrivate {
for _, ip := range ips {
if isPrivateIP(ip.IP) {
return nil, fmt.Errorf("blocked: %s resolves to private address %s", host, ip.IP)
}
}
}
dialer := &net.Dialer{Timeout: 10 * time.Second}
for _, ip := range ips {
target := net.JoinHostPort(ip.IP.String(), port)
conn, err := dialer.DialContext(ctx, network, target)
if err == nil {
return conn, nil
}
}
return nil, fmt.Errorf("failed to connect to %s", addr)
}
}
+47
View File
@@ -0,0 +1,47 @@
package monitor
import (
"net"
"testing"
)
func TestIsPrivateIP(t *testing.T) {
tests := []struct {
ip string
private bool
}{
{"127.0.0.1", true},
{"10.0.0.1", true},
{"172.16.0.1", true},
{"192.168.1.1", true},
{"169.254.169.254", true},
{"::1", true},
{"8.8.8.8", false},
{"1.1.1.1", false},
{"93.184.216.34", false},
}
for _, tt := range tests {
ip := net.ParseIP(tt.ip)
got := isPrivateIP(ip)
if got != tt.private {
t.Errorf("isPrivateIP(%s) = %v, want %v", tt.ip, got, tt.private)
}
}
}
func TestSafeDialContext_BlocksPrivate(t *testing.T) {
dial := SafeDialContext(false)
_, err := dial(t.Context(), "tcp", "127.0.0.1:80")
if err == nil {
t.Error("expected error dialing loopback with private blocking enabled")
}
}
func TestSafeDialContext_AllowsPrivate(t *testing.T) {
dial := SafeDialContext(true)
_, err := dial(t.Context(), "tcp", "127.0.0.1:80")
// Will fail to connect (nothing listening) but should NOT be blocked
if err != nil && err.Error() == "blocked: 127.0.0.1 resolves to private address 127.0.0.1" {
t.Error("should not block private IPs when allowPrivate is true")
}
}
+91
View File
@@ -0,0 +1,91 @@
package server
import (
"net"
"net/http"
"sync"
"time"
)
type visitor struct {
tokens float64
lastSeen time.Time
}
type RateLimiter struct {
mu sync.Mutex
visitors map[string]*visitor
rate float64
burst float64
}
func NewRateLimiter(requestsPerMinute int) *RateLimiter {
rl := &RateLimiter{
visitors: make(map[string]*visitor),
rate: float64(requestsPerMinute) / 60.0,
burst: float64(requestsPerMinute),
}
go rl.cleanup()
return rl
}
func (rl *RateLimiter) Allow(ip string) bool {
rl.mu.Lock()
defer rl.mu.Unlock()
v, exists := rl.visitors[ip]
now := time.Now()
if !exists {
rl.visitors[ip] = &visitor{tokens: rl.burst - 1, lastSeen: now}
return true
}
elapsed := now.Sub(v.lastSeen).Seconds()
v.tokens += elapsed * rl.rate
if v.tokens > rl.burst {
v.tokens = rl.burst
}
v.lastSeen = now
if v.tokens < 1 {
return false
}
v.tokens--
return true
}
func (rl *RateLimiter) cleanup() {
for {
time.Sleep(5 * time.Minute)
rl.mu.Lock()
cutoff := time.Now().Add(-10 * time.Minute)
for ip, v := range rl.visitors {
if v.lastSeen.Before(cutoff) {
delete(rl.visitors, ip)
}
}
rl.mu.Unlock()
}
}
func clientIP(r *http.Request) string {
if fwd := r.Header.Get("X-Forwarded-For"); fwd != "" {
return fwd
}
host, _, err := net.SplitHostPort(r.RemoteAddr)
if err != nil {
return r.RemoteAddr
}
return host
}
func RateLimit(limiter *RateLimiter, next http.HandlerFunc) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
if !limiter.Allow(clientIP(r)) {
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
next(w, r)
}
}
+170 -31
View File
@@ -4,23 +4,51 @@ import (
"crypto/subtle" "crypto/subtle"
"encoding/json" "encoding/json"
"fmt" "fmt"
"gitea.lerkolabs.com/lerko/uptop/internal/importer"
"gitea.lerkolabs.com/lerko/uptop/internal/metrics"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
"html/template" "html/template"
"log" "log"
"net/http" "net/http"
"sort" "sort"
"strings" "strings"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/importer"
"gitea.lerkolabs.com/lerko/uptop/internal/metrics"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
) )
const maxRequestBody = 1 << 20
func checkSecret(got, want string) bool { func checkSecret(got, want string) bool {
return subtle.ConstantTimeCompare([]byte(got), []byte(want)) == 1 return subtle.ConstantTimeCompare([]byte(got), []byte(want)) == 1
} }
func extractBearerToken(r *http.Request) string {
auth := r.Header.Get("Authorization")
if strings.HasPrefix(auth, "Bearer ") {
return strings.TrimPrefix(auth, "Bearer ")
}
return ""
}
var sensitiveKeys = map[string]bool{
"pass": true, "password": true, "token": true,
"routing_key": true, "user": true, "username": true,
}
func redactSettings(settings map[string]string) map[string]string {
redacted := make(map[string]string, len(settings))
for k, v := range settings {
if sensitiveKeys[k] && v != "" {
redacted[k] = "***REDACTED***"
} else {
redacted[k] = v
}
}
return redacted
}
var statusTpl = template.Must(template.New("status").Parse(` var statusTpl = template.Must(template.New("status").Parse(`
<!DOCTYPE html> <!DOCTYPE html>
<html> <html>
@@ -39,6 +67,7 @@ var statusTpl = template.Must(template.New("status").Parse(`
.UP { background: #9ece6a; color: #1a1b26; } .UP { background: #9ece6a; color: #1a1b26; }
.DOWN { background: #f7768e; color: #1a1b26; } .DOWN { background: #f7768e; color: #1a1b26; }
.PENDING { background: #e0af68; color: #1a1b26; } .PENDING { background: #e0af68; color: #1a1b26; }
.LATE { background: #e0af68; color: #1a1b26; }
.SSL-EXP { background: #e0af68; color: #1a1b26; } .SSL-EXP { background: #e0af68; color: #1a1b26; }
.PAUSED { background: #565f89; color: #c0caf5; } .PAUSED { background: #565f89; color: #c0caf5; }
.MAINT { background: #bb9af7; color: #1a1b26; } .MAINT { background: #bb9af7; color: #1a1b26; }
@@ -156,18 +185,39 @@ type ServerConfig struct {
Port int Port int
EnableStatus bool EnableStatus bool
Title string Title string
ClusterKey string // Shared Secret for Security ClusterKey string
TLSCert string
TLSKey string
ClusterMode string
MetricsPublic bool
CORSOrigin string
} }
func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server { func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
if cfg.ClusterKey == "" { if cfg.ClusterKey == "" {
fmt.Println("WARNING: No UPTOP_CLUSTER_SECRET set. Cluster API endpoints are unauthenticated.") fmt.Println("WARNING: No UPTOP_CLUSTER_SECRET set. Cluster API endpoints are unauthenticated.")
} }
pushRL := NewRateLimiter(60)
probeRL := NewRateLimiter(30)
backupRL := NewRateLimiter(10)
statusRL := NewRateLimiter(120)
mux := http.NewServeMux() mux := http.NewServeMux()
// 1. Push Heartbeat // 1. Push Heartbeat
mux.HandleFunc("/api/push", func(w http.ResponseWriter, r *http.Request) { mux.HandleFunc("/api/push", RateLimit(pushRL, func(w http.ResponseWriter, r *http.Request) {
token := r.URL.Query().Get("token") if r.Method != http.MethodGet && r.Method != http.MethodPost {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
token := extractBearerToken(r)
if token == "" {
if qt := r.URL.Query().Get("token"); qt != "" {
token = qt
log.Printf("DEPRECATED: push token in query string — use Authorization: Bearer header instead")
}
}
if token == "" { if token == "" {
http.Error(w, "Missing token", http.StatusBadRequest) http.Error(w, "Missing token", http.StatusBadRequest)
return return
@@ -178,10 +228,14 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
} else { } else {
http.Error(w, "Invalid Token", http.StatusNotFound) http.Error(w, "Invalid Token", http.StatusNotFound)
} }
}) }))
// 2. Health Check (For Cluster Follower) // 2. Health Check (For Cluster Follower)
mux.HandleFunc("/api/health", func(w http.ResponseWriter, r *http.Request) { mux.HandleFunc("/api/health", func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey != "" && !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) { if cfg.ClusterKey != "" && !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized) http.Error(w, "Unauthorized", http.StatusUnauthorized)
return return
@@ -191,7 +245,7 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
}) })
// 3. Config Export // 3. Config Export
mux.HandleFunc("/api/backup/export", func(w http.ResponseWriter, r *http.Request) { mux.HandleFunc("/api/backup/export", RateLimit(backupRL, func(w http.ResponseWriter, r *http.Request) {
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) { if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized: UPTOP_CLUSTER_SECRET required", http.StatusUnauthorized) http.Error(w, "Unauthorized: UPTOP_CLUSTER_SECRET required", http.StatusUnauthorized)
return return
@@ -202,11 +256,16 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
http.Error(w, "Export failed", http.StatusInternalServerError) http.Error(w, "Export failed", http.StatusInternalServerError)
return return
} }
if r.URL.Query().Get("redact_secrets") != "false" {
for i := range data.Alerts {
data.Alerts[i].Settings = redactSettings(data.Alerts[i].Settings)
}
}
_ = json.NewEncoder(w).Encode(data) //nolint:errcheck _ = json.NewEncoder(w).Encode(data) //nolint:errcheck
}) }))
// 4. Config Import // 4. Config Import
mux.HandleFunc("/api/backup/import", func(w http.ResponseWriter, r *http.Request) { mux.HandleFunc("/api/backup/import", RateLimit(backupRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" { if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed) http.Error(w, "POST required", http.StatusMethodNotAllowed)
return return
@@ -215,7 +274,7 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
http.Error(w, "Unauthorized", http.StatusUnauthorized) http.Error(w, "Unauthorized", http.StatusUnauthorized)
return return
} }
r.Body = http.MaxBytesReader(w, r.Body, 1<<20) r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var data models.Backup var data models.Backup
if err := json.NewDecoder(r.Body).Decode(&data); err != nil { if err := json.NewDecoder(r.Body).Decode(&data); err != nil {
http.Error(w, "Invalid JSON", http.StatusBadRequest) http.Error(w, "Invalid JSON", http.StatusBadRequest)
@@ -227,10 +286,10 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
return return
} }
_, _ = w.Write([]byte("Import Successful")) _, _ = w.Write([]byte("Import Successful"))
}) }))
// 5. Kuma Import // 5. Kuma Import
mux.HandleFunc("/api/import/kuma", func(w http.ResponseWriter, r *http.Request) { mux.HandleFunc("/api/import/kuma", RateLimit(backupRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" { if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed) http.Error(w, "POST required", http.StatusMethodNotAllowed)
return return
@@ -239,7 +298,7 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
http.Error(w, "Unauthorized", http.StatusUnauthorized) http.Error(w, "Unauthorized", http.StatusUnauthorized)
return return
} }
r.Body = http.MaxBytesReader(w, r.Body, 1<<20) r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var kb importer.KumaBackup var kb importer.KumaBackup
if err := json.NewDecoder(r.Body).Decode(&kb); err != nil { if err := json.NewDecoder(r.Body).Decode(&kb); err != nil {
log.Printf("Invalid Kuma JSON: %v", err) log.Printf("Invalid Kuma JSON: %v", err)
@@ -253,10 +312,10 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
return return
} }
fmt.Fprintf(w, "Imported %d monitors, %d alerts from Kuma v%s", len(backup.Sites), len(backup.Alerts), kb.Version) fmt.Fprintf(w, "Imported %d monitors, %d alerts from Kuma v%s", len(backup.Sites), len(backup.Alerts), kb.Version)
}) }))
// 6. Probe Registration // 6. Probe Registration
mux.HandleFunc("/api/probe/register", func(w http.ResponseWriter, r *http.Request) { mux.HandleFunc("/api/probe/register", RateLimit(probeRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" { if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed) http.Error(w, "POST required", http.StatusMethodNotAllowed)
return return
@@ -265,7 +324,7 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
http.Error(w, "Unauthorized", http.StatusUnauthorized) http.Error(w, "Unauthorized", http.StatusUnauthorized)
return return
} }
r.Body = http.MaxBytesReader(w, r.Body, 1<<20) r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var req struct { var req struct {
ID string `json:"id"` ID string `json:"id"`
Name string `json:"name"` Name string `json:"name"`
@@ -288,10 +347,14 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
return return
} }
_ = json.NewEncoder(w).Encode(map[string]bool{"ok": true}) //nolint:errcheck _ = json.NewEncoder(w).Encode(map[string]bool{"ok": true}) //nolint:errcheck
}) }))
// 7. Probe Assignment Fetch // 7. Probe Assignment Fetch
mux.HandleFunc("/api/probe/assignments", func(w http.ResponseWriter, r *http.Request) { mux.HandleFunc("/api/probe/assignments", RateLimit(probeRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) { if cfg.ClusterKey == "" || !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized) http.Error(w, "Unauthorized", http.StatusUnauthorized)
return return
@@ -325,10 +388,10 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
} }
w.Header().Set("Content-Type", "application/json") w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(map[string][]models.Site{"sites": assigned}) //nolint:errcheck _ = json.NewEncoder(w).Encode(map[string][]models.Site{"sites": assigned}) //nolint:errcheck
}) }))
// 8. Probe Result Submission // 8. Probe Result Submission
mux.HandleFunc("/api/probe/results", func(w http.ResponseWriter, r *http.Request) { mux.HandleFunc("/api/probe/results", RateLimit(probeRL, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" { if r.Method != "POST" {
http.Error(w, "POST required", http.StatusMethodNotAllowed) http.Error(w, "POST required", http.StatusMethodNotAllowed)
return return
@@ -337,13 +400,14 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
http.Error(w, "Unauthorized", http.StatusUnauthorized) http.Error(w, "Unauthorized", http.StatusUnauthorized)
return return
} }
r.Body = http.MaxBytesReader(w, r.Body, 1<<20) r.Body = http.MaxBytesReader(w, r.Body, maxRequestBody)
var req struct { var req struct {
NodeID string `json:"node_id"` NodeID string `json:"node_id"`
Results []struct { Results []struct {
SiteID int `json:"site_id"` SiteID int `json:"site_id"`
LatencyNs int64 `json:"latency_ns"` LatencyNs int64 `json:"latency_ns"`
IsUp bool `json:"is_up"` IsUp bool `json:"is_up"`
ErrorReason string `json:"error_reason"`
} `json:"results"` } `json:"results"`
} }
if err := json.NewDecoder(r.Body).Decode(&req); err != nil { if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
@@ -358,21 +422,33 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
if err := s.SaveCheckFromNode(result.SiteID, req.NodeID, result.LatencyNs, result.IsUp); err != nil { if err := s.SaveCheckFromNode(result.SiteID, req.NodeID, result.LatencyNs, result.IsUp); err != nil {
log.Printf("Failed to save probe result: %v", err) log.Printf("Failed to save probe result: %v", err)
} }
eng.IngestProbeResult(req.NodeID, result.SiteID, result.LatencyNs, result.IsUp) eng.IngestProbeResult(req.NodeID, result.SiteID, result.LatencyNs, result.IsUp, result.ErrorReason)
} }
if err := s.UpdateNodeLastSeen(req.NodeID); err != nil { if err := s.UpdateNodeLastSeen(req.NodeID); err != nil {
log.Printf("Failed to update node last seen: %v", err) log.Printf("Failed to update node last seen: %v", err)
} }
_ = json.NewEncoder(w).Encode(map[string]bool{"ok": true}) //nolint:errcheck _ = json.NewEncoder(w).Encode(map[string]bool{"ok": true}) //nolint:errcheck
}) }))
// 9. Prometheus Metrics // 9. Prometheus Metrics
mux.HandleFunc("/metrics", metrics.Handler(eng)) mux.HandleFunc("/metrics", func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
if !cfg.MetricsPublic && cfg.ClusterKey != "" {
if !checkSecret(r.Header.Get("X-Upkeep-Secret"), cfg.ClusterKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
}
metrics.Handler(eng)(w, r)
})
// 10. Status Page // 10. Status Page
if cfg.EnableStatus { if cfg.EnableStatus {
mux.HandleFunc("/status", func(w http.ResponseWriter, r *http.Request) { renderStatusPage(w, cfg.Title, eng) }) mux.HandleFunc("/status", RateLimit(statusRL, func(w http.ResponseWriter, r *http.Request) { renderStatusPage(w, cfg.Title, eng) }))
mux.HandleFunc("/status/json", func(w http.ResponseWriter, r *http.Request) { mux.HandleFunc("/status/json", RateLimit(statusRL, func(w http.ResponseWriter, r *http.Request) {
state := eng.GetLiveState() state := eng.GetLiveState()
activeWindows, _ := s.GetActiveMaintenanceWindows() activeWindows, _ := s.GetActiveMaintenanceWindows()
maintSet := make(map[int]bool) maintSet := make(map[int]bool)
@@ -394,22 +470,85 @@ func Start(cfg ServerConfig, s store.Store, eng *monitor.Engine) *http.Server {
} }
state[id] = site state[id] = site
} }
if cfg.CORSOrigin != "" {
w.Header().Set("Access-Control-Allow-Origin", cfg.CORSOrigin)
}
w.Header().Set("Content-Type", "application/json") w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(state) //nolint:errcheck _ = json.NewEncoder(w).Encode(state) //nolint:errcheck
}) }))
}
if cfg.ClusterMode != "" && cfg.ClusterMode != "leader" && cfg.TLSCert == "" {
fmt.Println("WARNING: Cluster mode active without TLS. Secrets transmitted in cleartext.")
}
handler := loggingMiddleware(securityHeadersMiddleware(mux))
if cfg.TLSCert != "" {
handler = hstsMiddleware(handler)
} }
addr := fmt.Sprintf(":%d", cfg.Port) addr := fmt.Sprintf(":%d", cfg.Port)
srv := &http.Server{Addr: addr, Handler: mux, ReadHeaderTimeout: 10 * time.Second} srv := &http.Server{
Addr: addr,
Handler: handler,
ReadHeaderTimeout: 10 * time.Second,
ReadTimeout: 30 * time.Second,
WriteTimeout: 60 * time.Second,
IdleTimeout: 120 * time.Second,
}
go func() { go func() {
if cfg.TLSCert != "" && cfg.TLSKey != "" {
fmt.Printf("HTTPS Server listening on %s\n", addr)
if err := srv.ListenAndServeTLS(cfg.TLSCert, cfg.TLSKey); err != nil && err != http.ErrServerClosed {
log.Printf("HTTPS server error: %v", err)
}
} else {
fmt.Printf("HTTP Server listening on %s\n", addr) fmt.Printf("HTTP Server listening on %s\n", addr)
if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed { if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
log.Printf("HTTP server error: %v", err) log.Printf("HTTP server error: %v", err)
} }
}
}() }()
return srv return srv
} }
type statusWriter struct {
http.ResponseWriter
code int
}
func (w *statusWriter) WriteHeader(code int) {
w.code = code
w.ResponseWriter.WriteHeader(code)
}
func loggingMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
sw := &statusWriter{ResponseWriter: w, code: 200}
next.ServeHTTP(sw, r)
path := strings.ReplaceAll(strings.ReplaceAll(r.URL.Path, "\n", ""), "\r", "")
log.Printf("%s %s %d %s %s", r.Method, path, sw.code, time.Since(start).Round(time.Millisecond), clientIP(r)) //nolint:gosec // path sanitized above
})
}
func securityHeadersMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("X-Content-Type-Options", "nosniff")
w.Header().Set("X-Frame-Options", "DENY")
w.Header().Set("Referrer-Policy", "no-referrer")
w.Header().Set("Content-Security-Policy", "default-src 'self'; script-src 'unsafe-inline'; style-src 'unsafe-inline'")
next.ServeHTTP(w, r)
})
}
func hstsMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Strict-Transport-Security", "max-age=63072000; includeSubDomains")
next.ServeHTTP(w, r)
})
}
func renderStatusPage(w http.ResponseWriter, title string, eng *monitor.Engine) { func renderStatusPage(w http.ResponseWriter, title string, eng *monitor.Engine) {
sites := eng.GetAllSites() sites := eng.GetAllSites()
+5 -2
View File
@@ -4,13 +4,14 @@ import (
"bytes" "bytes"
"encoding/json" "encoding/json"
"fmt" "fmt"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"net" "net"
"net/http" "net/http"
"sync" "sync"
"testing" "testing"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
) )
// --- Mock Store --- // --- Mock Store ---
@@ -75,6 +76,8 @@ func (m *mockStore) DeleteMaintenanceWindow(int) error { retur
func (m *mockStore) IsMonitorInMaintenance(int) (bool, error) { return false, nil } func (m *mockStore) IsMonitorInMaintenance(int) (bool, error) { return false, nil }
func (m *mockStore) GetPreference(string) (string, error) { return "", nil } func (m *mockStore) GetPreference(string) (string, error) { return "", nil }
func (m *mockStore) SetPreference(string, string) error { return nil } func (m *mockStore) SetPreference(string, string) error { return nil }
func (m *mockStore) SaveStateChange(int, string, string, string) error { return nil }
func (m *mockStore) GetStateChanges(int, int) ([]models.StateChange, error) { return nil, nil }
func (m *mockStore) Close() error { return nil } func (m *mockStore) Close() error { return nil }
func (m *mockStore) ExportData() (models.Backup, error) { func (m *mockStore) ExportData() (models.Backup, error) {
+70
View File
@@ -0,0 +1,70 @@
package store
import (
"crypto/aes"
"crypto/cipher"
"crypto/rand"
"encoding/base64"
"encoding/hex"
"fmt"
"io"
"strings"
)
const encryptedPrefix = "enc:"
type Encryptor struct {
gcm cipher.AEAD
}
func NewEncryptor(hexKey string) (*Encryptor, error) {
key, err := hex.DecodeString(hexKey)
if err != nil {
return nil, fmt.Errorf("invalid encryption key: must be hex-encoded: %w", err)
}
if len(key) != 32 {
return nil, fmt.Errorf("invalid encryption key: must be 32 bytes (64 hex chars), got %d bytes", len(key))
}
block, err := aes.NewCipher(key)
if err != nil {
return nil, fmt.Errorf("create cipher: %w", err)
}
gcm, err := cipher.NewGCM(block)
if err != nil {
return nil, fmt.Errorf("create GCM: %w", err)
}
return &Encryptor{gcm: gcm}, nil
}
func (e *Encryptor) Encrypt(plaintext string) (string, error) {
nonce := make([]byte, e.gcm.NonceSize())
if _, err := io.ReadFull(rand.Reader, nonce); err != nil {
return "", fmt.Errorf("generate nonce: %w", err)
}
ciphertext := e.gcm.Seal(nonce, nonce, []byte(plaintext), nil)
return encryptedPrefix + base64.StdEncoding.EncodeToString(ciphertext), nil
}
func (e *Encryptor) Decrypt(data string) (string, error) {
if !strings.HasPrefix(data, encryptedPrefix) {
return data, nil
}
raw, err := base64.StdEncoding.DecodeString(strings.TrimPrefix(data, encryptedPrefix))
if err != nil {
return "", fmt.Errorf("decode base64: %w", err)
}
nonceSize := e.gcm.NonceSize()
if len(raw) < nonceSize {
return "", fmt.Errorf("ciphertext too short")
}
nonce, ciphertext := raw[:nonceSize], raw[nonceSize:]
plaintext, err := e.gcm.Open(nil, nonce, ciphertext, nil)
if err != nil {
return "", fmt.Errorf("decrypt: %w", err)
}
return string(plaintext), nil
}
func IsEncrypted(data string) bool {
return strings.HasPrefix(data, encryptedPrefix)
}
+83
View File
@@ -0,0 +1,83 @@
package store
import (
"encoding/hex"
"testing"
)
func testKey() string {
key := make([]byte, 32)
for i := range key {
key[i] = byte(i)
}
return hex.EncodeToString(key)
}
func TestEncryptorRoundTrip(t *testing.T) {
enc, err := NewEncryptor(testKey())
if err != nil {
t.Fatal(err)
}
original := `{"host":"smtp.example.com","pass":"s3cret"}`
encrypted, err := enc.Encrypt(original)
if err != nil {
t.Fatal(err)
}
if !IsEncrypted(encrypted) {
t.Error("expected encrypted prefix")
}
if encrypted == original {
t.Error("encrypted should differ from original")
}
decrypted, err := enc.Decrypt(encrypted)
if err != nil {
t.Fatal(err)
}
if decrypted != original {
t.Errorf("got %q, want %q", decrypted, original)
}
}
func TestEncryptorDecryptPlaintext(t *testing.T) {
enc, err := NewEncryptor(testKey())
if err != nil {
t.Fatal(err)
}
plain := `{"url":"https://hooks.slack.com/test"}`
result, err := enc.Decrypt(plain)
if err != nil {
t.Fatal(err)
}
if result != plain {
t.Errorf("plaintext passthrough failed: got %q", result)
}
}
func TestEncryptorBadKey(t *testing.T) {
_, err := NewEncryptor("tooshort")
if err == nil {
t.Error("expected error for short key")
}
_, err = NewEncryptor("not-hex-at-all-but-long-enough-to-be-64-chars-if-we-keep-going!!")
if err == nil {
t.Error("expected error for non-hex key")
}
}
func TestEncryptorUniqueCiphertexts(t *testing.T) {
enc, err := NewEncryptor(testKey())
if err != nil {
t.Fatal(err)
}
a, _ := enc.Encrypt("same")
b, _ := enc.Encrypt("same")
if a == b {
t.Error("two encryptions of same plaintext should produce different ciphertexts")
}
}
+5 -7
View File
@@ -1,6 +1,9 @@
package store package store
import "database/sql" import (
"database/sql"
"strconv"
)
type Dialect interface { type Dialect interface {
DriverName() string DriverName() string
@@ -13,8 +16,6 @@ type Dialect interface {
UpsertNodeSQL() string UpsertNodeSQL() string
} }
// rewritePlaceholders converts ? markers to $1, $2, etc. for Postgres.
// For SQLite (or any dialect not needing rewrite), returns the input unchanged.
func rewritePlaceholders(query string, dollarStyle bool) string { func rewritePlaceholders(query string, dollarStyle bool) string {
if !dollarStyle { if !dollarStyle {
return query return query
@@ -25,10 +26,7 @@ func rewritePlaceholders(query string, dollarStyle bool) string {
if query[i] == '?' { if query[i] == '?' {
n++ n++
buf = append(buf, '$') buf = append(buf, '$')
if n >= 10 { buf = append(buf, []byte(strconv.Itoa(n))...)
buf = append(buf, byte('0'+n/10))
}
buf = append(buf, byte('0'+n%10))
} else { } else {
buf = append(buf, query[i]) buf = append(buf, query[i])
} }
+9
View File
@@ -72,6 +72,15 @@ func (d *PostgresDialect) CreateTablesSQL() []string {
key TEXT PRIMARY KEY, key TEXT PRIMARY KEY,
value TEXT NOT NULL value TEXT NOT NULL
)`, )`,
`CREATE TABLE IF NOT EXISTS state_changes (
id SERIAL PRIMARY KEY,
site_id INTEGER NOT NULL,
from_status TEXT NOT NULL,
to_status TEXT NOT NULL,
error_reason TEXT DEFAULT '',
changed_at TIMESTAMP DEFAULT NOW()
)`,
`CREATE INDEX IF NOT EXISTS idx_state_changes_site ON state_changes(site_id, changed_at DESC)`,
} }
} }
+17 -1
View File
@@ -10,7 +10,14 @@ import (
type SQLiteDialect struct{} type SQLiteDialect struct{}
func NewSQLiteStore(path string) (*SQLStore, error) { func NewSQLiteStore(path string) (*SQLStore, error) {
return NewSQLStore("sqlite3", path, &SQLiteDialect{}) s, err := NewSQLStore("sqlite3", path, &SQLiteDialect{})
if err != nil {
return nil, err
}
if _, err := s.db.Exec("PRAGMA journal_mode=WAL"); err != nil {
log.Printf("WAL mode failed: %v", err)
}
return s, nil
} }
func (d *SQLiteDialect) DriverName() string { return "sqlite3" } func (d *SQLiteDialect) DriverName() string { return "sqlite3" }
@@ -72,6 +79,15 @@ func (d *SQLiteDialect) CreateTablesSQL() []string {
key TEXT PRIMARY KEY, key TEXT PRIMARY KEY,
value TEXT NOT NULL value TEXT NOT NULL
)`, )`,
`CREATE TABLE IF NOT EXISTS state_changes (
id INTEGER PRIMARY KEY AUTOINCREMENT,
site_id INTEGER NOT NULL,
from_status TEXT NOT NULL,
to_status TEXT NOT NULL,
error_reason TEXT DEFAULT '',
changed_at DATETIME DEFAULT CURRENT_TIMESTAMP
)`,
`CREATE INDEX IF NOT EXISTS idx_state_changes_site ON state_changes(site_id, changed_at DESC)`,
} }
} }
+140 -32
View File
@@ -6,15 +6,24 @@ import (
"encoding/hex" "encoding/hex"
"encoding/json" "encoding/json"
"fmt" "fmt"
"gitea.lerkolabs.com/lerko/uptop/internal/models" "strings"
"log"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
)
const (
maxCheckHistory = 1000
checkHistoryPruneAt = 1100
maxMaintenanceExport = 1000
maxRequestBody = 1 << 20
) )
type SQLStore struct { type SQLStore struct {
db *sql.DB db *sql.DB
dialect Dialect dialect Dialect
dollar bool dollar bool
encryptor *Encryptor
} }
func NewSQLStore(driverName, dsn string, dialect Dialect) (*SQLStore, error) { func NewSQLStore(driverName, dsn string, dialect Dialect) (*SQLStore, error) {
@@ -22,10 +31,31 @@ func NewSQLStore(driverName, dsn string, dialect Dialect) (*SQLStore, error) {
if err != nil { if err != nil {
return nil, err return nil, err
} }
db.SetMaxOpenConns(25)
db.SetMaxIdleConns(5)
db.SetConnMaxLifetime(5 * time.Minute)
_, isDollar := dialect.(*PostgresDialect) _, isDollar := dialect.(*PostgresDialect)
return &SQLStore{db: db, dialect: dialect, dollar: isDollar}, nil return &SQLStore{db: db, dialect: dialect, dollar: isDollar}, nil
} }
func (s *SQLStore) SetEncryptor(enc *Encryptor) {
s.encryptor = enc
}
func (s *SQLStore) encryptSettings(jsonStr string) (string, error) {
if s.encryptor == nil {
return jsonStr, nil
}
return s.encryptor.Encrypt(jsonStr)
}
func (s *SQLStore) decryptSettings(data string) (string, error) {
if s.encryptor == nil {
return data, nil
}
return s.encryptor.Decrypt(data)
}
func (s *SQLStore) q(query string) string { func (s *SQLStore) q(query string) string {
return rewritePlaceholders(query, s.dollar) return rewritePlaceholders(query, s.dollar)
} }
@@ -50,7 +80,11 @@ func (s *SQLStore) Init() error {
} }
for _, m := range s.dialect.MigrationsSQL() { for _, m := range s.dialect.MigrationsSQL() {
if _, err := s.db.Exec(m); err != nil { if _, err := s.db.Exec(m); err != nil {
log.Printf("migration error: %v", err) errMsg := err.Error()
if strings.Contains(errMsg, "already exists") || strings.Contains(errMsg, "duplicate column") {
continue
}
return fmt.Errorf("migration failed: %w", err)
} }
} }
return nil return nil
@@ -140,39 +174,82 @@ func (s *SQLStore) GetSiteByName(name string) (models.Site, error) {
return st, err return st, err
} }
func (s *SQLStore) unmarshalSettings(raw string) (map[string]string, error) {
decrypted, err := s.decryptSettings(raw)
if err != nil {
return nil, fmt.Errorf("decrypt settings: %w", err)
}
var m map[string]string
if err := json.Unmarshal([]byte(decrypted), &m); err != nil {
return nil, fmt.Errorf("unmarshal settings: %w", err)
}
return m, nil
}
func (s *SQLStore) marshalSettings(settings map[string]string) (string, error) {
jsonBytes, err := json.Marshal(settings)
if err != nil {
return "", err
}
return s.encryptSettings(string(jsonBytes))
}
func (s *SQLStore) GetAlertByName(name string) (models.AlertConfig, error) { func (s *SQLStore) GetAlertByName(name string) (models.AlertConfig, error) {
var a models.AlertConfig var a models.AlertConfig
var settingsJSON string var settingsRaw string
err := s.db.QueryRow(s.q("SELECT id, name, type, settings FROM alerts WHERE name = ?"), name).Scan(&a.ID, &a.Name, &a.Type, &settingsJSON) err := s.db.QueryRow(s.q("SELECT id, name, type, settings FROM alerts WHERE name = ?"), name).Scan(&a.ID, &a.Name, &a.Type, &settingsRaw)
if err != nil { if err != nil {
return a, err return a, err
} }
if err := json.Unmarshal([]byte(settingsJSON), &a.Settings); err != nil { a.Settings, err = s.unmarshalSettings(settingsRaw)
return a, fmt.Errorf("unmarshal alert settings: %w", err) if err != nil {
return a, fmt.Errorf("alert %q: %w", name, err)
} }
return a, nil return a, nil
} }
func (s *SQLStore) AddSiteReturningID(site models.Site) (int, error) { func (s *SQLStore) AddSiteReturningID(site models.Site) (int, error) {
if err := s.AddSite(site); err != nil { token := ""
return 0, err if site.Type == "push" {
var err error
token, err = generateToken()
if err != nil {
return 0, fmt.Errorf("generate push token: %w", err)
} }
created, err := s.GetSiteByName(site.Name) }
if s.dollar {
var id int
err := s.db.QueryRow(s.q("INSERT INTO sites (name, url, type, token, interval, alert_id, check_ssl, threshold, max_retries, hostname, port, timeout, method, description, parent_id, accepted_codes, dns_resolve_type, dns_server, ignore_tls, paused, regions) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) RETURNING id"),
site.Name, site.URL, site.Type, token, site.Interval, site.AlertID, site.CheckSSL, site.ExpiryThreshold, site.MaxRetries,
site.Hostname, site.Port, site.Timeout, site.Method, site.Description, site.ParentID, site.AcceptedCodes, site.DNSResolveType, site.DNSServer, site.IgnoreTLS, site.Paused, site.Regions).Scan(&id)
return id, err
}
result, err := s.db.Exec(s.q("INSERT INTO sites (name, url, type, token, interval, alert_id, check_ssl, threshold, max_retries, hostname, port, timeout, method, description, parent_id, accepted_codes, dns_resolve_type, dns_server, ignore_tls, paused, regions) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"),
site.Name, site.URL, site.Type, token, site.Interval, site.AlertID, site.CheckSSL, site.ExpiryThreshold, site.MaxRetries,
site.Hostname, site.Port, site.Timeout, site.Method, site.Description, site.ParentID, site.AcceptedCodes, site.DNSResolveType, site.DNSServer, site.IgnoreTLS, site.Paused, site.Regions)
if err != nil { if err != nil {
return 0, err return 0, err
} }
return created.ID, nil id, err := result.LastInsertId()
return int(id), err
} }
func (s *SQLStore) AddAlertReturningID(name, aType string, settings map[string]string) (int, error) { func (s *SQLStore) AddAlertReturningID(name, aType string, settings map[string]string) (int, error) {
if err := s.AddAlert(name, aType, settings); err != nil { stored, err := s.marshalSettings(settings)
return 0, err
}
created, err := s.GetAlertByName(name)
if err != nil { if err != nil {
return 0, err return 0, err
} }
return created.ID, nil if s.dollar {
var id int
err := s.db.QueryRow(s.q("INSERT INTO alerts (name, type, settings) VALUES (?, ?, ?) RETURNING id"), name, aType, stored).Scan(&id)
return id, err
}
result, err := s.db.Exec(s.q("INSERT INTO alerts (name, type, settings) VALUES (?, ?, ?)"), name, aType, stored)
if err != nil {
return 0, err
}
id, err := result.LastInsertId()
return int(id), err
} }
func (s *SQLStore) GetAllAlerts() ([]models.AlertConfig, error) { func (s *SQLStore) GetAllAlerts() ([]models.AlertConfig, error) {
@@ -184,12 +261,13 @@ func (s *SQLStore) GetAllAlerts() ([]models.AlertConfig, error) {
var alerts []models.AlertConfig var alerts []models.AlertConfig
for rows.Next() { for rows.Next() {
var a models.AlertConfig var a models.AlertConfig
var settingsJSON string var settingsRaw string
if err := rows.Scan(&a.ID, &a.Name, &a.Type, &settingsJSON); err != nil { if err := rows.Scan(&a.ID, &a.Name, &a.Type, &settingsRaw); err != nil {
return alerts, err return alerts, err
} }
if err := json.Unmarshal([]byte(settingsJSON), &a.Settings); err != nil { a.Settings, err = s.unmarshalSettings(settingsRaw)
return alerts, fmt.Errorf("unmarshal alert settings for %q: %w", a.Name, err) if err != nil {
return alerts, fmt.Errorf("alert %q: %w", a.Name, err)
} }
alerts = append(alerts, a) alerts = append(alerts, a)
} }
@@ -198,32 +276,33 @@ func (s *SQLStore) GetAllAlerts() ([]models.AlertConfig, error) {
func (s *SQLStore) GetAlert(id int) (models.AlertConfig, error) { func (s *SQLStore) GetAlert(id int) (models.AlertConfig, error) {
var a models.AlertConfig var a models.AlertConfig
var settingsJSON string var settingsRaw string
err := s.db.QueryRow(s.q("SELECT id, name, type, settings FROM alerts WHERE id = ?"), id).Scan(&a.ID, &a.Name, &a.Type, &settingsJSON) err := s.db.QueryRow(s.q("SELECT id, name, type, settings FROM alerts WHERE id = ?"), id).Scan(&a.ID, &a.Name, &a.Type, &settingsRaw)
if err != nil { if err != nil {
return a, err return a, err
} }
if err := json.Unmarshal([]byte(settingsJSON), &a.Settings); err != nil { a.Settings, err = s.unmarshalSettings(settingsRaw)
return a, fmt.Errorf("unmarshal alert settings: %w", err) if err != nil {
return a, fmt.Errorf("alert %d: %w", id, err)
} }
return a, nil return a, nil
} }
func (s *SQLStore) AddAlert(name, aType string, settings map[string]string) error { func (s *SQLStore) AddAlert(name, aType string, settings map[string]string) error {
jsonBytes, err := json.Marshal(settings) stored, err := s.marshalSettings(settings)
if err != nil { if err != nil {
return err return err
} }
_, err = s.db.Exec(s.q("INSERT INTO alerts (name, type, settings) VALUES (?, ?, ?)"), name, aType, string(jsonBytes)) _, err = s.db.Exec(s.q("INSERT INTO alerts (name, type, settings) VALUES (?, ?, ?)"), name, aType, stored)
return err return err
} }
func (s *SQLStore) UpdateAlert(id int, name, aType string, settings map[string]string) error { func (s *SQLStore) UpdateAlert(id int, name, aType string, settings map[string]string) error {
jsonBytes, err := json.Marshal(settings) stored, err := s.marshalSettings(settings)
if err != nil { if err != nil {
return err return err
} }
_, err = s.db.Exec(s.q("UPDATE alerts SET name=?, type=?, settings=? WHERE id=?"), name, aType, string(jsonBytes), id) _, err = s.db.Exec(s.q("UPDATE alerts SET name=?, type=?, settings=? WHERE id=?"), name, aType, stored, id)
return err return err
} }
@@ -268,6 +347,29 @@ func (s *SQLStore) DeleteUser(id int) error {
return err return err
} }
func (s *SQLStore) SaveStateChange(siteID int, fromStatus, toStatus, errorReason string) error {
_, err := s.db.Exec(s.q("INSERT INTO state_changes (site_id, from_status, to_status, error_reason) VALUES (?, ?, ?, ?)"),
siteID, fromStatus, toStatus, errorReason)
return err
}
func (s *SQLStore) GetStateChanges(siteID int, limit int) ([]models.StateChange, error) {
rows, err := s.db.Query(s.q("SELECT id, site_id, from_status, to_status, error_reason, changed_at FROM state_changes WHERE site_id = ? ORDER BY changed_at DESC LIMIT ?"), siteID, limit)
if err != nil {
return nil, err
}
defer rows.Close()
var changes []models.StateChange
for rows.Next() {
var sc models.StateChange
if err := rows.Scan(&sc.ID, &sc.SiteID, &sc.FromStatus, &sc.ToStatus, &sc.ErrorReason, &sc.ChangedAt); err != nil {
return changes, err
}
changes = append(changes, sc)
}
return changes, rows.Err()
}
func (s *SQLStore) SaveCheck(siteID int, latencyNs int64, isUp bool) error { func (s *SQLStore) SaveCheck(siteID int, latencyNs int64, isUp bool) error {
return s.SaveCheckFromNode(siteID, "", latencyNs, isUp) return s.SaveCheckFromNode(siteID, "", latencyNs, isUp)
} }
@@ -277,10 +379,16 @@ func (s *SQLStore) SaveCheckFromNode(siteID int, nodeID string, latencyNs int64,
if err != nil { if err != nil {
return err return err
} }
_, err = s.db.Exec(s.q(`DELETE FROM check_history WHERE site_id = ? AND id NOT IN ( var count int
SELECT id FROM check_history WHERE site_id = ? ORDER BY checked_at DESC LIMIT 1000 _ = s.db.QueryRow(s.q("SELECT COUNT(*) FROM check_history WHERE site_id = ?"), siteID).Scan(&count)
)`), siteID, siteID) if count > checkHistoryPruneAt {
pruneQuery := fmt.Sprintf(`DELETE FROM check_history WHERE site_id = ? AND id NOT IN (
SELECT id FROM check_history WHERE site_id = ? ORDER BY checked_at DESC LIMIT %d
)`, maxCheckHistory)
_, err = s.db.Exec(s.q(pruneQuery), siteID, siteID)
return err return err
}
return nil
} }
func (s *SQLStore) RegisterNode(node models.ProbeNode) error { func (s *SQLStore) RegisterNode(node models.ProbeNode) error {
@@ -493,7 +601,7 @@ func (s *SQLStore) ExportData() (models.Backup, error) {
if err != nil { if err != nil {
return models.Backup{}, err return models.Backup{}, err
} }
windows, err := s.GetAllMaintenanceWindows(1000) windows, err := s.GetAllMaintenanceWindows(maxMaintenanceExport)
if err != nil { if err != nil {
return models.Backup{}, err return models.Backup{}, err
} }
+4
View File
@@ -38,6 +38,10 @@ type Store interface {
SaveCheckFromNode(siteID int, nodeID string, latencyNs int64, isUp bool) error SaveCheckFromNode(siteID int, nodeID string, latencyNs int64, isUp bool) error
LoadAllHistory(limit int) (map[int][]models.CheckRecord, error) LoadAllHistory(limit int) (map[int][]models.CheckRecord, error)
// State Changes
SaveStateChange(siteID int, fromStatus, toStatus, errorReason string) error
GetStateChanges(siteID int, limit int) ([]models.StateChange, error)
// Nodes // Nodes
RegisterNode(node models.ProbeNode) error RegisterNode(node models.ProbeNode) error
GetNode(id string) (models.ProbeNode, error) GetNode(id string) (models.ProbeNode, error)
+96 -5
View File
@@ -2,7 +2,10 @@ package tui
import ( import (
"fmt" "fmt"
"strings"
"time"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
tea "github.com/charmbracelet/bubbletea" tea "github.com/charmbracelet/bubbletea"
"github.com/charmbracelet/huh" "github.com/charmbracelet/huh"
"github.com/charmbracelet/lipgloss" "github.com/charmbracelet/lipgloss"
@@ -113,34 +116,122 @@ func fmtAlertConfig(alert struct {
} }
} }
func fmtAlertHealth(h monitor.AlertHealth) string {
if h.LastSendAt.IsZero() {
return subtleStyle.Render("●")
}
if h.LastSendOK {
return specialStyle.Render("●")
}
return dangerStyle.Render("●")
}
func fmtAlertLastSent(h monitor.AlertHealth) string {
if h.LastSendAt.IsZero() {
return subtleStyle.Render("never")
}
d := time.Since(h.LastSendAt)
if d < time.Minute {
return fmt.Sprintf("%ds ago", int(d.Seconds()))
}
if d < time.Hour {
return fmt.Sprintf("%dm ago", int(d.Minutes()))
}
if d < 24*time.Hour {
return fmt.Sprintf("%dh ago", int(d.Hours()))
}
return fmt.Sprintf("%dd ago", int(d.Hours())/24)
}
func (m Model) viewAlertsTab() string { func (m Model) viewAlertsTab() string {
if len(m.alerts) == 0 { if len(m.alerts) == 0 {
return "\n No alert channels configured. Press [n] to add one." return "\n No alert channels configured. Press [n] to add one."
} }
var headers []string
var widths []int
if m.isWide() {
headers = []string{"#", "", "NAME", "TYPE", "CONFIG", "LAST SENT"}
widths = []int{4, 3, 18, 12, 40, 12}
} else {
headers = []string{"#", "", "NAME", "TYPE", "CONFIG", "SENT"}
widths = []int{4, 3, 14, 10, 24, 8}
}
nameW := widths[2]
cfgW := widths[4]
return m.renderTable( return m.renderTable(
[]string{"#", "NAME", "TYPE", "CONFIG"}, headers,
len(m.alerts), len(m.alerts),
func(start, end int) [][]string { func(start, end int) [][]string {
var rows [][]string var rows [][]string
for i := start; i < end; i++ { for i := start; i < end; i++ {
a := m.alerts[i] a := m.alerts[i]
h := m.engine.GetAlertHealth(a.ID)
rows = append(rows, []string{ rows = append(rows, []string{
fmt.Sprintf("%d", i+1), fmt.Sprintf("%d", i+1),
m.zones.Mark(fmt.Sprintf("alert-%d", i), limitStr(a.Name, 15)), fmtAlertHealth(h),
m.zones.Mark(fmt.Sprintf("alert-%d", i), limitStr(a.Name, nameW-2)),
fmtAlertType(a.Type), fmtAlertType(a.Type),
fmtAlertConfig(struct { limitStr(fmtAlertConfig(struct {
Type string Type string
Settings map[string]string Settings map[string]string
}{a.Type, a.Settings}), }{a.Type, a.Settings}), cfgW-2),
fmtAlertLastSent(h),
}) })
} }
return rows return rows
}, },
nil, nil, widths, nil,
) )
} }
func (m Model) viewAlertDetailPanel() string {
if m.cursor >= len(m.alerts) {
return ""
}
a := m.alerts[m.cursor]
h := m.engine.GetAlertHealth(a.ID)
var b strings.Builder
b.WriteString(subtleStyle.Render(" Alerts > ") + titleStyle.Render(a.Name) + "\n\n")
row := func(label, value string) {
fmt.Fprintf(&b, " %-16s %s\n", subtleStyle.Render(label), value)
}
row("Type", fmtAlertType(a.Type))
if h.LastSendAt.IsZero() {
row("Health", subtleStyle.Render("never sent"))
} else if h.LastSendOK {
row("Health", specialStyle.Render("OK"))
} else {
row("Health", dangerStyle.Render("FAILED"))
}
if !h.LastSendAt.IsZero() {
row("Last Sent", h.LastSendAt.Format("2006-01-02 15:04:05")+" ("+fmtAlertLastSent(h)+")")
}
if h.SendCount > 0 {
row("Sends", fmt.Sprintf("%d sent, %d failed", h.SendCount, h.FailCount))
}
if h.LastError != "" {
row("Last Error", dangerStyle.Render(limitStr(h.LastError, 60)))
}
b.WriteString("\n" + subtleStyle.Render(" CONFIGURATION") + "\n")
for k, v := range a.Settings {
row(k, v)
}
b.WriteString("\n\n")
b.WriteString(subtleStyle.Render(" [i/Esc] Back [e] Edit [t] Test [q] Quit"))
return lipgloss.NewStyle().Padding(1, 2).Render(b.String())
}
func (m *Model) initAlertHuhForm() tea.Cmd { func (m *Model) initAlertHuhForm() tea.Cmd {
m.alertFormData = &alertFormData{ m.alertFormData = &alertFormData{
AlertType: "discord", AlertType: "discord",
+88 -20
View File
@@ -5,27 +5,83 @@ import (
"strings" "strings"
) )
func colorizeLog(line string) string { type logSeverity int
const (
severityInfo logSeverity = iota
severityWarn
severityDown
severityUp
severitySystem
)
func classifyLog(line string) logSeverity {
lower := strings.ToLower(line) lower := strings.ToLower(line)
switch { switch {
case strings.Contains(lower, "confirmed down"), case strings.Contains(lower, "confirmed down"),
strings.Contains(lower, "is down"), strings.Contains(lower, "is down"),
strings.Contains(lower, "missed heartbeat"), strings.Contains(lower, "missed heartbeat"),
strings.Contains(lower, "failed check"), strings.Contains(lower, "alert send failed"):
strings.Contains(lower, "ssl warning"): return severityDown
return dangerStyle.Render(line)
case strings.Contains(lower, "recovered"), case strings.Contains(lower, "recovered"),
strings.Contains(lower, "is up"), strings.Contains(lower, "is up"),
strings.Contains(lower, "recovery"): strings.Contains(lower, "recovery"),
return specialStyle.Render(line) strings.Contains(lower, "first heartbeat"):
return severityUp
case strings.Contains(lower, "failed check"),
strings.Contains(lower, "ssl warning"),
strings.Contains(lower, "overdue"),
strings.Contains(lower, "was late"):
return severityWarn
case strings.Contains(lower, "engine"), case strings.Contains(lower, "engine"),
strings.Contains(lower, "cluster"): strings.Contains(lower, "cluster"),
return titleStyle.Render(line) strings.Contains(lower, "loaded"),
strings.Contains(lower, "paused"),
strings.Contains(lower, "resumed"):
return severitySystem
default: default:
return line return severityInfo
} }
} }
func isImportantLog(sev logSeverity) bool {
return sev == severityDown || sev == severityUp || sev == severitySystem
}
func renderLogTag(sev logSeverity) string {
switch sev {
case severityDown:
return dangerStyle.Render(" DOWN ")
case severityUp:
return specialStyle.Render(" UP ")
case severityWarn:
return warnStyle.Render(" WARN ")
case severitySystem:
return titleStyle.Render(" SYS ")
default:
return subtleStyle.Render(" info ")
}
}
func renderLogLine(line string) string {
sev := classifyLog(line)
tag := renderLogTag(sev)
ts := ""
msg := line
if len(line) > 10 && line[0] == '[' {
if idx := strings.Index(line, "]"); idx > 0 && idx < 12 {
ts = subtleStyle.Render(line[1:idx])
msg = strings.TrimSpace(line[idx+1:])
}
}
if ts != "" {
return fmt.Sprintf(" %s %s %s", ts, tag, msg)
}
return fmt.Sprintf(" %s %s", tag, msg)
}
func (m Model) viewLogsTab() string { func (m Model) viewLogsTab() string {
content := m.logViewport.View() content := m.logViewport.View()
if strings.TrimSpace(content) == "" || content == "Waiting for logs..." { if strings.TrimSpace(content) == "" || content == "Waiting for logs..." {
@@ -33,22 +89,34 @@ func (m Model) viewLogsTab() string {
} }
lines := strings.Split(content, "\n") lines := strings.Split(content, "\n")
var colored []string var rendered []string
total := 0
shown := 0
for _, line := range lines { for _, line := range lines {
if line == "" { if strings.TrimSpace(line) == "" {
colored = append(colored, line)
continue continue
} }
colored = append(colored, colorizeLog(line)) total++
sev := classifyLog(line)
if m.logFilterImportant && !isImportantLog(sev) {
continue
}
shown++
rendered = append(rendered, renderLogLine(line))
} }
count := 0 filterLabel := "All"
for _, l := range lines { if m.logFilterImportant {
if strings.TrimSpace(l) != "" { filterLabel = "Important"
count++
}
} }
header := subtleStyle.Render(fmt.Sprintf(" %d entries [↑/↓] Scroll [PgUp/PgDn] Page", count)) header := subtleStyle.Render(fmt.Sprintf(
return "\n" + header + "\n\n" + strings.Join(colored, "\n") " %d entries [↑/↓] Scroll [PgUp/PgDn] Page [f] Filter: %s", shown, filterLabel))
if m.logFilterImportant && shown < total {
header += subtleStyle.Render(fmt.Sprintf(" (%d hidden)", total-shown))
}
return "\n" + header + "\n\n" + strings.Join(rendered, "\n")
} }
+27 -10
View File
@@ -2,10 +2,11 @@ package tui
import ( import (
"fmt" "fmt"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"strconv" "strconv"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
tea "github.com/charmbracelet/bubbletea" tea "github.com/charmbracelet/bubbletea"
"github.com/charmbracelet/huh" "github.com/charmbracelet/huh"
"github.com/charmbracelet/lipgloss" "github.com/charmbracelet/lipgloss"
@@ -40,19 +41,19 @@ func fmtMaintType(t string) string {
return maintStyle.Render("maintenance") return maintStyle.Render("maintenance")
} }
func fmtMaintMonitor(monitorID int, sites []models.Site) string { func fmtMaintMonitorW(monitorID int, sites []models.Site, maxW int) string {
if monitorID == 0 { if monitorID == 0 {
return "All" return "All"
} }
for _, s := range sites { for _, s := range sites {
if s.ID == monitorID { if s.ID == monitorID {
return limitStr(s.Name, 18) return limitStr(s.Name, maxW)
} }
} }
return fmt.Sprintf("#%d", monitorID) return fmt.Sprintf("#%d", monitorID)
} }
func fmtMaintTime(t time.Time) string { func fmtMaintTime(t time.Time, colW int) string {
if t.IsZero() { if t.IsZero() {
return subtleStyle.Render("—") return subtleStyle.Render("—")
} }
@@ -60,7 +61,10 @@ func fmtMaintTime(t time.Time) string {
if t.Year() == now.Year() && t.YearDay() == now.YearDay() { if t.Year() == now.Year() && t.YearDay() == now.YearDay() {
return t.Format("15:04") return t.Format("15:04")
} }
if colW >= 14 {
return t.Format("15:04 Jan 02") return t.Format("15:04 Jan 02")
}
return t.Format("Jan 02")
} }
func (m Model) isMonitorInMaintenance(monitorID int) bool { func (m Model) isMonitorInMaintenance(monitorID int) bool {
@@ -92,8 +96,21 @@ func (m Model) viewMaintTab() string {
return "\n No maintenance windows or incidents. Press [n] to create one." return "\n No maintenance windows or incidents. Press [n] to create one."
} }
var headers []string
var widths []int
if m.isWide() {
headers = []string{"#", "TITLE", "TYPE", "MONITORS", "STATUS", "STARTED", "ENDS"}
widths = []int{4, 24, 14, 22, 12, 16, 16}
} else {
headers = []string{"#", "TITLE", "TYPE", "MON", "ST", "START", "ENDS"}
widths = []int{4, 14, 13, 14, 11, 14, 14}
}
titleW := widths[1]
monW := widths[3]
timeW := widths[5]
return m.renderTable( return m.renderTable(
[]string{"#", "TITLE", "TYPE", "MONITORS", "STATUS", "STARTED", "ENDS"}, headers,
len(m.maintenanceWindows), len(m.maintenanceWindows),
func(start, end int) [][]string { func(start, end int) [][]string {
var rows [][]string var rows [][]string
@@ -102,17 +119,17 @@ func (m Model) viewMaintTab() string {
mw := m.maintenanceWindows[i] mw := m.maintenanceWindows[i]
rows = append(rows, []string{ rows = append(rows, []string{
strconv.Itoa(i + 1), strconv.Itoa(i + 1),
m.zones.Mark(fmt.Sprintf("maint-%d", i), limitStr(mw.Title, 24)), m.zones.Mark(fmt.Sprintf("maint-%d", i), limitStr(mw.Title, titleW-2)),
fmtMaintType(mw.Type), fmtMaintType(mw.Type),
fmtMaintMonitor(mw.MonitorID, allSites), fmtMaintMonitorW(mw.MonitorID, allSites, monW-2),
fmtMaintStatus(mw), fmtMaintStatus(mw),
fmtMaintTime(mw.StartTime), fmtMaintTime(mw.StartTime, timeW),
fmtMaintTime(mw.EndTime), fmtMaintTime(mw.EndTime, timeW),
}) })
} }
return rows return rows
}, },
[]int{6, 0, 14, 20, 12, 16, 16}, widths,
nil, nil,
) )
} }
+13 -4
View File
@@ -10,16 +10,25 @@ func (m Model) viewNodesTab() string {
return "\n No probe nodes connected." return "\n No probe nodes connected."
} }
colWidths := []int{0, 12, 20, 10, 8} var headers []string
var widths []int
if m.isWide() {
headers = []string{"NAME", "REGION", "LAST SEEN", "VERSION", "STATUS"}
widths = []int{24, 14, 16, 12, 10}
} else {
headers = []string{"NAME", "REGION", "SEEN", "VER", "STATUS"}
widths = []int{16, 10, 10, 8, 8}
}
nameW := widths[0]
return m.renderTable( return m.renderTable(
[]string{"NAME", "REGION", "LAST SEEN", "VERSION", "STATUS"}, headers,
len(m.nodes), len(m.nodes),
func(start, end int) [][]string { func(start, end int) [][]string {
var rows [][]string var rows [][]string
for i := start; i < end; i++ { for i := start; i < end; i++ {
node := m.nodes[i] node := m.nodes[i]
name := limitStr(node.Name, 20) name := limitStr(node.Name, nameW-2)
if name == "" { if name == "" {
name = node.ID name = node.ID
} }
@@ -37,7 +46,7 @@ func (m Model) viewNodesTab() string {
} }
return rows return rows
}, },
colWidths, widths,
nil, nil,
) )
} }
+177 -26
View File
@@ -60,14 +60,18 @@ type siteFormData struct {
Regions string Regions string
} }
func latencySparkline(latencies []time.Duration, width int) string { func latencySparkline(latencies []time.Duration, statuses []bool, width int) string {
if len(latencies) == 0 { if len(latencies) == 0 {
return subtleStyle.Render(strings.Repeat("·", width)) return subtleStyle.Render(strings.Repeat("·", width))
} }
samples := latencies samples := latencies
sampledStatuses := statuses
if len(samples) > width { if len(samples) > width {
samples = samples[len(samples)-width:] samples = samples[len(samples)-width:]
if len(sampledStatuses) > width {
sampledStatuses = sampledStatuses[len(sampledStatuses)-width:]
}
} }
minL, maxL := samples[0], samples[0] minL, maxL := samples[0], samples[0]
@@ -85,7 +89,7 @@ func latencySparkline(latencies []time.Duration, width int) string {
sb.WriteString(subtleStyle.Render(strings.Repeat("·", remaining))) sb.WriteString(subtleStyle.Render(strings.Repeat("·", remaining)))
} }
spread := maxL - minL spread := maxL - minL
for _, l := range samples { for i, l := range samples {
idx := 0 idx := 0
if spread > 0 { if spread > 0 {
idx = int(float64(l-minL) / float64(spread) * 7) idx = int(float64(l-minL) / float64(spread) * 7)
@@ -94,6 +98,10 @@ func latencySparkline(latencies []time.Duration, width int) string {
} }
} }
ch := string(sparkChars[idx]) ch := string(sparkChars[idx])
isDown := i < len(sampledStatuses) && !sampledStatuses[i]
if isDown {
sb.WriteString(dangerStyle.Render(ch))
} else {
ms := l.Milliseconds() ms := l.Milliseconds()
if ms < 200 { if ms < 200 {
sb.WriteString(specialStyle.Render(ch)) sb.WriteString(specialStyle.Render(ch))
@@ -103,6 +111,7 @@ func latencySparkline(latencies []time.Duration, width int) string {
sb.WriteString(dangerStyle.Render(ch)) sb.WriteString(dangerStyle.Render(ch))
} }
} }
}
return sb.String() return sb.String()
} }
@@ -302,6 +311,8 @@ func fmtStatus(status string, paused bool, inMaint bool) string {
switch status { switch status {
case "DOWN", "SSL EXP": case "DOWN", "SSL EXP":
return dangerStyle.Render(status) return dangerStyle.Render(status)
case "LATE":
return warnStyle.Render(status)
case "PENDING": case "PENDING":
return subtleStyle.Render(status) return subtleStyle.Render(status)
default: default:
@@ -309,28 +320,94 @@ func fmtStatus(status string, paused bool, inMaint bool) string {
} }
} }
func (m Model) dynamicWidths() (nameW, sparkW int) { func fmtDuration(d time.Duration) string {
fixed := 6 + 10 + 10 + 8 + 8 + 7 + 9 // #, TYPE, STATUS, LATENCY, UPTIME, SSL, RETRY if d < time.Minute {
overhead := 30 // cell padding + borders return fmt.Sprintf("%ds", int(d.Seconds()))
avail := m.termWidth - chromePadH - 2 - fixed - overhead }
if avail < 30 { if d < time.Hour {
avail = 30 return fmt.Sprintf("%dm", int(d.Minutes()))
}
if d < 24*time.Hour {
h := int(d.Hours())
m := int(d.Minutes()) % 60
if m > 0 {
return fmt.Sprintf("%dh %dm", h, m)
}
return fmt.Sprintf("%dh", h)
}
days := int(d.Hours()) / 24
hours := int(d.Hours()) % 24
if hours > 0 {
return fmt.Sprintf("%dd %dh", days, hours)
}
return fmt.Sprintf("%dd", days)
}
type tableLayout struct {
nameW, sparkW int
headers []string
colWidths []int
}
func (m Model) computeLayout() tableLayout {
wide := m.isWide()
var fixed int
var headers []string
var widths []int
if wide {
// # NAME TYPE STATUS LATENCY UPTIME HISTORY SSL RETRIES
headers = []string{"#", "NAME", "TYPE", "STATUS", "LATENCY", "UPTIME", "HISTORY", "SSL", "RETRIES"}
widths = []int{4, 0, 10, 10, 10, 8, 0, 7, 9}
fixed = 4 + 10 + 10 + 10 + 8 + 7 + 9
} else {
// # NAME TYPE STATUS LAT UP% HISTORY SSL RT
headers = []string{"#", "NAME", "TYPE", "STATUS", "LAT", "UP%", "HISTORY", "SSL", "RT"}
widths = []int{4, 0, 8, 8, 7, 8, 0, 5, 5}
fixed = 4 + 8 + 8 + 7 + 8 + 5 + 5
}
numCols := len(headers)
borderOverhead := 2 + (numCols - 1)
avail := m.termWidth - chromePadH - 2 - borderOverhead - fixed
if avail < 20 {
avail = 20
}
maxName := 0
for _, s := range m.sites {
if n := len([]rune(s.Name)); n > maxName {
maxName = n
}
}
maxName += 4
nameW := avail / 2
if nameW > maxName {
nameW = maxName
} }
nameW = avail / 2
sparkW = avail - nameW - 2 // -2 for spark column padding
if nameW < 13 { if nameW < 13 {
nameW = 13 nameW = 13
} }
if nameW > 40 { if nameW > 40 {
nameW = 40 nameW = 40
} }
sparkW := avail - nameW
if sparkW < 10 { if sparkW < 10 {
sparkW = 10 sparkW = 10
} }
if sparkW > 60 {
sparkW = 60 widths[1] = nameW
widths[6] = sparkW
return tableLayout{
nameW: nameW,
sparkW: sparkW,
headers: headers,
colWidths: widths,
} }
return
} }
func (m Model) viewSitesTab() string { func (m Model) viewSitesTab() string {
@@ -348,12 +425,16 @@ func (m Model) viewSitesTab() string {
return "\n" + welcome return "\n" + welcome
} }
nameW, sparkWidth := m.dynamicWidths() layout := m.computeLayout()
colWidths := []int{6, 0, 10, 10, 8, 8, sparkWidth + 2, 7, 9} nameW := layout.nameW
sparkWidth := layout.sparkW - 2
if sparkWidth < 8 {
sparkWidth = 8
}
var groupRows map[int]bool var groupRows map[int]bool
return m.renderTable( return m.renderTable(
[]string{"#", "NAME", "TYPE", "STATUS", "LATENCY", "UPTIME", "HISTORY", "SSL", "RETRY"}, layout.headers,
len(m.sites), len(m.sites),
func(start, end int) [][]string { func(start, end int) [][]string {
groupRows = make(map[int]bool) groupRows = make(map[int]bool)
@@ -366,7 +447,7 @@ func (m Model) viewSitesTab() string {
icon := typeIcon("group", m.collapsed[site.ID]) icon := typeIcon("group", m.collapsed[site.ID])
rows = append(rows, []string{ rows = append(rows, []string{
strconv.Itoa(i + 1), strconv.Itoa(i + 1),
m.zones.Mark(fmt.Sprintf("site-%d", i), icon+" "+limitStr(site.Name, nameW-2)), m.zones.Mark(fmt.Sprintf("site-%d", i), icon+" "+limitStr(site.Name, nameW-4)),
"group", "group",
fmtStatus(site.Status, site.Paused, m.isMonitorInMaintenance(site.ID)), fmtStatus(site.Status, site.Paused, m.isMonitorInMaintenance(site.ID)),
subtleStyle.Render("—"), subtleStyle.Render("—"),
@@ -384,9 +465,17 @@ func (m Model) viewSitesTab() string {
if i+1 >= len(m.sites) || m.sites[i+1].ParentID != site.ParentID { if i+1 >= len(m.sites) || m.sites[i+1].ParentID != site.ParentID {
prefix = "└" prefix = "└"
} }
name = prefix + " " + limitStr(name, nameW-2) name = prefix + " " + limitStr(name, nameW-4)
} else { } else {
name = limitStr(name, nameW) name = limitStr(name, nameW-2)
}
if (site.Status == "DOWN" || site.Status == "SSL EXP" || site.Status == "LATE") && site.LastError != "" {
nameLen := len([]rune(name))
errSpace := nameW - nameLen - 3
if errSpace > 10 {
name = name + " " + subtleStyle.Render(limitStr(site.LastError, errSpace))
}
} }
hist, _ := m.engine.GetHistory(site.ID) hist, _ := m.engine.GetHistory(site.ID)
@@ -394,7 +483,7 @@ func (m Model) viewSitesTab() string {
if site.Type == "push" { if site.Type == "push" {
spark = heartbeatSparkline(hist.Statuses, sparkWidth) spark = heartbeatSparkline(hist.Statuses, sparkWidth)
} else { } else {
spark = latencySparkline(hist.Latencies, sparkWidth) spark = latencySparkline(hist.Latencies, hist.Statuses, sparkWidth)
} }
rows = append(rows, []string{ rows = append(rows, []string{
@@ -411,7 +500,7 @@ func (m Model) viewSitesTab() string {
} }
return rows return rows
}, },
colWidths, layout.colWidths,
func(row, col int) *lipgloss.Style { func(row, col int) *lipgloss.Style {
if groupRows[row] { if groupRows[row] {
s := siteGroupStyle s := siteGroupStyle
@@ -731,7 +820,30 @@ func (m Model) viewDetailPanel() string {
fmt.Fprintf(&b, " %-16s %s\n", subtleStyle.Render(label), value) fmt.Fprintf(&b, " %-16s %s\n", subtleStyle.Render(label), value)
} }
section := func(label string) {
b.WriteString("\n" + subtleStyle.Render(" "+label) + "\n")
}
row("Status", fmtStatus(site.Status, site.Paused, m.isMonitorInMaintenance(site.ID))) row("Status", fmtStatus(site.Status, site.Paused, m.isMonitorInMaintenance(site.ID)))
if (site.Status == "DOWN" || site.Status == "SSL EXP" || site.Status == "LATE") && site.LastError != "" {
row("Error", dangerStyle.Render(limitStr(site.LastError, 60)))
}
if site.Type == "http" && site.StatusCode > 0 {
row("HTTP Code", strconv.Itoa(site.StatusCode))
}
if !site.StatusChangedAt.IsZero() {
dur := time.Since(site.StatusChangedAt)
row("State Since", site.StatusChangedAt.Format("2006-01-02 15:04:05")+" ("+fmtDuration(dur)+")")
}
if !site.LastSuccessAt.IsZero() {
ago := time.Since(site.LastSuccessAt)
row("Last Success", site.LastSuccessAt.Format("15:04:05")+" ("+fmtDuration(ago)+" ago)")
}
if m.isMonitorInMaintenance(site.ID) { if m.isMonitorInMaintenance(site.ID) {
for _, mw := range m.maintenanceWindows { for _, mw := range m.maintenanceWindows {
if mw.Type == "maintenance" && (mw.MonitorID == 0 || mw.MonitorID == site.ID || mw.MonitorID == site.ParentID) { if mw.Type == "maintenance" && (mw.MonitorID == 0 || mw.MonitorID == site.ID || mw.MonitorID == site.ParentID) {
@@ -740,6 +852,8 @@ func (m Model) viewDetailPanel() string {
} }
} }
} }
section("ENDPOINT")
row("Type", site.Type) row("Type", site.Type)
if site.URL != "" { if site.URL != "" {
row("URL", site.URL) row("URL", site.URL)
@@ -750,20 +864,36 @@ func (m Model) viewDetailPanel() string {
if site.Port > 0 { if site.Port > 0 {
row("Port", strconv.Itoa(site.Port)) row("Port", strconv.Itoa(site.Port))
} }
section("TIMING")
row("Interval", fmt.Sprintf("%ds", site.Interval)) row("Interval", fmt.Sprintf("%ds", site.Interval))
if site.Timeout > 0 {
row("Timeout", fmt.Sprintf("%ds", site.Timeout)) row("Timeout", fmt.Sprintf("%ds", site.Timeout))
}
row("Latency", fmtLatency(site.Latency)) row("Latency", fmtLatency(site.Latency))
row("Uptime", fmtUptime(hist.Statuses)) row("Uptime", fmtUptime(hist.Statuses))
if !site.LastCheck.IsZero() {
row("Last Check", site.LastCheck.Format("15:04:05"))
}
if site.Type == "http" { if site.Type == "http" {
section("HTTP")
if site.Method != "" && site.Method != "GET" {
row("Method", site.Method) row("Method", site.Method)
row("Codes", site.AcceptedCodes) }
codes := site.AcceptedCodes
if codes == "" {
codes = "200-299"
}
row("Codes", codes)
row("SSL", fmtSSL(site)) row("SSL", fmtSSL(site))
if site.IgnoreTLS { if site.IgnoreTLS {
row("TLS Verify", dangerStyle.Render("disabled")) row("TLS Verify", dangerStyle.Render("disabled"))
} }
} }
if site.MaxRetries > 0 || site.Regions != "" || site.Description != "" {
section("CONFIG")
if site.MaxRetries > 0 { if site.MaxRetries > 0 {
row("Retries", fmtRetries(site)) row("Retries", fmtRetries(site))
} }
@@ -773,8 +903,6 @@ func (m Model) viewDetailPanel() string {
if site.Description != "" { if site.Description != "" {
row("Description", site.Description) row("Description", site.Description)
} }
if !site.LastCheck.IsZero() {
row("Last Check", site.LastCheck.Format("15:04:05"))
} }
probeResults := m.engine.GetProbeResults(site.ID) probeResults := m.engine.GetProbeResults(site.ID)
@@ -787,7 +915,30 @@ func (m Model) viewDetailPanel() string {
} }
latency := time.Duration(result.LatencyNs).Milliseconds() latency := time.Duration(result.LatencyNs).Milliseconds()
ago := time.Since(result.CheckedAt).Truncate(time.Second) ago := time.Since(result.CheckedAt).Truncate(time.Second)
fmt.Fprintf(&b, " %-14s %s %dms %s ago\n", nodeID, status, latency, ago) line := fmt.Sprintf(" %-14s %s %dms %s ago", nodeID, status, latency, ago)
if !result.IsUp && result.ErrorReason != "" {
line += " " + dangerStyle.Render(limitStr(result.ErrorReason, 30))
}
b.WriteString(line + "\n")
}
}
stateChanges := m.engine.GetStateChanges(site.ID, 5)
if len(stateChanges) > 0 {
b.WriteString("\n" + subtleStyle.Render(" STATE CHANGES") + "\n")
for _, sc := range stateChanges {
ago := fmtDuration(time.Since(sc.ChangedAt))
arrow := subtleStyle.Render(sc.FromStatus) + " → "
if sc.ToStatus == "UP" {
arrow += specialStyle.Render(sc.ToStatus)
} else {
arrow += dangerStyle.Render(sc.ToStatus)
}
line := fmt.Sprintf(" %s %s", arrow, subtleStyle.Render(ago+" ago"))
if sc.ErrorReason != "" && sc.ToStatus != "UP" {
line += " " + dangerStyle.Render(limitStr(sc.ErrorReason, 40))
}
b.WriteString(line + "\n")
} }
} }
@@ -807,7 +958,7 @@ func (m Model) viewDetailPanel() string {
up, len(hist.Statuses)) up, len(hist.Statuses))
} }
} else { } else {
b.WriteString(" " + latencySparkline(hist.Latencies, sparkWidth)) b.WriteString(" " + latencySparkline(hist.Latencies, hist.Statuses, sparkWidth))
if len(hist.Latencies) > 0 { if len(hist.Latencies) > 0 {
minL, maxL := hist.Latencies[0], hist.Latencies[0] minL, maxL := hist.Latencies[0], hist.Latencies[0]
var total time.Duration var total time.Duration
+14 -3
View File
@@ -32,8 +32,19 @@ func (m Model) viewUsersTab() string {
return "\n No users configured. Press [n] to add one." return "\n No users configured. Press [n] to add one."
} }
var headers []string
var widths []int
if m.isWide() {
headers = []string{"#", "USERNAME", "ROLE", "PUBLIC KEY"}
widths = []int{4, 18, 10, 50}
} else {
headers = []string{"#", "USER", "ROLE", "KEY"}
widths = []int{4, 14, 8, 30}
}
userW := widths[1]
return m.renderTable( return m.renderTable(
[]string{"#", "USERNAME", "ROLE", "PUBLIC KEY"}, headers,
len(m.users), len(m.users),
func(start, end int) [][]string { func(start, end int) [][]string {
var rows [][]string var rows [][]string
@@ -41,14 +52,14 @@ func (m Model) viewUsersTab() string {
u := m.users[i] u := m.users[i]
rows = append(rows, []string{ rows = append(rows, []string{
fmt.Sprintf("%d", i+1), fmt.Sprintf("%d", i+1),
m.zones.Mark(fmt.Sprintf("user-%d", i), limitStr(u.Username, 15)), m.zones.Mark(fmt.Sprintf("user-%d", i), limitStr(u.Username, userW-2)),
fmtRole(u.Role), fmtRole(u.Role),
fmtKey(u.PublicKey), fmtKey(u.PublicKey),
}) })
} }
return rows return rows
}, },
nil, nil, widths, nil,
) )
} }
+23 -4
View File
@@ -15,6 +15,12 @@ var (
type StyleOverride func(row, col int) *lipgloss.Style type StyleOverride func(row, col int) *lipgloss.Style
const wideBreakpoint = 120
func (m Model) isWide() bool {
return m.termWidth >= wideBreakpoint
}
func (m Model) renderTable(headers []string, items int, buildRows func(start, end int) [][]string, colWidths []int, styleOverride StyleOverride) string { func (m Model) renderTable(headers []string, items int, buildRows func(start, end int) [][]string, colWidths []int, styleOverride StyleOverride) string {
if items == 0 { if items == 0 {
return "" return ""
@@ -28,7 +34,16 @@ func (m Model) renderTable(headers []string, items int, buildRows func(start, en
selectedVisual := m.cursor - m.tableOffset selectedVisual := m.cursor - m.tableOffset
rows := buildRows(m.tableOffset, end) rows := buildRows(m.tableOffset, end)
tableWidth := m.termWidth - chromePadH - 2 colTotal := 0
for _, w := range colWidths {
colTotal += w
}
borderOverhead := 2 + len(colWidths) - 1
tableWidth := colTotal + borderOverhead
maxWidth := m.termWidth - chromePadH - 2
if tableWidth > maxWidth {
tableWidth = maxWidth
}
if tableWidth < 40 { if tableWidth < 40 {
tableWidth = 40 tableWidth = 40
} }
@@ -41,7 +56,11 @@ func (m Model) renderTable(headers []string, items int, buildRows func(start, en
Rows(rows...). Rows(rows...).
StyleFunc(func(row, col int) lipgloss.Style { StyleFunc(func(row, col int) lipgloss.Style {
if row == table.HeaderRow { if row == table.HeaderRow {
return tableHeaderStyle h := tableHeaderStyle
if col < len(colWidths) && colWidths[col] > 0 {
h = h.Width(colWidths[col]).MaxWidth(colWidths[col])
}
return h
} }
isSelected := row == selectedVisual isSelected := row == selectedVisual
if styleOverride != nil { if styleOverride != nil {
@@ -51,7 +70,7 @@ func (m Model) renderTable(headers []string, items int, buildRows func(start, en
style = tableSelectedStyle.Foreground(s.GetForeground()) style = tableSelectedStyle.Foreground(s.GetForeground())
} }
if col < len(colWidths) && colWidths[col] > 0 { if col < len(colWidths) && colWidths[col] > 0 {
style = style.Width(colWidths[col]) style = style.Width(colWidths[col]).MaxWidth(colWidths[col])
} }
return style return style
} }
@@ -64,7 +83,7 @@ func (m Model) renderTable(headers []string, items int, buildRows func(start, en
base = tableSelectedStyle base = tableSelectedStyle
} }
if col < len(colWidths) && colWidths[col] > 0 { if col < len(colWidths) && colWidths[col] > 0 {
base = base.Width(colWidths[col]) base = base.Width(colWidths[col]).MaxWidth(colWidths[col])
} }
return base return base
}) })
+61 -10
View File
@@ -3,14 +3,15 @@ package tui
import ( import (
"encoding/json" "encoding/json"
"fmt" "fmt"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
"math" "math"
"sort" "sort"
"strings" "strings"
"time" "time"
"gitea.lerkolabs.com/lerko/uptop/internal/models"
"gitea.lerkolabs.com/lerko/uptop/internal/monitor"
"gitea.lerkolabs.com/lerko/uptop/internal/store"
"github.com/charmbracelet/bubbles/viewport" "github.com/charmbracelet/bubbles/viewport"
tea "github.com/charmbracelet/bubbletea" tea "github.com/charmbracelet/bubbletea"
"github.com/charmbracelet/harmonica" "github.com/charmbracelet/harmonica"
@@ -67,6 +68,7 @@ const (
stateLogs stateLogs
stateUsers stateUsers
stateDetail stateDetail
stateAlertDetail
stateFormSite stateFormSite
stateFormAlert stateFormAlert
stateFormUser stateFormUser
@@ -92,6 +94,7 @@ type Model struct {
maintFormData *maintFormData maintFormData *maintFormData
logViewport viewport.Model logViewport viewport.Model
logFilterImportant bool
isAdmin bool isAdmin bool
zones *zone.Manager zones *zone.Manager
@@ -382,6 +385,14 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
return m, tea.Quit return m, tea.Quit
} }
return m, nil return m, nil
case stateAlertDetail:
switch msg.String() {
case "i", "esc":
m.state = stateDashboard
case "q":
return m, tea.Quit
}
return m, nil
case stateDashboard, stateLogs, stateUsers: case stateDashboard, stateLogs, stateUsers:
switch msg.String() { switch msg.String() {
case "q": case "q":
@@ -391,6 +402,11 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
m.filterMode = true m.filterMode = true
return m, nil return m, nil
} }
case "f":
if m.state == stateLogs {
m.logFilterImportant = !m.logFilterImportant
return m, nil
}
case "tab": case "tab":
m.switchTab(m.currentTab + 1) m.switchTab(m.currentTab + 1)
case "pgup", "pgdown": case "pgup", "pgdown":
@@ -462,6 +478,16 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
m.state = stateFormUser m.state = stateFormUser
return m, m.initUserHuhForm() return m, m.initUserHuhForm()
} }
case "t":
if m.currentTab == 1 && len(m.alerts) > 0 {
a := m.alerts[m.cursor]
go func() {
if err := m.engine.TestAlert(a.ID); err != nil {
m.engine.AddLog(fmt.Sprintf("Test alert failed (%s): %v", a.Name, err))
}
}()
return m, nil
}
case " ": case " ":
if m.currentTab == 0 && len(m.sites) > 0 && m.sites[m.cursor].Type == "group" { if m.currentTab == 0 && len(m.sites) > 0 && m.sites[m.cursor].Type == "group" {
gid := m.sites[m.cursor].ID gid := m.sites[m.cursor].ID
@@ -480,6 +506,8 @@ func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
case "i": case "i":
if m.currentTab == 0 && len(m.sites) > 0 { if m.currentTab == 0 && len(m.sites) > 0 {
m.state = stateDetail m.state = stateDetail
} else if m.currentTab == 1 && len(m.alerts) > 0 {
m.state = stateAlertDetail
} }
case "x": case "x":
if m.currentTab == 4 && len(m.maintenanceWindows) > 0 { if m.currentTab == 4 && len(m.maintenanceWindows) > 0 {
@@ -801,6 +829,8 @@ func (m Model) View() string {
return "" return ""
case stateDetail: case stateDetail:
return m.viewDetailPanel() return m.viewDetailPanel()
case stateAlertDetail:
return m.viewAlertDetailPanel()
default: default:
return m.zones.Scan(m.viewDashboard()) return m.zones.Scan(m.viewDashboard())
} }
@@ -810,13 +840,20 @@ func (m Model) viewDashboard() string {
allSites := m.engine.GetAllSites() allSites := m.engine.GetAllSites()
totalMonitors := 0 totalMonitors := 0
downCount := 0 downCount := 0
lateCount := 0
for _, s := range allSites { for _, s := range allSites {
if s.Type == "group" { if s.Type == "group" {
continue continue
} }
totalMonitors++ totalMonitors++
if !s.Paused && !m.isMonitorInMaintenance(s.ID) && (s.Status == "DOWN" || s.Status == "SSL EXP") { if s.Paused || m.isMonitorInMaintenance(s.ID) {
continue
}
switch s.Status {
case "DOWN", "SSL EXP":
downCount++ downCount++
case "LATE":
lateCount++
} }
} }
offlineNodes := 0 offlineNodes := 0
@@ -829,6 +866,8 @@ func (m Model) viewDashboard() string {
var sitesLabel string var sitesLabel string
if downCount > 0 { if downCount > 0 {
sitesLabel = fmt.Sprintf("Sites (%d↓)", downCount) sitesLabel = fmt.Sprintf("Sites (%d↓)", downCount)
} else if lateCount > 0 {
sitesLabel = fmt.Sprintf("Sites (%d⚠)", lateCount)
} else if totalMonitors > 0 { } else if totalMonitors > 0 {
sitesLabel = fmt.Sprintf("Sites (%d)", totalMonitors) sitesLabel = fmt.Sprintf("Sites (%d)", totalMonitors)
} else { } else {
@@ -894,14 +933,19 @@ func (m Model) viewDashboard() string {
} }
} }
upCount := totalMonitors - downCount upCount := totalMonitors - downCount - lateCount
var upStr string var upStr string
if downCount > 0 { if downCount > 0 {
upStr = dangerStyle.Render(fmt.Sprintf("%d/%d UP", upCount, totalMonitors)) upStr = dangerStyle.Render(fmt.Sprintf("%d/%d UP", upCount, totalMonitors))
} else if lateCount > 0 {
upStr = warnStyle.Render(fmt.Sprintf("%d/%d UP", upCount, totalMonitors))
} else { } else {
upStr = specialStyle.Render(fmt.Sprintf("%d/%d UP", upCount, totalMonitors)) upStr = specialStyle.Render(fmt.Sprintf("%d/%d UP", upCount, totalMonitors))
} }
statusParts := []string{upStr} statusParts := []string{upStr}
if lateCount > 0 {
statusParts = append(statusParts, warnStyle.Render(fmt.Sprintf("%d LATE", lateCount)))
}
if len(m.nodes) > 0 { if len(m.nodes) > 0 {
online := 0 online := 0
for _, n := range m.nodes { for _, n := range m.nodes {
@@ -922,6 +966,10 @@ func (m Model) viewDashboard() string {
switch m.currentTab { switch m.currentTab {
case 0: case 0:
keys = "[/]Filter [n]New [e]Edit [i]Info [d]Del [p]Pause [T]Theme [Tab]Switch [q]Quit" keys = "[/]Filter [n]New [e]Edit [i]Info [d]Del [p]Pause [T]Theme [Tab]Switch [q]Quit"
case 1:
keys = "[n]New [e]Edit [i]Info [d]Del [t]Test [T]Theme [Tab]Switch [q]Quit"
case 2:
keys = "[f]Filter [T]Theme [Tab]Switch [q]Quit"
case 4: case 4:
keys = "[n]New [x]End [d]Del [T]Theme [Tab]Switch [q]Quit" keys = "[n]New [x]End [d]Del [T]Theme [Tab]Switch [q]Quit"
case 5: case 5:
@@ -948,16 +996,19 @@ func siteOrder(s models.Site) int {
switch s.Status { switch s.Status {
case "DOWN", "SSL EXP": case "DOWN", "SSL EXP":
return 0 return 0
case "PENDING": case "LATE":
return 2
default:
return 1 return 1
case "PENDING":
return 3
default:
return 2
} }
} }
func limitStr(text string, max int) string { func limitStr(text string, max int) string {
if len(text) > max { runes := []rune(text)
return text[:max-3] + "..." if len(runes) > max {
return string(runes[:max-3]) + "..."
} }
return text return text
} }
+274
View File
@@ -0,0 +1,274 @@
package main
import (
"database/sql"
"fmt"
"math/rand"
"os"
"time"
_ "github.com/mattn/go-sqlite3"
)
func main() {
if len(os.Args) < 2 {
fmt.Fprintln(os.Stderr, "usage: backfill <db-path>")
os.Exit(1)
}
db, err := sql.Open("sqlite3", os.Args[1])
if err != nil {
fmt.Fprintf(os.Stderr, "open: %v\n", err)
os.Exit(1)
}
defer db.Close()
ids, err := loadSiteIDs(db)
if err != nil {
fmt.Fprintf(os.Stderr, "load site IDs: %v\n", err)
os.Exit(1)
}
rng := rand.New(rand.NewSource(42))
now := time.Now().UTC()
if err := backfillHistory(db, rng, now, ids); err != nil {
fmt.Fprintf(os.Stderr, "history: %v\n", err)
os.Exit(1)
}
if err := backfillStateChanges(db, now, ids); err != nil {
fmt.Fprintf(os.Stderr, "state changes: %v\n", err)
os.Exit(1)
}
if err := backfillLogs(db, now); err != nil {
fmt.Fprintf(os.Stderr, "logs: %v\n", err)
os.Exit(1)
}
if err := backfillNodes(db, now); err != nil {
fmt.Fprintf(os.Stderr, "nodes: %v\n", err)
os.Exit(1)
}
if err := backfillMaintenance(db, now, ids); err != nil {
fmt.Fprintf(os.Stderr, "maintenance: %v\n", err)
os.Exit(1)
}
var count int
db.QueryRow("SELECT COUNT(*) FROM check_history").Scan(&count)
fmt.Printf("Backfill complete: %d check records\n", count)
var token string
if err := db.QueryRow("SELECT token FROM sites WHERE name='Nightly Backup'").Scan(&token); err == nil {
fmt.Printf("PUSH_TOKEN=%s\n", token)
}
}
func loadSiteIDs(db *sql.DB) (map[string]int, error) {
rows, err := db.Query("SELECT id, name FROM sites")
if err != nil {
return nil, err
}
defer rows.Close()
ids := make(map[string]int)
for rows.Next() {
var id int
var name string
if err := rows.Scan(&id, &name); err != nil {
return nil, err
}
ids[name] = id
}
return ids, rows.Err()
}
type monitorProfile struct {
name string
minMs int
maxMs int
downFrom int // check index where DOWN starts (-1 = never)
}
func backfillHistory(db *sql.DB, rng *rand.Rand, now time.Time, ids map[string]int) error {
profiles := []monitorProfile{
{"Nextcloud", 40, 80, -1},
{"Jellyfin", 80, 200, -1},
{"Home Assistant", 15, 45, -1},
{"Gitea", 40, 90, -1},
{"Traefik Dashboard", 5, 25, -1},
{"Vaultwarden", 50, 130, -1},
{"Personal Blog", 25, 65, -1},
{"Immich", 100, 280, -1}, // spikes handled below
{"Auth Portal", 30, 70, 40}, // DOWN after check 40
{"Edge Router", 5, 15, -1}, // ping
{"Postgres", 1, 5, -1}, // port
{"DNS Primary", 10, 30, -1},
}
tx, err := db.Begin()
if err != nil {
return err
}
defer tx.Rollback()
stmt, err := tx.Prepare("INSERT INTO check_history (site_id, latency_ns, is_up, checked_at) VALUES (?, ?, ?, ?)")
if err != nil {
return err
}
defer stmt.Close()
const total = 60
for _, p := range profiles {
siteID, ok := ids[p.name]
if !ok {
continue
}
for i := 0; i < total; i++ {
minutesAgo := (total - i) * 24
checkedAt := now.Add(-time.Duration(minutesAgo) * time.Minute)
var latencyNs int64
isUp := true
if p.downFrom >= 0 && i >= p.downFrom {
latencyNs = 0
isUp = false
} else {
ms := p.minMs + rng.Intn(p.maxMs-p.minMs)
if p.name == "Immich" && i%17 == 0 {
ms = 250 + rng.Intn(100)
}
latencyNs = int64(ms) * 1_000_000
}
if _, err := stmt.Exec(siteID, latencyNs, isUp, checkedAt.Format("2006-01-02 15:04:05")); err != nil {
return err
}
}
}
return tx.Commit()
}
func backfillStateChanges(db *sql.DB, now time.Time, ids map[string]int) error {
type sc struct {
name string
from string
to string
reason string
at time.Time
}
changes := []sc{
{"Nextcloud", "UP", "DOWN", "read timeout", now.Add(-3 * 24 * time.Hour).Add(-5 * time.Minute)},
{"Nextcloud", "DOWN", "UP", "", now.Add(-3 * 24 * time.Hour)},
{"Jellyfin", "UP", "DOWN", "connection reset", now.Add(-18 * time.Hour).Add(-3 * time.Minute)},
{"Jellyfin", "DOWN", "UP", "", now.Add(-18 * time.Hour)},
{"Auth Portal", "UP", "DOWN", "connection refused", now.Add(-8 * time.Hour)},
{"Immich", "UP", "DOWN", "502 Bad Gateway", now.Add(-12 * time.Hour).Add(-8 * time.Minute)},
{"Immich", "DOWN", "UP", "", now.Add(-12 * time.Hour)},
}
tx, err := db.Begin()
if err != nil {
return err
}
defer tx.Rollback()
stmt, err := tx.Prepare("INSERT INTO state_changes (site_id, from_status, to_status, error_reason, changed_at) VALUES (?, ?, ?, ?, ?)")
if err != nil {
return err
}
defer stmt.Close()
for _, c := range changes {
siteID, ok := ids[c.name]
if !ok {
continue
}
if _, err := stmt.Exec(siteID, c.from, c.to, c.reason, c.at.Format("2006-01-02 15:04:05")); err != nil {
return err
}
}
return tx.Commit()
}
func backfillLogs(db *sql.DB, now time.Time) error {
type logEntry struct {
msg string
at time.Time
}
logs := []logEntry{
{"[06:12] Monitor 'Auth Portal' confirmed DOWN: connection refused", now.Add(-8 * time.Hour)},
{"[06:12] Monitor 'Auth Portal' failed check 2/2", now.Add(-8*time.Hour - 30*time.Second)},
{"[06:11] Monitor 'Auth Portal' failed check 1/2", now.Add(-8*time.Hour - 60*time.Second)},
{"[12:33] Monitor 'Immich' recovered (was down 8m)", now.Add(-12 * time.Hour)},
{"[12:25] Monitor 'Immich' confirmed DOWN: 502 Bad Gateway", now.Add(-12*time.Hour - 8*time.Minute)},
{"[12:25] Monitor 'Immich' failed check 3/3", now.Add(-12*time.Hour - 8*time.Minute - 30*time.Second)},
{"[12:25] Monitor 'Immich' failed check 2/3", now.Add(-12*time.Hour - 8*time.Minute - 60*time.Second)},
{"[12:24] Monitor 'Immich' failed check 1/3", now.Add(-12*time.Hour - 9*time.Minute)},
{"[06:14] Monitor 'Jellyfin' recovered (was down 3m)", now.Add(-18 * time.Hour)},
{"[06:11] Monitor 'Jellyfin' confirmed DOWN: connection reset", now.Add(-18*time.Hour - 3*time.Minute)},
{"[06:11] Monitor 'Jellyfin' failed check 2/2", now.Add(-18*time.Hour - 3*time.Minute - 30*time.Second)},
{"[06:10] Monitor 'Jellyfin' failed check 1/2", now.Add(-18*time.Hour - 4*time.Minute)},
{"[23:45] SSL certificate for 'Personal Blog' expires in 42 days", now.Add(-28 * time.Hour)},
{"[08:00] Loaded check history from database", now.Add(-32*time.Hour - 30*time.Minute)},
{"[08:00] Engine RESUMED (Active)", now.Add(-32*time.Hour - 30*time.Minute - 5*time.Second)},
}
tx, err := db.Begin()
if err != nil {
return err
}
defer tx.Rollback()
stmt, err := tx.Prepare("INSERT INTO logs (message, created_at) VALUES (?, ?)")
if err != nil {
return err
}
defer stmt.Close()
for _, l := range logs {
if _, err := stmt.Exec(l.msg, l.at.Format("2006-01-02 15:04:05")); err != nil {
return err
}
}
return tx.Commit()
}
func backfillNodes(db *sql.DB, now time.Time) error {
_, err := db.Exec(
"INSERT OR REPLACE INTO nodes (id, name, region, last_seen, version) VALUES (?, ?, ?, ?, ?)",
"node-1", "leader", "us-east", now.Format("2006-01-02 15:04:05"), "2026.05.1",
)
return err
}
func backfillMaintenance(db *sql.DB, now time.Time, ids map[string]int) error {
tx, err := db.Begin()
if err != nil {
return err
}
defer tx.Rollback()
stmt, err := tx.Prepare("INSERT INTO maintenance_windows (monitor_id, title, description, type, start_time, end_time, created_by) VALUES (?, ?, ?, ?, ?, ?, ?)")
if err != nil {
return err
}
defer stmt.Close()
jellyfinID := ids["Jellyfin"]
past := now.Add(-3 * 24 * time.Hour)
if _, err := stmt.Exec(jellyfinID, "Jellyfin upgrade", "Upgrade to v10.10 + plugin updates", "maintenance",
past.Format("2006-01-02 15:04:05"),
past.Add(2*time.Hour).Format("2006-01-02 15:04:05"),
"admin"); err != nil {
return err
}
future := now.Add(2 * 24 * time.Hour)
if _, err := stmt.Exec(0, "Network switch replacement", "Replacing core switch in rack 2", "maintenance",
future.Format("2006-01-02 15:04:05"),
future.Add(4*time.Hour).Format("2006-01-02 15:04:05"),
"admin"); err != nil {
return err
}
return tx.Commit()
}
+54
View File
@@ -0,0 +1,54 @@
Set Shell "bash"
Set Width 1400
Set Height 800
Set FontSize 14
Set Padding 20
Set Framerate 15
Set TypingSpeed 50ms
Hide
Type "bash vhs/setup.sh /tmp/uptop-vhs.db"
Enter
Sleep 45s
Show
Sleep 5s
# Sites tab — hero shot with mixed monitor states
Screenshot vhs/screenshots/monitors.png
Sleep 1s
# Navigate to Nextcloud (row 6: group + 3 children + Auth Portal)
Down
Sleep 200ms
Down
Sleep 200ms
Down
Sleep 200ms
Down
Sleep 200ms
Down
Sleep 200ms
Type "i"
Sleep 3s
Screenshot vhs/screenshots/detail.png
Sleep 1s
# Close detail
Escape
Sleep 1s
# Tab to Alerts
Tab
Sleep 2s
Screenshot vhs/screenshots/alerts.png
Sleep 1s
# Tab to Logs
Tab
Sleep 2s
Screenshot vhs/screenshots/logs.png
Sleep 1s
# Quit
Type "q"
Sleep 1s
Binary file not shown.

After

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 160 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 219 KiB

+141
View File
@@ -0,0 +1,141 @@
alerts:
- name: Discord Homelab
type: discord
settings:
url: https://discord.com/api/webhooks/1234567890/demo-token
- name: Ntfy Alerts
type: webhook
settings:
url: https://ntfy.example.com/homelab-alerts
- name: Email Oncall
type: email
settings:
host: smtp.example.com
port: "587"
user: alerts@example.com
pass: "••••••••"
from: alerts@example.com
to: oncall@example.com
- name: Slack Ops
type: slack
settings:
url: https://hooks.slack.com/services/T00000/B00000/demo-token
monitors:
# HTTP — homelab services
- name: Nextcloud
type: http
url: https://example.com
interval: 30
alert: Discord Homelab
check_ssl: true
expiry_threshold: 14
max_retries: 2
- name: Jellyfin
type: http
url: https://example.com
interval: 30
alert: Discord Homelab
max_retries: 2
- name: Home Assistant
type: http
url: https://example.com
interval: 30
alert: Discord Homelab
max_retries: 3
- name: Gitea
type: http
url: https://example.com
interval: 60
alert: Discord Homelab
check_ssl: true
expiry_threshold: 14
max_retries: 2
- name: Traefik Dashboard
type: http
url: https://example.com
interval: 60
alert: Discord Homelab
max_retries: 1
- name: Vaultwarden
type: http
url: https://example.com
interval: 30
alert: Discord Homelab
check_ssl: true
expiry_threshold: 14
max_retries: 3
- name: Personal Blog
type: http
url: https://example.com
interval: 120
alert: Discord Homelab
check_ssl: true
expiry_threshold: 14
max_retries: 2
- name: Immich
type: http
url: https://example.com
interval: 60
alert: Discord Homelab
check_ssl: true
expiry_threshold: 7
max_retries: 3
# HTTP — deliberate failure
- name: Auth Portal
type: http
url: http://localhost:1
interval: 30
alert: Discord Homelab
max_retries: 2
# Push — cron jobs
- name: Nightly Backup
type: push
interval: 300
alert: Discord Homelab
- name: Cert Renewal
type: push
interval: 300
alert: Discord Homelab
# Infrastructure group
- name: Infrastructure
type: group
alert: Discord Homelab
monitors:
- name: Edge Router
type: ping
hostname: 8.8.8.8
interval: 30
alert: Discord Homelab
timeout: 5
- name: Postgres
type: port
hostname: localhost
port: 18099
interval: 60
alert: Discord Homelab
timeout: 5
- name: DNS Primary
type: dns
hostname: google.com
dns_server: 8.8.8.8
dns_resolve_type: A
interval: 60
alert: Discord Homelab
timeout: 5
Executable
+27
View File
@@ -0,0 +1,27 @@
#!/bin/bash
# VHS screenshot setup: seed monitors, backfill history, start server.
set -e
DB="${1:?usage: setup.sh <db-path>}"
rm -f "$DB" "$DB-shm" "$DB-wal"
echo "==> Seeding monitors and alerts..."
UPTOP_DB_DSN="$DB" ./uptop apply -f vhs/seed.yaml 2>&1
echo "==> Backfilling check history..."
BACKFILL_OUT=$(go run ./vhs/backfill/ "$DB")
echo "$BACKFILL_OUT"
PUSH_TOKEN=$(echo "$BACKFILL_OUT" | grep '^PUSH_TOKEN=' | cut -d= -f2)
if [ -n "$PUSH_TOKEN" ]; then
echo "==> Sending push heartbeat in 15s (background)..."
(sleep 15 && curl -s "http://localhost:18099/api/push" -H "Authorization: Bearer $PUSH_TOKEN" > /dev/null 2>&1) &
fi
echo "==> Starting uptop server..."
exec env \
UPTOP_DB_DSN="$DB" \
UPTOP_PORT=23299 \
UPTOP_HTTP_PORT=18099 \
UPTOP_ALLOW_PRIVATE_TARGETS=true \
./uptop serve 2>/dev/null