fix(monitor): merge check results into live state, never overwrite #98
Reference in New Issue
Block a user
Delete Branch "fix/livestate-race"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
checkByIDsnapshotted aSiteunder RLock, ran a network check for seconds, thenhandleStatusChangewrote the entire stale struct back intoliveState. Any concurrent mutation during the check — a user pause, a config edit, or a push heartbeat — was silently reverted.Worst case: a heartbeat set UP and an in-flight
checkPushoverwrote it with a stale DOWN → false alert.Fix
applyState(id, mutate): single read-modify-write helper. Mutator runs against the current live entry under the write lock, so config +Pausedare preserved automatically and status transitions read the true current status.handleStatusChange,RecordHeartbeat,ToggleSitePause,checkGroupthrough it — no more whole-struct overwrite.LastCheckpredates the liveLastCheckis dropped (a heartbeat/newer check superseded it). HTTP/probe stampLastCheck=now, so unaffected — and serial per site anyway.RecordHeartbeatreadStatusChangedAtafter overwriting it (always "was down 0s");downSinceis now captured before mutation.Tests
4 new regression tests — pause / config-edit / heartbeat during in-flight check, and removed-site-dropped. Each fails against the old code. Full suite green under
-race; build + vet clean.Phase 1 of the fresh-eyes first-cut backlog.