fix(monitor): serialize DB writes through a single drained writer
Every check spawned `go e.db.Save*(...)` with the error discarded: a fire-and-forget goroutine per log line, check, state change, and alert health update. SaveLog ran a full-table prune DELETE on every insert and SaveCheck a COUNT + conditional prune on every check, so the hot path amplified each write into several statements. Nothing tracked these goroutines, so at shutdown they raced the store's Close() — writes to a closing DB, silently swallowed. Introduce a single writer goroutine that drains a buffered channel of typed dbWrite values (log/check/state-change/alert-health). Writes are enqueued non-blocking; a saturated queue drops and notes it in the in-memory log rather than blocking the check loop. Write errors are now logged instead of discarded. Retention moves off the hot path: SaveLog and SaveCheck become plain INSERTs, and PruneLogs/PruneCheckHistory/ PruneStateChanges run on a 10-minute timer inside the writer (single keep-newest-N-per-site pass via a window function). state_changes was previously never pruned — now bounded. Add Engine.Stop(): cancels the engine's context, then waits for the writer to drain every buffered write before returning. main wires it in before the deferred store Close() so no write races a closed DB. SQLite gains busy_timeout=5000 and synchronous=NORMAL, applied via the DSN so every pooled connection inherits them (a post-open PRAGMA only touches one connection); WAL moves to the DSN too. :memory: test DBs are left as-is. Tests: writer drains on Stop, Stop is idempotent, and the prune queries keep newest-N per site / N logs on real SQLite. Full suite green under -race.
This commit was merged in pull request #99.
This commit is contained in:
@@ -2,6 +2,7 @@ package store
|
||||
|
||||
import (
|
||||
"database/sql"
|
||||
"fmt"
|
||||
"log"
|
||||
|
||||
_ "github.com/mattn/go-sqlite3"
|
||||
@@ -10,13 +11,20 @@ import (
|
||||
type SQLiteDialect struct{}
|
||||
|
||||
func NewSQLiteStore(path string) (*SQLStore, error) {
|
||||
s, err := NewSQLStore("sqlite3", path, &SQLiteDialect{})
|
||||
// Apply pragmas via the DSN so every pooled connection gets them — a
|
||||
// post-open PRAGMA Exec only affects a single connection. WAL allows
|
||||
// concurrent readers alongside the single writer goroutine; busy_timeout
|
||||
// rides out brief lock contention; synchronous=NORMAL is durable under WAL
|
||||
// and far faster than the FULL default. (:memory: is left untouched —
|
||||
// these pragmas are no-ops or harmful for the in-memory test DB.)
|
||||
dsn := path
|
||||
if path != ":memory:" {
|
||||
dsn = fmt.Sprintf("file:%s?_journal_mode=WAL&_busy_timeout=5000&_synchronous=NORMAL", path)
|
||||
}
|
||||
s, err := NewSQLStore("sqlite3", dsn, &SQLiteDialect{})
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if _, err := s.db.Exec("PRAGMA journal_mode=WAL"); err != nil {
|
||||
log.Printf("WAL mode failed: %v", err)
|
||||
}
|
||||
return s, nil
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user