feat(resilience): operational hardening (NEXT phase of the audit)
Deploy to VPS / deploy (push) Has been cancelled

Acts on the audit's NEXT block — operational resilience.

Backups (N1):
- New `backup` compose service (postgres:16-alpine) runs scripts/backup-loop.sh:
  immediate pg_dump on start, then nightly, gzip, 14-day rotation into
  ./backups on the host. Configurable via BACKUP_RETENTION_DAYS /
  BACKUP_INTERVAL_SECONDS. (Offsite copy is the documented next step.)

Resource limits + healthchecks (N2):
- deploy.resources.limits.memory on postgres (2g), app (1500m), nginx (256m),
  backup (256m) so no container can starve the others (the Nginx outage was a
  reminder).
- Nginx now has a healthcheck hitting a new self-served `/nginx-health`
  endpoint on the default_server (no upstream dependency).

Chat resilience (N3):
- buildSystemPrompt() wraps its 4 Prisma queries in try/catch with safe
  defaults — if Postgres is down the assistant degrades instead of 500-ing.
- Result is cached for 60s (only on healthy builds) so we don't run 4 queries
  per message; CMS edits still appear within the TTL.
- POST fails fast with 503 if OPENAI_API_KEY is missing (instead of breaking
  mid-stream after headers are sent).
- streamText gets an onError handler that logs + persists an `error` AiEvent.

Idempotent submissions (N4):
- consultation/route.ts and operations.ts now wrap the email-tracking UPDATE
  in try/catch — the lead/signal is already saved, so a telemetry hiccup can't
  500 the request and trigger a duplicate retry. operations.ts also returns
  emailError.

Performance (N5):
- Index GlobalNode(application, isActive) — backs the case-study join on every
  application page. Migration 20260609130000_index_globalnode_application.

Verified: next build compiles (Docker parity, SESSION_SECRET unset),
TypeScript clean, prisma schema valid, golden tests 17/17,
`docker compose config` valid.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-09 23:07:38 -05:00
parent 18d5ed87c8
commit a81ee50ed8
10 changed files with 208 additions and 31 deletions
+3
View File
@@ -64,6 +64,9 @@ model GlobalNode {
@@index([isActive])
@@index([nodeType])
@@index([nodeType, isActive])
// Case studies on an application page filter by application slug + isActive
// (src/app/[locale]/applications/[slug]/page.tsx). Back this join with an index.
@@index([application, isActive])
}
// ------------------------------------------------------