Files
flux-srl/docs/AUDIT_2026-06_VERIFIED.md
T
davidherran 18d5ed87c8
Deploy to VPS / deploy (push) Has been cancelled
fix(security+db): close the real audit findings (SEC-04/05/01, DB-01)
Acts on the verified findings from the 2026-06 audit (docs/AUDIT_2026-06_
VERIFIED.md). The audit's #1 "middleware never runs" was a false positive
(verified in prod: /hq-command redirects to login). These are the genuine
gaps:

- SEC-04 (HIGH): /api/assets (GET/POST/PUT/DELETE/PATCH) and
  /api/branding/favicon (POST) had NO auth. The middleware matcher excludes
  /api, so they were world-reachable — anyone could list/upload/rename/
  delete CMS files or regenerate the favicon. Added a new getAdminSession()
  helper (src/lib/session.ts) and a requireAdmin() guard on every handler.

- DB-01 (HIGH): the ClientUser table (B2B client portal) was defined in the
  schema but NEVER created by any migration, and OperationsSignal.clientId +
  its FK were missing too. B2B register/login failed at runtime; the
  dashboard silently showed 0 clients. New additive migration
  20260609120000_add_client_user creates the table, the unique email index,
  the clientId column (IF NOT EXISTS), and the FK (duplicate-object guarded).

- SEC-05 (MED-HIGH): operations.ts generateRichEmailHtml() interpolated
  item.title/sku/quantity, clientName/Company/Email/Phone and the free-text
  message straight into HTML — stored XSS into the team's internal inbox.
  Now escaped via escapeHtml/escapeAttr/safeMailto; file links validated to
  internal paths only.

- SEC-01 (MED): removed the hardcoded SESSION_SECRET fallback in src/proxy.ts;
  it now validates lazily and throws if the secret is missing (mirrors
  session.ts), so a runtime env failure can't fall back to a public key.

Verified: next build compiles with SESSION_SECRET unset (Docker parity),
TypeScript clean, prisma schema valid, golden tests 17/17.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 22:40:20 -05:00

44 KiB
Raw Blame History

FLUX SRL — Audit Verification & Corrected Priorities

Date: 2026-06-09 Method: 10-dimension multi-agent audit (59 agents, adversarial verification) + manual re-verification of every critical/high finding against the running production system.

Why this file exists: the automated audit (full report below) is thorough but produced one false-positive CRITICAL and overstated a few others. This top section is the corrected, ground-truth verdict. Trust this section over the raw report where they disagree.


Manual verification results (the corrections)

SEC-02 "Auth middleware never runs" — FALSE POSITIVE

The audit's #1 critical claimed src/proxy.ts is never executed because it exports proxy not middleware. Verified in production: curl https://www.rf-flux.com/hq-command/dashboard (no cookie) → 307 redirect to /hq-command/login. The middleware is running. Next.js 16 does recognize src/proxy.ts exporting proxy (the rename is real in Next 16; the code comment was correct). The empty middleware-manifest.json in a local build is misleading — runtime behavior disproves the claim. Consequence: the HQ admin surface is protected by the middleware. SEC-03 (below) drops from critical to low.

⚠️ SEC-03 "HQ server actions lack auth" — OVERSTATED (low, not critical)

Because the middleware does run and its matcher covers /hq-command/*, POSTs to those routes (which is how server actions are invoked) are gated — no cookie → redirect before the action executes. Adding an in-action getSession() check is still good defense-in-depth, but it is not an open door.

SEC-04 "/api/assets + /api/branding/favicon unauthenticated" — REAL (HIGH) — the actual top security issue

Verified: neither route checks a session, and the middleware matcher excludes /api, so the middleware does not protect them. Anyone can GET/POST/PUT/DELETE /api/assets — list CMS structure, upload files into public scopes, rename/delete content — and POST /api/branding/favicon. This is the real #1.

DB-01 "ClientUser table never migrated" — REAL (HIGH), impact nuanced

Verified: the init migration creates 10 tables; ClientUser is not among them, and no later migration adds it. Code calls prisma.clientUser in clientAuth.ts (B2B register/login) and the dashboard. Real impact: the B2B client portal (register/login) is broken at runtime. The HQ dashboard does not crash (its counts run inside a try/catch and silently show 0 clients), so the audit's "undeployable crash" framing was overstated — but the B2B portal genuinely doesn't work.

SEC-05 "Operations email not HTML-escaped" — REAL (MEDIUM-HIGH)

Verified: src/app/actions/operations.ts → generateRichEmailHtml() interpolates item.title, payload.clientName, clientCompany, etc. straight into HTML with no escapeHtml(). Stored-XSS into the team's internal operations inbox from the CartDrawer form. Real.

SEC-01 "Hardcoded secret fallback in proxy" — REAL but MITIGATED (MEDIUM)

src/proxy.ts:13 still has || "FLUX_SUPER_SECRET_KEY_2026_ARCHITECTURE". Mitigated because signing (session.ts) throws without the env var, so tokens can't be forged in normal operation. Should still be removed (the verifier asymmetry is real).

INFRA-03 "Secrets in git" — REAL (known)

OpenAI key + Gmail app password are in git history. Rotate; convert tracked env.env.example.


Corrected priority list — what actually matters

🔴 NOW (real, small, do first)

  1. Auth on /api/assets + /api/branding/favicon (SEC-04) — add getAdminSession() guard at the top of each handler. Effort: S
  2. Create the ClientUser migration (DB-01) — additive migration; unblocks the B2B portal. Effort: S
  3. Escape HTML in operations emails (SEC-05) — reuse escapeHtml like consultation/route already does. Effort: S
  4. Remove the proxy secret fallback (SEC-01) — throw if SESSION_SECRET missing, mirroring session.ts. Effort: S
  5. Whitelist chat context fields (AI-01) — sanitize context.section/activeTab before injecting into the system prompt (prompt-injection hygiene). Effort: S
  6. Rotate OpenAI key + SMTP password; untrack env (INFRA-03). Effort: S — needs the client for the key

🟡 NEXT (operational resilience — all real)

  • Automated nightly DB backups (none today) + offsite. M
  • Container memory/CPU limits + Nginx healthcheck in compose. S
  • buildSystemPrompt() try/catch + static fallback so chat degrades instead of 500 if DB is down (TEST-02). S
  • Validate OPENAI_API_KEY at startup + wrap streamText in try/catch (TEST-03). S
  • Idempotent consultation submission (no orphan records on SMTP fail) (TEST-04). M
  • Monitoring/uptime (UptimeRobot on /api/health) + log aggregation. M
  • Index GlobalNode.application; cache the system prompt (PERF-01/03). S

🟢 LATER (polish/scale)

  • Defense-in-depth getSession() inside HQ actions (SEC-03). a11y pass (focus trap, labels, skip link, reduced-motion). i18n: number formatting + translate error boundaries. SEO: image/video sitemaps, FAQ/VideoObject schema. Reduce any, split oversized files, adopt log.* everywhere. Jsonb migration for hot JSON fields. Lazy-load Three.js.

What's genuinely strong (verified)

CSRF on public endpoints · magic-byte upload validation · JWT signing that throws on missing secret · the middleware DOES protect HQ · complete 5-locale parity · the AI translation glossary · hreflang/canonical/breadcrumb SEO · Nginx edge caching + the new canonical-host guard + scanner blocking · structured logger + safe-JSON helpers + escapeHtml exist (gap is adoption consistency) · clean idempotent migrations · bcrypt password hashing · well-designed 9-tool SPIN FluxAI.


Dimension scorecard (audit, with my adjustments)

Dimension Audit Adjusted Note
Security 3.5 6.5 SEC-02 false positive removed; middleware protects HQ. Real gaps: SEC-04/05/01.
Performance 6.0 6.0 Solid baseline; prompt rebuilt per message.
Code quality 5.5 5.5 any usage, logging inconsistency.
Database 4.5 5.5 ClientUser migration missing (B2B), but not a dashboard crash.
FluxAI 6.0 6.0 Strong tools; add prompt-injection guard + history.
SEO 6.5 6.5 Good base; add media sitemaps + FAQ/Video schema.
i18n 6.0 6.5 Excellent infra; few hardcoded strings.
Infra 4.0 4.5 No backups/monitoring; secrets in git. The real weak spot.
Accessibility/UX 4.5 5.0 a11y gaps; dark mode now covered.
Testing/reliability 3.0 3.5 One suite; add API integration tests + degradation.

Adjusted overall: ~6.0/10 (audit said 4.8 — the false-positive critical dragged it down). The build is solid; the genuine priorities are a handful of small auth/data fixes plus operational resilience (backups + monitoring).



Appendix — Full automated audit report (raw, uncorrected)

The section below is the unedited 10-dimension report. Where it conflicts with the verification above (notably SEC-02), the verification above is authoritative.

FLUX SRL — Consolidated Audit Report

Project: rf-flux.com (Flux SRL) — Next.js 16 · Prisma 7 · PostgreSQL · AI SDK 6 · 5 locales · Docker + Nginx on OVH Audit scope: 10 dimensions, adversarially verified (each critical/high finding independently re-checked against source) Date: 2026-06-09 Auditor: Lead consolidation pass


1. Executive Summary

Flux SRL is a capably-built marketing + B2B platform with genuinely good bones — multi-stage Docker, CSRF on public endpoints, magic-byte upload validation, a structured logger, a working AI translation glossary, complete 5-locale message parity, and a recently-hardened Nginx (canonical-host guard, scanner blocking). But the audit surfaced a cluster of authorization failures that, taken together, mean the entire HQ admin surface is currently unprotected, plus operational gaps (no backups, no monitoring, secrets in git) that make this risky to run in production as-is.

The single most important finding: the auth middleware does not execute at all because the file exports a function named proxy instead of middleware. Every downstream auth assumption collapses from that one fact — and because the HQ server actions and the /api/assets + /api/branding routes have no internal session checks, they are reachable by anyone.

Overall health: 4.8 / 10 (weighted toward the security + infra blockers)

Dimension scorecard

Dimension Score One-line verdict
Security 3.5 Middleware never runs; HQ actions + asset APIs fully unauthenticated.
Performance 6.0 Solid caching baseline; chat rebuilds prompt + 4 DB queries every message.
Code quality 5.5 187 any usages, inconsistent logging, empty catch blocks; good helpers exist but under-used.
Database 4.5 ClientUser table referenced in code but never migrated — runtime crash risk.
FluxAI 6.0 Strong tool architecture; prompt injection via context, no history, plaintext message storage.
SEO 6.5 Good metadata + JSON-LD; no image/video sitemaps for a visual industrial site.
i18n 6.0 Excellent infra and parity; a few components ship hardcoded English.
Infra 4.0 No resource limits, no backups, no monitoring, secrets in tracked env file.
Accessibility/UX 4.5 No skip link, no Escape-to-close, no reduced-motion, broken Tailwind class.
Testing/reliability 3.0 One unit-test file; zero integration tests; multiple silent-fail paths.

Top 5 things to fix first

  1. Make the middleware actually run (SEC-02). Rename/export proxy as middleware in src/middleware.ts. Until this lands, nothing else in the auth layer matters — the page-level guards are dead code.
  2. Add session checks inside HQ server actions and the asset/branding APIs (SEC-03, SEC-04). Defense-in-depth: even with middleware fixed, these must self-verify. They are directly POST-able today.
  3. Rotate the leaked credentials and stop tracking env (INFRA-03). The OpenAI key and Gmail app password are in git history. Rotate now; they're compromised regardless of what you do next.
  4. Create the missing ClientUser migration (DB-01). Code calls prisma.clientUser in the dashboard and B2B auth; the table was never created. This is an active crash, not a hypothetical.
  5. Stand up backups + container resource limits (INFRA-05, INFRA-01). No automated DB backup and no memory caps on an OVH single-VPS is a "lose everything on one bad day" setup.

2. Critical & High Findings (confirmed / partially-confirmed only)

Severities below use the adjusted verdict from verification. Refuted findings are excluded (see §6).

Security

SEC-02 — Auth middleware never executes (CRITICAL) src/proxy.ts:17 — The file exports async function proxy(...) plus config, but Next.js 16 only invokes a file named middleware.ts exporting middleware. The .next/server/middleware-manifest.json is empty, confirming it was never compiled. A code comment even documents the rename intentionally. Impact: every /hq-command/* route has zero middleware protection; direct navigation and prefetch hit the dashboard with no auth gate. Fix: create src/middleware.ts with export { proxy as middleware } and re-export config; verify the matcher covers /hq-command. Then redeploy and confirm the manifest is populated.

SEC-03 — HQ server actions lack session verification (CRITICAL) src/app/hq-command/dashboard/inbox/actions.ts:11-188getSignals(), getClients(), approveAccessRequest(), updateSignalStatus(), resolveAndCleanSignal(), deleteSignal(), resendSignalEmail(), deleteClient() run raw Prisma mutations with no getSession() call. Marked "use server" but self-unauthenticated. Impact: these are invokable directly via POST (curl/fetch) — approve/delete B2B clients, mutate signals, trigger emails, all without a token. Fix: add a requireAdminSession() helper and call it at the top of every action: const s = await getAdminSession(); if (!s) throw new Error('Unauthorized');. Don't rely on the middleware alone.

SEC-04 — /api/assets and /api/branding/favicon are unauthenticated (CRITICAL, raised from high) src/app/api/assets/route.ts, src/app/api/branding/favicon/route.ts — All methods (GET/POST/PUT/DELETE/PATCH) on /api/assets and POST on the favicon route have no session check. Path sanitization prevents traversal but not auth. Impact: anyone can enumerate CMS structure, upload SVG-with-JS into public scopes, delete branding/content, or regenerate favicons → defacement/XSS/downtime. Fix: at the top of each handler, const s = await getAdminSession(); if (!s) return NextResponse.json({error:'Unauthorized'},{status:401});.

SEC-01 — Hardcoded session-secret fallback in proxy (HIGH, lowered from critical) src/proxy.ts:13process.env.SESSION_SECRET || "FLUX_SUPER_SECRET_KEY_2026_ARCHITECTURE", used for JWT verification at line 35. Verifier confirmed it's mitigated in practice: session.ts/clientAuth.ts throw if the secret is unset at signing time, so tokens can't be forged under normal operation. The danger is the asymmetry — if env loading ever fails at runtime, the verifier would accept tokens signed with the public string. Fix: remove the fallback; throw loudly if SESSION_SECRET is missing, mirroring session.ts.

SEC-05 — Missing HTML escaping in operations email templates (HIGH) src/app/actions/operations.ts:119-156generateRichEmailHtml() interpolates item.title, item.sku, payload.clientName, clientCompany, clientEmail, clientPhone, message straight into HTML. escapeHtml() exists in src/lib/escapeHtml.ts but isn't imported here. Impact: HTML/script injection into internal operations emails from the CartDrawer form. Fix: escape every user field before interpolation, matching the pattern already correct in src/app/api/consultation/route.ts.

Database

DB-01 — ClientUser table referenced but never migrated (CRITICAL) prisma/schema.prisma:370-387 defines ClientUser and the OperationsSignal.clientId relation (lines 226-227), but no migration creates the table — not the init migration nor any of the 5 later ones. Code actively calls it: src/app/hq-command/dashboard/page.tsx:67-68 (prisma.clientUser.count()), src/app/actions/clientAuth.ts:25,30,50+ (findUnique/create/update). Impact: immediate runtime failure on any B2B auth or dashboard-count path; undeployable. Fix: generate a migration creating ClientUser (id, email unique, passwordHash, fullName, companyName, phone?, isApproved default false, lastLoginAt?, createdAt, updatedAt) and the FK from OperationsSignal.clientId. Run prisma migrate dev and verify against a fresh DB.

DB-04 — JSON stored as String instead of Jsonb (HIGH) prisma/schema.prisma — 24+ JSON fields across 12 models (translationsJson, galleryJson, sectionsJson, payloadJson, etc.) are String. Impact: no DB-level JSON queries/indexing/validation; all parse/serialize is manual (and is the root cause of several CQ-03/CQ-04 parse-error findings). Fix: migrate query-hot fields first (payloadJson on AiEvent, galleryJson/sectionsJson on Application) to Jsonb via add-column → backfill → drop → rename.

DB-05 — AI telemetry grows unbounded; no retention policy (HIGH) prisma/schema.prisma:322-365AiConversation/AiEvent persist every chat indefinitely. No TTL, cleanup job, or purge anywhere in src/. Stores hashed IP + userAgent + full message text → GDPR Art. 5(1)(e) storage-limitation exposure. Fix: add a retentionDays setting (default 90), a daily cleanup deleting rows older than the window, and document it. Consider archival before delete.

DB-03 — GlobalNode.application has no referential integrity (MEDIUM, lowered from high) prisma/schema.prisma:40 — plain String, validated only as non-empty in network/actions.ts:34. Verifier downgraded: orphaned nodes persist and are simply excluded from filtered queries (deterministic, not data loss); UI dropdown prevents most bad input. Fix: validate Application.findUnique({where:{slug}}) exists before create, or migrate to an applicationId FK.

DB-02 — PageContent missing createdAt (MEDIUM, lowered from high) prisma/schema.prisma:266 — has updatedAt only; every other model has both (except SiteSetting). Impact: no creation audit trail for page content. Fix: add createdAt DateTime @default(now()).

Infrastructure

INFRA-01 — No container resource limits (CRITICAL) docker-compose.yml:41-116 — neither app (with restart: always) nor nginx nor postgres define memory/CPU limits. Impact: a runaway/leaking process consumes all VPS RAM and OOM-kills siblings → documented downtime. Fix: app mem_limit 1G (reservation 512M), nginx 256M/128M, cpus 1.0 each; load-test.

INFRA-03 — Secrets committed in tracked env file (CRITICAL) env:29-40 — real OPENAI_API_KEY (sk-proj-…), SMTP_USER, and a 16-char Gmail app password, tracked since first commit fc24313. .gitignore pattern .env* doesn't match the bare env filename. Fix (in order): (1) rotate the OpenAI key and Gmail app password now — they're already compromised; (2) git rm --cached env, add env to .gitignore; (3) commit .env.example with placeholders; (4) move to GitHub Secrets / a secrets manager. History scrub is optional but the rotation is not.

INFRA-05 — No automated database backups (CRITICAL) src/app/hq-command/dashboard/health/actions.ts:31-52 — backup is a manual HTTP export of unencrypted JSON covering only 10 of 16 tables (omits HeroSlide, SiteSetting, AiConversation, AiEvent, ClientUser, TeamMember). No cron, no offsite, no retention, no WAL/PITR. Impact: total loss if the PG volume corrupts. Fix: nightly pg_dump | gzip to offsite (S3, encrypted), 30-day retention; enable WAL archiving for PITR; document + test RTO/RPO monthly.

INFRA-02 — Nginx has no healthcheck; unconditional depends_on (HIGH, lowered from critical) docker-compose.yml:93-116 — nginx lacks a healthcheck and depends on app with no condition, while app correctly uses condition: service_healthy for postgres. Verifier softened the "deadlock" claim: it's a startup race, not a hang — nginx proxies to the unready app and returns 502/503 during the ~40s start window. Fix: add an nginx healthcheck (tcp:80) and depends_on: app: condition: service_healthy.

INFRA-04 — Deploy script has no error handling or rollback (HIGH, lowered from critical) .github/workflows/deploy.yml:28-51 — the health check at line 49 uses curl -sf … || echo, swallowing failure so script_stop is bypassed; migrations (line 42) run post-live with no error handling; no snapshot, no rollback. Impact: a failed migration or post-deploy crash leaves prod broken with manual-only recovery. Fix: make the health check fail the deploy (curl -sf … || exit 1, retried 3× with backoff); take a pre-deploy snapshot; restore on failure.

INFRA-06 — No monitoring, alerting, or log aggregation (HIGH, lowered from critical) docker-compose.yml — structured logger + /api/health exist, but nothing ships metrics/logs to any platform (no Prometheus/Grafana/Sentry/Loki). Impact: blind operations; degradation invisible until users complain. Fix: expose /metrics (prom-client), ship structured logs to Loki/CloudWatch, alert on mem >80% / error-rate >1% / p99 >500ms.

INFRA-07 — Missing OCSP stapling (HIGH) nginx/conf.d/flux.conf:68-82 — no ssl_stapling. Impact: slower TLS handshakes; visitor IPs leak to Let's Encrypt. Fix: add ssl_stapling on; ssl_stapling_verify on; ssl_trusted_certificate …/chain.pem; resolver 1.1.1.1 8.8.8.8 valid=300s;.

INFRA-11 — CSP allows unsafe-inline + unsafe-eval (MEDIUM, lowered from high) nginx/conf.d/flux.conf:87 — present in script-src for Next.js hydration. Verifier noted strong compensating controls (Zod validation, escapeHtml, magic-byte checks, minimal dangerouslySetInnerHTML), so it's real technical debt rather than an active hole. Fix: move to nonce-based CSP; run Report-Only first.

INFRA-08 — Nginx cache-poisoning surface via hidden Set-Cookie (MEDIUM, lowered from high) nginx/conf.d/flux.conf:282-286proxy_ignore_headers/proxy_hide_header Set-Cookie on public pages. Verifier refuted the "session cookies lost" and "sensitive header disclosure" parts (authenticated requests bypass cache via $cookie_flux_session); the residual risk is poisoning public pages with a non-sensitive injected cookie. Fix: refine cache-bypass logic rather than blanket-hiding Set-Cookie.

INFRA-09 — Large client_max_body_size without explicit body timeout (MEDIUM, lowered from high) nginx/conf.d/flux.conf:71,135,148 — 500M limit; global block lacks client_body_timeout. Verifier softened: nginx's 60s default applies and upload endpoints set proxy timeouts. Fix: add client_body_timeout 60s; to the http block; reduce 500M where not genuinely needed.

INFRA-10 — No logrotate (MEDIUM, lowered from high) nginx/nginx.conf:39 — logs unbounded, but verifier noted /var/log/nginx isn't a mounted volume, so logs live in ephemeral container storage (purged on restart). Residual risk: log loss on restart and container-layer growth on long uptimes. Fix: add logrotate (daily, compress, keep 7, SIGHUP).

Performance

PERF-01 — System prompt rebuilt with 4 DB queries on every chat request (HIGH) src/app/api/chat/route.ts:42-281buildSystemPrompt() runs application.findMany + 2× globalNode.count + sparePart.count (Promise.all) on every POST, with no caching layer (the promptCacheKey is a no-op). Impact: 4 round-trips per message; 400+/min at 100 concurrent users. Fix: cache the built prompt in-memory with a 3060s TTL, invalidated on CMS write.

PERF-03 — Missing index on GlobalNode.application (HIGH) prisma/schema.prisma — queries filter on (application, isActive), (nodeType, isActive, application), and (application, nodeType, isActive) (applications page 136-142; chat 503, 621) but there's no application index. Impact: full table scans on the synchronous FluxAI search path. Fix: add @@index([application]), @@index([application, isActive]), @@index([application, nodeType]); migrate.

PERF-02 — Synchronous fs.readdirSync in SSR paths (MEDIUM, lowered from high) src/app/[locale]/page.tsx:102, src/app/[locale]/applications/[slug]/page.tsx:32 — real blocking calls, but verifier downgraded: they're fallback paths inside try/catch on small dirs, hit during 60s ISR regen, not per-request. Fix: switch to fs.promises.readdir, or remove the fallback once HeroSlide/CMS migration is complete.

FluxAI

AI-01 — Prompt injection via unvalidated context fields (HIGH) src/app/api/chat/route.ts:215-216context.section and context.activeTab come from req.json() (TS types only, no runtime validation) and are interpolated into contextNote, concatenated to the system prompt at line 287. A manipulated frontend store can inject instructions overriding FluxAI's personality/tool limits. Fix: whitelist context.section/activeTab against the known section set; reject anything else (Zod).

AI-03 — User message text stored unencrypted (HIGH) src/app/api/chat/route.ts:268 — full user text (≤8000 chars) persisted as plaintext in AiEvent.payloadJson (String). Customer questions about volumes/processes sit in cleartext. Impact: breach exposes competitive intel; GDPR Art. 32 (encryption at rest). Fix: encrypt sensitive payloads (pgcrypto/KMS) or store only industry label + tool names; pair with the DB-05 retention policy.

AI-02 — No conversation history / resume (HIGH) src/components/ai/SilentObserver.tsx:76-78useChat initializes fresh with no initialMessages; backend persists everything but the UI never fetches it. Impact: multi-turn B2B sales context lost on refresh, undermining the funnel-tracking investment. Fix: add /api/chat/history?sessionId, hydrate initialMessages on mount, offer "continue previous conversation?".

AI-06 — Telemetry write errors silently swallowed (HIGH) src/app/api/chat/route.ts:272-274, 370-371 — telemetry writes are wrapped in try/catch with only log.warn(); the onFinish writes can fail after the response streamed, losing conversation records. Impact: undercounted funnel analytics. Fix: circuit-breaker + alert on >5 failures/hour; buffer-and-retry unsent events.

SEO

SEO-02 — No image or video sitemaps (HIGH) src/app/sitemap.ts:1-106 — only page URLs. Rich media (news coverImage/galleryJson, application galleries, heritage mediaUrl videos) is invisible to Google Image/Video Search; robots.ts declares only /sitemap.xml. Fix: add sitemap-images.xml (<image:image> per article cover/gallery + app hero) and sitemap-videos.xml (heritage videos with title/description/duration); declare both in robots.

i18n

I18N-01 — Hardcoded English in EnergySavingsCalculator (HIGH) src/components/ai/EnergySavingsCalculator.tsx:61,118,146-149,156,164-178 — 13+ literal strings ("Annual Savings", "CO2 Reduced", "Payback", "Request Detailed Engineering Study", …); component never imports useTranslations, and no EnergySavingsCalculator namespace exists in any of the 5 locale files. Impact: IT/ES/DE/VEC users see English in a customer-facing calculator. Fix: add the namespace to en.json, translate to all 4 locales, wire useTranslations("EnergySavingsCalculator").

I18N-02 — Hardcoded alert in CartDrawer (HIGH) src/components/layout/CartDrawer.tsx:43alert("You must accept the privacy policy.") bypasses the otherwise-used translation system; a modal blocker shown in English to all locales. Fix: move to a translated key and prefer the Toast component over native alert().

Accessibility / UX

A11Y-02 — Broken Tailwind class breaks nav styling (HIGH) src/components/layout/NavBar.tsx:190"text-[#86868B hover:text-[#1D1D1F]" is missing a ]. Impact: the color class doesn't apply to inactive nav links in light mode. Fix: "text-[#86868B] hover:text-[#1D1D1F]".

A11Y-03 — Icon-only buttons missing aria-label (HIGH) src/components/layout/NavBar.tsx:229,278,294 — theme toggle, cart, mobile-menu announce only "button". Fix: add descriptive aria-labels (e.g., aria-label={isDark ? 'Switch to light mode' : 'Switch to dark mode'}).

A11Y-04 — Modal can't be closed with Escape (HIGH) src/components/ui/CaseStudyModal.tsx:232-237 — no keydown handler; only a click-able close button (WCAG 2.1.1). Fix: useEffect keydown listener → if (e.key==='Escape') onClose().

A11Y-05 — No skip-to-main-content link (HIGH) src/app/[locale]/layout.tsx:145-204 — ~13 interactive elements before main; main div has no id and isn't a <main> (WCAG 2.4.1). Fix: add <a href="#main-content" class="sr-only focus:not-sr-only">Skip to main content</a> and an id/<main> target.

A11Y-08 — No prefers-reduced-motion handling (HIGH) src/app/globals.css — 260+ framer-motion animations, zero reduced-motion checks. Impact: vestibular-disorder risk. Fix: add @media (prefers-reduced-motion: reduce){*{animation:none!important;transition:none!important}} and gate heavy motion via useReducedMotion().

A11Y-09 — Hero videos lack captions (MEDIUM, lowered from high) src/components/sections/HeroReel.tsx:71-83 — no <track>; verifier noted videos are muted/decorative with overlaid text, softening impact (WCAG 1.2.2). Fix: add captions where videos carry meaning, or mark decorative explicitly.

A11Y-01 — Empty alt on cart product images (MEDIUM, lowered from high) src/components/layout/CartDrawer.tsx:148alt=""; verifier noted adjacent visible title+SKU text mitigates. Fix: alt={item.title} for defense-in-depth.

Code quality

CQ-01 — 187 any usages across 48 files (HIGH) Concentrated in ApplicationClient.tsx (1208 lines), chat/route.ts (920 lines), SilentObserver.tsx (22 instances), despite strict: true. Fix: use existing cms.ts types (AppFull, NodeFull); add a model-viewer.d.ts for the web-component casts; chip away by file.

CQ-02 — Inconsistent logging: 50 console.error vs 9 log.error (HIGH) A structured JSON logger exists (src/lib/logger.ts) but is bypassed in sitemap.ts:75,102, heritage/page.tsx:210, mailer.ts:96, imageOptimizer.ts:117, etc. Impact: unstructured output breaks Loki/CloudWatch parsing. Fix: replace console.* with log.* in server code; add a lint rule / pre-commit hook.

CQ-04 — Empty catch blocks suppress JSON parse errors (HIGH) ApplicationClient.tsx:618,826-829, news/[slug]/page.tsx:259, parts/page.tsx:55 — silent catch(e){} on dimensions/media/specs parsing. Fix: at minimum catch(e){ log.warn('parse_failed', e) }.

CQ-03 — Unguarded JSON.parse for sections/advantages (MEDIUM, lowered from high) ApplicationClient.tsx:1026-1027data.sectionsJson/advantagesJson parsed with no try/catch (verifier confirmed this pair; refuted the claims about lines 600/609/CaseStudyModal which do have guards). Fix: use the existing parseJsonField helper from cms.ts.

CQ-07 — any-typed Prisma access in chat (MEDIUM, lowered from high) chat/route.ts:413-414(app: any) access with inconsistent null-handling; verifier refuted the line-450 "undefined .score" claim (it's always initialized). Real issue is type-safety, not a live crash. Fix: use Prisma select with explicit types instead of any.

Testing / reliability

TEST-01 — No integration/e2e tests for critical API routes (HIGH) Only tests/ai/golden.test.mjs (17 unit tests). Zero coverage of /api/consultation, /api/chat, /api/health, /api/public-upload. Fix: add integration tests (Node test runner) with mocked Prisma/nodemailer; aim >80% on src/app/api.

TEST-02 — buildSystemPrompt() has no DB-failure handling (HIGH) chat/route.ts:42-72 — 4 parallel Prisma queries, no try/catch; called at line 281 before streamText(). If Postgres is down, the whole chat 500s instead of degrading. Asymmetric with the telemetry try/catch. Fix: wrap in try/catch → fall back to a static DEFAULT_SYSTEM_PROMPT (keeps personality/tools, omits dynamic counts).

TEST-03 — OpenAI key never validated at startup or in health check (HIGH) chat/route.ts, health/route.tsopenai('gpt-4o') is called with no key validation; /api/health only does SELECT 1. Impact: invalid/missing key surfaces mid-stream after headers are sent. Fix: validate OPENAI_API_KEY at startup; wrap streamText() in try/catch returning a structured error (with retryAfterSec on 429).

TEST-04 — Consultation email isn't idempotent; orphaned records on SMTP failure (HIGH) api/consultation/route.ts:80-155 — signal created (line 105) before email send (line 141); on SMTP failure the client gets a 500 with no ticketId, retries duplicate, and the fallback route isn't re-attempted. Fix: add an idempotency key, return 202 with ticketId + emailError on send failure, and add a background retry with backoff.


3. Medium / Low / Info

Security

  • SEC-06 (med) — operations.ts:129-132: file links built from unvalidated fileUrl → validate it starts with /.
  • SEC-07 (low, downgraded) — parts/page.tsx: client-side gate not SSR redirect; data isn't leaked → optional redirect() for cleanliness.
  • SEC-08 (med) — HQ actions have no rate limiting → rate-limit by admin id + action.
  • SEC-09 (low) — rateLimit.ts:144: trusts X-Forwarded-For unvalidated → validate single-IP / trust only Nginx.
  • SEC-10 (low) — chat/route.ts:225: SESSION_SECRET reused as telemetry HMAC salt → use a separate VISITOR_HASH_SECRET.

Database

  • DB-06 (med) — schema.prisma:230: AiConversation→OperationsSignal no onDelete → orphans; document or cascade.
  • DB-07 (med) — add @@index([type, status, createdAt(sort: Desc)]) on OperationsSignal.
  • DB-08 (low) — Application.order non-unique → validate uniqueness per isActive group.
  • DB-09 (low) — NewsArticle.publishedAt has no future-date guard → filter publishedAt <= now().

Performance

  • PERF-04 (med) — chat/route.ts:519: JS keyword filter post-query → use Prisma contains/mode:'insensitive'.
  • PERF-05 (med) — lazy-load GlobalOperations (Three.js ~300KB) via dynamic({ssr:false}).
  • PERF-06 (med) — 48 use client components; push pure-display ones back to server components.
  • PERF-07 / INFRA-12 (med) — Prisma pool max=10 hardcoded → make DB_POOL_MAX env-configurable (2030 prod).
  • PERF-08/10 (low) — ISR revalidate=60 × 40 renders/min → consider 300600s or on-demand revalidation.
  • PERF-09 (med) — large Docker build context → audit with --progress=plain, exclude heavy media.
  • INFRA-13 (med) — Nginx upstream keepalive=32 → raise to 128256.

FluxAI

  • AI-04 (med) — no golden eval for tool selection → build 2030 query eval set, run monthly.
  • AI-05 (low) — stepCountIs(5) may truncate → bump to 78, monitor traces.
  • AI-07 (low) — no per-tool cost tracking → log (conversationId, toolName, tokensIn/Out).
  • AI-08 (med) — SilentObserver.tsx:232,237: tool failures silently omitted → render "no results" card.
  • AI-09 (low) — duplicate of PERF-01 (prompt caching).
  • AI-10 (med) — useChat has no onError → add error state + Retry button.

SEO

  • SEO-01 (med) — OG images lack width/height/type → add {width:1200,height:630,type:'image/jpeg'}.
  • SEO-03 (med) — no FAQPage schema → add faqPageSchema().
  • SEO-04 (med) — no VideoObject schema on heritage videos → add videoObjectSchema().
  • SEO-05/06 (low) — Product schema lacks AggregateRating/Offer; Article lacks keywords/commentCount.
  • SEO-07 (med) — markdown parser allows empty alt → fallback alt || 'Article image' + CMS validation.
  • SEO-08 (low) — add breadcrumb schema to home/team/heritage/news-hub.
  • SEO-09 (low) — LocalBusiness hours lack timezone → add Europe/Rome.
  • SEO-10 (med) — GlobalNode case studies lack structured data → add caseStudySchema().
  • SEO-11 (low) — add Twitter creator/site handles.
  • SEO-12 (info) — no Core Web Vitals monitoring → add web-vitals + dashboard.

i18n

  • I18N-03 (med) — error boundaries hardcoded English → translate [locale]/error.tsx at minimum.
  • I18N-04 (med) — formatNumber() calls toLocaleString() without locale → pass active locale / Intl.NumberFormat.
  • I18N-05 (med) — register EnergySavingsCalculator namespace in all 5 locale files (pairs with I18N-01).
  • I18N-08 (low) — getLocalizedData fallback is English-only → optional locale-chain fallback.

Accessibility / UX

  • A11Y-06 (med) — ConsultationScheduler.tsx:394: errors lack aria-live/role=alert.
  • A11Y-07 (med) — form inputs lack <label htmlFor> associations.
  • A11Y-10 (med) — 71 buttons default to type=submit → add type=button to non-submit buttons.
  • A11Y-11 (med) — AI chat modal has no focus trap.
  • A11Y-12 (med) — language dropdown lacks arrow-key navigation.
  • A11Y-13 (med) — opacity-reduced nav text may fail 4.5:1 contrast → test + use solid colors.
  • A11Y-14 (med) — success message lacks role=status/aria-live.
  • A11Y-15 (low) — verify dark-mode coverage on newer pages + HQ panel.

Code quality

  • CQ-05 (med) — oversized files (ApplicationClient 1208, chat/route 920, AssetBucketBrowser 874) → extract sub-components/tools.
  • CQ-06 (med) — model-viewer as any casts → proper .d.ts type.
  • CQ-08 (med) — inconsistent error-handling pattern across layers → standardize log-then-return/throw.
  • CQ-09 (low) — privacy/page.tsx:12: hardcoded privacy@rf-flux.com (TODO unconfirmed) → env/SiteSettings + confirm with FLUX legal.
  • CQ-10 (low) — repeated as unknown as casts on AI SDK responses → define interfaces.
  • CQ-11 (low) — i18nHelper.ts:27: console.errorlog.error.

Infra / testing

  • INFRA-14 (med) — backup/restore lacks HMAC integrity → sign exports, verify on restore.
  • INFRA-15 (med) — deploy doesn't verify commit signatures → git verify-commit HEAD.
  • TEST-05 (med) — no i18n parity/fallback tests.
  • TEST-06 (med) — error boundaries log to console, not the structured logger.
  • TEST-07 (med) — server-component DB errors bubble ungracefully → per-query try/catch + degraded UI.
  • TEST-08 (med) — no retry/backoff for transient email/Prisma/OpenAI failures.
  • TEST-09 (med) — in-memory rate limit multiplies across replicas → require Redis for multi-instance.
  • TEST-10 (med) — health check doesn't verify migrations/SMTP/OpenAI/env.
  • TEST-11/12/13 (low) — telemetry truncation unlogged; no polyglot-file upload tests; restoreDatabase() partial-failure handling.

4. What's Already Strong

Credit where due — several things are done well and should be preserved:

  • Public-endpoint security baseline. CSRF is correctly implemented on public endpoints, rate limiting covers chat, file uploads use magic-byte validation, and the consultation route already escapes HTML correctly (the model SEC-05 should copy). Recent commits added a canonical-host guard and scanner-probe blocking in Nginx — good hardening.
  • JWT signing discipline. session.ts/clientAuth.ts correctly throw when SESSION_SECRET is missing rather than falling back — this is what saved SEC-01 from being critical.
  • i18n infrastructure. Complete top-level key parity across all 5 locales (18 sections), correct next-intl integration, preserved {count}/{app} placeholders, and a well-built AI translation glossary that masks/restores protected brand/technical terms (FLUX, Radio Frequency, solid-state, RF, MHz/kHz/GHz, kW/kWh/MW). Pluralization (componentFound/componentsFound) is handled correctly.
  • SEO foundations. Correct hreflang alternates, canonical URLs, breadcrumb JSON-LD on detail pages, and core schemas (Organization, LocalBusiness, WebSite, Article, Product). robots.txt disallow rules are correct; sitemap covers all public routes.
  • Caching & DB pooling. Nginx edge caching, ISR, image optimization, and a real Prisma connection pool are all in place — a solid performance baseline.
  • Architecture awareness. A structured JSON logger, safe-JSON helpers (parseJsonField), escapeHtml, and proper cms.ts types all exist — the gaps are adoption consistency, not missing infrastructure. bcrypt is used for password hashing; migrations are clean and idempotent (IF NOT EXISTS guards). FluxAI's 9-tool SPIN-funnel architecture is well-designed. Semantic HTML (main/nav/header/footer) and dark-mode coverage are largely in place.

5. Prioritized Remediation Roadmap

NOW — blockers; do before any further production traffic

Item Finding Effort
Rename/export middleware so auth actually runs SEC-02 S
Add session checks inside HQ server actions SEC-03 M
Add auth to /api/assets + /api/branding SEC-04 S
Rotate leaked OpenAI key + Gmail password; untrack env INFRA-03 S
Create the missing ClientUser migration DB-01 S
Escape HTML in operations email templates SEC-05 S
Whitelist chat context fields (prompt injection) AI-01 S
Remove the proxy session-secret fallback SEC-01 S

NEXT — operational resilience + data integrity

Item Finding Effort
Automated nightly offsite encrypted backups + WAL/PITR INFRA-05 M
Container memory/CPU limits INFRA-01 S
Nginx healthcheck + condition: service_healthy INFRA-02 S
Deploy: failing health check + snapshot/rollback INFRA-04 M
AI telemetry retention policy + cleanup job DB-05 M
Encrypt / minimize stored chat message text AI-03 M
buildSystemPrompt() try/catch + fallback prompt TEST-02 S
Validate OpenAI key at startup + wrap streamText TEST-03 S
Idempotent consultation submission TEST-04 M
Cache system prompt (TTL) PERF-01 S
Add GlobalNode.application indexes PERF-03 S
Integration tests for the 4 critical API routes TEST-01 L
Monitoring + log aggregation + alerts INFRA-06 M
Fix aria-labels, skip link, Escape-to-close, reduced-motion, broken nav class A11Y-02/03/04/05/08 M
i18n: EnergySavingsCalculator + CartDrawer alert I18N-01/02/05 M

LATER — hardening, polish, scale

Item Finding Effort
Migrate hot JSON String fields → Jsonb DB-04 L
Nonce-based CSP (drop unsafe-inline/eval) INFRA-11 M
OCSP stapling, logrotate, body timeout, keepalive tuning INFRA-07/09/10/13 S
Image + video sitemaps; VideoObject/FAQ/CaseStudy schema SEO-02/03/04/10 M
Reduce any usage; extract oversized files; standardize logging CQ-01/02/04/05 L
Lazy-load Three.js; trim use client; configurable pool PERF-05/06/07 M
Conversation history/resume; tool eval set; useChat onError AI-02/04/10 M
Remaining a11y (focus trap, labels, contrast, button types) A11Y-06/07/10/11/12/13/14 M
Locale-aware number formatting; translate error boundaries I18N-03/04 S

6. False Positives Considered (refuted / materially downgraded)

These were checked and either refuted or substantially softened during verification — listed so you know they were examined, not missed:

  • SEC-07 (B2B /parts not redirecting)refuted as a security issue. The page returns 200 but never exposes parts data (empty array), and shows a proper "Access Restricted" locked-state UI. A style/pattern preference, not a leak. Downgraded high → low.
  • CQ-07 line-450 "undefined .score"refuted. The scored array always initializes score (line 415, returned 443); .score can't be undefined there. The surrounding any-typing concern remains valid; the crash claim does not.
  • CQ-03 (lines 600/609 + CaseStudyModal "silent failure")partially refuted. Lines 600/609 are inside a try/catch with .w/.h/.d validation, and CaseStudyModal:189-197 has real try/catch. Only the sectionsJson/advantagesJson parse at lines 1026-1027 is genuinely unguarded.
  • INFRA-08 (cache poisoning)partially refuted. "Session cookies lost for users" and "sensitive header disclosure" are false (authenticated requests bypass cache via $cookie_flux_session). Residual risk is limited to poisoning public pages with non-sensitive cookies.
  • INFRA-02 / INFRA-04 (nginx + deploy)softened, not refuted. The "deadlock"/"no-recovery" framing was overstated (startup race + manual recovery, not hangs); both remain real high-severity gaps.
  • INFRA-09 / INFRA-10 (body timeout / logrotate)softened. nginx's 60s default timeout and ephemeral (unmounted) log storage reduce the blast radius; both downgraded high → medium.
  • DB-02 / DB-03 / PERF-02 / A11Y-01 / A11Y-09 / INFRA-11confirmed but downgraded (see §2) where verification found mitigating context (other models' parity, dropdown-constrained input, ISR-only timing, adjacent visible text, muted decorative video, compensating input-validation controls).

Bottom line, David: the build quality is real, but the auth layer is currently a no-op and the platform has no backups, no monitoring, and live secrets in git. The "NOW" block is mostly small, surgical changes — SEC-02 alone is a one-file fix that re-activates a security layer you already wrote. Land that block first, then the operational resilience in "NEXT," and this moves from "risky to run" to "production-ready" quickly.