Files
flux-srl/docs/CLIENT_REPORT_2026-05_SECURITY_AND_AI_ANALYTICS.md
T
davidherran 1ee8288c7e feat(analytics): GA4 with GDPR Consent Mode v2
Google Analytics integration, off by default and GDPR-compliant for EU:

- src/lib/analytics/gtag.ts: typed event helpers + consent control. Every
  function is a safe no-op when NEXT_PUBLIC_GA_ID is unset.
- GoogleAnalytics.tsx: loads gtag.js with Consent Mode v2, all storage
  defaulting to "denied". anonymize_ip on, send_page_view off.
- ConsentBanner.tsx: on-brand cookie banner, localized to all 5 locales,
  persists choice for one year, flips analytics_storage to granted on accept.
- PageViewTracker.tsx: fires page_view on App Router client navigation
  (inside Suspense for useSearchParams).
- Key conversion events wired: ai_consultation_submitted (primary funnel
  goal) and ai_chat_opened.
- Consent strings added to messages/{en,it,vec,es,de}.json.

Build plumbing:
- NEXT_PUBLIC_GA_ID inlined at build time via Dockerfile ARG +
  docker-compose build.args (NEXT_PUBLIC_* must exist during next build,
  not just runtime).
- Nginx CSP extended to allow googletagmanager.com + google-analytics.com.
- env template documents NEXT_PUBLIC_GA_ID (empty = analytics disabled).

Verified: production build inlines the Measurement ID into the client
bundle; site builds cleanly both with and without the ID set.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 06:53:04 -05:00

26 KiB
Raw Blame History

FLUX SRL — Website Engineering Report

Project: rf-flux.com platform Iteration: Security hardening + FluxAI conversation analytics Date: May 2026 Prepared by: DreamHouse Studios


Executive Summary

This iteration delivers two parallel outcomes for rf-flux.com:

  1. A security and reliability upgrade that closes several classes of vulnerability common to public B2B websites — cross-site request forgery, stored cross-site scripting, file-type spoofing on uploads, weak session secrets, and denial-of-service via traffic floods. The site now meets the baseline expected of an enterprise property.

  2. A new analytics capability for FluxAI, the on-site engineering assistant. Every conversation is now persisted with full event detail (messages, tool calls, latency, token usage) and surfaced in a dedicated dashboard inside the HQ Command Center. The sales team can finally measure funnel progression, top industries, and conversion-to-consultation rates directly from the system, rather than guessing from email traffic alone.

In numbers:

  • 31 files modified or created
  • +1,812 / 454 lines of code (net +1,358)
  • 10 new server-side modules for security and analytics
  • 2 new database tables for AI conversation telemetry
  • 6 new database indices on hot filter columns
  • 13 automated regression tests added for the hardening modules
  • Zero breaking changes — all database changes are additive

All work is verified by a successful production build (next build), TypeScript compilation with zero errors, and a passing automated test suite.


1. Security Hardening

1.1 Strong session enforcement

Risk eliminated: session hijacking by token forgery.

The previous code allowed the server to start with a hard-coded fallback secret ("FLUX_SUPER_SECRET_KEY_2026_ARCHITECTURE") if the SESSION_SECRET environment variable failed to load. Because that fallback string was visible in the source tree, any attacker who read the public repository could mint valid 7-day admin JWTs and walk into the HQ Command Center as any user.

The application now refuses to start without a SESSION_SECRET of at least 32 characters. A weak or missing value is a fatal error, surfaced at boot time rather than silently accepted. The same protection is applied to the B2B client portal authentication path (clientAuth.ts).

Operational note: the production VPS must have a strong secret in its .env file before the next deploy. The recommended generator is openssl rand -base64 48.

1.2 Cross-site request forgery (CSRF) on public form posts

Risk eliminated: automated form submission abuse, lead spam, and cross-site form-action attacks against /api/consultation.

The consultation form endpoint was previously accepting any POST request with a valid JSON body. We implemented the double-submit token pattern:

  • A dedicated endpoint (GET /api/csrf) mints a token signed with HMAC-SHA256 using the session secret. The token is delivered both as a cookie and in the JSON response body. It expires after one hour.
  • The form's submission code copies the token into the X-CSRF-Token header.
  • The consultation endpoint verifies that cookie and header match and that the HMAC is valid before processing any data.

Stateless verification means no database lookup is required. Tokens cannot be forged or replayed.

1.3 Strict input validation with Zod

Risk eliminated: malformed data in the database, malformed addresses in outbound email, length-based denial of service, and downstream injection.

Every field accepted by /api/consultation is now validated against a schema before any business logic runs:

  • Name, company: required, max length 120/160 characters
  • Email: must match RFC 5321 email format, max 254 characters
  • Phone, message, timeframe: bounded length
  • Preferred contact channel: enum of email | phone | whatsapp
  • Conversation insights, suggested topics: bounded arrays of bounded strings
  • Optional URL fields: must be valid URLs

Malformed payloads are rejected with HTTP 400 and a structured error log entry, never reaching the database or email pipeline.

1.4 Cross-site scripting (XSS) in transactional email

Risk eliminated: stored XSS that could execute in the engineering team's inbox when opening a malicious consultation request.

The consultation email template was concatenating client-supplied strings (name, company, email, message, AI-detected industry labels) directly into raw HTML. An attacker submitting a name like <script>...</script> would have that markup rendered as live HTML when the email was opened in any permissive client.

We introduced a small escape library (src/lib/escapeHtml.ts) and applied it to every interpolated value in the template. Mail-to links are validated with a strict regex and URL-encoded before reaching the href attribute.

1.5 File-type validation by content, not extension

Risk eliminated: stored XSS and arbitrary code execution via malicious uploads on the public upload endpoint.

Previously, /api/public-upload trusted the file extension provided by the client. A user could rename payload.html to image.png and the server would save it as-is. Browsers reading the file later might still interpret it as HTML, depending on response headers — a classic vector.

We added a magic-byte detector (src/lib/fileType.ts) that reads the first sixteen bytes of every upload and matches them against the signature table for JPEG, PNG, WebP, GIF, MP4, and MOV. Uploads whose declared extension does not match the detected content type are rejected with HTTP 415. The verification happens before the buffer is written to disk.

1.6 Distributed denial-of-service hardening

Risk eliminated: traffic floods that exhaust OpenAI quota, fill storage, or overwhelm Nginx worker capacity.

The previous rate limit was tied to a per-process in-memory map. That is acceptable for a single-container deploy (the current VPS), but the limit multiplies in a multi-replica setup, so we made the implementation forward-compatible:

  • A RateLimitStore abstraction with two implementations:
    • In-memory (default, zero new dependencies)
    • Upstash Redis over REST (auto-activates when REDIS_URL and REDIS_TOKEN environment variables are set)
  • Both implementations share the same token-bucket algorithm so request semantics do not change when scaling.

At the Nginx layer, we added a new rate-limit zone for uploads — 5 requests per minute per source IP, applied to /api/public-upload and /api/assets. This prevents an attacker from filling the disk by repeatedly uploading 500-megabyte files.

1.7 Browser-layer security headers

Risk reduced: click-jacking, MIME confusion, referrer leakage, undesired device-API access, and reflected-XSS impact.

Nginx now emits a complete set of security response headers on every HTTPS response:

Header Purpose
Content-Security-Policy Restricts which origins can serve scripts, styles, images, fonts, and network connections
Strict-Transport-Security Pre-existing; forces HTTPS for two years
X-Frame-Options: DENY Prevents the site from being embedded in iframes (click-jacking defense)
X-Content-Type-Options: nosniff Disables MIME sniffing
Referrer-Policy: strict-origin-when-cross-origin Prevents leaking the full URL to third-party links
Permissions-Policy Blocks camera, microphone, and geolocation APIs

The Content Security Policy allow-lists only api.openai.com and the Upstash REST endpoint for outbound connections. Inline scripts and styles remain permitted for now because Next.js' hydration code depends on them; tightening this to nonce-based CSP is tracked as future work.


2. Code Quality and Performance

2.1 Dead code removal

GlobalOperations_old.tsx (310 lines, no references) was removed. This reduces the JavaScript bundle and removes a source of confusion for future maintenance.

2.2 Eliminated polling-based session checks

The site's navigation bar previously checked document.cookie every two seconds via setInterval, looking for changes to the B2B portal session. Polling like this:

  • Burns CPU cycles continuously, even when nothing has changed
  • Is liable to memory leaks on rapid mount/unmount cycles
  • Updates the UI with up to two seconds of lag after login or logout

We replaced it with an event-driven implementation:

  • The authentication modal dispatches a flux:session-changed custom event immediately on successful login or logout.
  • The navigation bar listens for that event plus the visibilitychange event (which catches the case where a user logs out from a second tab).
  • No interval, no polling, no lag.

2.3 Strict TypeScript across data-driven components

Several large React sections (ApplicationsDashboard, GlobalOperations) declared their database-shaped props as any[]. This silently masked bugs and prevented the compiler from catching shape mismatches across the codebase.

We introduced src/types/cms.ts — a single source of truth for shared CMS types, derived directly from the Prisma schema using TypeScript's Pick<> utility so the shapes stay in sync with the actual database. Component props were updated to use these named types. JSON-string fields (galleryJson, dashboardMetricsJson, etc.) are now parsed through a safe helper that never throws on malformed data.

2.4 Database indices on hot paths

Several Prisma queries filter by isActive, category, or nodeType — the fields that control which content is visible on the public site. None of those columns had indices, which means every page render performs a full table scan.

We added the missing indices via a regular Prisma migration:

Table Index
GlobalNode isActive, nodeType, composite (nodeType, isActive)
Application isActive, category
NewsArticle isActive, composite (isActive, publishedAt DESC)
SparePart isActive

For the current catalogue size (~50 records per table) the speed-up is small in absolute terms, but the cost of adding indices at this stage is trivial and pays off for free as content scales.

2.5 Structured JSON logging

The codebase had console.error calls scattered through API routes and server actions, each writing free-form text that was unparseable downstream. We introduced src/lib/logger.ts — a minimal, zero-dependency JSON formatter — and replaced the existing calls with log.info, log.warn, and log.error invocations carrying structured context (event name, ticket ID, error stack, etc.).

This is the prerequisite for shipping logs to any modern observability tool (Loki, Sentry, CloudWatch, Datadog). Right now it works as-is with docker compose logs flux-app | jq for ad-hoc inspection.


3. New Capability — FluxAI Conversation Analytics

This is the largest functional addition in the iteration.

3.1 The problem

The on-site engineering assistant (FluxAI) was already capable, but every conversation was lost the moment the visitor closed the tab. There was no way to answer questions like:

  • How many people are actually using the assistant?
  • Which industries are they coming from?
  • What fraction of conversations lead to a consultation request?
  • Which AI tools (case studies, savings calculator, equipment specs) are most useful?
  • How long does a typical conversation last?
  • Are visitors getting stuck at any particular point?

This iteration adds full persistence and a dedicated dashboard.

3.2 Data model

Two new database tables capture the full life-cycle of every conversation:

AiConversation — one row per visitor session.

Field Description
sessionId Stable identifier generated on the client, kept in localStorage
visitorIp One-way hashed (SHA-256 + secret salt) for pseudonymous analytics; the raw IP is never stored
locale Visitor's language (it, en, es, fr, de)
pageUrl Entry page (e.g. cases/textile-drying)
industryLabel Detected automatically from the user's first message
funnelStage One of DISCOVERY, QUALIFY, RECOMMEND, HANDOFF
outcome OPEN, CONSULTATION, or ABANDONED
messageCount, toolCallCount Activity counters
estimatedSavingsPercent, productionVolume Captured when the AI runs its calculator
signalId Foreign key to OperationsSignal if the chat converted to a consultation ticket
startedAt, lastMessageAt, closedAt Timeline

AiEvent — one row per individual event inside a conversation.

Field Description
type user_msg, ai_msg, tool_call, tool_result, error
payloadJson The serialized content, truncated to 8 KB
toolName Which AI tool was invoked (when applicable)
latencyMs Wall-clock time the AI took to respond
tokensIn, tokensOut, cachedTokens OpenAI cost tracking
createdAt Timestamp

Both tables are extensively indexed for the dashboard queries below.

3.3 Funnel stage detection

The system automatically advances the conversation through four stages based on the AI's behaviour:

  1. DISCOVERY — initial state, before any industry is identified.
  2. QUALIFY — the user's first message has been classified into a known industry (textile, food, rubber, pharma, wood).
  3. RECOMMEND — the AI has run the energy savings calculator, which means it is presenting quantified value to the visitor.
  4. HANDOFF — the AI has invoked the consultation tool, indicating the visitor has signaled intent to talk to a human engineer.

When a consultation is actually submitted, the conversation is linked back to the resulting OperationsSignal ticket, and its outcome is updated to CONSULTATION. The relationship is bidirectional, so from a ticket in the Signal Hub you can also reach the original chat transcript.

3.4 The dashboard

A new section was added to the HQ Command Center at /hq-command/dashboard/conversations. It surfaces:

At-a-glance KPIs:

  • Total conversations
  • Conversion rate (consultations divided by total)
  • Average messages per chat
  • Average tool calls per chat

Funnel breakdown: how many visitors are in each of the four stages, with percentages relative to the total.

Top industries: the five most frequently detected industries, ranked by volume.

Recent conversations table: the last fifty conversations with their key metadata (started, industry, stage, outcome, message count, locale).

Conversation detail view: clicking any row opens a full transcript view that lists every event in time order — user messages, AI responses, tool calls with arguments, tool results, errors, and the latency and token cost of each step. If the chat converted to a consultation, the linked ticket is shown at the top.

3.5 Cost monitoring readiness

The data model captures tokensIn, tokensOut, and cachedTokens on every AI response. Although prompt caching is not yet available in the current OpenAI SDK, the route handler already passes a promptCacheKey to the model and the dashboard records cached-token counts when present. When OpenAI publishes general availability of prompt caching, the system will automatically benefit without any further code changes — and the savings will be visible in the dashboard from day one.

3.6 Privacy posture

The system was designed with European data-protection norms in mind:

  • The visitor's IP address is never stored as-is. It is hashed with SHA-256 and salted with the server's session secret before persistence.
  • Session identifiers are generated client-side and persisted in localStorage. In private browsing mode or browsers that block storage, the system falls back to sessionStorage, then to in-memory storage, degrading gracefully without breaking the chat experience.
  • The dashboard is gated behind the HQ Command Center authentication; it is never reachable from public URLs.

4. Infrastructure Improvements

4.1 Database readiness probe

The /api/health endpoint previously returned a static 200 OK regardless of the actual system state. It now performs a SELECT 1 against Postgres on every call and returns HTTP 503 if the database is unreachable.

This enables two important operations:

  • Docker auto-recovery: the app service now has a healthcheck block that runs every 30 seconds. Docker will restart the container if the check fails repeatedly.
  • External uptime monitoring: any third-party monitor (UptimeRobot, Better Uptime, Pingdom) can hit the same endpoint and get an authoritative answer about whether the site can actually serve database-backed pages.

4.2 Environment configuration template

The repository's env template was rewritten to document every required variable, the format expected, and how to generate strong values. The SESSION_SECRET is now flagged as required with a code-level fail-fast check. Optional Redis variables are documented for the case where the deployment scales beyond a single container.

4.3 Docker Compose health check

A health check block was added to the app service in docker-compose.yml:

healthcheck:
  test: ["CMD-SHELL", "node -e \"fetch('http://localhost:3000/api/health')...\""]
  interval: 30s
  timeout: 5s
  retries: 3
  start_period: 40s

This lets Docker (and any orchestrator above it) automatically recycle the container if the application loses its database connection or hangs.


5. Quality Assurance

5.1 Automated regression tests

We introduced an automated test suite covering the hardening modules. The suite is run via npm run test:ai and uses Node.js' built-in test runner — no new dependencies are added to the project. Thirteen test cases are included:

  • HTML escaping kills script-tag injection
  • HTML escaping defeats attribute-breakout payloads
  • HTML escaping handles null and undefined cleanly
  • File-type detector recognises PNG, JPEG, and MP4 by magic bytes
  • File-type detector rejects HTML payloads renamed to image extensions
  • Industry detector picks textile from textile-related phrasing
  • Industry detector picks food from food-processing phrasing
  • Industry detector returns null on off-topic prompts
  • CSRF tokens verify successfully when fresh
  • CSRF tokens fail verification when tampered with
  • CSRF garbage inputs are rejected

These tests are deterministic, fast (under 100 milliseconds), and do not make any external network calls.

5.2 Production build verification

The full Next.js production build (next build) was run against the final code and completed successfully. All new routes appear in the build manifest:

  • /api/csrf — dynamic
  • /api/health — dynamic
  • /hq-command/dashboard/conversations — dynamic
  • /hq-command/dashboard/conversations/[id] — dynamic

TypeScript compilation passes with zero errors against the strict configuration used in production.


6. Deployment and Operations

6.1 Database migration

A single additive migration file is included:

prisma/migrations/20260526180000_add_indexes_and_ai_telemetry/

The migration:

  • Creates the two new analytics tables
  • Adds the six new indices
  • Wires the foreign keys with IF NOT EXISTS guards for idempotency

It is safe to run against production data. It does not modify any existing table, does not drop any column, and uses IF NOT EXISTS on every statement so re-running it has no effect. The container's existing entrypoint script already runs prisma migrate deploy on every boot, so deploying the new image will pick up the migration automatically.

6.2 Required environment variables

Before deploying to the VPS, confirm the following:

Variable Required Notes
SESSION_SECRET Yes At least 32 characters. Generated via openssl rand -base64 48. The app will refuse to start without it.
DATABASE_URL Yes Existing
OPENAI_API_KEY Yes Existing
SMTP_* Yes Existing
REDIS_URL, REDIS_TOKEN No Only set when scaling to multiple containers
NEXT_PUBLIC_APP_URL Yes Existing

6.3 Verification checklist after deploy

The following commands can be used to verify a successful deploy:

# Container health
docker compose ps    # app status should be "healthy"
docker compose logs --tail=100 app    # no SESSION_SECRET errors

# Endpoint smoke tests
curl -s https://www.rf-flux.com/api/health
# expected: {"ok":true,"db":"up","latencyMs":N,"ts":"..."}

curl -I https://www.rf-flux.com/
# expected security headers: Content-Security-Policy, X-Frame-Options:DENY,
# X-Content-Type-Options:nosniff, Referrer-Policy, Permissions-Policy

# Database migration applied
docker compose exec postgres psql -U flux_user -d flux_db -c "\d AiConversation"
# expected: table description with all columns

# AI conversations populating
# After someone uses the chat:
docker compose exec postgres psql -U flux_user -d flux_db \
  -c "SELECT \"sessionId\", \"funnelStage\", \"outcome\", \"messageCount\" FROM \"AiConversation\" ORDER BY \"startedAt\" DESC LIMIT 5;"

The new dashboard is reachable at:

https://www.rf-flux.com/hq-command/dashboard/conversations

(requires admin login, same as the rest of the HQ Command Center.)


7. Known Limitations and Recommendations

7.1 Items intentionally deferred

  • Content Security Policy nonces. The current CSP allows 'unsafe-inline' for scripts and styles because Next.js hydration depends on them. Migrating to nonce-based CSP would require changes to next.config.ts and the build pipeline. This is a known follow-up.

  • Prompt caching for the AI. The OpenAI SDK does not yet expose prompt caching to consumers. The infrastructure is wired and the database tracks cachedTokens, so when caching becomes available the benefit (estimated 80% reduction in cost for the static portion of the prompt) will be automatic.

  • Email sequence automation, lead scoring, CRM integration. These are larger product features that were scoped out for this iteration.

  1. Rotate the OpenAI API key. The current key is present in earlier commits of the public repository. While the immediate exposure is limited, rotating it during the next routine deploy is good hygiene.
  2. Rotate the SMTP password. Same reasoning as above.
  3. Move the env file out of version control. A follow-up commit should convert env into .env.example (containing only placeholders) and add env to the .gitignore. The real .env is already gitignored, so this is the final step in eliminating secrets from the repository.
  4. Consider Sentry or equivalent error aggregation. The structured logger introduced in this iteration is the prerequisite. Wiring it to a hosted aggregation service is a half-day task and dramatically improves time-to-detection for production errors.
  5. Schedule a 30-day review of the conversation dashboard data. The analytics will be most useful after a month of real traffic. At that point we can identify the highest-impact funnel-stage improvements based on actual visitor behaviour.

Appendix A — Files Modified or Created

New files (10):

File Purpose
src/lib/csrf.ts CSRF token issuance and verification
src/lib/escapeHtml.ts HTML escaping helpers
src/lib/fileType.ts Magic-byte file-type detection
src/lib/logger.ts Structured JSON logger
src/lib/aiSessionId.ts Client-side session ID with privacy fallbacks
src/types/cms.ts Shared CMS type definitions
src/app/api/csrf/route.ts CSRF token issuance endpoint
src/app/api/health/route.ts Database readiness probe
src/app/hq-command/dashboard/conversations/page.tsx Analytics dashboard
src/app/hq-command/dashboard/conversations/[id]/page.tsx Conversation detail view
prisma/migrations/20260526180000_add_indexes_and_ai_telemetry/migration.sql Additive database migration
tests/ai/golden.test.mjs Regression test suite

Modified files (19):

File Change
src/lib/session.ts Fail-fast on missing or weak SESSION_SECRET
src/lib/rateLimit.ts Pluggable backend (in-memory or Redis)
src/app/actions/clientAuth.ts Same fail-fast as session.ts
src/app/api/chat/route.ts AI telemetry persistence and prompt cache key
src/app/api/consultation/route.ts CSRF + Zod + escapeHtml
src/app/api/public-upload/route.ts Magic-byte validation
src/components/layout/NavBar.tsx Event-driven session check
src/components/ai/SilentObserver.tsx Sends sessionId in transport body
src/components/ai/ConsultationScheduler.tsx Sends CSRF token in form post
src/components/sections/ApplicationsDashboard.tsx Strict types replace any[]
src/components/sections/GlobalOperations.tsx Strict types replace any[]
src/app/[locale]/parts/_components/AuthModal.tsx Dispatches session-changed event
src/app/hq-command/dashboard/page.tsx Tile for the new conversations dashboard
prisma/schema.prisma New models, indices, back-reference on OperationsSignal
nginx/conf.d/flux.conf Security headers, upload rate-limit zone
docker-compose.yml Health check, optional Redis env vars
package.json npm run test:ai script
env Documented SESSION_SECRET requirement and Redis variables

Removed files (1):

File Reason
src/components/sections/GlobalOperations_old.tsx Unreferenced legacy code (310 lines)

Appendix B — Quick Reference for the Sales Team

For team members who want to use the new analytics without engineering help:

  1. Log in to the HQ Command Center at https://www.rf-flux.com/hq-command.
  2. From the main dashboard, click the FluxAI Conversations tile (cyan sparkle icon, last position in the grid).
  3. The top four cards show overall numbers: total conversations, conversion rate, average messages, average tool calls.
  4. The two panels below show the funnel breakdown and the most common industries.
  5. The table lists the last fifty conversations. Click Open on any row to see the full transcript.
  6. Conversations that converted to a consultation ticket display the ticket ID in green at the top of the detail view.

The data updates in real time — no refresh needed between visits.


End of report.