Blog

SSR Deep Dive — Hydration, State Replay, and the Cookbook
Twelfth in a series about migrating from legacy architectures to a modern Nuxt 4 stack.

The Hydration Contract

In a server-rendered Vue application, SSR establishes a strict contract: the HTML generated on the server must match exactly what the client-side Vue runtime would render. During hydration, Vue attaches to the existing DOM instead of re-rendering it from scratch. If the server HTML and the client render differ, Vue reports a hydration mismatch.

In Vue 3 strict mode (and Nuxt 4), hydration mismatches are more than harmless warnings. They can lead to:
- Silent rendering bugs (server HTML stays, but event listeners bind to the wrong elements)
- Missing interactivity (Vue skips hydrating mismatched subtrees)
- Inconsistent state (server-rendered content shows one value, client state holds another)
These issues are tricky because they only appear under SSR — the same component may work perfectly in client-only mode.

A Taxonomy of Hydration Mismatches

Across large enterprise applications, most hydration issues fall into a handful of categories. Once you recognize the category, the fix usually becomes obvious.

Category 1: Non-Deterministic Values

Any value that differs between server and client at render time will cause a mismatch:
```
Server renders: <div id="input-a7f3b2">...</div>
Client renders: <div id="input-c9e1d4">...</div>
 ↑ different random value
```
Common culprits: Math.random(), Date.now(), crypto.randomUUID() used in templates or setup().

Fix: use useId() — a Nuxt composable that generates deterministic IDs, consistent between server and client.

Category 2: Timing-Dependent State

If a child component mutates parent state during setup(), the execution order can differ between server and client:
```
sequenceDiagram
 box Server
 participant SParent as Parent (server)
 participant SChild as Child (server)
 end
 box Client
 participant CParent as Parent (client)
 participant CChild as Child (client)
 end

 Note over SParent: 1. Parent setup()
 SParent->>SParent: setup()
 Note over SParent: 2. Parent renders
 SParent->>SParent: render()
 Note over SChild: 3. Child setup() → emits to parent (too late for render)
 SChild->>SParent: emit() changes parent state

 Note over CParent: 1. Parent setup()
 CParent->>CParent: setup()
 Note over CChild: 2. Child setup() → emits to parent
 CChild->>CParent: emit() changes parent state
 Note over CParent: 3. Parent renders with new state
 CParent->>CParent: render()

 Note over SParent,CParent: Different HTML on server vs client
```
Fix: move shared state into useState() so it is initialized once, independent of component execution order.

Category 3: Teleports

is rendered inline in the component tree on the server, but moved to on the client. The DOM structure no longer matches.

Fix: wrap teleported content in so it is rendered exclusively on the client.
```
flowchart LR
 subgraph SSRTree["SSR Tree"]
 A["Component A (includes Teleport target)"]
 B["Teleported content (rendered inline on server)"]
 A --> B
 end

 subgraph HydratedDOM["Hydrated DOM"]
 A2["Component A (no inline teleported content)"]
 B2["Teleported content moved under body element"]
 end

 SSRTree -->|server HTML| HydratedDOM
 classDef mismatch fill:#ffe0e0,stroke:#ff5555,stroke-width:1px;
 class B,B2 mismatch;
```
Category 4: Client-Side State Initialization

If a reactive value is false during SSR but becomes true during hydration (for example, a dialog’s isOpen toggled in mounted), CSS classes and markup diverge:
```
Server: <div class="panel panel-closed"> ← isOpen = false
Client: <div class="panel panel-open"> ← isOpen = true (mounted set it)
```
Fix: ensure the initial value matches the SSR state. Use watch or nextTick to change state after hydration completes, not during.

Category 5: Async Composable Race Conditions

When multiple composables use useAsyncData and depend on each other, the resolution order can differ between server and client. Computed values built on these async results may pass through different intermediate states and yield divergent HTML.

Fix: enforce top-down data flow from useState. Avoid computed values that depend on partially resolved async state.
```
flowchart TB
 subgraph Server
 S1["useAsyncData A resolves first"]
 S2["useAsyncData B resolves second"]
 SC["Computed C based on A+B → Server HTML"]
 S1 --> SC
 S2 --> SC
 end

 subgraph Client
 C1["useAsyncData B resolves first"]
 C2["useAsyncData A resolves second"]
 CC1["Computed C (intermediate) based only on B"]
 CC2["Computed C (final) based on A+B → Client DOM"]
 C1 --> CC1
 C2 --> CC2
 end

 classDef warn fill:#fff4e5,stroke:#ff9900,stroke-width:1px;
 class SC,CC1,CC2 warn;
```
The Hydration Cookbook Pattern

Capturing hydration issues in a structured way — symptom, root cause, fix — builds a shared knowledge base that dramatically reduces debugging time in any sizeable Nuxt application. A practical approach is to keep a “Hydration Issues Cookbook” with entries like:
```
flowchart TB
 Issue["HYDRATION ISSUE: Random IDs in Templates"]

 Symptom["Symptom: #quot;Hydration node mismatch#quot; &lt;input id=#quot;...#quot;&gt; differs"]
 Cause["Root Cause: Math.random() / crypto.randomUUID() in setup() or template"]
 Fix["Fix: Use useId() for deterministic IDs"]
 Prevention["Prevention: ESLint rule — no Math.random() in setup/template"]

 Issue --> Symptom
 Issue --> Cause
 Issue --> Fix
 Issue --> Prevention

 classDef header fill:#e0f2ff,stroke:#1e88e5,stroke-width:1px;
 classDef box fill:#ffffff,stroke:#90a4ae,stroke-width:1px;
 class Issue header;
 class Symptom,Cause,Fix,Prevention box;
```
Each entry describes a pattern, not a one-off incident. Over time, teams learn to recognize categories instead of chasing isolated bugs.

SSR Event Replay

In large modular applications, events emitted during SSR still need to reach client-side listeners. The usual SSR lifecycle creates a timing gap:
```
sequenceDiagram
 box Server
 participant S as Cart module (server)
 end
 box Client
 participant C as Funnel module (client)
 end

 Note over S: Cart module loads
 S->>S: emit cart:loaded

 Note over C: Hydration begins
 C->>C: subscribe to cart:loaded

 Note over S,C: Event is lost — no client listener existed when server emitted it
```
The server fires events while rendering, but no client listeners exist yet. By the time they subscribe, those events are gone.

The Solution: useState as an Event Buffer

During SSR, events are serialized into useState, which is automatically transferred from server to client via the Nuxt payload. After hydration, the event bus reads the stored events and replays them through standard RxJS subjects.
```
sequenceDiagram
 box Server
 participant S as Cart module (server)
 participant ST as useState (SSR store)
 end
 box Client
 participant CT as useState (hydrated payload)
 participant B as Event bus (RxJS)
 participant L as Listeners
 end

 Note over S: Cart module loads
 S->>S: emit cart:loaded
 S->>ST: push cart:loaded into useState buffer

 ST-->>CT: state transfer with events

 Note over B,L: After hydration
 L->>B: subscribe to cart:loaded
 B->>CT: read buffered events
 CT-->>B: cart:loaded events
 B-->>L: replay cart:loaded → listeners fire ✓
```
Replay is automatic. Module authors do not need to care whether an event fired during SSR or on the client — subscribers receive it either way.

Debugging Hydration Issues

Hydration warnings identify where the DOM diverged, but rarely why. Vue points to a specific DOM node, while the real cause might be several layers up in the tree or hidden in composables.

Strategy 1: Binary Elimination

Wrap parts of the page in to localize the mismatch. If wrapping section A in makes the warning disappear, the bug is in that section. Then progressively narrow down.
```
flowchart TB
 Page[Page Component]

 A["Section A (suspect)"]
 B[Section B]
 C[Section C]

 Page --> A
 Page --> B
 Page --> C

 A2["Section A wrapped in &lt;ClientOnly&gt;"]
 Page -. test step .-> A2

 classDef suspect fill:#fff4e5,stroke:#ff9800;
 classDef normal fill:#ffffff,stroke:#90a4ae;
 class A suspect;
 class B,C normal;
```
Strategy 2: SSR-Only Rendering

Disable client-side hydration entirely (ssr: true with no client JavaScript) and compare:
- The raw server HTML
- The HTML the client would render
This isolates state differences and logic that only runs on the client.
```
flowchart LR
 SSR["SSR-only HTML (no client JS)"]
 ClientRender["Client-only render (same route, mocked data)"]

 SSR --> Diff[Diff DOM + state]
 ClientRender --> Diff

 Diff --> Cause["Identify diverging values and client-only logic"]
```
Strategy 3: AI-Assisted Debugging

An AI assistant connected to both the application’s MCP server (for server-side state) and the browser’s DevTools (for client-side state) can automatically diff the two:
```
Developer: "The checkout form shows different content
 after hydration. Help me debug this."

AI Assistant:
 1. Queries Pinia store via MCP → gets server-side cart state
 2. Inspects browser DOM via DevTools → gets client-side rendering
 3. Compares the two → identifies the diverging value
 4. Traces the value to a composable with client-only initialization
 5. Suggests fix: move initialization to useState
```
```
flowchart TB
 Dev[Developer]
 AI[AI Assistant]
 MCP["MCP Server (server-side state)"]
 DevTools["Browser DevTools (client-side state)"]
 Diff[State + DOM diff]
 Fix["Suggested fix (e.g., move init to useState)"]

 Dev -->|debug request| AI
 AI --> MCP
 AI --> DevTools
 MCP --> AI
 DevTools --> AI
 AI --> Diff
 Diff --> Fix
 Fix --> Dev
```
This pattern is already used in practice in sophisticated internal tooling — for example, a debug-chatbot-style module that provides exactly this capability (covered in detail in Article 14 of this series).

The hydrate-never Directive

Not all server-rendered content needs hydration. Static sections — text blocks, decorative images, layout wrappers — never change on the client. Hydrating them wastes CPU and inflates Total Blocking Time (TBT).

A custom directive marks elements that should be skipped during hydration:
```
With hydrate-never:
 Server renders: <div v-hydrate-never class="static-banner">
 <h2>Welcome to Our Store</h2>
 Thousands of satisfied customers...
 </div>

 Client: Vue skips this subtree during hydration
 → No patch() calls
 → No reactive tracking
 → Zero TBT contribution
```
```
flowchart LR
 subgraph Render
 S["Server render &lt;div v-hydrate-never&gt;..."]
 C["Client hydration Vue sees v-hydrate-never"]
 end

 S -->|HTML payload| C

 C -->|skip subtree| NoPatch["No patch() calls"]
 C -->|skip reactivity| NoReactive[No reactive tracking]
 C -->|perf| TBT["Zero TBT contribution for this subtree"]

 classDef static fill:#e0f7fa,stroke:#00acc1;
 class S,C,NoPatch,NoReactive,TBT static;
```
On pages with large static sections (landing pages, editorial content, catalog content), this can cut Total Blocking Time by 30–50%.

Lessons Learned

Hydration is a contract, not a feature

Treating hydration as “it works or it doesn’t” leads to brittle apps. Treating it as a contract — server and client must agree on every rendered value — leads to defensive patterns that prevent mismatches by design.

The five categories cover 95% of real-world mismatches

Random values, timing-dependent state, teleports, client-side initialization, and async race conditions. If you know these five, you can diagnose almost any hydration issue you encounter.

Event replay is essential for SSR module architectures

In modular SSR systems where modules communicate via events, event replay is non-negotiable. Without it, SSR-only events vanish on the client, creating subtle, production-only bugs.

A cookbook is more valuable than documentation

High-level advice (“avoid non-deterministic values”) is less actionable than concrete patterns (“this code causes this bug; here is the fix”). A living cookbook that evolves with new patterns is one of the most effective knowledge tools for hydration issues.

What’s Next
- Article 10: Memory, Stability, and PM2 — Running a Long-Lived Node.js Server — What happens when V8 runs for days and how to keep it stable.
- Article 11: Multi-Environment Infrastructure — Azure Container Apps and the Configuration System — Managing three environments with generated configuration.
- Article 12: Security in a Nuxt SSR App — CSRF, Azure AD, CSP, and More — The security layers that protect a server-rendered application.
Munir Husseini is a software architect specializing in full-stack TypeScript, .NET, and cloud-native architectures.
June 6, 2026

The Full Picture — What the New Concept Delivers

Twentieth and final article in a series about migrating from legacy architectures to a modern Nuxt 4 stack.

From Parts to Whole

The previous fifteen articles describe individual pieces — the GraphQL gateway, code generators, performance, infrastructure, security. This article brings them together and answers the question that matters to decision-makers: what does the complete system deliver?

The Architecture at a Glance

flowchart TB
    A["Azure Front Door<br/>(CDN + WAF)"] --> B[Azure Container Apps Environment]
    B --> C["Nginx Proxy<br/>TLS, Image cache, OTel spans"]
    C --> D["Nuxt 4<br/>SSR + GQL Gateway<br/>SSR, GraphQL stitching<br/>Page cache, Redis"]
    D --> E[".NET API<br/>Pricing, Orders, Users, Validation"]
    D --> F["Redis<br/>(per-env)"]
    B --> G["External Services<br/>Headless CMS · Application Insights<br/>Azure Key Vault · Azure AD"]

Four containers per environment. One frontend language (TypeScript). One data schema (GraphQL). One module system (Nuxt modules). One configuration generator (YAML → JSON + Bicep).

The Five Pillars

Five architectural pillars, each delivering measurable value in a large enterprise application.

Pillar 1: Unified Data Layer (GraphQL Schema Stitching)

What it delivers: One API endpoint for all data sources. The frontend never needs to know which backend produced which field.

Before	After
3–5 REST calls per page	1 GraphQL query per page
Manual data joining in frontend code	Automatic via `@delegate` directive
Per-endpoint types, manually maintained	Generated from unified schema
Custom error handling per API	One Apollo error link

Pillar 2: Total Automation (Code Generation)

What it delivers: Developers write declarations (GraphQL queries, YAML translations, GraphQL input types). The system generates everything else.

flowchart LR
    A[What developers write] --> B[What is generated]
    A1[.graphql files] --> B1["Typed composables<br/>(auto-imported)"]
    A2[YAML translation files] --> B2["Typed t.* proxy chain<br/>(auto-imported)"]
    A3[GraphQL input types] --> B3[Form field metadata + validation]
    A4[CMS content model] --> B4[Vue component stubs + types]
    A5[Module scaffold command] --> B5[Complete module structure]

Roughly 40–60% of the TypeScript code is generated — not boilerplate, but correct implementations derived from authoritative schemas.

Pillar 3: SSR Performance Stack

What it delivers: Near-instant page loads with near-perfect Lighthouse scores.

flowchart TB
    S[Performance Stack]
    S --> L1["SSR<br/>Content visible immediately"]
    S --> L2["Multi-Tier Cache<br/>Sub-ms data retrieval"]
    S --> L3["Deferred Hydration<br/>No render-blocking JS"]
    S --> L4["Same-Origin Proxy<br/>−594ms LCP"]
    S --> L5["Font Strategy<br/>Zero CLS"]
    S --> L6["Bot Detection<br/>Clean audit scores"]
    S --> L7["Manual Chunks<br/>Only needed JS per page"]
    S --> R["Combined Result<br/>Lighthouse 97+ (mobile)"]

Pillar 4: Production-Grade Operations

What it delivers: Elastic infrastructure, zero-downtime deployments, full observability, and instant rollback.

Capability	Implementation	Article
Elastic scaling	Container Apps auto-scale (5–20 replicas)	11
Zero-downtime deploy	Blue-green with traffic switching	11
Per-branch environments	Automated feature deployments	11
Full request tracing	W3C Trace Context across all services	13
Security	CSRF + Azure AD + runtime CSP	12
Instant rollback	Traffic switch to previous revision	11

Pillar 5: Developer Experience

What it delivers: Fast feedback loops, strong type safety, AI-assisted debugging, and clear modular boundaries.

Aspect	Experience
New data source	Write `.graphql` file → composable auto-imported
New translation	Add YAML key → `t.section.key` typed and available
New module	Run scaffold → complete structure created
Debugging	AI assistant with 30+ live inspection tools
Architecture understanding	`AGENTS.md` + module READMEs + consistent patterns

For Decision-Makers: The Numbers

Performance

These example metrics illustrate typical gains when moving a legacy enterprise frontend to a modern SSR + GraphQL stack:

Metric	Legacy	New	Improvement
Median response time	2,618 ms	165 ms	15.9× faster
Error rate	3.91%	0.09%	97% lower
Lighthouse Performance (mobile)	~50	97+	+47 points
LCP	> 5 s	< 2.5 s	Google “good”

Capacity and Cost

Metric	Legacy	New	Improvement
Max tested capacity	~99 RPM	494+ RPM	5× more
Infrastructure model	Fixed (always on)	Elastic (pay-per-use)	~40% cost reduction at average load
Scaling	Manual (operations team)	Automatic (config-driven)	Zero manual intervention

Development Velocity

Metric	Legacy	New	Improvement
New API integration	Write REST client + types + mapper	Write `.graphql` file	~90% less glue code
New form	Backend + frontend sync + manual testing	Schema-driven, auto-validated	No manual field wiring
New CMS content type	Manual component + data fetching + types	Generated component stub + typed query	No boilerplate
New translation key	Add key, hope it matches at runtime	YAML key → typed `t.section.key`	Compile-time checked
New module	Create folders, wire routing, exports, types	`yarn plop` → complete scaffold	Consistent structure in seconds
Type safety coverage	Partial (hand-written)	Complete (generated)	Zero type drift

For Architects: The Design Principles

Five design principles unify the decisions across all 15 articles for large-scale web applications.

1. Generate, Don’t Write

If code can be derived from an authoritative schema (GraphQL, CMS model, YAML translations), generate it. Hand-written code drifts; generated code stays correct.

2. One Source of Truth Per Concern

Data shape → GraphQL schema
Validation rules → Backend model annotations
URL structure → CMS entries
Translations → YAML files
Infrastructure configuration → YAML values files

No concern has two sources. Everything is defined once and consumed many times.

3. Eliminate Work, Don’t Optimize It

The biggest performance gains came from removing work: modulepreload hints, cross-origin connections, redundant CMS queries. The biggest productivity gains came from removing boilerplate. Subtraction outperforms optimization.

4. Modules as Boundaries

Folders suggest organization. Modules enforce it. With 35+ modules, each owning its own API surface, the codebase has real boundaries.

5. Measure Everything, Assume Nothing

Architecture claims without data are opinions. Load test results, Lighthouse scores, cache hit rates, and response time distributions turn them into evidence. The production configuration is shaped by measurement.

Lessons Learned

The whole is greater than the sum of the parts

No single technique produces a 15.9× improvement. Stitched GraphQL removes multi-source joins, caching eliminates redundant fetches, deferred hydration removes render-blocking JS, and elastic infrastructure eliminates over-provisioning. Each targets a different bottleneck; together they transform the system.

Architecture must allow change

The architecture is designed upfront — but designed for adaptability. Technology and requirements evolve, and the system must evolve with them. An architecture that requires wholesale rewrites accumulates debt until it becomes unmaintainable.

Four mechanisms make change safe:

Loose coupling via modules. Dozens of independent modules, each owning a vertical slice. Replacing authentication, tracking, or forms never cascades into unrelated code. A module can be deleted and rebuilt without touching the rest.
Code generation from schemas. Generated code follows the schema it came from. When a data source changes or a service is replaced, regeneration produces correct integration code. No hand-written adapter layer drifting from reality.
Schema-driven contracts (GraphQL). The frontend depends on a unified schema, not individual backends. A service can be rewritten, split, or replaced — as long as it serves the same fields, no frontend changes are required. New clients (mobile, internal tools) consume the same gateway without backend modifications.
Infrastructure as configuration. The same YAML generates manifests for Container Apps, AKS, and App Service. Switching platforms or adding environments is a configuration change, not a re-architecture.

In practice: adding a data source is one subgraph and a stitch directive. Changing rendering strategy is a route rule. Replacing a backend service is transparent to the frontend. Swapping a module is contained to its directory.

Things that change often (services, modules, data sources, infrastructure) are easy to change. The few core decisions that are hard to change (framework, data protocol) are chosen carefully and affect only their own layer.

Developer experience is a force multiplier

Fast feedback loops (auto-generated types, branch environments, AI debugging) do not just improve morale — they improve the system. When adding a data source takes minutes instead of hours, teams integrate more data. When debugging takes minutes instead of days, bugs get fixed faster.

Developer hardware is not optional

This architecture demands fast machines. The dev stack runs a Nuxt server watching thousands of files, a GraphQL server, code generators, and TypeScript language services analyzing the module graph — simultaneously. When hardware falls short, HMR becomes sluggish, type checking lags, and code generation feels blocking.

Windows is particularly affected. Node.js file watching and module resolution are measurably slower on NTFS than on macOS or Linux. Teams on Windows need WSL2, faster disks, or higher-spec hardware.

A trade-off worth noting: the architecture optimizes for velocity given adequate hardware. The minimum spec is higher than simpler stacks. Budget accordingly.

What’s Next

The remaining articles dive deeper into specific technical patterns:

Article 17: The @delegate Directive Deep Dive — Cross-subgraph field resolution in detail
Article 18: Building a Headless Design System — The Compose Pattern — Separating style logic from templates
Article 19: A/B Testing at SSR Level — Cookie-based variant selection during server rendering
Articles 20–27: Content preview, logging, module development, conditional rendering, image proxying, observability, reactive filters, and deferred hydration

Munir Husseini is a software architect specializing in full-stack TypeScript, .NET, and cloud-native architectures.

June 6, 2026

Load Testing Results — 15× Faster, 5× More Capacity

Nineteenth in a series about migrating from legacy architectures to a modern Nuxt 4 stack.

Architecture Decisions Have Consequences — Measure Them

Architecture decisions accumulate, and their combined effect only becomes visible under real load.

Before production, a large enterprise application was load-tested with production-equivalent patterns, not synthetic traffic. k6 replayed a model derived from real production logs: 20 pages, weighted by actual traffic share.

The Headline Numbers

Metric	Legacy System	New System	Change
Median response time	2,618 ms	165 ms	15.9× faster
Error rate (1× prod load)	3.91%	0.09%	97% lower
Max tested capacity	~99 RPM	494+ RPM	5× more
Infrastructure	3× fixed VMs (24 vCPU, 96 GB)	Auto-scaled containers	Elastic
Lighthouse Performance (mobile)	~50	97+	Near-perfect

A 2.6-second median means the better half of requests still took 2.6 seconds. A 165 ms median means the page renders before a user can blink.

Test Methodology

Traffic Pattern

The load test replayed production-equivalent traffic using k6’s HTTP module:

pie showData
  title Traffic Distribution (top 10 pages)
  "Homepage (28%)" : 28
  "Product Overview (19%)" : 19
  "Product Details (14%)" : 14
  "Checkout Step 1 (9%)" : 9
  "FAQ (7%)" : 7
  "Contact (6%)" : 6
  "About (5%)" : 5
  "Legal / Imprint (4%)" : 4
  "Blog Overview (3%)" : 3
  "Other (11 pages) (5%)" : 5

Test Types

Two test types were run:

Replay Test — constant load at 1× production traffic (99 RPM) for 30 minutes
Ramp Test — linear ramp from 1× to 5× production traffic over 30 minutes

Replay Test: 1× Production Load

The replay test answers: “Can the new system handle current production traffic?”

flowchart TB
  title["Replay Test Results (1× production load = 99 RPM)"]

  subgraph Legacy_System["Legacy System"]
    L_Median["Median RT: 2,618 ms"]
    L_P95["P95 RT: 8,500+ ms"]
    L_Error["Error Rate: 3.91%"]
    L_RPM["Requests/min: 99"]
    L_Status["Status: Degraded"]
  end

  subgraph New_System["New System"]
    N_Median["Median RT: 168 ms"]
    N_P95["P95 RT: 450 ms"]
    N_Error["Error Rate: 0.09%"]
    N_RPM["Requests/min: 99"]
    N_Status["Status: Healthy"]
  end

  L_Median --- N_Median
  L_P95 --- N_P95
  L_Error --- N_Error
  L_RPM --- N_RPM
  L_Status --- N_Status

The new system handles production traffic with 96% lower response times and 97% fewer errors. The P95 at 450 ms means even the slowest 5% of requests are faster than the legacy system’s median.

Ramp Test: Finding the Ceiling

The ramp test answers: “How far can we push it before it breaks?”

xychart-beta
  title "Ramp Test Results (1× → 5× production load)"
  x-axis "Load (× production)" [1, 2, 3, 4, 5]
  y-axis "Response Time (ms)"
  line [2618, 4000, 5000, 6000, 8800]
  line [165, 165, 165, 165, 165]

The median stayed flat at 165 ms even at 5× load. There was no linear degradation: additional load did not increase per-request latency.

The P95 degraded to 8.8 seconds at 5×, driven by scale-out lag. New replicas needed time to start; once they were online, they matched existing replica performance.

The Right-Sizing Experiment

Finding the minimum viable resource allocation is a critical part of load testing. Four configurations were tested:

Config	vCPU	RAM	PM2 Workers	V8 Heap	Result
#1	4	8 GiB	3	2048 MB	✅ Stable, over-provisioned
#2	2	4 GiB	2	1536 MB	✅ Stable, efficient
#3	1	2 GiB	2	1024 MB	❌ Cascading failures
#4	2	4 GiB	2	1536 MB	✅ Validated (6× load)

flowchart LR
  A["Config #1: 4 vCPU / 8 GiB / 3 workers / 2048 MB heap"] -->|Over-provisioned| B["Config #2: 2 vCPU / 4 GiB / 2 workers / 1536 MB heap"]
  B -->|Right-size further| C["Config #3: 1 vCPU / 2 GiB / 2 workers / 1024 MB heap"]
  C -->|Cascading failures| D["Config #4: 2 vCPU / 4 GiB / 2 workers / 1536 MB heap (Validated at 6× load)"]

The Failed Right-Sizing (Config #3)

Reducing to 1 vCPU / 2 GiB caused a cascade:

sequenceDiagram
  participant L as Load Generator
  participant R1 as Replica 1
  participant R2 as Replica 2
  participant R3 as Replica 3
  participant HP as Health Probe

  Note over R1,R3: Failure Cascade at 1 vCPU / 2 GiB

  L->>R1: t=0s: Traffic (99 RPM)
  Note over R1: Memory: 1,791 / 2,048 MB (87.5%)

  R1-->>R1: t=10s: V8 GC stalls<br/>Event loop blocked
  HP->>R1: t=15s: Health probe
  HP-->>HP: Timeout
  HP->>R1: Mark unhealthy → restart

  Note over R2: t=20s: Absorbs 2× traffic
  L->>R2: Increased traffic

  R2-->>R2: t=25s: Memory spike → restart
  Note over R3: t=30s: Overloaded → restart

  Note over R1,R3: t=35s: All replicas restarting
  Note over L: t=45s: Zero capacity for ~10 seconds<br/>→ 5% error rate

V8 needs breathing room. At 87.5% heap utilization, GC pauses block the event loop long enough for health probes to time out. The minimum viable compute here was 2 vCPU / 4 GiB, though the exact threshold depends on application complexity, page weight, and caching. The principle is general; the numbers are specific.

The Validated Production Configuration

The configuration that passed k6’s exit-code-0 threshold at 6× production load:

flowchart TB
  subgraph SPA["SPA Containers"]
    SPA_CPU["CPU: 2 vCPU"]
    SPA_MEM["Memory: 4 GiB"]
    SPA_PM2["PM2 Workers: 2 per container"]
    SPA_HEAP["V8 Heap: 1536 MB (--max-old-space-size=1536)"]
    SPA_MIN["Min Replicas: 5"]
    SPA_MAX["Max Replicas: 20"]
  end

  subgraph API["API Containers"]
    API_CPU["CPU: 0.5 vCPU"]
    API_MEM["Memory: 1 GiB"]
    API_MIN["Min Replicas: 3"]
    API_MAX["Max Replicas: 20"]
  end

  subgraph Results["Result at 6× load"]
    RES_MED["Median RT: 165 ms"]
    RES_ERR["Error rate: 0.82%"]
    RES_CPU["CPU peak: 12% of allocation"]
    RES_MEM["Memory peak: 60% of allocation"]
  end

  SPA --> Results
  API --> Results

Cost Analysis

50% less CPU and 50% less memory per replica compared to the initial over-provisioned config:

flowchart TB
  subgraph Legacy["Legacy (fixed)"]
    L1["3× VM instances"]
    L2["24 vCPU, 96 GB RAM — always on"]
    L3["Cost: constant regardless of traffic"]
  end

  subgraph New["New (elastic)"]
    N1["5–20 SPA replicas (2 vCPU, 4 GiB each)"]
    N2["3–20 API replicas (0.5 vCPU, 1 GiB each)"]
    N3["Per-second billing — pay for actual usage"]
    N4["At idle: 5 SPA + 3 API"]
    N5["At peak: 15 SPA + 8 API"]
    N6["Average: ~60% of peak capacity billed"]
  end

  Legacy -->|"Migrated to"| New

Elastic billing lowers cost during low-traffic periods — nights, weekends, and holidays — while still scaling for spikes without permanent over-provisioning.

What the Numbers Mean for Architecture

Each architecture decision from earlier articles contributed to these numbers:

Decision	Contribution
SSR (Article 1)	Eliminates client-side rendering delay
GraphQL Gateway (Article 2)	Single query per page instead of 3–5 REST calls
Multi-Tier Cache (Article 6)	Sub-ms content retrieval for cached pages
Deferred Hydration (Article 6)	Eliminates render-blocking JavaScript
Same-Origin Image Proxy (Article 6)	Improves LCP by reducing cross-origin overhead
PM2 Cluster Mode (Article 10)	Zero-downtime worker restarts
Container Apps Auto-Scaling (Article 11)	Elastic capacity, no over-provisioning

flowchart LR
  SSR["SSR"] --> PERF["Lower TTFB & faster first paint"]
  GQL["GraphQL Gateway"] --> PERF
  CACHE["Multi-Tier Cache"] --> PERF
  HYDR["Deferred Hydration"] --> PERF
  IMG["Same-Origin Image Proxy"] --> PERF
  PM2["PM2 Cluster Mode"] --> REL["Resilience & zero-downtime deploys"]
  AS["Container Apps Auto-Scaling"] --> CAP["Elastic capacity"]

  PERF --> OUT["15.9× faster median\nLighthouse 97+"]
  REL --> OUT
  CAP --> OUT

No single decision produces 15.9×. It is the combination — each one removing a different bottleneck — that delivers the aggregate result.

Lessons Learned

Load test with production traffic patterns, not synthetic ones

A synthetic test hitting the homepage 100 times per second says nothing about real-world performance. Real traffic has a distribution — heavy pages, light pages, API calls, form submissions. The test must match it.

flowchart LR
  A["Synthetic test: 100 req/s to homepage"] -->|Misleading| C["Unrealistic bottlenecks"]
  B["Production-equivalent mix:\nheavy pages, light pages, APIs, forms"] -->|Accurate| D["Realistic capacity & latency insights"]

Right-sizing failures are the most valuable test results

The cascading failure at 1 vCPU / 2 GiB taught more about system behavior than all successful tests combined. It exposed the GC pressure threshold, health probe timing sensitivity, and cold-start vulnerability. These insights shaped the production configuration.

flowchart TB
  F["Right-sizing attempt"] --> F1["Too small (1 vCPU / 2 GiB)"]
  F1 --> F2["GC pressure & probe timeouts"]
  F2 --> F3["Cascading restarts"]
  F3 --> F4["Error budget impact"]
  F4 --> F5["Refined production config\n(2 vCPU / 4 GiB, validated at 6×)"]

Median response time is the metric that matters most

P95 and P99 matter for tail latency, but the median determines the experience for most users. A flat median under increasing load (165 ms at 1× and 5×) proves horizontal scaling without per-request degradation.

xychart-beta
  title "Median vs P95 under load"
  x-axis "Load (× production)" [1, 2, 3, 4, 5]
  y-axis "Response Time (ms)"
  line [165, 165, 165, 165, 165]
  line [450, 1200, 3000, 6000, 8800]

15× is not an optimization — it is a different architecture

A 15.9× improvement does not come from optimizing an existing system. It comes from removing fundamental bottlenecks: dual rendering, multi-source data joining, absence of caching, fixed infrastructure. The improvement is architectural, not incremental.

What’s Next

Article 16: The Full Picture — What the New Concept Delivers — Synthesis for decision-makers and architects.
Article 17: The @delegate Directive Deep Dive — Cross-Subgraph Field Resolution — A technical deep dive into the most powerful schema stitching feature.
Article 18: Building a Headless Design System in Vue 3 — The Compose Pattern — Separating style logic from templates.

Munir Husseini is a software architect specializing in full-stack TypeScript, .NET, and cloud-native architectures.

June 6, 2026

The Nuxt Observability Stack: Tracing, Logging, and PM2 Metrics

Migrating from a legacy application to a modern Nuxt 4 stack is not just about new frameworks and better performance numbers. The real shift is moving from reactive firefighting to proactive observability — knowing what is slow, why it is slow, and how the platform behaves under real load.

This observability stack has three pillars:

End-to-end distributed tracing across Nginx, Nuxt, backend services, and Redis
Structured logging with per-module, runtime-tunable log levels
Node.js process diagnostics for GC, heap, and CPU under PM2

Together, they turn a deployment into something that can be reasoned about, not just hoped over.

Flying Blind vs. Full Visibility

Without observability, slowdowns are only visible when users complain, and failures are only visible when error rates spike. The underlying cause remains unknown: which component was slow, which call failed, which cache missed.

In a system with multiple containers — for example, a frontend app, an API, a proxy, and Redis — a single request crosses several services. Without tracing, correlating what happened means manually matching timestamps across separate log streams. Most teams stop long before they get a clear picture.

The target state is one trace ID created at the edge and propagated from the browser through every service, so a single click in the observability backend reveals the full request waterfall.

Three-Layer Telemetry: Traces, Proxy Spans, and Container Metrics

The observability stack has three layers, each capturing a different dimension of the system:

flowchart TB
  subgraph L1["Layer 1: SDK Instrumentation"]
    L1a["Node.js applicationinsights<br/>+ .NET AI SDK"]
    L1b["→ Request traces, dependency calls, exceptions"]
    L1c["→ Custom events (GraphQL operations, cache metrics)"]
  end

  subgraph L2["Layer 2: Nginx OpenTelemetry Module"]
    L2a["→ Span per proxied request"]
    L2b["→ W3C Trace Context headers<br/>(traceparent, tracestate)"]
    L2c["→ Complete proxy → SPA → API waterfall"]
  end

  subgraph L3["Layer 3: Container Apps Managed OTel Agent"]
    L3a["→ Container-level metrics<br/>(CPU, memory, restarts)"]
    L3b["→ All containers, including Redis"]
    L3c["→ Zero code changes"]
  end

  L1 --- L2 --- L3

Layer 1: SDK Instrumentation

Both the frontend app and the API send request traces, dependency calls, exceptions, and custom events to the observability backend. The Node.js SDK automatically instruments incoming HTTP requests, outgoing HTTP calls, and Redis operations.

A GraphQL server module can add custom dependency telemetry for every subgraph call and every Redis cache operation:

flowchart TB
  subgraph GQL["Custom Dependency Event: GraphQL"]
    direction TB
    g1["Name: GraphQL: cms/pageByPath"]
    g2["Type: GraphQL"]
    g3["Duration: 45ms"]
    g4["Success: true"]
    g5["operationName: pageByPath"]
    g6["subgraph: cms"]
    g7["cacheHit: false"]
    g8["transactionId: abc-123-def"]
  end

  subgraph RED["Custom Dependency Event: Redis"]
    direction TB
    r1["Name: Redis: cache-check"]
    r2["Type: Redis"]
    r3["Duration: 2ms"]
    r4["Success: true"]
    r5["operation: GET"]
    r6["cacheHit: true"]
    r7["key: page-data:/products/premium"]
  end

These custom events land in the same trace as the HTTP request, so it becomes clear which operations ran, which caches hit or missed, and how long each step took.

Layer 2: Nginx OpenTelemetry

The reverse proxy includes the nginxinc/nginx-otel module. Every proxied request becomes a span and carries W3C Trace Context headers:

sequenceDiagram
  participant B as Browser
  participant N as Nginx Proxy
  participant S as Nuxt SPA (Node.js)
  participant A as Backend API
  participant R as Redis

  B->>N: HTTP request<br/>(no trace context yet)
  Note right of N: Creates span<br/>Generates traceparent header<br/>traceparent: 00-abcdef1234567890-span1-01
  N->>S: Forward request<br/>+ traceparent

  Note right of S: Reads traceparent<br/>Creates child span<br/>Propagates to outgoing calls

  S->>R: Redis cache GET<br/>(child span)
  S->>A: GraphQL → CMS API<br/>(child span)
  S->>A: GraphQL → Backend API<br/>(child span)

  A->>A: Database calls,<br/>business logic (child spans)

A single trace ID stitches together every hop. The end-to-end transaction view in the observability backend renders the full waterfall:

gantt
  dateFormat  x
  axisFormat  %Lms

  section Nginx Proxy
  Nginx Proxy         :active, nginx, 0, 150

  section SPA Request
  SPA Request         :spa, 10, 140
  Redis GET           :redis, 20, 10
  GraphQL CMS         :cms, 30, 40
  GraphQL Backend     :backend, 40, 80

  section Backend API
  API Request         :api, 60, 70
  SQL Query           :sql, 80, 30

Layer 3: Container-Level Metrics

The container environment runs a managed OpenTelemetry collector that gathers container metrics — CPU, memory, restart counts — for all containers, including Redis. No application changes are required.

This layer answers operational questions:

Is Redis consuming too much memory?
Are frontend replicas flapping?
What is the steady-state CPU profile for API containers?

Transaction ID Propagation

Distributed traces are useful for visualizing a single request, but day-to-day debugging often starts from logs. To bridge both worlds, the proxy generates an x-transaction-id header for every incoming request:

flowchart TB
  N["Nginx<br/>x-transaction-id: txn-abc-123"]
  FE["Frontend app"]
  API["API"]
  GQL["GraphQL custom events"]

  N -->|"Reads header<br/>adds to outgoing calls<br/>logs include txn-abc-123"| FE
  N --> API
  FE -->|"Includes txn-abc-123<br/>in request & logs"| API
  FE -->|"Tag events with<br/>txn-abc-123"| GQL
  API -->|"Logs include<br/>txn-abc-123"| GQL

The transaction ID is mapped to the W3C traceparent trace ID. Developers can start from either side — a transaction ID from logs or a trace ID from the observability backend — and still recover the complete request history.

What Metrics Tell You

The combined telemetry stack tracks several metric categories, each answering a distinct question:

Metric Category	Examples	Question It Answers
Response times	Per-endpoint, per-container latency	“Which pages are slow?”
Error rates	HTTP 5xx, GraphQL errors, exceptions	“What is failing?”
Cache metrics	Hit/miss rates per cache tier	“Is caching effective?”
Resource usage	CPU, memory per container/worker	“Are we right-sized?”
Dependency durations	GraphQL subgraph calls, Redis ops	“Which external call is slow?”
User journeys	Page-to-page navigation funnels	“Where do users drop off?”

Alerting Strategy: Symptoms First, Causes Later

Metrics matter only when they drive action. The guiding principle is:

> Alert on symptoms, investigate with traces.

Symptom alert:

“Frontend P95 response time exceeded 2 seconds for 5 minutes.”

Investigation:

Open the traces for those slow requests → locate the slow dependency → fix the underlying issue.

Alerting directly on causes like Redis CPU > 80% creates noise and false positives, because Redis CPU can legitimately spike during cache invalidation without harming users. Symptom-based alerts keep noise low and align alerts with real user impact.

Structured Logging in Nuxt: From `console.log` to Observability

Traces tell you where the problem is. Logs tell you what happened. To make that effective, logging has to be more than printing strings.

The `console.log` Problem

Using console.log in a production SSR application causes real issues:

No severity levels — errors are indistinguishable from informational noise
No structure — freeform strings cannot be reliably queried, filtered, or aggregated
No context — you cannot tell which request, user, or component produced the log
No control — you cannot selectively enable verbose logging for one module without overwhelming the output
SSR noise — server-side logs are mixed with framework output, health checks, and PM2 logs

There is a big difference between “we have logging” and “we have useful logging.” The first gives you strings to grep. The second gives you a structured, queryable observability layer.

The Logging Architecture

The logging system has three main building blocks:

flowchart TB
  subgraph APP["Application Code"]
    A1["const log = useLogger('shopping-cart')"]
    A2["log.info('Item added', { productId, quantity })"]
  end

  subgraph UL["useLogger Composable"]
    UL1["Tagged with module name"]
    UL2["Checks if this module's level is enabled"]
    UL3["Formats structured message"]
  end

  subgraph MS["Multi-Sink Router"]
    S1["Sink 1: Console (development)<br/>Formatted, colored, human-readable"]
    S2["Sink 2: Observability Backend<br/>Structured JSON, custom properties"]
    S3["Sink 3: DevTools Log Viewer<br/>Real-time, filterable, in-browser"]
  end

  APP --> UL --> MS
  MS --> S1
  MS --> S2
  MS --> S3

The `useLogger` Composable

Each module gets its own logger instance:

const log = useLogger('shopping-cart')

log.debug('Cart state loaded', { items: cart.items.length })
log.info('Item added', { productId: 'abc', quantity: 2 })
log.warn('Price mismatch detected', { expected: 29.99, actual: 31.99 })
log.error('Checkout failed', { error: err.message, orderId })

Every logger is tagged with its module name. This enables per-module log level control — you can set shopping-cart to debug while keeping navigation at warn.

Severity Levels

Level	When to Use	Example
`debug`	Development-only details	“Cart state loaded, 3 items”
`info`	Significant business events	“Item added to cart”
`warn`	Unexpected but recoverable	“Price mismatch, using server price”
`error`	Failures requiring attention	“Checkout failed, payment rejected”

Multi-Sink Routing

Each log message is fanned out to multiple sinks at once.

Sink 1: Console (Development)

In development, logs are written to both the browser console and Node.js stdout with:

Color coding by severity
A module name prefix
Collapsible structured payloads (objects expand on click)

Sink 2: Observability Backend (Production)

In production, logs are sent as structured events:

Observability Event:
{
  name: "shopping-cart:info",
  properties: {
    module: "shopping-cart",
    severity: "info",
    message: "Item added",
    productId: "abc-123",
    quantity: 2,
    requestId: "req-xyz",
    timestamp: "2025-06-02T12:34:56Z"
  }
}

These events can be queried with KQL (Kusto Query Language):

customEvents
| where name startswith "shopping-cart"
| where customDimensions.severity == "error"
| project timestamp, customDimensions.message, customDimensions.productId
| order by timestamp desc

Sink 3: DevTools Log Viewer

A custom DevTools tab shows logs in real time:

flowchart TB
  subgraph DT["DevTools — Logs Tab"]
    F["Filter controls:<br/>[All Modules ▼] [Info ▼] [Search...]"]
    L1["12:34:56 INFO  shopping-cart<br/>Item added {productId: 'abc', quantity: 2}"]
    L2["12:34:57 DEBUG catalog-query<br/>Cache hit for key 10115"]
    L3["12:34:58 WARN  shopping-cart<br/>Price mismatch {expected: 29.99, actual: 31}"]
    L4["12:35:01 ERROR checkout<br/>Payment failed {orderId: 'ord-789'}"]
  end

  F --> L1 --> L2 --> L3 --> L4

Capabilities:

Filter by severity, such as only errors or debug and above
Filter by module, such as only shopping-cart logs
Full-text search across messages
Expandable structured data payloads

Runtime Log Level Control

Log levels are adjustable at runtime without restarting the app.

flowchart TB
  subgraph CFG["Default levels (from config)"]
    C1["shopping-cart: info"]
    C2["catalog-query: warn"]
    C3["navigation: warn"]
  end

  subgraph RT["Runtime override (via API or DevTools)"]
    R1["shopping-cart: debug  ← changed"]
    R2["catalog-query: warn   ← unchanged"]
    R3["navigation: info      ← changed"]
  end

  CFG --> RT

  subgraph EFFECT["Effect"]
    E1["shopping-cart now outputs debug logs"]
    E2["No server restart"]
    E3["No redeploy"]
    E4["No impact on other modules"]
  end

  RT --> EFFECT

A typical production debugging workflow:

A user reports an issue
Enable debug logging for the relevant module via an API or DevTools
Reproduce the problem
Inspect the debug logs in the observability backend
Turn debug logging off again and restore the default level

No deployment, no restart, and no log flood from unrelated modules.

SSR-Aware Logging

In an SSR app, logging must handle both server and client execution contexts:

flowchart LR
  subgraph SRV["SSR Execution (Server: Node.js)"]
    S1["log.info('Page rendered')"]
    S2["Output:<br/>stdout (PM2 logs)<br/>Observability backend"]
    S3["Context:<br/>Request URL<br/>Request ID<br/>User-Agent"]
    S1 --> S2 --> S3
  end

  subgraph CLI["Client Execution (Browser)"]
    C1["log.info('Button clicked')"]
    C2["Output:<br/>Browser console<br/>DevTools Log Viewer<br/>Observability backend telemetry"]
    C3["Context:<br/>Current route<br/>Session ID"]
    C1 --> C2 --> C3
  end

useLogger detects where it is running and routes logs to the right sinks. Server-side logs include request context such as URL, request ID, and user agent. Client-side logs include session context such as current route and user interactions.

Replacing `console.log` Safely

The migration away from console.log is incremental.

An ESLint rule flags console.log usage and suggests replacing it with useLogger. It does not auto-fix, so the developer explicitly chooses the severity and module tag.
For legacy code, a global console interceptor captures console.* calls and forwards them into the structured logging pipeline under a legacy module tag. This ensures nothing is lost during the transition.

Over time, the codebase shifts from unstructured strings to queryable, structured events.

Node.js Observability Under PM2: Diagnostics, GC, and CPU

Application-level traces and logs tell you what is slow. To understand why the Node.js process itself degrades — heap growth, GC pauses, event loop lag — you need process-level visibility.

Three Nuxt modules provide this:

diagnostics — per-request aggregation and pattern learning
diagnostics-heap — GC and heap monitoring with leak detection
diagnostics-profiler — automatic CPU profiling for slow requests

These sit alongside the Nuxt app, PM2, and Nginx, and feed directly into the same observability backend.

Layer 1: Per-Request Aggregation (`diagnostics` module)

The diagnostics module captures seven metrics for every HTTP request:

Metric	What It Measures
Duration	Total request handling time (ms)
Input size	Request body size (bytes)
Output size	Response body size (bytes)
CPU usage	Process CPU delta during request
Memory delta	Heap memory change during request
Event loop lag	Main thread blocking time (ms)
Status code	HTTP response status

O(1) Memory Aggregation

Traditional APM tools store one record per request — 8.6 million records per day at 100 req/s. This module takes a different approach: no per-request storage. Only aggregations such as min, max, sum, and count are retained.

flowchart TB
  subgraph TRAD["Per-Request Storage (traditional APM)"]
    T1["Request 1: { duration: 150, cpu: 12, memory: 35MB, ... }"]
    T2["Request 2: { duration: 200, cpu: 15, memory: 42MB, ... }"]
    T3["Request 3: { duration: 180, cpu: 11, memory: 38MB, ... }"]
    Tn["Request N: { duration: ???, cpu: ??, memory: ???, ... }"]
    TM["Memory usage: O(N) — grows with request count"]
    T1 --> T2 --> T3 --> Tn --> TM
  end

  subgraph AGG["Aggregation-Only (diagnostics module)"]
    A1["Aggregate:"]
    A2["duration: { min, max, sum, count }"]
    A3["cpu: { min, max, sum, count }"]
    A4["memory: { min, max, sum, count }"]
    AM["Memory usage: O(1) — constant<br/>regardless of request count"]
    A1 --> A2 --> A3 --> A4 --> AM
  end

Monitoring overhead is constant, regardless of traffic volume.

Slow-Request Pattern Detection

Every 30 seconds, after at least 50 requests, the module detects patterns in slow requests by grouping on several features:

flowchart TB
  subgraph FB["Feature Buckets"]
    F1["URL pattern: /products/*, /checkout/*, /"]
    F2["HTTP method: GET, POST"]
    F3["Payload size: small (<1KB), medium, large"]
    F4["Path depth: 1, 2, 3, 4+"]
  end

  FB --> P["For each bucket:<br/>Compute probability(request is slow)<br/>If probability ≥ 50% and count ≥ 3 → emit pattern"]

For each feature bucket, the algorithm calculates the probability that a request in this bucket is slow (exceeds the configured threshold, such as 500ms). If a bucket has at least 50% slow probability with at least 3 samples, a pattern is emitted:

flowchart TB
  P1["Observed: /checkout/*<br/>73% of requests slow (>500ms)<br/>12 observations in last 30s"]
  P2["Emit custom event:<br/>name = 'SlowRequestPatterns'<br/>pattern = '/checkout/*'<br/>probability > 0.5"]
  P1 --> P2

Pattern detection surfaces systemic slowness that individual alerts miss. A single slow request might be a fluke. A persistent pattern for a specific URL points to a real problem with that page’s data fetching or rendering.

SSR GraphQL Disambiguation

During SSR, the Nuxt server makes GraphQL calls to itself — real HTTP requests that pass through the diagnostics middleware. Without disambiguation, each page request would be counted twice.

The module identifies SSR-internal requests via the CSRF bypass token from the security layer and excludes them. You get accurate per-page measurements with no double-counting.

Layer 2: Heap Memory and GC (`diagnostics-heap` module)

The diagnostics-heap module uses V8’s PerformanceObserver API to monitor garbage collection events in real time.

GC Event Categories

GC Type	What It Collects	Typical Duration
`scavenge`	Young generation (new objects)	1–5 ms
`mark-sweep`	Full heap (major GC)	10–50 ms
`incremental`	Incremental marking	1–10 ms
`weakcb`	Weak reference callbacks	<1 ms

Each event records duration, heap before/after, and bytes freed. Events are aggregated into time-series data and sent periodically to the observability backend.

Automatic Memory Leak Detection

The module tracks consecutive heap growth over time. When heapUsed increases for \(N\) consecutive intervals without a significant GC reduction, it emits a leak detection event:

flowchart TB
  subgraph WIN["Observation Window: 10 intervals (5 min each)"]
    I1["Interval 1: heapUsed = 800 MB"]
    I2["Interval 2: heapUsed = 820 MB  ↑ +20 MB"]
    I3["Interval 3: heapUsed = 845 MB  ↑ +25 MB"]
    I4["Interval 4: heapUsed = 860 MB  ↑ +15 MB"]
    I5["Interval 5: heapUsed = 890 MB  ↑ +30 MB"]
  end

  WIN --> DET["5 consecutive growth intervals detected<br/>Growth rate ≈ 18 MB/interval = 216 MB/hour"]

  DET --> EVT["Emit event:<br/>{ event: 'PotentialMemoryLeak',<br/>confidence: 'medium',<br/>growthRateMBPerHour: 216,<br/>consecutiveGrowths: 5 }"]

  EVT --> HIGH["If growth continues to 8+ intervals:<br/>confidence → 'high'"]

The confidence level reduces false positives. Short-term growth is normal during traffic spikes. Only sustained growth triggers a leak alert.

Automatic Heap Dumps

When heapUsed exceeds a configurable threshold (default 1024 MB), a .heapsnapshot file is written automatically. It can be loaded into Chrome DevTools for detailed memory analysis.

V8 Heap Space Breakdown

Periodic sampling of v8.getHeapSpaceStatistics() provides per-space memory usage:

flowchart TB
  subgraph HS["V8 Heap Spaces"]
    N["new_space: 16 MB total, 8 MB used<br/>Purpose: New objects (GC: scavenge)"]
    O["old_space: 900 MB total, 780 MB used<br/>Purpose: Survived objects"]
    C["code_space: 12 MB total, 10 MB used<br/>Purpose: Compiled code"]
    L["large_object: 45 MB total, 40 MB used<br/>Purpose: Objects > 512 KB"]
  end

  N --> O --> C --> L

This is essential for distinguishing object leaks (old_space growing) from code cache growth (code_space growing) — different causes, different fixes.

Layer 3: CPU Profiling (`diagnostics-profiler` module)

The diagnostics-profiler module automatically captures V8 CPU profiles for requests that exceed the slow-request threshold.

flowchart TB
  RS["Request starts<br/>Timer begins"]
  TH["Duration exceeds threshold"]
  PR["Profiler activates<br/>Capture V8 CPU profile"]
  RC["Request completes<br/>Profile saved as .cpuprofile"]
  DEV["Load in Chrome DevTools<br/>Flame chart analysis"]

  RS --> TH --> PR --> RC --> DEV

  subgraph FL["Example Flame Chart Breakdown"]
    F1["SSR renderer: 45% CPU time"]
    F2["GraphQL response parsing: 30%"]
    F3["HTML serialization: 15%"]
    F4["Other: 10%"]
  end

  DEV --> FL

Profiles are in the standard V8 format, which Chrome DevTools renders as a flame chart, showing exactly which functions consumed CPU time.

The Unified Picture: From Symptom to Root Cause

When a slow request occurs, all layers fire in concert — traces, logs, and Node diagnostics:

flowchart TB
  subgraph L1["Layer 1 (diagnostics)"]
    L1a["Records duration: 2,300 ms"]
    L1b["Emits SlowRequest event"]
    L1c["Updates pattern detection"]
  end

  subgraph L2["Layer 2 (diagnostics-heap)"]
    L2a["Records memory delta: +45 MB"]
    L2b["Checks for leak pattern"]
    L2c["If heap > threshold → auto heap dump"]
  end

  subgraph L3["Layer 3 (diagnostics-profiler)"]
    L3a["Captures .cpuprofile"]
    L3b["Shows 65% time in CMS API response parsing"]
  end

  subgraph OBS["Observability backend"]
    O1["End-to-end trace correlates:"]
    O2["Nginx span"]
    O3["Nuxt request + GraphQL dependencies"]
    O4["API calls + SQL query"]
    O5["Custom logs tagged with transaction ID"]
    O6["SlowRequestPatterns + GC + leak signals"]
  end

  L1 --> OBS
  L2 --> OBS
  L3 --> OBS

You can move from:

An alert: “P95 for /checkout is 2.3s”
To the trace: “Most time is in the CMS subgraph”
To logs: “Price mismatch warnings and retries”
To process-level data: “Major GC pauses plus heap growth”
To artifacts: .heapsnapshot and .cpuprofile for offline analysis

All within a single, correlated observability fabric.

Capacity Planning Endpoint

The diagnostics module exposes a /api/__profiler/memory-capacity endpoint that calculates the theoretical memory requirement:

flowchart TB
  IN["Inputs:<br/>Baseline = 200 MB<br/>Requests/sec = 10<br/>Avg RT = 150 ms (0.15 s)<br/>Memory/req = 35 MB"]
  CONC["Concurrent requests = 10 × 0.15 = 1.5"]
  PEAK["Peak memory = 200 + (1.5 × 35) = 252.5 MB"]
  SAFETY["With 3× safety factor = 757.5 MB"]
  CFG["Set --max-old-space-size ≥ 768 MB"]

  IN --> CONC --> PEAK --> SAFETY --> CFG

This directly informs the V8 heap cap and container memory allocation, bridging runtime diagnostics with deployment configuration.

Lessons Learned Across the Stack

Distributed tracing is not optional in a multi-container architecture

Without trace correlation, debugging a slow request across four or more containers means combing through isolated log streams and aligning timestamps by hand. With W3C Trace Context, one trace ID tells the whole story. Setup cost: a few hours. Debugging savings: ongoing.

Custom dependency events are worth the effort

Out-of-the-box instrumentation knows about HTTP calls and Redis commands but has no idea that a specific call is “a GraphQL query to the CMS subgraph for page-by-path.” Custom events supply that semantic meaning — you can ask for “all slow CMS page queries” instead of “all slow HTTP calls to this URL.”

Separate the telemetry environment from the application environment

Using separate observability instances for test and production stops test noise from polluting production dashboards. Feature branches can report into the test instance.

Layer 3 catches what SDK instrumentation misses

SDK instrumentation covers what happens inside application processes. Container and Node-level metrics capture everything around them — Redis memory growth, restarts, OOM kills, GC pauses. Without this layer, Redis running out of memory or Node leaks are invisible until things start failing.

Per-module log levels are essential at scale

With 35+ modules, a single global log level is useless because enabling debug generates thousands of messages per second. Per-module levels let teams zoom in on the area they care about without drowning in noise.

Runtime control changes how production issues are debugged

When enabling debug logging requires a deployment, teams either leave it on permanently or never enable it. Runtime controls turn it into a normal tool: enable, investigate, disable.

Structured data beats formatted strings

log.info('Item added', { productId: 'abc', quantity: 2 }) is queryable: “show all items with quantity > 5.”

console.log('Item abc added, quantity: 2') needs regex parsing and still breaks when the format changes. The extra effort to log structured data pays off every time it needs to be analyzed.

Pattern detection beats single-event alerts

Single slow-request alerts create noise and fatigue. A pattern like “73% of /checkout requests are slow” is actionable. It tells you exactly where to investigate.

Automatic heap dumps are worth the disk space

When a leak is detected in production, reproducing it locally is often the hardest part. Automatic heap dumps capture the heap state at the moment of detection — no reproduction required. A single snapshot can save days of debugging.

Munir Husseini is a software architect specializing in full-stack TypeScript, .NET, and cloud-native architectures.

June 6, 2026

Memory, Stability, and PM2 — Running a Long-Lived Node.js Server
Seventeenth in a series about migrating from legacy architectures to a modern Nuxt 4 stack.

The Inconvenient Truth About Node.js Servers

Node.js is optimized for event-driven I/O, not for long-lived servers that render thousands of pages per hour. Over time, the V8 heap grows and objects such as GraphQL responses, Vue server renderer allocations, cached strings, and Apollo Client instances accumulate. Without intervention, a production process will eventually consume all available memory and get killed by the container orchestrator.

That is not a bug to eliminate so much as a reality to manage. The real question is not whether memory will approach its limit, but how gracefully the system will handle it.

PM2 Cluster Mode: Zero-Downtime Worker Management

In a large enterprise application, instead of a single Node.js process, PM2 typically runs N worker processes — often 2–3 per container. Each worker handles requests independently, which provides two critical benefits:
1. Fault isolation — if one worker crashes or becomes unresponsive, the others keep serving requests
2. Rolling restarts — when a worker approaches its memory limit, PM2 restarts it while the other workers continue handling traffic
```
flowchart TB
 subgraph C["Container (2 vCPU, 4 GiB RAM)"]
 direction TB
 M[PM2 Master Process]

 subgraph W1[Worker 1]
 direction TB
 H1[V8 Heap\n~1.5 GiB\nmax-old-space-size=1536]
 R1[Handles requests\nindependently]
 end

 subgraph W2[Worker 2]
 direction TB
 H2[V8 Heap\n~1.5 GiB\nmax-old-space-size=1536]
 R2[Handles requests\nindependently]
 end
 end

 M --- W1
 M --- W2
```
When Worker 1 approaches 1,536 MB of heap usage, PM2 restarts it. Worker 2 handles traffic during the restart, which typically takes 2–3 seconds for V8 to compile the Nuxt application. For that worker, downtime lasts a few seconds. For the overall application, it is effectively zero.

V8 Heap Cap: Trading Throughput for Predictability

By default, V8 uses a dynamic heap limit that grows based on available system memory. In containerized environments, that behavior is risky — V8 can grow beyond the container’s memory allocation and trigger an OOM kill.

Setting an explicit heap limit forces more aggressive garbage collection:
```
NODE_OPTIONS=--max-old-space-size=1536

Effect:
 Without cap: GC runs infrequently → heap grows to 3+ GiB → OOM kill
 With cap: GC runs at ~1.2 GiB → heap stays under 1.5 GiB → stable
```
```
flowchart LR
 A[Start] --> B[No explicit V8 heap cap]
 B --> C["Heap grows with available memory\n&gt; 3 GiB in container"]
 C --> D[Container OOM kill]

 A --> E[Set --max-old-space-size=1536]
 E --> F[GC runs around 1.2 GiB]
 F --> G["Heap stays &lt;= 1.5 GiB"]
 G --> H[Process stable\nSlightly lower peak throughput]
```
The trade-off is straightforward: more frequent GC pauses of 2–5 ms each reduce peak throughput by about 5%. But the process never gets OOM-killed, which is a far better outcome in production.

Memory Is the Scaling Bottleneck

Load testing for a typical Nuxt SSR frontend in a large SaaS or e-commerce platform reveals something counterintuitive: the Nuxt SSR application is often I/O-bound, not CPU-bound.
```
flowchart TB
 subgraph RU[Resource Usage Under Load]
 CPU[CPU peak ~12%]
 MEM[Memory peak ~60%]
 BOT[Bottleneck: Memory, not CPU]
 end

 CPU --> BOT
 MEM --> BOT
```
SSR mostly waits for backend responses (for example, GraphQL or REST APIs) and renders HTML — I/O work that barely touches the CPU. But each in-flight request still holds response objects, VNode trees, and serialization buffers in memory. Under load, dozens of concurrent requests holding a few hundred kilobytes each add up quickly.

This means:
- Over-provisioning CPU wastes money — you pay for compute that sits idle
- Under-provisioning memory crashes the server — V8 heap exhaustion triggers cascading failures
- A good starting ratio is roughly 1 vCPU : 2 GiB RAM for SSR workloads
The Right-Sizing Experiment

In a representative production-like environment, you can right-size Node.js SSR containers by running load tests with different resource configurations:

Configuration Result
4 vCPU / 8 GiB Stable but over-provisioned
2 vCPU / 4 GiB Stable and efficient ✓
1 vCPU / 2 GiB Cascading failures

At 1 vCPU / 2 GiB, workers ran at 1,791 MB out of 2,048 MB — V8 was at its ceiling. Health probes timed out because the event loop was blocked by GC. The orchestrator restarted replicas, but cold-starting Nuxt takes several seconds because V8 must compile the application. During that window, the remaining replicas were overloaded, which caused them to fail health checks. The cascade continued until manual intervention.
```
sequenceDiagram
 participant R1 as Replica 1
 participant R2 as Replica 2
 participant R3 as Replica 3
 participant O as Orchestrator

 Note over R1: Memory 1791/2048 MB GC stalls Health probe timeout
 O->>R1: Mark unhealthy
 O->>R1: Restart replica
 Note over R1: Cold start (~3s) No traffic handling

 Note over R2: Now handling 2× traffic Memory spike
 O->>R2: Health probe timeout
 O->>R2: Restart replica

 Note over R3: Now handling 3× traffic Immediate failure
 O->>R3: Restart replica

 Note over R1,R3: All replicas restarting Zero capacity for ~10 seconds
```
In practice, the minimum viable per-replica compute for V8 startup plus Nuxt SSR in such an environment is about 2 vCPU / 4 GiB. Going below that introduces a cascading failure risk that replica count alone cannot absorb.

Minimum Replicas: Preventing Cold-Start Cascades

Even with correctly sized replicas, starting from too few creates problems under load. The orchestrator can launch new replicas, but each one needs time to start, compile, and begin accepting requests.

For example, with 2 replicas scaling to 15, the first traffic burst hits only 2 instances. They overload while new replicas spin up. By the time those are ready, the original 2 may already have failed.

The fix is to set minReplicas high enough to handle average production traffic without scaling out. In a typical large-scale web application, values might look like this:

Service minReplicas maxReplicas Reasoning
SSR SPA 5 20 Handles page rendering (heaviest)
API 3 20 Handles business logic (lighter)
```
flowchart LR
 TRAF[Average production traffic] -->|First burst| R5[5 pre-warmed SSR SPA replicas]
 R5 --> CAP["Within capacity No scale-out needed"]

 TRAF -->|Genuine spike| SO[Autoscaler triggers scale-out]
 SO --> N[New replicas starting\nNuxt compile + V8 startup]
 R5 --> BUF[Existing 5 replicas buffer traffic]
 N --> READY[New replicas ready\nTraffic distributed]
```
At 5 pre-warmed SPA replicas, normal production traffic stays within capacity and does not trigger scaling. Scale-out only activates for genuine spikes, and the existing 5 replicas buffer traffic while new ones start.

Health Monitoring

The application exposes a health endpoint that returns per-worker metrics, enabling the orchestrator and internal tools to see exactly what PM2 workers are doing:
```
GET /api/health/pm2
Response:
{
 "workers": [
 {
 "id": 0,
 "cpu": 8.2,
 "memory": 1234567890,
 "restarts": 3,
 "uptime": 86400000,
 "status": "online"
 },
 {
 "id": 1,
 "cpu": 5.1,
 "memory": 987654321,
 "restarts": 1,
 "uptime": 72000000,
 "status": "online"
 }
 ]
}
```
The endpoint is protected by an internal API guard — it returns 404 for any caller that is not a health probe or internal service with the correct authorization header. External callers cannot even discover that it exists.

The Validated Configuration

After extensive load testing in a realistic production scenario, a configuration like the following has proven to pass all thresholds:
```
Per Container:
 CPU: 2 vCPU
 Memory: 4 GiB
 PM2: 2 workers per container
 V8: --max-old-space-size=1536 per worker

Scaling:
 SPA: min 5, max 20 replicas
 API: min 3, max 20 replicas

Result at 6× production load:
 Median response time: 165 ms
 Error rate: 0.82%
 CPU peak: 12% of allocation
 Memory peak: 60% of allocation
```
```
flowchart TB
 subgraph PC[Per Container]
 CPU[CPU: 2 vCPU]
 MEM[Memory: 4 GiB]
 PM2W[PM2: 2 workers per container]
 V8[V8: --max-old-space-size=1536 per worker]
 end

 subgraph SC[Scaling]
 SPA[SPA: min 5, max 20 replicas]
 API[API: min 3, max 20 replicas]
 end

 subgraph RES[Result at 6× production load]
 RT[Median response time: 165 ms]
 ER[Error rate: 0.82%]
 CPUU[CPU peak: 12% of allocation]
 MEMU[Memory peak: 60% of allocation]
 end

 PC --> SC --> RES
```
Lessons Learned

Node.js is not a “fire and forget” runtime

Unlike compiled languages with deterministic memory management, Node.js requires active memory management for long-lived processes. V8 heap caps, PM2 restarts, and minimum replica sizing are not optimizations — they are necessities.

Size for memory, not CPU

SSR workloads are I/O-bound. The CPU spends most of its time waiting for backend responses. Provision memory generously and CPU conservatively. A 1:2 vCPU:GiB ratio is a solid starting point.

Cold starts are the hidden enemy of auto-scaling

Auto-scaling sounds effortless until you realize new replicas take several seconds to become productive. During that window, existing replicas have to absorb the load. If they cannot, cascading failures follow. Adequate minReplicas removes that risk.

Load test the validated configuration, not the ideal one

It is tempting to load test with generous resources and right-size later. But right-sizing can expose failure modes that do not exist at larger sizes. Always load test the production configuration, not a more generous version.

What’s Next
- Article 11: Multi-Environment Infrastructure — Azure Container Apps and the Configuration System — Managing three environments with generated configuration.
- Article 12: Security in a Nuxt SSR App — CSRF, Azure AD, CSP, and More — The security layers that protect a server-rendered application.
- Article 13: Observability and Distributed Tracing — Application Insights End-to-End — How every request is traced from the reverse proxy through the application to the backend.
Munir Husseini is a software architect specializing in full-stack TypeScript, .NET, and cloud-native architectures.
June 6, 2026

Configuration	Result
4 vCPU / 8 GiB	Stable but over-provisioned
2 vCPU / 4 GiB	Stable and efficient ✓
1 vCPU / 2 GiB	Cascading failures

Service	minReplicas	maxReplicas	Reasoning
SSR SPA	5	20	Handles page rendering (heaviest)
API	3	20	Handles business logic (lighter)

Multi-Environment Infrastructure — Azure Container Apps and the Configuration System

Sixteenth in a series about migrating from legacy architectures to a modern Nuxt 4 stack.

The Environment Problem

Any non-trivial application needs multiple environments: development, test, production. In a large enterprise application, feature branches ideally each get an isolated environment so developers can share a live preview without blocking one another.

Configuration is where the real complexity hides. Three environments × four services × dozens of environment variables × secrets × scaling rules = hundreds of values that must be correct for every combination. Managing this by hand inevitably leads to deployment failures from miscopied connection strings or wrong environment variables.

The Architecture: Three Environments on Azure Container Apps

A typical setup runs three distinct environment types, all on Azure Container Apps (ACA):

flowchart LR
    subgraph ACA[Azure Container Apps Environment]
        direction LR

        subgraph FE[Feature Branches]
            direction TB
            FE_Title[Per-branch:]
            FE_SPA[SPA]
            FE_API[API]
            FE_Proxy[Proxy]
            FE_Redis[Redis]
            FE_Iso[Isolated per branch]
        end

        subgraph TEST[Test]
            direction TB
            T_Title[Shared:]
            T_SPA[SPA]
            T_API[API]
            T_Proxy[Proxy]
            T_Redis[Redis]
            T_Notes[Stable integration]
        end

        subgraph PROD[Production]
            direction TB
            P_Title[Shared:]
            P_SPA[SPA]
            P_API[API]
            P_Proxy[Proxy]
            P_Redis[Redis]
            P_Notes[Live traffic]
        end
    end

Feature Environments

In many teams, every feature branch gets its own fully isolated deployment: SPA, API, proxy, and Redis containers. The CI/CD pipeline provisions on git push and tears everything down when the branch is deleted.

Developers share a live URL within minutes of pushing
No shared test environment lock — multiple features can be tested in parallel
Full isolation — one branch’s bugs never impact another

flowchart LR
    subgraph Dev[Developer Workflow]
        direction LR
        A[git push to feature branch]
        B["CI/CD: provision\nFeature Environment\n(SPA, API, Proxy, Redis)"]
        C["Share live URL\nfor review & QA"]
        D[Branch merged\nand deleted]
        E[CI/CD: teardown\nFeature Environment]

        A --> B --> C --> D --> E
    end

Per-Branch Redis

Each feature branch gets its own Redis container. The pipeline rewrites Redis connection strings in the manifests at deploy time, preventing any cross-branch cache pollution.

flowchart LR
    subgraph BranchA[Feature Branch A]
        A_API[API A]
        A_R[Redis A]
        A_API --> A_R
    end

    subgraph BranchB[Feature Branch B]
        B_API[API B]
        B_R[Redis B]
        B_API --> B_R
    end

    style BranchA fill:#e8f5e9,stroke:#2e7d32
    style BranchB fill:#e3f2fd,stroke:#1565c0

The Configuration Generator

Manually managing hundreds of configuration values does not scale. A YAML-based configuration system can generate all deployment artifacts from a single source of truth.

Configuration Merge Order

Configuration values are defined in layers, where later layers override earlier ones:

flowchart TB
    L1["Layer 1:\nvalues.yaml\n(base + test defaults)"]
    L2["Layer 2:\nenvironments/production.yaml\n(production overrides)"]
    L3["Layer 3:\nplatforms/container-apps.yaml\n(platform defaults)"]
    L4["Layer 4:\nplatforms/container-apps.production.yaml\n(platform × env)"]
    M[⭣\nMerged configuration object]
    O1[Container Apps\nJSON manifests]
    O2[Azure Bicep\nparameter files]
    O3[Pipeline\nvariable files]

    L1 --> L2 --> L3 --> L4 --> M
    M --> O1
    M --> O2
    M --> O3

To add a new environment variable, define it once in values.yaml with a default. If production needs a different value, override it in environments/production.yaml. The generator merges all layers and emits the final artifacts.

Generated Artifacts

The configuration generator produces three kinds of outputs:

Artifact	Purpose	Example
Container Apps manifests	Complete container spec (env vars, secrets, scaling)	`container-apps/spa.test.json`
Bicep parameter files	Infrastructure parameters (environment name, region)	`container-apps/my-app.test.bicepparam`
Pipeline variable files	CI/CD variables (image tags, resource names)	`variables/common.yml`

flowchart LR
    SRC["Single source of truth\n(YAML config)"]
    GEN[Configuration generator]

    MAN[Container Apps\nJSON manifests]
    BICEP[Bicep parameter files]
    VARS[Pipeline variable files]

    SRC --> GEN
    GEN --> MAN
    GEN --> BICEP
    GEN --> VARS

Separation of Infrastructure and Application Configuration

A critical design choice in large systems: infrastructure and application configuration are treated as separate concerns.

flowchart LR
    subgraph Infra["Infrastructure (Bicep)"]
        I1[Container Apps Environment]
        I2[Application Insights]
        I3[Other platform resources]
        I_Mgr["Managed by:\nInfrastructure pipeline\n(runs rarely)"]
    end

    subgraph AppCfg["Application (JSON Manifests)"]
        A1[Container image + tag]
        A2[Environment variables]
        A3["Secrets (Key Vault refs)"]
        A4[Scaling rules]
        A5["Resource limits (CPU/RAM)"]
        A6[Ingress configuration]
        A_Mgr["Managed by:\nBuild/deploy pipeline\n(runs every deployment)"]
    end

    Infra -->|"Provides infrastructure\nendpoints & resources"| AppCfg

Infrastructure — the Container Apps Environment, monitoring, and other platform resources — changes rarely and is defined with Bicep. Application configuration — environment variables, secrets, scaling rules, resource limits — changes with each deployment and lives in generated JSON manifests.

Risky, infrequent infrastructure changes are decoupled from routine application releases that run multiple times per day.

Secret Management

Sensitive values — connection strings, API keys, encryption keys — never live in Git. Secrets are stored in Azure Key Vault and referenced by name in the manifests:

Manifest (in Git):
  env:
    - name: NUXT_REDIS_CONNECTION_STRING
      secretRef: redis-connection-string    ← reference, not value

Key Vault:
  redis-connection-string = "redis://host:6379,password=..."
                                            ← actual value

flowchart LR
    subgraph Git[Git Repo]
        M[Manifest\nsecretRef: redis-connection-string]
    end

    subgraph KV[Azure Key Vault]
        S[Secret:\nredis-connection-string\n= actual value]
    end

    subgraph ACA[Azure Container Apps Runtime]
        R[Container\nat startup]
    end

    M -. reference name .-> R
    S -. value resolution .-> R

The Container Apps runtime resolves these references at startup. Secret values never show up in CI/CD logs, Git history, or committed manifests.

Runtime Placeholders

Some values are only known at deploy time — for example, the image tag (from the build) or the Application Insights connection string (from infrastructure). Placeholders handle these late-bound values:

Manifest template:
  image: myregistry.azurecr.io/spa:__IMAGE_TAG__
  env:
    - name: APPLICATIONINSIGHTS_CONNECTION_STRING
      value: __APPINSIGHTS_CONNECTION_STRING__

Deploy pipeline substitution:
  jq '.properties.template.containers[0].image |=
      gsub("__IMAGE_TAG__"; "20260602.3")' manifest.json

sequenceDiagram
    participant B as Build
    participant P as Deploy Pipeline
    participant M as Manifest Template
    participant ACA as Azure Container Apps

    B->>P: Produce image tag\n(e.g. 20260602.3)
    P->>M: Load manifest template\nwith __IMAGE_TAG__ / __APPINSIGHTS_CONNECTION_STRING__
    P->>P: Use jq to substitute\nplaceholders with real values
    P->>ACA: Apply concrete manifest
    ACA->>ACA: Run container with\nresolved image & settings

The pipeline replaces placeholders at deploy time using jq. Manifests remain deterministic — the same manifest plus different placeholder values yields different environments.

Blue-Green Deployments

Test and production environments commonly use blue-green deployments: the new version is deployed alongside the old one, validated, and then traffic is switched.

flowchart TB
    subgraph Before[Before]
        BO["Old Revision\n(v20260601)\n100% traffic"]
    end

    subgraph During[During Deploy]
        DO["Old Revision\n(v20260601)\n100% traffic"]
        DN["New Revision\n(v20260602)\n0% traffic\n(warming up)"]
    end

    subgraph After[After Validation]
        AO["Old Revision\n(v20260601)\n0% traffic\n(standby)"]
        AN["New Revision\n(v20260602)\n100% traffic"]
    end

    subgraph Rollback[Rollback]
        RO["Switch traffic back\nto old revision\n(instant)"]
    end

The old revision remains deployed at 0% traffic. Rolling back is a single traffic flip — no new deployment, effectively sub-second rollback.

Fully Isolated Chains

Both versions run side by side during the transition. To avoid mixed-version states (for example, a new SPA calling an old API), environment variables are rewritten at deploy time to point to revision-specific hostnames:

Old Chain: Old Proxy → Old SPA → Old API (all on main hostnames)
New Chain: New Proxy → New SPA → New API (all on revision hostnames)

flowchart LR
    subgraph Old["Old Chain\n(main hostnames)"]
        OP[Old Proxy]
        OS[Old SPA]
        OA[Old API]
        OP --> OS --> OA
    end

    subgraph New["New Chain\n(revision hostnames)"]
        NP[New Proxy]
        NS[New SPA]
        NA[New API]
        NP --> NS --> NA
    end

    style Old fill:#fff3e0,stroke:#fb8c00
    style New fill:#e3f2fd,stroke:#1565c0

Traffic is switched at the proxy level — a single switch moves all requests to the new chain in one shot.

Production Migration: Front Door Traffic Switching

For an initial cutover from a legacy system to a new stack, Azure Front Door enables zero-downtime traffic switching:

Before Go-Live:
  Front Door → Old App Service (100% traffic)

During Migration:
  Front Door → Old App Service (100%)
  New Container Apps (0%, ready and warmed)

Go-Live:
  Front Door → New Container Apps (100%)
  Old App Service (0%, still running)

If issues:
  Front Door → Old App Service (100%)  ← instant rollback

flowchart TB
    subgraph Before[Before Go-Live]
        FD1[Azure Front Door]
        OA1[Old App Service\n100% traffic]
        FD1 --> OA1
    end

    subgraph Migration[During Migration]
        FD2[Azure Front Door]
        OA2[Old App Service\n100% traffic]
        NC2["New Container Apps\n0% traffic\n(ready & warmed)"]
        FD2 --> OA2
        FD2 -. monitoring .- NC2
    end

    subgraph GoLive[Go-Live]
        FD3[Azure Front Door]
        NC3[New Container Apps\n100% traffic]
        OA3["Old App Service\n0% traffic\n(still running)"]
        FD3 --> NC3
    end

    subgraph Issue[If issues]
        FD4[Azure Front Door]
        OA4["Old App Service\n100% traffic\n(instant rollback)"]
        FD4 --> OA4
    end

Both systems run in parallel. The switch is a Front Door configuration change — no DNS propagation delays, no cold starts. Rollback is likewise instant.

Lessons Learned

Generate configuration, don’t manage it

Manual configuration management across environments does not scale. Treat it as a code generation problem: define values once, override per environment, and let a generator produce the final artifacts. This removes entire classes of deployment bugs.

Separate infrastructure from application deployment

Infrastructure changes are rare, high-risk, and require planning. Application deployments are frequent and should be low-friction. Coupling the two means either infrastructure changes slow everything down or every deploy becomes risky.

Feature branch environments reshape the workflow

When every branch has its own live URL, code review turns into live review. Stakeholders can exercise features before they merge. QA can work in parallel with development. The infrastructure cost is trivial compared to the productivity gain.

Blue-green is worth the complexity

Being able to deploy a new version, validate it under real traffic conditions, and then flip traffic with instant rollback changes the risk profile of releases. Deployments become uneventful.

What’s Next

Article 12: Security in a Nuxt SSR App — CSRF, Azure AD, CSP, and More — The security layers that protect a server-rendered application.
Article 13: Observability and Distributed Tracing — Application Insights End-to-End — How every request is traced across all layers.
Article 14: AI-Assisted Development — MCP, Debug Chatbot, and the Shared Language of the Codebase — Making AI assistants genuinely useful for live debugging.

Munir Husseini is a software architect specializing in full-stack TypeScript, .NET, and cloud-native architectures.

June 6, 2026

Security in a Nuxt SSR App — CSRF, OAuth, CSP, and More

Fifteenth in a series about migrating from legacy architectures to a modern Nuxt 4 stack.

Security in SSR Is Different

An SSR application has a very different attack surface from a client-side SPA. The server is responsible for rendering HTML with embedded state, generating tokens, setting cookies, and proxying API calls — all before the browser executes any JavaScript.

Security must be enforced at the server rendering layer. A CSRF token created during SSR has to survive hydration. Authentication must block the HTTP response before it ever reaches the browser. CSP must be sent as an HTTP header during rendering, not injected later as a meta tag.

Reusing SPA security patterns directly in SSR apps creates gaps — not because the patterns are wrong, but because they operate at the wrong layer.

flowchart LR
  subgraph Client["Browser"]
    HTML["SSR HTML + Embedded State"]
    JS["Hydrated JS App"]
  end

  subgraph Server["Nuxt SSR Stack"]
    Render["SSR Render Layer"]
    Tokens["Token Generation<br/>(CSRF, Auth)"]
    Cookies["Set Cookies<br/>(HTTP-only, SameSite)"]
    Proxy["API Proxy / BFF"]
  end

  Client <-- "HTTP Response" --> Server
  Render --> HTML
  Render --> Tokens
  Tokens --> Cookies
  Render --> Proxy
  Proxy -->|"Internal API Calls"| Backend["Upstream APIs / Services"]

CSRF Protection: Dual-Token System with User-Agent Binding

CSRF protection in an SSR app needs more than the classic double-submit cookie pattern.

The Standard Pattern (And Why It’s Not Enough)

The traditional double-submit approach: generate a token, store it in a cookie, and require it in a request header. The server verifies that the cookie and header values match. This works because a cross-site attacker cannot read the cookie in order to set the matching header.

The weakness: if an attacker somehow gets both the cookie and the token (for example, via a subdomain cookie issue or XSS on a related domain), they can replay the request from any browser.

The Enhanced Pattern: User-Agent Binding

The fix is to bind the token to the specific browser that requested it:

flowchart TB
  subgraph SSR["Token Generation During SSR"]
    UA["User-Agent Header"]
    Time["Current Timestamp"]
    Salt["Random UUID Salt"]
    Type["Token Type = 'client'"]
    HashUA["SHA-256(User-Agent)<br/>→ first 16 chars"]
    Payload["Token Payload<br/>d: timestamp<br/>p: type<br/>s: salt<br/>ua: UA hash"]
    Enc["Encrypt with AES-256-GCM"]

    UA --> HashUA
    Time --> Payload
    Type --> Payload
    Salt --> Payload
    HashUA --> Payload
    Payload --> Enc
  end

  Enc --> Cookie["Set HTTP-only cookie: csrf"]
  Enc --> Embed["Embed token in SSR HTML<br/>(for X-XSRF-TOKEN header)"]

flowchart TB
  subgraph Validation["Token Validation on Every API Request"]
    HeaderTok["X-XSRF-TOKEN header"]
    CookieTok["csrf cookie"]
    Decrypt["Decrypt header token"]
    Exp["Check expiration (24h TTL)"]
    ReqUA["Current User-Agent"]
    HashReqUA["SHA-256(Req UA)<br/>→ first 16 chars"]
    MatchUA["Compare ua in token<br/>with current UA hash"]
    MatchCookie["Compare cookie value<br/>with header value"]
    Ok["All checks pass"]
    Reject["Reject 403 Forbidden"]

    HeaderTok --> Decrypt
    Decrypt --> Exp
    Decrypt --> MatchUA
    ReqUA --> HashReqUA --> MatchUA
    Decrypt --> MatchCookie
    CookieTok --> MatchCookie

    Exp -->|valid| MatchUA
    Exp -->|expired| Reject
    MatchUA -->|mismatch| Reject
    MatchUA -->|match| MatchCookie
    MatchCookie -->|mismatch| Reject
    MatchCookie -->|match| Ok
    Ok -->|"Process API request"| App["Application Handler"]
  end

A stolen token is useless from a different browser — the User-Agent hash will not match. Combined with AES-256-GCM encryption, random salts, and a 24-hour TTL, this creates layered defenses against replay attacks.

SSR Bypass Tokens

During SSR, the Nuxt server calls its own GraphQL or REST APIs — there is no browser, and therefore no CSRF cookie. A server-only bypass token allows these internal SSR requests to pass CSRF checks.

This token is generated per request, stored only in the Nitro event context (never exposed to the client), and validated using User-Agent binding but without the cookie–header comparison.

sequenceDiagram
  participant Browser
  participant NuxtSSR as Nuxt SSR Renderer
  participant NitroCtx as Nitro Event Context
  participant InternalAPI as Internal API

  Browser->>NuxtSSR: HTTP GET /page
  NuxtSSR->>NitroCtx: Create SSR bypass token<br/>(bound to UA, no cookie)
  Note right of NitroCtx: Token stored only in<br/>server context, not sent<br/>to the client
  NuxtSSR->>InternalAPI: Request with SSR bypass token<br/>(e.g. header X-SSR-CSRF)
  InternalAPI-->>InternalAPI: Validate token + UA<br/>(no cookie-header check)
  InternalAPI-->>NuxtSSR: Data response
  NuxtSSR-->>Browser: Rendered HTML

Encryption Key from Key Vault

The AES key is loaded from a cloud key vault (for example, Azure Key Vault or AWS KMS) at startup and stored on globalThis. If the key cannot be loaded, the server refuses to start — a fail-fast approach with no degraded mode. CSRF protection is never quietly turned off.

flowchart LR
  subgraph Startup["Nuxt Server Startup"]
    KV["Cloud Key Vault<br/>(AWS KMS / Azure Key Vault)"]
    Fetch["Fetch AES-256 Key"]
    Store["Store key on globalThis"]
    Ready["Server Ready"]
    Fail["Abort startup<br/>(process exit)"]

    KV --> Fetch
    Fetch -->|success| Store --> Ready
    Fetch -->|failure| Fail
  end

OAuth 2.0 Authentication (Authorization Code Flow)

Test and staging environments often require authentication, even for otherwise public-facing applications. A common pattern is server-side OAuth 2.0 Authorization Code Flow — the client secret is never exposed to the browser.

sequenceDiagram
  participant Browser
  participant Nuxt as Nuxt Server
  participant IdP as Identity Provider

  Browser->>Nuxt: GET /protected-page
  Nuxt-->>Nuxt: Check auth cookie
  alt No valid token
    Nuxt-->>Browser: 302 Redirect to /api/auth/login
    Browser->>Nuxt: GET /api/auth/login
    Nuxt-->>Browser: 302 Redirect to IdP auth URL
    Browser->>IdP: GET /authorize?client_id=...&redirect_uri=/api/auth/callback
    IdP-->>Browser: 302 Redirect to /api/auth/callback?code=...
    Browser->>Nuxt: GET /api/auth/callback?code=...
    Nuxt->>IdP: POST /token (exchange code for token)
    IdP-->>Nuxt: Access token
    Nuxt-->>Browser: Set HTTP-only auth cookie + 302 /protected-page
    Browser->>Nuxt: GET /protected-page
    Nuxt-->>Browser: 200 Full HTML
  else Valid token
    Nuxt-->>Browser: 200 Full HTML (no redirect)
  end

Key security properties:

Server-side token exchange — the client secret is used only on the server, never sent to the browser.
HTTP-only cookies — tokens live in cookies that JavaScript cannot read (mitigating XSS).
Middleware blocking — unauthenticated requests are stopped in server middleware before any page content is rendered. Unauthorized users cannot even download the app’s JavaScript bundle.

Environment-Based Security Tiers

Not every environment needs the full security stack:

Environment	CSRF	Auth	CSP	Rationale
Development	Off	Off	Off	Fast iteration, minimal friction
Docker	Off	On	Off	Protects shared dev environments
Test	On	On	On	Production-like security
Production	On	Off	On	Public site, no login required

Development turns security off for productivity. Test enables everything to catch issues before release. Production enables CSRF and CSP but omits authentication for a public site.

flowchart LR
  Dev["Development"]:::off -->|Deploy| Docker["Docker / Shared Dev"]:::partial
  Docker -->|Promote| Test["Test / Staging"]:::full
  Test -->|Promote| Prod["Production"]:::prod

  classDef off fill:#eee,stroke:#999,color:#333;
  classDef partial fill:#ffe6b3,stroke:#cc9a00,color:#333;
  classDef full fill:#c6f6d5,stroke:#2f855a,color:#000;
  classDef prod fill:#bee3f8,stroke:#2b6cb0,color:#000;

  Dev --- DevSec["CSRF: Off<br/>Auth: Off<br/>CSP: Off"]
  Docker --- DockSec["CSRF: Off<br/>Auth: On<br/>CSP: Off"]
  Test --- TestSec["CSRF: On<br/>Auth: On<br/>CSP: On"]
  Prod --- ProdSec["CSRF: On<br/>Auth: Off<br/>CSP: On"]

Runtime Content Security Policy from CMS

Hardcoding CSP in application config is an operational choke point: every new script source (analytics, chat widgets, A/B testing) forces a code change and deployment.

Treating CSP as content solves this. A server plugin fetches CSP from the CMS at runtime and applies it as an HTTP header:

flowchart TB
  Editor["CMS Editor<br/>updates CSP entry"] --> CMS["Headless CMS"]
  CMS --> Cache["Server Plugin<br/>fetches CSP (cache 5 min)"]
  Cache --> Header["Set Content-Security-Policy<br/>header on HTML responses"]
  Header --> Browser["Browser enforces CSP"]

  CMS -.failure.-> Fallback["Use hardcoded fallback CSP<br/>(stricter, not looser)"]
  Fallback --> Header

A hardcoded fallback covers CMS downtime. The runtime CSP is strictly more permissive than the fallback — if the CMS fetch fails, the app operates under a stricter, not looser, policy.

Internal API Guard

Health and diagnostics endpoints expose sensitive operational data — memory usage, restart counts, worker status. The Internal API Guard keeps these endpoints invisible to the public:

sequenceDiagram
  participant PublicClient as External Client
  participant Probe as Kube Health Probe
  participant InternalSvc as Internal Service
  participant NuxtAPI as Nuxt API Layer

  PublicClient->>NuxtAPI: GET /api/health/pm2<br/>User-Agent: Mozilla/5.0
  NuxtAPI-->>PublicClient: 404 Not Found

  Probe->>NuxtAPI: GET /api/health<br/>User-Agent: kube-probe/1.28
  NuxtAPI-->>Probe: 200 OK { status: "healthy" }

  InternalSvc->>NuxtAPI: GET /api/health/pm2<br/>X-Internal-Secret: correct
  NuxtAPI-->>InternalSvc: 200 OK { workers: [...] }

The 404 is deliberate — it neither confirms nor denies the endpoint’s existence. Scanners see exactly what they would for any non-existent path.

The Security Stack

All layers combine into a single request pipeline:

flowchart TB
  Browser["Browser"] --> HTTPS["HTTPS"]
  HTTPS --> Edge["Edge Load Balancer"]
  Edge -->|"TLS termination"| Nginx["Nginx Proxy"]

  Nginx --> Guard["Internal API Guard<br/>(hide internal endpoints)"]
  Guard --> Nuxt["Nuxt Server"]

  subgraph NuxtPipeline["Nuxt Middleware & Plugins"]
    Auth["Auth Middleware<br/>Check auth cookie<br/>Redirect if invalid"]
    CSRF["CSRF Middleware<br/>Decrypt token<br/>Validate UA hash<br/>Check cookie-header match"]
    CSP["CSP Plugin<br/>Fetch CSP from CMS<br/>Set CSP header"]
    SSR["SSR Render<br/>Generate CSRF token<br/>Set HTTP-only cookie<br/>Embed token in HTML"]
  end

  Nuxt --> Auth --> CSRF --> CSP --> SSR

Lessons Learned

SSR security operates at the HTTP response level, not the DOM level

In a SPA, security typically lives in JavaScript — interceptors, route guards, client-side middleware. In SSR, it must live in server middleware controlling the HTTP response. By the time browser JavaScript runs, the HTML (and any injected payloads) has already been sent.

flowchart LR
  subgraph SPA["SPA Model"]
    JSClient["Client-side JS<br/>(route guards, interceptors)"]
    API["APIs"]
    JSClient --> API
  end

  subgraph SSR["SSR Model"]
    Middleware["Server Middleware<br/>(auth, CSRF, CSP)"]
    Render["SSR Render"]
    API2["APIs"]
    Middleware --> Render --> API2
  end

User-Agent binding is cheap insurance against replay attacks

Hashing the first 16 characters of the User-Agent costs almost nothing (one SHA-256 per request) but shuts down an entire class of replay attacks. The User-Agent is available on every request — binding to it is practically free.

flowchart TB
  UA["User-Agent string"] --> Hash["SHA-256 + truncate 16 chars"]
  Hash --> Bind["Include hash in token payload"]
  Bind --> Verify["On request, recompute hash<br/>and compare before processing"]

Runtime security configuration reduces operational bottlenecks

Any security setting that requires a deployment to change becomes a bottleneck. CSP changes are often requested by marketing (new script providers) and security (removing outdated sources). Moving CSP to the CMS takes the development team out of this loop.

flowchart LR
  Marketing["Marketing / Security"] --> ChangeReq["Request CSP change"]
  ChangeReq --> CMSConfig["Update CSP in CMS"]
  CMSConfig --> AutoApply["Server auto-applies CSP<br/>on next cache refresh"]
  AutoApply --> LiveSite["Live Site with updated policy"]

Environment tiers prevent security theater

Full security in development forces engineers to work around it. Zero security in production is reckless. A tiered approach — each environment enabling exactly the protections it needs — balances safety with productivity.

flowchart TB
  Dev["Dev: Minimal security<br/>High productivity"] --> Docker["Shared Dev: Auth only"]
  Docker --> Test["Test: Full security<br/>Pre-prod hardening"]
  Test --> Prod["Prod: Public-friendly<br/>CSRF + CSP only"]

What’s Next

Article 13: Observability and Distributed Tracing — Application Insights End-to-End — How every request is traced across all layers.
Article 14: AI-Assisted Development — MCP, Debug Chatbot, and the Shared Language of the Codebase — Making AI assistants genuinely useful for live debugging.
Article 15: Load Testing Results — 15× Faster, 5× More Capacity — The measured proof that architecture decisions produce real outcomes.

Munir Husseini is a software architect specializing in full-stack TypeScript, .NET, and cloud-native architectures.

June 6, 2026

Deferred Hydration Done Right — The `requestIdleCallback` Trick and the `modulepreload` Pitfall
Fourteenth in a series about migrating from legacy architectures to a modern Nuxt 4 stack.

The Hydration Dilemma

SSR gives you a fast first paint: complete HTML that the browser can render immediately. Then the framework’s JavaScript arrives, gets parsed, compiled, and executed to hydrate the page—wiring up event listeners and making everything interactive.

During hydration, the main thread is busy. Buttons do not respond. Forms do not accept input. This gap between “looks ready” and “is ready” is Total Blocking Time (TBT).

If SSR has already rendered the page, why rush to download and execute JavaScript before the user even finishes the first paragraph?
```
flowchart LR
 A[SSR Server] --> B["HTML Response Fully rendered markup"]
 B --> C[Browser parses HTML]
 C --> D["First Paint / FCP Looks ready"]
 D --> E["Hydration JS downloads parse & execute"]
 E --> F["Event listeners attached Interactive"]

 classDef paint fill:#e0f7fa,stroke:#006064;
 classDef js fill:#fff3e0,stroke:#e65100;
 classDef interactive fill:#e8f5e9,stroke:#1b5e20;

 class B,C,D paint
 class E js
 class F interactive
```
The Naive Approach That Does Not Work

The obvious idea: delay JavaScript modules with the media attribute, just like you can with stylesheets:
```

<link rel="stylesheet" href="print.css" media="print">



<link rel="modulepreload" href="/_nuxt/entry.js" media="none">

```
The spec does not define media semantics for , so browsers ignore it. Module scripts are preloaded eagerly regardless.[7]

This is poorly documented. You only notice when performance metrics refuse to improve, even after you add media="none" to 90 modulepreload links.
```
flowchart LR
 subgraph Stylesheet preload
 S1["<link rel=#quot;stylesheet#quot; media=#quot;print#quot;>"] --> S2["Browser may delay applying until media matches"]
 end

 subgraph Modulepreload
 M1["<link rel=#quot;modulepreload#quot; media=#quot;none#quot;>"] --> M2["Browser still preloads module eagerly"]
 end
```
The Working Approach: Complete Removal

Because modulepreload hints cannot be meaningfully delayed, the solution is to remove them entirely.[7] A Nitro render:html plugin performs two transformations in a large SSR application.

Transformation 1: Remove All Modulepreload Links
```
Before (standard Nuxt SSR output):
 <head>
 <link rel="modulepreload" href="/_nuxt/entry.js">
 <link rel="modulepreload" href="/_nuxt/chunk-abc.js">
 <link rel="modulepreload" href="/_nuxt/chunk-def.js">
 <link rel="modulepreload" href="/_nuxt/chunk-ghi.js">
 ... (90+ more)
 </head>

After (deferred):
 <head>
 
 </head>
```
Without these hints, the browser no longer speculatively downloads chunks. It waits until it encounters actual

Blog

SSR Deep Dive — Hydration, State Replay, and the Cookbook

The Hydration Contract

A Taxonomy of Hydration Mismatches

Category 1: Non-Deterministic Values

Category 2: Timing-Dependent State

Category 3: Teleports

Category 4: Client-Side State Initialization

Category 5: Async Composable Race Conditions

The Hydration Cookbook Pattern

SSR Event Replay

The Solution: useState as an Event Buffer

Debugging Hydration Issues

Strategy 1: Binary Elimination

Strategy 2: SSR-Only Rendering

Strategy 3: AI-Assisted Debugging

The hydrate-never Directive

Lessons Learned

Hydration is a contract, not a feature

The five categories cover 95% of real-world mismatches

Event replay is essential for SSR module architectures

A cookbook is more valuable than documentation

What’s Next

The Full Picture — What the New Concept Delivers

From Parts to Whole

The Architecture at a Glance

The Five Pillars

Pillar 1: Unified Data Layer (GraphQL Schema Stitching)

Pillar 2: Total Automation (Code Generation)

Pillar 3: SSR Performance Stack

Pillar 4: Production-Grade Operations

Pillar 5: Developer Experience

For Decision-Makers: The Numbers

Performance

Capacity and Cost

Development Velocity

For Architects: The Design Principles

1. Generate, Don’t Write

2. One Source of Truth Per Concern

3. Eliminate Work, Don’t Optimize It

4. Modules as Boundaries

5. Measure Everything, Assume Nothing

Lessons Learned

The whole is greater than the sum of the parts

Architecture must allow change

Developer experience is a force multiplier

Developer hardware is not optional

What’s Next

Load Testing Results — 15× Faster, 5× More Capacity

Architecture Decisions Have Consequences — Measure Them

The Headline Numbers

Test Methodology

Traffic Pattern

Test Types

Replay Test: 1× Production Load

Ramp Test: Finding the Ceiling

The Right-Sizing Experiment

The Failed Right-Sizing (Config #3)

The Validated Production Configuration

Cost Analysis

What the Numbers Mean for Architecture

Lessons Learned

Load test with production traffic patterns, not synthetic ones

Right-sizing failures are the most valuable test results

Median response time is the metric that matters most

15× is not an optimization — it is a different architecture

What’s Next

The Nuxt Observability Stack: Tracing, Logging, and PM2 Metrics

Flying Blind vs. Full Visibility

Three-Layer Telemetry: Traces, Proxy Spans, and Container Metrics

Layer 1: SDK Instrumentation

Layer 2: Nginx OpenTelemetry

Layer 3: Container-Level Metrics

Transaction ID Propagation

What Metrics Tell You

Alerting Strategy: Symptoms First, Causes Later

Structured Logging in Nuxt: From console.log to Observability

The console.log Problem

The Logging Architecture

The useLogger Composable

The Solution: `useState` as an Event Buffer

The `hydrate-never` Directive

Structured Logging in Nuxt: From `console.log` to Observability

The `console.log` Problem

The `useLogger` Composable

Replacing `console.log` Safely

Layer 1: Per-Request Aggregation (`diagnostics` module)

Layer 2: Heap Memory and GC (`diagnostics-heap` module)

Layer 3: CPU Profiling (`diagnostics-profiler` module)