NEW! Data443 Acquires VaikoraReal-Time AI Runtime Control & Enforcement for AI Agent

Home | Blog | AI Security Architecture: LLM Proxy Design Guide

AI Security Architecture: LLM Proxy Design Guide

This is a reference architecture for securing AI agents with an inline proxy layer. The design has five layers — Middleware, AuthN/Z, Interceptor Proxy, Threat Detection and Enforcement, and Audit and Compliance — arranged in fixed order between the agent and the upstream LLM or partner agent. The architecture is intentionally drawn to be reused: it fits in a board deck slide, anchors a security review, and gives auditors a single block diagram that maps cleanly to NIST AI RMF, ISO 42001, and SOC 2 control evidence. This guide walks through each layer with one-line descriptions, the data and control flows through the stack, fail-open vs fail-closed defaults at each control point, and the SIEM / EDR integration paths that hang off the audit layer.

The Five Layers at a Glance

Use the exact layer names below — they are the names referenced in the diagram, the configuration, and the audit log.

  1. Middleware — the inline ingress point that terminates traffic from the agent or application and normalizes it into the inspection schema.
  2. AuthN/Z — authentication of the calling identity (workspace key, SAML SSO, SCIM-provisioned service account, or DID for ANP) and authorization against the policy boundary.
  3. Interceptor Proxy — the protocol-aware payload extractor that parses MCP / A2A / ACP / ANP / OpenAI Chat Completions into common inspectable fields and routes outbound to the right upstream.
  4. Threat Detection and Enforcement — 12+ detection vectors across 4 layers (pattern, semantic, ML, behavioral), the 7-factor probabilistic risk score, and the deterministic policy decision (allow / redact / block).
  5. Audit and Compliance — the SHA-256 hash-chained, content-free audit writer and the SIEM / EDR / GRC outbound connectors.

The Block Diagram (Reusable)

This is the canonical block diagram. It is laid out in plain text so it reproduces cleanly in a slide, in a markdown doc, in an architecture review, and in AI search snippets.

AI Agent / LLM Application
MCP Host • A2A Client-Agent • ACP Caller • ANP Agent • OpenAI SDK app
inbound request
Layer 1 · Middleware

TLS termination · request normalization · backpressure · rate limit

fail-open on transient parse error · fail-closed on malformed schema

Layer 2 · AuthN / AuthZ

workspace API key · SAML SSO (humans) · SCIM service accounts · DID (ANP)

fail-closed on auth failure

Layer 3 · Interceptor Proxy

protocol-aware extract: MCP / A2A / ACP / ANP / OpenAI Chat Completions

routing policy: 12 LLM providers · provider fallback · per-route presets

fail-closed for HIPAA / PCI-DSS / GDPR · fail-open for permissive

Layer 4 · Threat Detection & Enforcement

12+ detection vectors / 4 layers: pattern · semantic · ML · behavioral

7-factor probabilistic risk score · reversible PII redaction

deterministic policy decision: allow / redact / block

fail-closed on detection-engine outage for regulated presets

Layer 5 · Audit & Compliance

SHA-256 hash-chained, content: false (metadata-only)

outbound: SIEM · EDR · GRC

fail-open on audit sink unreachable · buffered + retried

redacted / approved request
Upstream LLM Provider · Partner Agent · MCP Server · Downstream
OpenAI / Anthropic / Gemini / Azure / Bedrock / Mistral / Cohere / Together / Groq / Ollama / custom vLLM

How to Read This Diagram

Read the diagram top to bottom for the request path and bottom to top for the response path. Every inbound request from an AI agent or LLM application traverses all five layers in fixed order before it is allowed to reach the upstream provider or partner agent. Every response traverses Layers 4 and 5 in reverse — Threat Detection and Enforcement runs again on tool outputs and retrieved content (this is where indirect injection from RAG content gets caught), and Audit and Compliance writes a paired response-side audit entry. The five layers run in process on the same gateway instance, so the request never crosses the inline boundary unguarded; the SIEM / EDR / GRC outbound from Layer 5 is asynchronous and does not sit on the request critical path.

Layer 1 — Middleware

The inline ingress for the entire stack. Layer 1 owns TLS termination, request normalization (parsing the inbound HTTPS into the gateway’s internal request object), connection-level rate limiting, and backpressure into Layer 4 if the detection engine is saturated. This is the layer that sees raw bytes; it is also the only layer that directly faces the network.

  • Transport: HTTPS only; mTLS optional for high-assurance deployments.
  • Rate limit: Per-key requests per minute (RPM) cap; per-workspace daily token budget.
  • Backpressure: Returns 429 to client when the Layer 4 queue exceeds the high-water mark.
  • Fail mode: Fail-open on transient parse error (return upstream); fail-closed on malformed schema.

Layer 2 — AuthN/Z

Authentication and authorization sit in their own layer because the rest of the stack depends on knowing who is calling. Layer 2 supports four identity types: workspace API key (default for service-to-gateway calls), SAML SSO (human console access through Okta / Azure AD / Google Workspace), SCIM-provisioned service accounts (lifecycle managed centrally), and W3C Decentralized Identifiers for ANP traffic. Authorization is policy-bound — the calling identity is mapped to a workspace, and the workspace’s compliance preset determines what the rest of the stack will accept.

  • Workspace API key. Used for application → gateway service calls. Issued in the console and rotated on the cadence of upstream provider keys.
  • SAML SSO. Used for human console access (developers, security engineers, auditors). Federated via Okta, Azure AD, or Google Workspace; no local accounts.
  • SCIM service account. Used for automated workloads with lifecycle managed by the IdP. Group-to-role mapping; auto-deprovisioned when the IdP record is deactivated.
  • W3C Decentralized Identifier (DID). Used for ANP (Agent Network Protocol) traffic. Cryptographic; policy is bound to the calling DID, not to a central account.

Fail mode for Layer 2 is always fail-closed. An auth failure is never passed through; a request that cannot be authenticated is rejected with a structured error and an audit entry.

Layer 3 — Interceptor Proxy

The protocol-aware payload extractor. Layer 3 normalizes any of five inbound surfaces — MCP tool calls (JSON-RPC 2.0), A2A task messages, ACP REST payloads, ANP JSON-LD messages, and OpenAI Chat Completions — into the common inspection schema that Layer 4 operates on. Layer 3 also holds the routing policy: which of the 12 supported LLM providers to call, which fallbacks to use on rate-limit or 5xx, and which compliance preset applies to which route.

  • Inbound coverage: MCP, A2A, ACP, ANP, OpenAI Chat Completions wire format.
  • Outbound coverage: OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Mistral, Cohere, Together AI, Groq, Ollama, custom vLLM.
  • Provider fallback: Primary + 1–2 backups declared per route; policy enforcement preserved across legs.
  • Fail mode: Fail-closed for hipaa / pci-dss / gdpr presets; fail-open is acceptable only for permissive workspaces.

Layer 4 — Threat Detection and Enforcement

The control point that does the AI-specific work. Layer 4 runs the four detection layers (pattern, semantic, ML, behavioral) in parallel against the normalized payload, composes the 7-factor probabilistic risk score, applies reversible PII redaction (synthetic / mask / hash with format preservation) when the policy says so, and produces the deterministic policy decision: allow, redact, or block. The right framing for this layer is deterministic policy enforcement with probabilistic risk scoring — neither alone covers the surface.

  • Pattern layer. Literal and regex matches (PII formats, known injection phrases). Example: SSN regex, valid IBAN check digit, the exact phrase “ignore previous instructions.”
  • Semantic layer. Embedding-similarity to known injection intent vectors regardless of language or encoding. Example: multilingual injection, base64-encoded instruction, paraphrased attack template.
  • ML layer. Trained classifier over 1M+ adversarial examples; catches novel injection that does not match a vector. Example: zero-day injection patterns, system-prompt leakage roleplays.
  • Behavioral layer. Session and agent-level baseline; deviation from normal sequence of actions or topic drift. Example: multi-turn manipulation, agent goal hijack, slow exfiltration.

Fail mode for Layer 4 is fail-closed for regulated presets. If the detection engine is unavailable, hipaa / pci-dss / gdpr workspaces block the request rather than pass it through unguarded. Performance budget: P50 ~ 8 ms, P95 ~ 22 ms, P99 < 50 ms; block path 18 ms; throughput 10,000+ actions per second.

Layer 5 — Audit and Compliance

Every decision Layer 4 makes produces an audit entry written by Layer 5. Default mode is content: false — metadata, decision summary, redaction summary, risk score, latency, timestamp, and a SHA-256 hash of the inspected payload, but no prompt content. Each entry’s prev_hash links to the previous entry’s curr_hash, providing tamper-evidence. The audit stream is the input to the SIEM, EDR, and GRC outbound connectors.

  • SIEM. Microsoft Sentinel, Splunk, Datadog, AWS CloudWatch, and a custom HTTPS webhook — used for detection alerting, compliance evidence, and incident response replay.
  • EDR / cloud. SentinelOne via Microsoft Sentinel Content Hub (live); CrowdStrike Falcon (Custom IOC) and AWS Security Hub (ASFF) in development.
  • GRC. Crosswalks to NIST AI RMF, ISO/IEC 42001, SOC 2, HIPAA, GDPR, PCI DSS — used for automated evidence packages and DPIA inputs.

Fail mode for Layer 5 is fail-open. If the audit sink is unreachable, the request still proceeds — security decisions in Layer 4 have already been made — but the audit entries are buffered locally and replayed when the sink recovers. Layer 5 must never be on the request critical path.

Fail-Mode Defaults: One Table to Settle the Question

Architecture reviews often spend too much time on “what happens if X is down.” The defaults below are the answer. Override them only with explicit policy.

Dimension ACP ANP
Full name
Agent Communication Protocol
Agent Network Protocol
Governance
Linux Foundation / IBM-BeeAI
Open-source community
Topology
Router-agent + downstream agents
Decentralized three-layer DID + JSON-LD mesh
Transport
REST / HTTPS
JSON-LD over HTTPS
Identity
API key / OAuth at the router
W3C DIDs (cryptographic, decentralized)
Discovery
Router capability registry
DID resolution; protocol negotiation
Best for
Enterprise multi-framework orchestration
Open agent marketplaces, cross-company trustless workflows

Mapping the Five Layers to Compliance Frameworks

Auditors and analysts ask which controls in this architecture map to which framework. The mapping below is not exhaustive — it covers the controls most often requested in NIST AI RMF, ISO/IEC 42001, and SOC 2 evidence packages. The full crosswalk is published as a downloadable XLSX in the companion piece, “Mapping AI Controls to NIST AI RMF and ISO 42001.”

  • Layer 1 — Middleware. NIST AI RMF MAP-1.1 (intended use); ISO/IEC 42001 A.8.2 (transparency); SOC 2 CC6.1 (logical access).
  • Layer 2 — AuthN/Z. NIST AI RMF GOVERN-1.4 (accountability); ISO/IEC 42001 A.5.3 (roles); SOC 2 CC6.1 and CC6.6 (logical access, identification).
  • Layer 3 — Interceptor Proxy. NIST AI RMF MEASURE-2.5 (data integrity); ISO/IEC 42001 A.8.4 (data quality); SOC 2 CC7.2 (system monitoring).
  • Layer 4 — Threat Detection and Enforcement. NIST AI RMF MEASURE-2.7 (security and resilience); ISO/IEC 42001 A.8.5 (information security); SOC 2 CC7.3 (system anomalies).
  • Layer 5 — Audit and Compliance. NIST AI RMF GOVERN-1.5 and MANAGE-4.1 (ongoing monitoring); ISO/IEC 42001 A.9.2 (internal audit); SOC 2 CC7.4 (incident response).

Next Steps

Walk this architecture through your next AI security review. The diagram and layer names are stable; the fail-mode defaults are the recommendation. Pair this guide with “How to Secure AI Agent Protocols: A Cross-Protocol Control Plane” for the protocol-coverage story and “Mapping AI Controls to NIST AI RMF and ISO 42001” for the framework evidence.

Your AI Agents Need a Control Layer

See how Vaikora intercepts, evaluates, and enforces policy on every AI agent action — in real time, before execution.

 Frequently Asked Questions

What are the five layers of this AI security architecture?

Middleware, AuthN/Z, Interceptor Proxy, Threat Detection and Enforcement, and Audit and Compliance — in fixed order between the agent or LLM application and the upstream provider. Every inbound request traverses all five layers; every response traverses Layers 4 and 5 in reverse.

Where do AI-specific controls actually run?

In Layer 4 — Threat Detection and Enforcement. That layer hosts the four parallel detection layers (pattern, semantic, ML, behavioral), the 7-factor probabilistic risk score, the reversible PII redaction (synthetic / mask / hash with format preservation), and the deterministic allow / redact / block decision.

Why is AuthN/Z always fail-closed?

Because unauthenticated traffic should never reach the rest of the stack. Layer 2 fail-closed is non-negotiable; the other layers vary by preset, but Layer 2 does not.

How does the architecture integrate with our SIEM and EDR?

Layer 5 emits content-free audit and detection events to Microsoft Sentinel, Splunk, Datadog, and AWS CloudWatch (plus a custom HTTPS webhook) via native connectors, plus a connector path to EDR (SentinelOne live; CrowdStrike Falcon and AWS Security Hub in development) and a GRC path for framework crosswalks (NIST AI RMF, ISO 42001, SOC 2, HIPAA, GDPR, PCI DSS). The outbound is asynchronous; it never sits on the request critical path.

Does this architecture support MCP, A2A, ACP, and ANP?

Yes. Layer 3 — Interceptor Proxy — is protocol-aware and normalizes MCP tool calls, A2A task messages, ACP REST payloads, ANP JSON-LD messages, and OpenAI Chat Completions into a common inspection schema. Layer 4 then runs the same controls regardless of which protocol delivered the payload.

What is the latency cost of running all five layers inline?

P50 ~ 8 ms, P95 ~ 22 ms, P99 < 50 ms; block path 18 ms; throughput 10,000+ actions per second. Compared to a typical 1–6 second LLM round-trip, the inline cost is well under 1%. Layer 5 is asynchronous and does not contribute to the request-path latency.

Can I drop this diagram into a board deck without further work?

Yes — that is the design intent. The block diagram is laid out to fit a single slide, the layer names are stable across the configuration and the audit log, and the framework crosswalk above maps directly to the controls boards typically ask about. Pair it with the secure AI development reference architecture (Topic 12) when the audience is engineering rather than security.