Shadow AI is unsanctioned use of LLM services inside an enterprise. This includes developers, business units, or SaaS tools calling external AI APIs such as OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Mistral, Cohere, Together AI, Groq, Ollama, or custom inference endpoints outside the organization's approved policy and audit controls.

Why can't EDR or SIEM find shadow AI?

Shadow AI traffic appears as normal HTTPS connections to external services. EDR focuses on processes and files, while SIEM rules often look for indicators of compromise. Shadow AI is not malicious by default, just unsanctioned. The reliable detection signal is at the network egress layer, including DNS queries and TLS connections to LLM provider domains.

What domains should I monitor?

Common domains include api.openai.com, api.anthropic.com, generativelanguage.googleapis.com, aiplatform.googleapis.com, *.openai.azure.com, bedrock-runtime. .amazonaws.com, api.mistral.ai, api.cohere.com, api.together.ai, api.groq.com, ollama.com, and inference platforms such as runpod.io, modal.run, replicate.com, and huggingface.co. This list should be treated as a baseline and updated regularly.

Should I capture prompt content during shadow AI discovery?

No. Capturing prompt content creates a regulated data store under frameworks such as HIPAA, GDPR, and PCI DSS. Shadow AI discovery should rely on connection metadata such as source identity, destination provider, traffic volume, timing, and application fingerprint.

How do I turn discovery into governance?

Move discovered AI traffic to a sanctioned gateway and block direct access to LLM provider domains at the network level. The gateway becomes the single approved egress point. Any new unsanctioned usage will appear as denied connections, making shadow AI visible and controllable.

Does Vaikora help with shadow AI specifically?

Yes. Vaikora acts as a controlled egress gateway that converts shadow AI into observable and governed AI traffic. It is OpenAI-compatible, allowing migration through a simple base URL change. Once traffic flows through the gateway, each request generates structured telemetry including risk score, policy decision, redaction summary, upstream provider, and a SHA-256 hash, all without storing prompt content.

How often should the canonical domain list be refreshed?

At least quarterly and whenever new AI providers are introduced. The AI ecosystem evolves quickly, so the domain list should be treated as a living artifact and updated as part of ongoing governance.

Home Blog AI Gateway vs AI Firewall vs AI Proxy: Category Definitions

AI Gateway vs AI Firewall vs AI Proxy: Category Definitions

AI Runtime Control, Real-time AI Security, Threat Intelligence, Vaikora

May 4, 2026

AI gateway, AI firewall, and AI proxy are three terms vendors use almost interchangeably for products in the AI security space — but they emphasize different jobs. An AI gateway is a routing and integration layer for LLM traffic; an AI firewall is a deny / block control plane for prompts and responses; an AI proxy is the inline transport that carries either of those jobs. The category that owns runtime policy decisions on every prompt and response is AI runtime control, and that is the term used in this guide. This is an honest category-definition piece: what each term actually means, what overlaps, what does not, and where Vaikora sits inside the AI runtime control category.

Why the Terminology Is a Mess

The AI security space is two years old as a buying category. Vendors that built network firewalls call their LLM product an “AI firewall.” Vendors that built API gateways call theirs an “AI gateway.” Vendors that built reverse proxies call theirs an “AI proxy.” The labels are mostly a function of a vendor’s previous product, not a description of what the AI product actually does. A buyer searching the category sees three vendors using three different labels for what looks like the same control surface, and reasonably gives up on category clarity.

The honest framing is that the three terms describe overlapping but distinct jobs. AI runtime control — runtime policy decisions on every prompt and response, inline at execution time — is the umbrella job. AI gateway, AI firewall, and AI proxy are how a product can be packaged.

Three Short Definitions, Side by Side

This is the single reference table for the category. Each column is a short, honest definition of one term as it is most often used in 2026.

AI Gateway
An inline routing and integration layer for LLM traffic. It uses the OpenAI Chat Completions API format on the inbound side and routes requests to one or more upstream providers (OpenAI, Anthropic, Gemini, Azure OpenAI, AWS Bedrock, Mistral, Cohere, Together AI, Groq, Ollama, or custom models). It also applies policy controls, PII redaction, audit logging, and provider fallback as part of the routing layer.
AI Firewall
A control plane focused on allow/deny decisions for prompts and responses. It inspects LLM traffic against defined policies, detects issues such as prompt injection, jailbreak attempts, or data exfiltration, and returns block or allow decisions. It also generates alerts. In many architectures, this capability is included within the AI gateway rather than deployed as a separate product.
AI Proxy
The transport layer that carries traffic between the LLM application and upstream providers. It typically operates as a reverse proxy, terminating TLS, normalizing requests, forwarding them upstream, and streaming responses back. By itself, it does not enforce AI-specific policies—it simply handles the data path, while gateway and firewall functions operate on top of it.

What Overlaps

Most production deployments do all three jobs in one component. The overlap is real:

All three sit inline. Gateway, firewall, and proxy are placed in the request path between the application and the upstream LLM. None of them are after-the-fact log analyzers.
All three speak HTTPS to the application. The application talks to the gateway / firewall / proxy at an HTTPS endpoint and gets a normal LLM-shaped response back. The component is invisible to the application beyond the base URL change.
All three usually emit telemetry to a SIEM. Whichever label the vendor uses, the operational pattern is the same: produce per-request decision metadata that flows into Splunk, Microsoft Sentinel, Elastic, or Sumo Logic.

What Does Not Overlap

The labels diverge most clearly on three axes — what the primary job is, where the policy lives, and what fails open or closed when the component is unreachable.

Primary job

AI gateway emphasizes routing and integration: provider fallback, multi-LLM strategy, model abstraction, OpenAI-compatible front door. AI firewall emphasizes deny / block: deterministic policy, threat detection, allow-list / deny-list behavior, alerting. AI proxy emphasizes transport: TLS, streaming, latency, throughput. A product that does only one well is a product that has staked out a position in the category; a product that does all three runs in one component is what most enterprises actually buy.

Where the policy lives

AI gateway products typically separate the routing policy (which provider, which model, which compliance preset) from the threat policy (block on injection, redact on PII). AI firewall products typically collapse them into a single rule engine. AI proxy products typically push policy out to whichever component sits next to them. The packaging difference matters at procurement time — a security team buying an AI firewall expects to write deny rules; an integration team buying an AI gateway expects to declare a routing policy. Both expectations are reasonable; the difference is what the buyer is signing up to operate.

Fail open vs fail closed

AI gateway products often fail open by default — if the policy engine is down, the gateway prefers to keep traffic flowing. AI firewall products often fail closed by default — if the engine is down, the firewall prefers to block. The right default for a regulated workload (HIPAA, PCI DSS, GDPR) is fail-closed; for a non-regulated workload it is sometimes fail-open. A buyer should ask explicitly.

What None of These Are

Three things the AI gateway / firewall / proxy category does not include — and that buyers sometimes confuse for it.

It is not a WAF. A WAF (Web Application Firewall) inspects HTTP traffic for OWASP-shaped attacks. WAFs do not parse LLM payloads; an OpenAI Chat Completions body is opaque to a WAF. AI runtime control is a different layer.
It is not a DLP. Traditional DLP (Data Loss Prevention) is tuned for files, email, endpoint flows, and stored data — and runs at latencies acceptable for documents (hundreds of ms), not LLM round-trips (single-digit ms). Traditional DLP tools are not designed for real-time LLM prompt / response enforcement at execution time.
It is not an observability tool. LLM observability platforms (LangSmith, Arize, Helicone) record what happened for debugging and analytics. AI runtime control enforces policy inline — different category, different evidence, often complementary.

AI Runtime Control: The Canonical Category

AI runtime control is the practice of enforcing policy on LLM prompts and responses at execution time — inline, while the request is in flight between the application and the model — rather than after the fact through logs or before the fact through static review. Runtime control is the umbrella category that AI gateway, AI firewall, and AI proxy products operate inside. It is the single term that an enterprise buyer can use to describe “the inline policy decision on every LLM call,” without having to pick which vendor’s label survives the procurement cycle.

The category is sometimes also called “AI execution control” or “AI inline policy enforcement” — both refer to the same job. Throughout the Vaikora documentation and across this blog series, the term used is AI runtime control.

Where Vaikora Sits Inside AI Runtime Control

Vaikora is an AI runtime control platform. Operationally, it does the AI gateway job (OpenAI-compatible front door at api.vaikora.com/v1, 12-provider fallback, per-route compliance preset), the AI firewall job (deterministic policy enforcement with probabilistic risk scoring, 12+ detection vectors across 4 layers, allow / redact / block decisions), and the AI proxy job (inline TLS termination, streaming, P50 ~ 8 ms / P99 < 50 ms, 10,000+ actions per second) — in one component, with a single audit chain across all three. The packaging is intentional: an enterprise that has to satisfy auditors, platform engineers, and SOC teams ends up needing all three jobs, and operating three separate components doubles or triples the operational cost.

Next Steps

Use the side-by-side definitions table when an architecture review devolves into a label argument. The companion guides — “AI Gateway vs DLP vs WAF” for the adjacent-category comparison and “Build vs Buy AI Security: What Enterprises Actually Need” for the procurement-decision view — close the buyer’s loop on the category.

AI runtime control is the practice of enforcing policy on LLM prompts and responses at execution time — inline, while the request is in flight — and is the umbrella category that AI gateway (routing + integration), AI firewall (deny / block), and AI proxy (inline transport) products operate inside. Vaikora is an AI runtime control platform that does all three jobs in one component, with a single content-free SHA-256 hash-chained audit chain.

Your AI Agents Need a Control Layer

See how Vaikora intercepts, evaluates, and enforces policy on every AI agent action — in real time, before execution.

Frequently Asked Questions

Is an AI gateway the same as an AI firewall?

They overlap, but they emphasize different jobs. An AI gateway emphasizes routing and integration (which provider, which model, fallback policy, compliance preset). An AI firewall emphasizes deny / block decisions on prompts and responses (block on injection, redact on PII, alert on detected threats). Most production deployments combine both jobs in one component.

Where does an AI proxy fit?

An AI proxy is the inline transport — the reverse proxy in the data path that terminates TLS, normalizes the request, and forwards to the upstream. By itself, an AI proxy is not opinionated about policy; it is the carrier on which the gateway and firewall jobs run. Vendors that lead with the “AI proxy” label are usually emphasizing latency, streaming behavior, and integration simplicity.

What is AI runtime control?

AI runtime control is the practice of enforcing policy on LLM prompts and responses at execution time — inline, while the request is in flight. It is the umbrella category that AI gateway, AI firewall, and AI proxy products operate inside. Same job, different vendor packaging.

Do I need a WAF, a DLP, and an AI gateway?

Most enterprises with LLM workloads, yes. WAF for HTTP-shaped attacks, DLP for files / email / endpoint flows, AI gateway / firewall / proxy (i.e. AI runtime control) for inline LLM enforcement. The three categories cover different surfaces and produce different evidence — see the companion piece, “AI Gateway vs DLP vs WAF.”

Is AI runtime control a Gartner category?

Analyst category language is still settling in 2026. Some analyst frames call the space “AI security platforms” or “runtime AI security”; others split it into observability, model security, governance, and runtime control. The Vaikora documentation uses AI runtime control consistently because it names the specific job — runtime, inline, policy enforcement — without overloading the term “security” or “governance.”

How do I evaluate an AI runtime control product?

Five questions: (1) does it speak the OpenAI Chat Completions wire format inbound; (2) does it support fallback across multiple LLM providers with policy preserved; (3) does it run deterministic policy enforcement plus probabilistic risk scoring inline at execution time; (4) does it offer content-free SHA-256 hash-chained audit; (5) does it integrate with SAML / SCIM identity and SIEM (Splunk, Microsoft Sentinel, Elastic, Sumo Logic) by default. If the answer to all five is yes, the product is in the category regardless of which label it leads with.

Does the label affect the evaluation?

Less than the buying motion suggests. The category is converging on the AI runtime control job; the labels reflect what each vendor’s previous product was. Evaluate on the five questions above. The label is for the marketing site.