Does this work with LangChain, LlamaIndex, or Vercel AI SDK?

Yes. Any library that wraps the OpenAI SDK works without modification because the wire format remains unchanged. This includes LangChain, LlamaIndex, and Vercel AI SDK, all of which support base URL overrides.

Home Blog OpenAI Proxy Integration Without Rewriting Your App

OpenAI Proxy Integration Without Rewriting Your App

Q: Do I need to swap the OpenAI SDK?

No. Vaikora is OpenAI-compatible on the inbound side. You can keep your existing OpenAI SDK in Python or Node.js and only update the base URL and API key configuration.

Q: What environment variables do I set?

You can set OPENAI_BASE_URL to https://api.vaikora.com/v1 and OPENAI_API_KEY to your Vaikora API key for a no-code environment override. Alternatively, you can pass base_url and api_key directly in the SDK constructor for explicit configuration. Both approaches are supported.

Q: What happens to streaming responses?

Streaming responses using server-sent events on the chat completions endpoint are preserved end-to-end. Policy enforcement is applied to the request before forwarding and to the response as it streams back, without introducing additional buffering beyond standard proxy overhead.

Q: Can I use providers other than OpenAI behind the same endpoint?

Yes. Vaikora supports multiple LLM providers including OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Mistral, Cohere, Together AI, Groq, Ollama, and custom vLLM deployments. Your application continues using the OpenAI interface while the gateway routes requests based on model selection or policy.

Q: Does provider fallback weaken policy enforcement?

No. Policy enforcement, including PII redaction, prompt injection detection, and SHA-256 hash-chained audit logging, is applied consistently regardless of which upstream provider handles the request. Enforcement remains intact across fallback scenarios.

Q: How long does the integration actually take?

The code change typically takes only a few minutes, often requiring just a single configuration update. Most production deployments, including staging validation, audit log review, and policy tuning, are operational within 48 hours.

AI Runtime Control, Real-time AI Security, Threat Intelligence, Vaikora

May 1, 2026

You can put an OpenAI-compatible gateway in front of an existing application by changing one line: swap the OpenAI base URL to your gateway, add an auth header, and ship. No SDK swap, no client rewrite, no application redeploy beyond the config change. This guide shows the exact one-line change in Python (sync and async) and Node/TypeScript, the env variables involved, the timeout / retry / header behavior to preserve, and how Vaikora’s drop-in proxy applies the same security policy across 12 LLM providers with provider fallback routing.

What Is an OpenAI-Compatible Gateway?

An OpenAI-compatible gateway is a proxy that speaks the OpenAI REST API on the inbound side, so any client written against the OpenAI SDK can use it without code changes, and routes outbound to one or more LLM providers. The application keeps its existing OpenAI Python or Node SDK; the only change is the base URL. The gateway terminates the inbound call, applies policy (PII redaction, prompt-injection detection, audit), and forwards the request to OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Mistral, Cohere, Together AI, Groq, Ollama, or a custom vLLM endpoint.

The One-Line Change: Swap the Base URL

The integration pattern is the same in every language. The OpenAI SDK accepts a base_url (Python) or baseURL (Node/TS) parameter. Point it at the gateway, set an auth header, and the rest of the application keeps working.

Python (sync) — OpenAI SDK ≥ 1.0

from openai import OpenAI

client = OpenAI(
    base_url=”https://api.vaikora.com/v1″,
    api_key=os.environ[“VAIKORA_API_KEY”],   # gateway key, not the upstream provider key
)

resp = client.chat.completions.create(
    model=”gpt-4o”,
    messages=[{“role”: “user”, “content”: “Summarize this contract in 3 bullets.”}],
)
print(resp.choices[0].message.content)

Python (async) — same one-line change

from openai import AsyncOpenAI

client = AsyncOpenAI(
    base_url=”https://api.vaikora.com/v1″,
    api_key=os.environ[“VAIKORA_API_KEY”],
)

async def summarize(text: str) -> str:
    resp = await client.chat.completions.create(
        model=”gpt-4o”,
        messages=[{“role”: “user”, “content”: text}],
    )
    return resp.choices[0].message.content

Node / TypeScript — OpenAI SDK v4+

import OpenAI from “openai”;

const client = new OpenAI({
baseURL: “https://api.vaikora.com/v1”,
apiKey: process.env.VAIKORA_API_KEY,
});

const resp = await client.chat.completions.create({
model: “gpt-4o”,
messages: [{ role: “user”, content: “Summarize this contract in 3 bullets.” }],
});
console.log(resp.choices[0].message.content);

cURL — proves the wire format is unchanged

curl https://api.vaikora.com/v1/chat/completions \
-H “Authorization: Bearer $VAIKORA_API_KEY” \
-H “Content-Type: application/json” \
-d ‘{
“model”: “gpt-4o”,
“messages”: [{“role”: “user”, “content”: “Hello”}]
}’

The wire format is the OpenAI Chat Completions schema. Anything that already speaks OpenAI — LangChain, LlamaIndex, Vercel AI SDK, custom Python or TypeScript code — works without modification.

Environment Variables: What to Set

There are two acceptable patterns. Either (a) keep the OpenAI SDK env names and override base_url in code, or (b) rely on the SDK’s built-in environment variables and never touch application code.

Setting	Default OpenAI SDK behavior	What to preserve through the gateway
Code-level override	VAIKORA_API_KEY plus base_url=… in the client constructor	When you want the gateway to be explicit and visible in the constructor
Pure env override	OPENAI_BASE_URL https://api.vaikora.com/v1 and OPENAI_API_KEY =$VAIKORA_API_KEY	When you cannot touch the application code at all and want to flip the gateway via deploy config
Hybrid	OPENAI_BASE_URL =https://api.vaikora.com/v1, custom secret name VAIKORA_API_KEY, code reads either	When you operate multiple environments (dev / staging / prod) with different gateways

Drop-In Proxy: Minimal Configuration Changes

No core application rewrite is required. The integration is a drop-in proxy with minimal configuration changes — only timeouts, headers, and retries need a brief look. The OpenAI SDK has reasonable defaults; preserve them in three places when adopting a gateway.

Pattern	Variables to set	When to use
Request timeout	60 seconds for chat completions	Keep client-side timeout ≥ 60s so streaming responses are not truncated by the gateway hop
Retries	Two automatic retries with backoff	Leave SDK retries on; the gateway should not double-retry on idempotent calls
Custom headers	X-Request-Id, X-Idempotency-Key	Pass-through preserved by Vaikora; correlation_id is added on the gateway side for audit
Streaming	Server-sent events on /v1/chat/completions with stream: true	Streaming is preserved end-to-end; policy decisions are made on the request and on the assembled response

12 LLM Providers Behind One OpenAI-Compatible Endpoint

Vaikora’s gateway accepts the OpenAI Chat Completions schema on the inbound side and routes outbound to any of the 12 supported LLM providers based on the model field or a routing policy. The application keeps using the OpenAI SDK; the gateway picks the upstream.

Provider category	Providers
Frontier APIs	OpenAI, Anthropic, Google Gemini
Cloud-hosted enterprise	Azure OpenAI, AWS Bedrock
Open-weights and specialty	Mistral, Cohere, Together AI, Groq
Self-hosted / private	Ollama, custom vLLM endpoint

Switching the upstream provider is a model-string change or a routing-policy update — not a client change. A health-check or rate-limit failure on one provider falls back to the next provider in the routing policy without an exception surfacing in the client.

Provider Fallback Routing Example

A typical fallback policy specifies a primary provider, one or two backups, and the trigger conditions. Below is the shape of a routing policy that prefers OpenAI, falls back to Anthropic on rate-limit, and falls back to Azure OpenAI on outage.

# vaikora-routing.yaml
model: gpt-4o
primary:
provider: openai
model:    gpt-4o
fallbacks:
– provider: anthropic
    model:    claude-sonnet-4-6
    trigger: rate_limit
– provider: azure-openai
    model:    gpt-4o
    trigger: upstream_5xx
policy_enforcement: preserved   # PII redaction + audit applied identically across providers

Critically: policy enforcement is preserved across providers. PII redaction, prompt-injection detection, and SHA-256 hash-chained audit do not weaken when the gateway falls back from OpenAI to Anthropic to Azure. The application sees one OpenAI-shaped response; the gateway records which upstream actually served it.

What You Get for Free with the One-Line Change

Reversible PII redaction (synthetic / mask / hash) on every prompt before it leaves your environment.
12+ detection vectors across 4 layers (pattern, semantic, ML, behavioral) — prompt injection, jailbreak attempts, exfiltration patterns.
Six compliance presets (standard, strict, permissive, hipaa, pci-dss, gdpr) selectable per workspace or per route.
Content-free audit (content: false) — SHA-256 hash-chained log without storing prompts, GDPR / HIPAA / PCI DSS friendly.
Provider fallback across 12 LLM providers with policy preserved across the failover.

What Latency Does the Drop-In Proxy Add?

Inline overhead is small relative to LLM round-trip times. P50 ~ 8 ms, P95 ~ 22 ms, P99 < 50 ms; block path 18 ms; throughput 10,000+ actions per second. Typical OpenAI Chat Completions round-trip times sit between 1 and 6 seconds for non-trivial prompts, so the gateway adds well under 1% overhead in normal operation.

Next Steps

Start by setting OPENAI_BASE_URL=https://api.vaikora.com/v1 in a non-production environment, run your existing test suite unchanged, and verify the audit log shows your traffic. From there, the 30-minute end-to-end setup guide covers account creation, key issuance, smoke test, first policy, and audit review.

Your AI Agents Need a Control Layer

See how Vaikora intercepts, evaluates, and enforces policy on every AI agent action — in real time, before execution.

Frequently Asked Questions

Do I need to swap the OpenAI SDK?

No. Vaikora is OpenAI-compatible on the inbound side. Keep your existing openai (Python) or openai (Node/TS) SDK; only change the base_url / baseURL and the API key.

What environment variables do I set?

Either set OPENAI_BASE_URL=https://api.vaikora.com/v1 and OPENAI_API_KEY=$VAIKORA_API_KEY for a pure-env override with no code change, or pass base_url=… and api_key=… in the constructor for an explicit code-level override. Both patterns are supported.

Does this work with LangChain / LlamaIndex / Vercel AI SDK?

Yes. Any library that wraps the OpenAI SDK works because the wire format is unchanged. LangChain’s ChatOpenAI, LlamaIndex’s OpenAI LLM class, and Vercel’s openai provider all accept a base URL override.

What happens to streaming responses?

Streaming over server-sent events on /v1/chat/completions with stream: true is preserved end-to-end. Policy is applied to the request before it is forwarded and to the assembled response stream as it flows back, with no client-visible buffering beyond normal proxy overhead.

Can I use providers other than OpenAI behind the same endpoint?

Yes. Vaikora supports 12 LLM providers — OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Mistral, Cohere, Together AI, Groq, Ollama, and custom vLLM. The application speaks OpenAI; the gateway routes to whichever upstream the model string or routing policy selects.

Does provider fallback weaken policy enforcement?

No. PII redaction, prompt-injection detection, and SHA-256 hash-chained audit are applied identically regardless of which upstream serves the request. Policy enforcement is preserved across the fallback path.

How long does the integration actually take?

The code change is one line and takes minutes. Most production rollouts (staging validation, audit log review, policy tuning) are operational within 48 hours. The companion guide “Drop-In AI Gateway: Replace Your OpenAI Endpoint in 30 Minutes” walks through the timed steps.