NEW! Data443 Acquires VaikoraReal-Time AI Runtime Control & Enforcement for AI Agent

Home | Blog | OpenAI Proxy Integration Without Rewriting Your App

OpenAI Proxy Integration Without Rewriting Your App

You can put an OpenAI-compatible gateway in front of an existing application by changing one line: swap the OpenAI base URL to your gateway, add an auth header, and ship. No SDK swap, no client rewrite, no application redeploy beyond the config change. This guide shows the exact one-line change in Python (sync and async) and Node/TypeScript, the env variables involved, the timeout / retry / header behavior to preserve, and how Vaikora’s drop-in proxy applies the same security policy across 12 LLM providers with provider fallback routing.

What Is an OpenAI-Compatible Gateway?

An OpenAI-compatible gateway is a proxy that speaks the OpenAI REST API on the inbound side, so any client written against the OpenAI SDK can use it without code changes, and routes outbound to one or more LLM providers. The application keeps its existing OpenAI Python or Node SDK; the only change is the base URL. The gateway terminates the inbound call, applies policy (PII redaction, prompt-injection detection, audit), and forwards the request to OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Mistral, Cohere, Together AI, Groq, Ollama, or a custom vLLM endpoint.

The One-Line Change: Swap the Base URL

The integration pattern is the same in every language. The OpenAI SDK accepts a base_url (Python) or baseURL (Node/TS) parameter. Point it at the gateway, set an auth header, and the rest of the application keeps working.

Python (sync) — OpenAI SDK ≥ 1.0

from openai import OpenAI

client = OpenAI(
    base_url=”https://api.vaikora.com/v1″,
    api_key=os.environ[“VAIKORA_API_KEY”],   # gateway key, not the upstream provider key
)

resp = client.chat.completions.create(
    model=”gpt-4o”,
    messages=[{“role”: “user”, “content”: “Summarize this contract in 3 bullets.”}],
)
print(resp.choices[0].message.content)

Python (async) — same one-line change

from openai import AsyncOpenAI

client = AsyncOpenAI(
    base_url=”https://api.vaikora.com/v1″,
    api_key=os.environ[“VAIKORA_API_KEY”],
)

async def summarize(text: str) -> str:
    resp = await client.chat.completions.create(
        model=”gpt-4o”,
        messages=[{“role”: “user”, “content”: text}],
    )
    return resp.choices[0].message.content

Node / TypeScript — OpenAI SDK v4+

import OpenAI from “openai”;

const client = new OpenAI({
  baseURL: “https://api.vaikora.com/v1”,
  apiKey:  process.env.VAIKORA_API_KEY,
});

const resp = await client.chat.completions.create({
  model: “gpt-4o”,
  messages: [{ role: “user”, content: “Summarize this contract in 3 bullets.” }],
});
console.log(resp.choices[0].message.content);

cURL — proves the wire format is unchanged

curl https://api.vaikora.com/v1/chat/completions \
  -H “Authorization: Bearer $VAIKORA_API_KEY” \
  -H “Content-Type: application/json” \
  -d ‘{
    “model”: “gpt-4o”,
    “messages”: [{“role”: “user”, “content”: “Hello”}]  }’

The wire format is the OpenAI Chat Completions schema. Anything that already speaks OpenAI — LangChain, LlamaIndex, Vercel AI SDK, custom Python or TypeScript code — works without modification.

Environment Variables: What to Set

There are two acceptable patterns. Either (a) keep the OpenAI SDK env names and override base_url in code, or (b) rely on the SDK’s built-in environment variables and never touch application code.

Setting Default OpenAI SDK behavior What to preserve through the gateway
Code-level override
VAIKORA_API_KEY
plus base_url=… in the client constructor
When you want the gateway to be explicit and visible in the constructor
Pure env override
OPENAI_BASE_URL
https://api.vaikora.com/v1 and OPENAI_API_KEY
=$VAIKORA_API_KEY
When you cannot touch the application code at all and want to flip the gateway via deploy config
Hybrid
OPENAI_BASE_URL
=https://api.vaikora.com/v1,
custom secret name VAIKORA_API_KEY, code reads either
When you operate multiple environments (dev / staging / prod) with different gateways

Drop-In Proxy: Minimal Configuration Changes

No core application rewrite is required. The integration is a drop-in proxy with minimal configuration changes — only timeouts, headers, and retries need a brief look. The OpenAI SDK has reasonable defaults; preserve them in three places when adopting a gateway.

Pattern Variables to set When to use
Request timeout
60 seconds for chat completions
Keep client-side timeout ≥ 60s so streaming responses are not truncated by the gateway hop
Retries
Two automatic retries with backoff
Leave SDK retries on; the gateway should not double-retry on idempotent calls
Custom headers
X-Request-Id, X-Idempotency-Key
Pass-through preserved by Vaikora; correlation_id is added on the gateway side for audit
Streaming
Server-sent events on /v1/chat/completions with stream: true
Streaming is preserved end-to-end; policy decisions are made on the request and on the assembled response

12 LLM Providers Behind One OpenAI-Compatible Endpoint

Vaikora’s gateway accepts the OpenAI Chat Completions schema on the inbound side and routes outbound to any of the 12 supported LLM providers based on the model field or a routing policy. The application keeps using the OpenAI SDK; the gateway picks the upstream.

Provider category Providers
Frontier APIs
OpenAI, Anthropic, Google Gemini
Cloud-hosted enterprise
Azure OpenAI, AWS Bedrock
Open-weights and specialty
Mistral, Cohere, Together AI, Groq
Self-hosted / private
Ollama, custom vLLM endpoint

Switching the upstream provider is a model-string change or a routing-policy update — not a client change. A health-check or rate-limit failure on one provider falls back to the next provider in the routing policy without an exception surfacing in the client.

Provider Fallback Routing Example

A typical fallback policy specifies a primary provider, one or two backups, and the trigger conditions. Below is the shape of a routing policy that prefers OpenAI, falls back to Anthropic on rate-limit, and falls back to Azure OpenAI on outage.

# vaikora-routing.yaml
model: gpt-4o
primary:
  provider: openai
  model:    gpt-4o
fallbacks:
  – provider: anthropic
    model:    claude-sonnet-4-6
    trigger:  rate_limit
  – provider: azure-openai
    model:    gpt-4o
    trigger:  upstream_5xx
policy_enforcement: preserved   # PII redaction + audit applied identically across providers

Critically: policy enforcement is preserved across providers. PII redaction, prompt-injection detection, and SHA-256 hash-chained audit do not weaken when the gateway falls back from OpenAI to Anthropic to Azure. The application sees one OpenAI-shaped response; the gateway records which upstream actually served it.

What You Get for Free with the One-Line Change

  • Reversible PII redaction (synthetic / mask / hash) on every prompt before it leaves your environment.
  • 12+ detection vectors across 4 layers (pattern, semantic, ML, behavioral) — prompt injection, jailbreak attempts, exfiltration patterns.
  • Six compliance presets (standard, strict, permissive, hipaa, pci-dss, gdpr) selectable per workspace or per route.
  • Content-free audit (content: false) — SHA-256 hash-chained log without storing prompts, GDPR / HIPAA / PCI DSS friendly.
  • Provider fallback across 12 LLM providers with policy preserved across the failover.

What Latency Does the Drop-In Proxy Add?

Inline overhead is small relative to LLM round-trip times. P50 ~ 8 ms, P95 ~ 22 ms, P99 < 50 ms; block path 18 ms; throughput 10,000+ actions per second. Typical OpenAI Chat Completions round-trip times sit between 1 and 6 seconds for non-trivial prompts, so the gateway adds well under 1% overhead in normal operation.

Next Steps

Start by setting OPENAI_BASE_URL=https://api.vaikora.com/v1 in a non-production environment, run your existing test suite unchanged, and verify the audit log shows your traffic. From there, the 30-minute end-to-end setup guide covers account creation, key issuance, smoke test, first policy, and audit review.

Your AI Agents Need a Control Layer

See how Vaikora intercepts, evaluates, and enforces policy on every AI agent action — in real time, before execution.

 Frequently Asked Questions

Do I need to swap the OpenAI SDK?

No. Vaikora is OpenAI-compatible on the inbound side. Keep your existing openai (Python) or openai (Node/TS) SDK; only change the base_url / baseURL and the API key.

What environment variables do I set?

Either set OPENAI_BASE_URL=https://api.vaikora.com/v1 and OPENAI_API_KEY=$VAIKORA_API_KEY for a pure-env override with no code change, or pass base_url=… and api_key=… in the constructor for an explicit code-level override. Both patterns are supported.

Does this work with LangChain / LlamaIndex / Vercel AI SDK?

Yes. Any library that wraps the OpenAI SDK works because the wire format is unchanged. LangChain’s ChatOpenAI, LlamaIndex’s OpenAI LLM class, and Vercel’s openai provider all accept a base URL override.

What happens to streaming responses?

Streaming over server-sent events on /v1/chat/completions with stream: true is preserved end-to-end. Policy is applied to the request before it is forwarded and to the assembled response stream as it flows back, with no client-visible buffering beyond normal proxy overhead.

Can I use providers other than OpenAI behind the same endpoint?

Yes. Vaikora supports 12 LLM providers — OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Mistral, Cohere, Together AI, Groq, Ollama, and custom vLLM. The application speaks OpenAI; the gateway routes to whichever upstream the model string or routing policy selects.

Does provider fallback weaken policy enforcement?

No. PII redaction, prompt-injection detection, and SHA-256 hash-chained audit are applied identically regardless of which upstream serves the request. Policy enforcement is preserved across the fallback path.

How long does the integration actually take?

The code change is one line and takes minutes. Most production rollouts (staging validation, audit log review, policy tuning) are operational within 48 hours. The companion guide “Drop-In AI Gateway: Replace Your OpenAI Endpoint in 30 Minutes” walks through the timed steps.