Built in public · v0.2.0

One API for every LLM.
Global, open-source, and Indian.

Drop-in OpenAI-compatible gateway with routing, caching, and observability across OpenAI, Anthropic, Together, Sarvam, and your self-hosted models. BYOK. Zero markup.

Two lines. That’s the migration.

No new SDK. No refactoring. Change your base URL and key — your prompts, streaming, parameters, and parsing stay exactly the same.

# Before
from openai import OpenAI
client = OpenAI(api_key="sk-...")

# After — change these two lines
from openai import OpenAI
client = OpenAI(
    api_key="ib_your_key_here",
    base_url="https://inferbridge.dev/v1",
)
// Before
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// After — change these two lines
import OpenAI from "openai";
const client = new OpenAI({
  apiKey: process.env.INFERBRIDGE_API_KEY,
  baseURL: "https://inferbridge.dev/v1",
});
# Before
curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hi"}]}'

# After — change these two lines
curl https://inferbridge.dev/v1/chat/completions \
  -H "Authorization: Bearer ib_your_key_here" \
  -d '{"model":"ib/balanced","messages":[{"role":"user","content":"Hi"}]}'

That’s it. No code changes. No lock-in. Revert the two lines anytime.

Route smarter. Pay less. See everything.

Route by tier, not by guessing.

Pick a tier — cheap, balanced, or premium. InferBridge handles the rest: which provider, which model, what to do when one fails. Override per-request when you need surgical control.

See every token, every rupee.

Per-request logs with tokens, latency, cost, and provider. One endpoint answers “where is my money going?” for the first time. Aggregate it by mode, provider, or time range.

Indian providers, first-class.

Sarvam and self-hosted endpoints ship on day one. Add X-InferBridge-Residency: india to route only to India-hosted infrastructure. No other gateway does this.

# Your app
client.chat.completions.create(model="ib/balanced", ...)

         ↓  InferBridge evaluates:

         model: "ib/cheap"       → Together (Llama 3.3 70B)
         model: "ib/balanced"    → OpenAI (gpt-4o-mini)
         model: "ib/premium"     → Anthropic (claude-opus-4-7)
         residency: "india"      → Sarvam (sarvam-m)
         fallback: on 5xx/429    → next candidate in tier

         ↓  Your app receives: standard OpenAI response

Real routing logic, not marketing. Override any decision per-request.

Four steps. Five minutes.

  1. Register

    POST to /v1/users with your email. You get an API key starting with ib_. Shown once, stored as a SHA-256 hash.

  2. Add your provider keys

    BYOK means you bring your existing OpenAI, Anthropic, Together, or Sarvam keys. We Fernet-encrypt them at rest and never touch your billing.

  3. Change your base URL

    Two lines in your existing code. Any language, any framework. If it talks to OpenAI, it talks to InferBridge.

  4. Ship

    Monitor cost and latency at /v1/stats. Fallback and caching work automatically. Revert the two lines if anything breaks.

Five providers. One API.

Provider Residency Example models
OpenAI Global gpt-4o-mini, gpt-4o
Anthropic Global claude-haiku-4-5, claude-opus-4-7
Together AI Global Llama-3.3-70B-Instruct-Turbo
Sarvam India sarvam-m
Self-hosted User-declared Any OpenAI-compatible endpoint

Register any combination. Route across all of them.

Honest pricing. No markup. Ever.

You pay your providers directly through your own BYOK keys. InferBridge never marks up tokens.

Available now

Free

₹0 / month

Unlimited BYOK usage. 30-day log retention. All providers, all routing modes, all observability.

Get your API key

Coming soon

Pro

₹1,499 / month

Unlimited log retention. Priority support. Custom routing rules. Team seats.

Not available yet

Coming soon

Business

₹9,999 / month

SLA. Dedicated support. Custom residency policies. GST-compliant invoices.

Not available yet

Questions developers ask.

Does InferBridge store my prompts?

No. We log metadata only — tokens, latency, cost, provider, request ID. Prompt and completion content are never written to logs. There’s a dedicated test in the codebase that fails if anyone tries to change this.

What happens if InferBridge goes down?

Revert the two lines. Your SDK talks directly to OpenAI again in seconds. No data migration. No lock-in.

How is this different from OpenRouter or Portkey?

BYOK by default — you pay providers directly, we take no cut. Indian providers (Sarvam, self-hosted) are first-class, not an afterthought. And we’re built for developers who want simple, debuggable routing, not an enterprise policy engine.

What about tool calling, vision, and embeddings?

Streaming tool use and vision aren’t hardened yet — use the provider SDK directly for those. Embeddings aren’t exposed through InferBridge at all in v1. Text chat completions, streaming, caching, and fallback are production-ready.

Is my data safe?

Provider API keys are Fernet-encrypted at rest, decrypted only in-memory at request time. We never log content. You can delete any registered key from /v1/keys at any time.

Can I self-host InferBridge?

Not publicly yet. Open-source release is planned for month 2. Email yogesh@inferbridge.dev if you want early access or need an on-prem deployment.

Launching publicly on [Day-14 date].

Get the public launch announcement, early-access pricing, and the v0.2.0 migration guide.

Join the waitlist

Loops embed drops in here — until then, email works too.