Yes. We forward your requests to real AI providers (Claude, GPT, Gemini, Grok, DeepSeek). Same models, same output, same context windows. Only the price is different.

How is the discount possible?

We pool bulk credit across providers and accept crypto, which keeps ops cost low. Those savings get passed through as 70-80% off list price.

Which SDKs and tools work?

Anthropic SDK, OpenAI SDK, LangChain, raw fetch — all work. Just swap the base URL and the model name to use any AI.

What payment methods do you accept?

We accept cryptocurrency — USDT (TRC20/ERC20), BTC, ETH, and 100+ other coins via Oxapay. Credits never expire.

What are the pricing plans?

Basic plan (free): 70% off all supported AI providers. Pro ($19 lifetime): 80% off all supported AI providers. One-time payment, credits never expire.

Do you store my prompts or data?

No. We don't log, store, or train on your API requests. Zero data retention policy on request content.

24/7 support via email at support@aiapi.cheap. Pro users get priority response.

All posts

May 4, 2026·10 min readpillarmulti-aiapione-key

One API Key for Claude, GPT, Gemini, Grok, DeepSeek

Stop juggling 5 AI vendor accounts. One sk-aic-* key, one base URL, every major model — Claude, GPT, Gemini, Grok, DeepSeek. Setup in 2 minutes.

The Problem Nobody Warns You About

You start a side project. You pick Claude because the writing is great. Two weeks later you want to try GPT-4o for vision. Then a friend tells you Gemini Flash is dirt cheap and you should benchmark. Then DeepSeek ships v3 and the timeline goes nuts about how good the coding is.

Now you have:

An Anthropic account with a credit card on file

An OpenAI account with a separate credit card on file

A Google Cloud project (yes, even for Gemini API, that whole song and dance)

An xAI account (still in beta when you signed up)

A DeepSeek account that you're not sure is going to charge in USD or RMB

Five billing pages. Five sets of usage limits. Five separate "oops you hit your monthly cap" emails. Five different SDK shapes to learn.

For a solo founder this is death by a thousand papercuts. And the worst part? You're paying full sticker price at every one of them.

This post is the pillar explainer for what we built to fix this: one key, one base URL, every major model. 70-80% off.

The Pitch in One Paragraph

aiapi.cheap gives you a single API key (sk-aic-*). You point your existing Anthropic SDK or OpenAI SDK at our proxy, then you can call Claude, GPT, Gemini, Grok, or DeepSeek by changing one field — the model name. Same code, same SDK, every vendor. Pay-per-use, crypto top-ups, no subscription.

That's the whole product. The rest of this post explains why each piece matters and how to set it up.

Why One Key Beats Five

1. Less Billing Mental Overhead

You top up one balance. You watch one usage chart. When the bill is high, you know exactly where to look. No more "wait, did Anthropic charge me twice or did Google forget to invoice me?"

2. Switching Models Is Free

Want to A/B Claude Sonnet vs GPT-4o on the same prompt? Today, that's two SDK installs, two clients, two credit cards. With aiapi.cheap, it's model: "claude-sonnet-4-6" vs model: "gpt-4o" — same client, same key, both responses cost 70-80% less than retail.

3. Future-Proofing

The hottest model six months from now is not the hottest model today. Three years ago people were stuck on GPT-3.5. Then Claude 3 Opus. Then Sonnet 3.5 ate everything. Then GPT-4o. Then DeepSeek V3.2 came out of nowhere and made everyone re-evaluate. If you're locked into one vendor, you eat the migration cost every time the leaderboard shifts. With one universal key, you swap a string.

4. Real Savings

At 70-80% off retail across all five vendors, the math gets serious fast. A vibe coder on Claude Sonnet 4 hours a day might pay $80-120/month at official rates. With us, $15-25/month. Multiply that across 5 vendors and the savings compound when you start mixing models for different tasks.

Setup: Anthropic SDK Format

If your code already uses @anthropic-ai/sdk or the Python anthropic package, here's the change:

export ANTHROPIC_API_KEY="sk-aic-your-key-here"
export ANTHROPIC_BASE_URL="https://aiapi.cheap/api/proxy"

Then your existing code just works:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const message = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  messages: [
    { role: "user", content: "Write a haiku about cheap APIs." }
  ]
});

console.log(message.content);

This hits our /v1/messages endpoint, which mirrors the Anthropic format 1:1. Streaming, tool use, vision, prompt caching — all the same shape as the official SDK.

For the canonical request format reference, the Anthropic API getting-started docs describe what we mirror.

Setup: OpenAI SDK Format (Universal)

This is the magic trick. The OpenAI Chat Completions format has become the industry default. Almost every model speaks it now. We expose a single endpoint that routes to all 5 vendors based on the model name.

export OPENAI_API_KEY="sk-aic-your-key-here"
export OPENAI_BASE_URL="https://aiapi.cheap/api/proxy"

Now the same client, same code, hits any vendor:

import OpenAI from "openai";

const client = new OpenAI();

// Call GPT-4o
const gpt = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello from OpenAI!" }]
});

// Same client, change model — now you're calling Claude
const claude = await client.chat.completions.create({
  model: "claude-sonnet-4-6",
  messages: [{ role: "user", content: "Hello from Anthropic!" }]
});

// Same client, change model — now you're calling Gemini
const gemini = await client.chat.completions.create({
  model: "gemini-3-pro-preview",
  messages: [{ role: "user", content: "Hello from Google!" }]
});

// Same client, change model — Grok
const grok = await client.chat.completions.create({
  model: "grok-4.2",
  messages: [{ role: "user", content: "Hello from xAI!" }]
});

// Same client, change model — DeepSeek
const deepseek = await client.chat.completions.create({
  model: "deepseek-v3.2",
  messages: [{ role: "user", content: "Hello from DeepSeek!" }]
});

Five vendors. One client. One key. The OpenAI Chat Completions request shape is documented in the OpenAI API reference and we accept the same body for all five.

Vendor Coverage at a Glance

Which vendor for which job? Quick map:

| Vendor | Example Model | Best For |

| --- | --- | --- |

| Claude (Anthropic) | claude-sonnet-4-6, claude-opus-4-7 | Long-form writing, code review, agentic loops, refactoring |

| GPT (OpenAI) | gpt-4o, gpt-4o-mini | Vision, function calling, fast cheap tasks, broad ecosystem |

| Gemini (Google) | gemini-3-pro-preview | Massive context windows, multimodal, real-time low-latency |

| Grok (xAI) | grok-4.2 | Real-time data flavor, casual conversational tone |

| DeepSeek | deepseek-v3.2 | Coding, math, the cheapest tier per token |

This isn't a tier list — it's just what people typically reach for first. The point is: your stack should not force you to commit to one of these and abandon the rest.

What "Same Quality" Actually Means

We forward your request to the real vendor. We do not:

Cache responses across users

Rewrite your prompts

Substitute a cheaper model behind your back

Throttle context windows

If Claude Sonnet 4.6 has a 200K context window on the official API, it has a 200K context window through us. If GPT-4o supports vision, vision works. If Gemini 3 Pro supports the latest tool-use shape, that works.

The one thing we strip from streaming responses is the vendor fingerprint metadata (the headers and IDs that identify the upstream provider on the wire). Functionally that's invisible to your code — you still get the full text, full streaming chunks, full structured responses. We just don't broadcast which upstream credit pool served the request.

Streaming Works the Same Way

Both endpoints support Server-Sent Events streaming exactly like the official APIs:

const stream = await client.messages.stream({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Stream me a long response." }]
});

for await (const event of stream) {
  if (event.type === "content_block_delta") {
    process.stdout.write(event.delta.text || "");
  }
}

For OpenAI-format streaming, set stream: true on the chat completion call and iterate the chunks. Same as you'd do with the real OpenAI SDK.

Tools That Already Work

Because we mirror the official wire formats, anything that speaks Anthropic or OpenAI just plugs in:

Claude Code (Anthropic CLI) — point it at our base URL, your sk-aic key, you're vibe coding at 70-80% off. Walkthrough in our Claude Code setup guide.

Cursor (BYOK mode) — drop the base URL into Cursor settings. See Cursor BYOK setup.

LangChain — pass base_url to the ChatAnthropic / ChatOpenAI constructor.

LlamaIndex — same pattern, override the client.

Continue.dev, Aider, Open Interpreter — all support custom base URLs.

Raw fetch / curl — POST to the endpoint with your sk-aic-* in the Authorization header.

If the tool lets you set OPENAI_BASE_URL or ANTHROPIC_BASE_URL, it works.

A Real Workflow Example

Let's say you're building a content drafting tool. You want:

Claude for the long-form draft (it writes the cleanest prose)

GPT-4o mini to title-case and SEO-tweak the headline (cheap and fast)

Gemini 3 Pro to score the result on a rubric (huge context, you're stuffing the brand guide in)

With five separate accounts, that's three SDK setups, three clients, three billing reconciliations.

With one sk-aic-* key:

import OpenAI from "openai";
const client = new OpenAI();

async function draftAndScore(brief) {
  const draft = await client.chat.completions.create({
    model: "claude-sonnet-4-6",
    messages: [{ role: "user", content: `Draft a blog post: ${brief}` }]
  });

  const title = await client.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: `Improve this headline: ${draft.choices[0].message.content}` }]
  });

  const score = await client.chat.completions.create({
    model: "gemini-3-pro-preview",
    messages: [{ role: "user", content: `Score this draft 1-10 against the brief.\nBrief: ${brief}\nDraft: ${draft.choices[0].message.content}` }]
  });

  return { draft, title, score };
}

Three vendors, one client, one billing line. This is what we mean when we say multi-AI shouldn't be hard.

Pricing in Plain English

Basic (free forever): 70% off all 5 vendors. 200 requests/minute. Pay-per-use, top up when you need it.

Pro ($19 lifetime): 80% off all 5 vendors. 500 requests/minute. One-time payment, never expires.

No monthly subscription. You pay for what you actually use, in tokens, at our discounted rate. Top up via crypto (Oxapay supports 100+ coins). Minimum $5 to start, useful for benchmarking before you commit to anything.

Common Questions

Is this an unofficial API wrapper? No. We forward your request to the real provider. Same models, same versions, same outputs. We just buy the credits in bulk and pass the discount to you.

What about rate limits? Plan-based at our edge (200 RPM Basic, 500 RPM Pro). Beyond that, the upstream vendor's per-model rate still applies — same as if you were calling direct.

Do prompts get logged or trained on? No. We don't store prompt content. We log usage metadata (token counts, timestamps, model) for billing — that's it.

What happens if a vendor goes down? That request fails (same as direct). The other four vendors keep working. This is one of the underrated advantages of multi-vendor — when GPT has a bad afternoon, your Claude calls don't care.

Can I see token usage per model? Yes, your dashboard breaks down spend per vendor and per model so you can spot which one is eating your budget.

Welcome to One-Key Multi-AI

That's the whole shape of the thing. One key, one base URL, every major model, 70-80% cheaper. If your existing code uses the Anthropic SDK or the OpenAI SDK, the migration is two environment variables.

For the broader product context and why we built this, see the welcome post. For tool-specific setup walkthroughs (Claude Code, Cursor, LangChain), the docs cover each in detail.

If you've been putting off trying GPT or Gemini because "ugh, another account," this is the post that says you don't have to. Top up $5, change two env vars, and benchmark for an afternoon. Worst case you've spent a coffee. Best case you save a few hundred dollars a month.

That's the pitch.