Doubao Seed 2.1 API (2026): Pro & Turbo, No Volcano Signup

Call Doubao Seed 2.1 Pro ($0.884/$4.42 per M) and Turbo (half: $0.442/$2.212) from one endpoint. 256K context, one key, no separate Volcano signup.

Doubao Seed 2.1 API (2026): Pro & Turbo, No Volcano Signup

Doubao Seed 2.1 API (2026): Pro & Turbo, No Volcano Signup

ByteDance announced Doubao Seed 2.1 on June 24, 2026, at the Volcano Engine FORCE conference. Two variants, Pro and Turbo, both at 256K context. Going to them directly means a Volcano Engine account: either the registration that wants a phone number, real-name verification, and a CNY top-up, or a separate overseas account on Volcano’s international console with an international card. This guide takes a third route. You call both variants from one OpenAI-compatible endpoint with a single key you may already have, and you flip between them by editing one string.

30-second answer

  • What you can do: Call Doubao Seed 2.1 Pro and Turbo from the standard OpenAI SDK (Python or Node), switch between them by changing the model string, and send image input to either one.
  • Time required: About 5 minutes if you already have an ofox key. About 10 if you need to sign up.
  • What you need: An ofox.ai API key, the openai SDK (any recent version), and the two model IDs: volcengine/doubao-seed-2.1-pro and volcengine/doubao-seed-2.1-turbo.

The short version of the pricing, since it drives every routing decision below: Pro is $0.884 input and $4.42 output per million tokens. Turbo is exactly half, $0.442 and $2.212. Cached input drops the floor further, $0.177 on Pro and $0.085 on Turbo. Same 256K window on both.

Doubao Seed 2.1 ProDoubao Seed 2.1 Turbo
Model IDvolcengine/doubao-seed-2.1-provolcengine/doubao-seed-2.1-turbo
Input ($/M)$0.884$0.442
Output ($/M)$4.42$2.212
Cached input ($/M)$0.177$0.085
Context window256,000256,000
Max output256,000256,000
ModalityText + image in, text outText + image in, text out
PositioningFlagship deep thinking: complex coding, long-chain agents, multi-step deliveryLow cost, low latency: high-frequency enterprise traffic

Turbo’s per-token price is exactly half of Pro’s across input, output, and cached input. ByteDance says Turbo’s features are complete and its performance is comparable to Pro, which is the vendor’s framing, not a benchmark, so the routing question below is really “how confident are you that the cheap variant holds up on this specific task.”

What You Can Do After This Setup (And What You Can’t)

Setting expectations first, because nobody likes finding the wall after the build.

Here is what the setup gets you:

  • Call both Seed 2.1 variants through the OpenAI Chat Completions shape. Your existing OpenAI code mostly works after three edits: key, base URL, model.
  • Route by cost. Send cheap, high-frequency calls to Turbo and reserve Pro for the hard reasoning, with one string per call deciding which.
  • Send images. Both variants take an image_url content block, so a screenshot or a diagram goes in alongside text.
  • Bill in USD with an international card, on a key that may already cover your other models, with no second account to register.
  • Share one key across Doubao and the other models on the same gateway, which matters when you want a fallback that isn’t another signup.

And here is what it does not get you:

  • Volcano Engine’s exact list price. A gateway sits in the path, so the USD numbers here are the ofox rate, not the raw Volcano rate. They track each other closely (roughly 6.8 RMB to the dollar against ByteDance’s published ¥6 / ¥30 per-million numbers), but they are not identical.
  • A guarantee that “Turbo performs like Pro.” That is ByteDance’s framing from the launch. Test it on your own workload before you route production traffic on the strength of a marketing line.
  • An offline or self-hosted option. Seed 2.1 is an API-only model. There is no open-weight checkpoint to download.

If you ran the Doubao Seed 2.0 setup earlier this year, the muscle memory carries over. The difference is the lineup: 2.0 was a four-tier budget family (Pro, Lite, Mini, Code), 2.1 is a two-variant flagship split (a deep-thinking Pro and a half-price Turbo), and the model IDs changed accordingly.

Decision Frame: When to Use This Setup (and When NOT)

Before the steps, decide whether the gateway path is actually your path.

Use it when:

  • You don’t want to open a separate Volcano account, or clear its phone and real-name verification, just to evaluate a model.
  • You want Pro and Turbo behind one key so cost routing is a string swap, not a second integration.
  • You already call other models through an OpenAI-compatible endpoint and want Doubao to join the same code path.

Skip it when:

  • You already keep a verified Volcano Engine account (domestic or international) and only ever call Doubao. Going direct avoids the gateway hop, and you’ve already paid the registration cost.
  • You need Volcano’s exact list price down to the decimal for a procurement spreadsheet. Go to the source.
  • Your compliance rules demand a specific data-residency guarantee. Confirm that with the provider directly; a third-party gateway doesn’t change where inference runs.

One stop rule: if all you wanted was a first successful call to confirm the model exists and answers, you can stop at Step 4. Steps 5 onward are routing, error handling, and team setup.

System Requirements

Nothing heavy. The whole point of an OpenAI-compatible endpoint is that the client is boring.

ComponentRequirementNotes
RuntimePython 3.8+ or Node.js 18+Whatever your existing OpenAI SDK already runs on
SDKopenai (Python or JS)Any recent version; the Chat Completions shape is stable
API keyOne ofox.ai key (sk-ofox-...)From the ofox dashboard after signup
Endpointhttps://api.ofox.ai/v1The OpenAI-compatible base URL
NetworkOutbound HTTPSStandard outbound HTTPS, no special routing

You do not need the Volcano Engine SDK, a volces.com endpoint, or any ByteDance-specific client. The gateway normalizes the underlying API into the OpenAI shape.

Step-by-Step Installation

Step 1: Get an API key

Sign up at ofox.ai, open the dashboard, and create a key. It looks like sk-ofox-.... Keep it out of source control; an environment variable is the usual place.

export OFOX_API_KEY="sk-ofox-your-key-here"

Expected result: echo $OFOX_API_KEY prints your key in the current shell.

Step 2: Install the SDK

# Python
pip install openai

# or Node
npm install openai

Expected result: pip show openai (or npm ls openai) reports an installed version. Anything recent is fine; the request shape used here hasn’t changed across the modern SDK line.

Step 3: Smoke-test the endpoint with curl

Before writing any code, confirm the key and endpoint talk to each other. This call hits Turbo because it’s the cheaper one to test against.

curl https://api.ofox.ai/v1/chat/completions \
  -H "Authorization: Bearer $OFOX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "volcengine/doubao-seed-2.1-turbo",
    "messages": [{"role": "user", "content": "Reply with the single word: ready"}]
  }'

Expected result: a JSON body with choices[0].message.content containing ready. If you get a 401, the key is wrong or unset. If you get a 404 on the model, recheck the ID spelling (it’s volcengine/doubao-seed-2.1-turbo, with dots in 2.1, not dashes).

Step 4: First call from Python

from openai import OpenAI

client = OpenAI(
    api_key="sk-ofox-...",            # or os.environ["OFOX_API_KEY"]
    base_url="https://api.ofox.ai/v1",
)

resp = client.chat.completions.create(
    model="volcengine/doubao-seed-2.1-pro",
    messages=[{"role": "user", "content": "Explain MoE routing in two sentences."}],
)
print(resp.choices[0].message.content)

Expected result: a two-sentence answer on your terminal. Three things differ from a stock OpenAI call: the api_key, the base_url, and the model. Streaming, tools, and structured output all use the same SDK methods you already know.

Step 5: Same call from Node

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.OFOX_API_KEY,
  baseURL: "https://api.ofox.ai/v1",
});

const resp = await client.chat.completions.create({
  model: "volcengine/doubao-seed-2.1-pro",
  messages: [{ role: "user", content: "Explain MoE routing in two sentences." }],
});
console.log(resp.choices[0].message.content);

Expected result: the same two-sentence answer. The JS SDK uses baseURL (camelCase) where Python uses base_url. That’s the only spelling trap.

Step 6: Switch Pro and Turbo with one string

This is the part worth slowing down for, because it’s the whole reason to run both behind one key. Nothing changes except the model value.

MODELS = {
    "pro":   "volcengine/doubao-seed-2.1-pro",
    "turbo": "volcengine/doubao-seed-2.1-turbo",
}

def ask(tier: str, prompt: str) -> str:
    resp = client.chat.completions.create(
        model=MODELS[tier],
        messages=[{"role": "user", "content": prompt}],
    )
    return resp.choices[0].message.content

print(ask("turbo", "Summarize this ticket in one line."))   # cheap path
print(ask("pro",   "Plan a three-step refactor for this module."))  # hard path

Expected result: both calls return. The cheap summary goes through Turbo at $0.442/$2.212; the planning task goes through Pro at $0.884/$4.42. You decide per call which one pays.

flowchart TD
    A[Incoming request] --> B{Hard reasoning,<br/>long-chain agent,<br/>multi-step delivery?}
    B -->|Yes| C[model = volcengine/<br/>doubao-seed-2.1-pro<br/>$0.884 in / $4.42 out]
    B -->|No: summarize, classify,<br/>high-frequency call| D[model = volcengine/<br/>doubao-seed-2.1-turbo<br/>$0.442 in / $2.212 out]
    C --> E[Same endpoint<br/>api.ofox.ai/v1]
    D --> E
    E --> F[Response]

Common Errors During Setup (and Fixes)

The failures here are almost all the same three categories: wrong key, wrong model string, wrong request shape. The table covers what actually shows up.

SymptomLikely causeFix
401 UnauthorizedKey missing, expired, or with a stray spaceRe-export the key; confirm the Authorization: Bearer header has no trailing whitespace
404 on the modelTypo in the ID, usually 2-1 instead of 2.1Use the exact strings: volcengine/doubao-seed-2.1-pro / volcengine/doubao-seed-2.1-turbo
Connection refused / DNS errorBase URL points at OpenAI or a typo’d hostSet base URL to https://api.ofox.ai/v1 (note the /v1)
400 on an image requestimage_url block malformed or missing the data: prefix on base64Send {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
Empty or truncated outputmax_tokens set too low, or you’re reading the wrong fieldRaise max_tokens; read choices[0].message.content
429 Too Many RequestsBurst above your current rate allowanceAdd exponential backoff; retry after the delay the response suggests
Slow first token on ProDeep-thinking model spends time before emittingExpected on Pro for hard prompts; route latency-sensitive calls to Turbo instead
model works in curl, fails in SDKSDK pinned to a stale base URL via env varCheck OPENAI_BASE_URL; the explicit base_url/baseURL argument should win, but a leftover env var can confuse older setups

Team / Multi-Developer Configuration

Solo setup is one key in one environment variable. A team needs the key to be shared safely and the model choice to be consistent, so people aren’t each hardcoding a different tier.

The pattern that holds up: keep the key in your secret manager, expose the endpoint and default tier through environment variables, and let a small config decide Pro versus Turbo per environment.

# .env.example (committed); real .env stays out of git
OFOX_API_KEY=          # pulled from the team secret manager, never committed
OFOX_BASE_URL=https://api.ofox.ai/v1
DOUBAO_TIER=turbo      # dev/staging default; prod can override to pro per route

Then read those instead of literals, so no developer pins a tier by accident:

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["OFOX_API_KEY"],
    base_url=os.environ.get("OFOX_BASE_URL", "https://api.ofox.ai/v1"),
)
DEFAULT_MODEL = f"volcengine/doubao-seed-2.1-{os.environ.get('DOUBAO_TIER', 'turbo')}"

A few things that keep a team out of trouble:

ConcernSoloTeam
Key storageOne env var locallySecret manager (Vault, AWS Secrets Manager, Doppler), injected at deploy
Tier choiceHardcoded is fineDriven by DOUBAO_TIER env var so dev defaults to Turbo, prod opts into Pro
Cost visibilityEyeball the dashboardTag requests per service so the Pro/Turbo split is attributable
Onboarding”Here’s a key”.env.example in the repo, key handed out through the secret manager only

The single-key, single-endpoint shape is what makes this cheap to administer. One credential to rotate, one base URL, and the only per-team decision is which tier each environment defaults to. For cost attribution, read the usage object on each response (prompt_tokens, completion_tokens) and log it against the tier you called; that’s how you find out after a month whether your Pro/Turbo split matched your plan or quietly drifted toward the expensive variant. If you’re standing up a broader gateway in front of several models, the multi-model router pattern covers the routing layer that sits above this.

Advanced: Pro/Turbo Routing and Image Input

Cost-aware routing in one loop

A common production shape is a cheap first pass on Turbo with an escalation to Pro only when the cheap answer isn’t good enough. The escalation rule is yours, and that is the part worth thinking about, since a bad rule either escalates everything (you’ve paid Pro prices for a Turbo-shaped problem) or never escalates (you ship Turbo answers on tasks that needed Pro). A confidence threshold, a length check, or a cheap validator pass are all reasonable triggers. The model swap itself is one line.

def answer(prompt: str, hard: bool) -> str:
    tier = "pro" if hard else "turbo"
    resp = client.chat.completions.create(
        model=f"volcengine/doubao-seed-2.1-{tier}",
        messages=[{"role": "user", "content": prompt}],
    )
    return resp.choices[0].message.content

The math is the reason this pays off. Take a workload of one million requests a month, each averaging 500 input and 500 output tokens. All-Pro, that’s roughly 500M input at $0.884 plus 500M output at $4.42, about $2,652 a month before cached-input savings. All-Turbo, the same volume lands near $1,327, half the bill, because Turbo’s per-token rate is exactly half across the board. Route 80 percent to Turbo and escalate the hard 20 percent to Pro, and you sit around $1,592, much closer to the Turbo floor than the Pro ceiling. The split is the lever, not the model. Cached input pushes it lower again on prompts that repeat a system block, since the cache rate is $0.177 on Pro and $0.085 on Turbo against the full input rate.

Streaming a response

Long Pro answers feel slow if you wait for the whole completion. Stream tokens as they arrive; the only change is stream=True and iterating the chunks. The model swap stays a one-liner here too.

stream = client.chat.completions.create(
    model="volcengine/doubao-seed-2.1-pro",
    messages=[{"role": "user", "content": "Draft a migration plan, step by step."}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

Expected result: text prints incrementally instead of all at once. This matters more on Pro, where a deep-thinking pass can sit quiet for a beat before it starts emitting. Turbo’s first token usually lands faster, which is the whole reason it exists.

Sending an image to either variant

Both variants are multimodal (text plus image in, text out). The content block is the standard OpenAI vision shape, so a screenshot or a chart goes straight in.

import base64

with open("screenshot.png", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

resp = client.chat.completions.create(
    model="volcengine/doubao-seed-2.1-pro",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What does this error dialog say to do?"},
            {"type": "image_url",
             "image_url": {"url": f"data:image/png;base64,{b64}"}},
        ],
    }],
)
print(resp.choices[0].message.content)

Expected result: a text answer that reads the image. Swap the model string to volcengine/doubao-seed-2.1-turbo and the same call runs on the cheaper variant. If you need image generation rather than understanding, that’s a different ByteDance model; the Seedream 4.5 image API covers that side.

Want to try it on a real workload? A single ofox key calls both Seed 2.1 variants plus the rest of the catalog from https://api.ofox.ai/v1, billed in USD with no Volcano Engine signup. Start on the Doubao Seed 2.1 Pro model page.

Alternatives

If the gateway path isn’t right for you, the honest options:

  • ofox.ai (this guide). One key, both variants, USD billing, OpenAI-compatible endpoint, and other models on the same credential. Best when you want Doubao without opening a separate Volcano account and want a fallback model on the same key. A gateway markup sits over Volcano’s list pricing.
  • Volcano Engine (direct). ByteDance’s own endpoint, on two consoles. The standard registration wants a phone number, real-name verification, and a CNY top-up; the international console takes an email signup and an international card for an overseas account. Cheapest list price if you’ll open and keep one of those accounts, and the right call when Doubao is the only model you use.
  • Another OpenAI-compatible aggregator. Several gateways now carry Doubao. The integration shape is the same as here; compare on price, the breadth of the rest of the catalog, and billing currency. If you’re already on a multi-model setup, the Kimi K2.5 access guide walks through the same single-endpoint pattern for a different model family, which is useful for cross-checking what “one key, many models” actually buys you.

FAQ

What is Doubao Seed 2.1 and when was it released? Doubao Seed 2.1 is ByteDance’s next-generation model family, announced June 24, 2026 at the Volcano Engine FORCE conference. Two variants, Pro and Turbo, both at 256K context. Pro is the flagship deep-thinking model; Turbo is the low-cost, low-latency version for high-volume traffic.

How much does the Doubao Seed 2.1 API cost? Via ofox.ai in USD: Pro is $0.884 input and $4.42 output per million tokens, cached input $0.177. Turbo is exactly half: $0.442 input, $2.212 output, $0.085 cached input. Both carry 256K context and 256K max output.

Can I use Doubao Seed 2.1 without a Volcano Engine account? Yes. Volcano’s own registration wants a phone number and real-name verification, and its international console wants a separate overseas account. ofox is a third path: an email signup and an international card give you one key that calls both variants plus other models.

What is the difference between Pro and Turbo? Pro is the flagship deep-thinking model for high-complexity work. Turbo costs exactly half per token and targets latency-sensitive, high-frequency production. ByteDance says Turbo’s performance is comparable to Pro; treat that as a vendor claim and verify on your own tasks.

How do I switch between Pro and Turbo in code? Change one string. Both run on the same endpoint, so you swap model between volcengine/doubao-seed-2.1-pro and volcengine/doubao-seed-2.1-turbo. Everything else stays identical.

Does Doubao Seed 2.1 support image input? Yes. Both variants are multimodal (text plus image in, text out). Attach an image_url content block carrying a URL or a base64 data URI alongside your text prompt.

How does Doubao Seed 2.1 compare to GPT-5.5? ByteDance positions Seed 2.1’s three upgrades (coding delivery, agent long-chain tasks, multimodal understanding) against GPT-5.5. That is the vendor framing from the launch, not an independent benchmark, so verify it before you depend on it.

What is the context window? 256,000 tokens of context and up to 256,000 tokens of max output, the same on both Pro and Turbo.

Sources Checked for This Refresh