Why does Fable 5 refuse some requests or route them to Opus 4.8?

By design. Fable 5 ships with three safety classifiers; when a request looks like cybersecurity, biology and chemistry, or model distillation, Anthropic routes it to Claude Opus 4.8 instead of Fable 5. A hard refusal returns as a successful HTTP 200 with `stop_reason: "refusal"`, not an error, so check the stop reason before reading the content. If your work lives in those domains, call `anthropic/claude-opus-4.8` directly and skip the routing layer.

Claude Fable 5 vs Sonnet 5 (2026): 5x Pricier, When It Pays

Q: Is Claude Fable 5 worth 5x the price of Sonnet 5?

Only for the hardest tasks. At introductory pricing Fable 5 is $10/$50 per million tokens against Sonnet 5's $2/$10, exactly 5x on both lines. Fable 5 buys a real capability jump (80.3% SWE-bench Pro vs 63.2%, and 91/100 on Every's Senior Engineer test where Opus 4.8 scores 63), but on cost-per-solved-issue Sonnet 5 stays cheaper even after you account for retries. Fable 5 pays when a wrong first answer costs more than the token difference, or when Sonnet 5 simply cannot close the task.

Q: How much does Claude Fable 5 cost compared to Sonnet 5?

Fable 5 is $10 per million input tokens and $50 per million output, the Anthropic API rate. Sonnet 5 is $2/$10 at introductory pricing through August 31, 2026, then $3/$15. So Fable 5 is 5x Sonnet 5 during the intro window and about 3.3x after it. Cached input reads are $1/M on Fable 5 (Anthropic's standard 0.1x ratio) versus $0.2/M on Sonnet 5. The sticker gap understates the real gap because Fable 5's thinking is always on, so it emits more output tokens per task.

Q: Is Claude Fable 5 available on ofox?

Intermittently. Sonnet 5 sits on the ofox catalog permanently at `anthropic/claude-sonnet-5`. Fable 5 is offered in access windows rather than as a standing listing, so `anthropic/claude-fable-5` comes and goes on the aggregator the same way Anthropic rotated it in and out of subscription plans after June 9. Confirm the model is live on the ofox catalog before you wire it into production; when it is not listed, reach Fable 5 through Anthropic's own API instead.

Q: Is Fable 5 better than Sonnet 5 for coding?

At the frontier, clearly. Fable 5 scores 80.3% on SWE-bench Pro and 91/100 on Every's Senior Engineer benchmark, against Sonnet 5's 63.2% SWE-bench Pro. For routine coding (single-file edits, refactors, test scaffolding, review comments) Sonnet 5 is already enough and costs a fifth as much. The split that works in practice is Sonnet 5 for the everyday 80%, Fable 5 for the hard tail where a failed patch is expensive.

Q: Can I set temperature on Fable 5 or Sonnet 5?

No, on both. Non-default `temperature`, `top_p`, or `top_k` returns a 400 error, and manual `budget_tokens` thinking also 400s. Steer behavior through the system prompt and control depth with the `effort` parameter. One extra Fable 5 rule: thinking is always on, so an explicit `thinking: {type: "disabled"}` also 400s. Omit the parameter entirely.

Q: What is the context window of Fable 5 and Sonnet 5?

Both are 1M tokens with 128K max output. The nominal windows match, but Sonnet 5's newer tokenizer packs less text per token, so a 1M window holds somewhat less actual text on Sonnet 5 than on an older model. For the Fable 5 vs Sonnet 5 choice the window is a wash; price and capability decide it.

TL;DR Fable 5 is Anthropic’s capability ceiling and Sonnet 5 is its value floor, and for the first time both are reachable through one endpoint. Fable 5 lists $10/$50 per million tokens, exactly 5x Sonnet 5’s introductory $2/$10 (3.3x after August 31). It earns that premium on the numbers: 80.3% on SWE-bench Pro against Sonnet 5’s 63.2%, and 91/100 on Every’s Senior Engineer test where Opus 4.8 scores 63. The catch is that the price gap is the floor, not the ceiling, because Fable 5’s always-on thinking emits more output tokens per task, and there is an availability catch too: Sonnet 5 is always listed, Fable 5 comes and goes in access windows. Below: the specs, the benchmark table, cost-per-solved-issue math, and a 10-line way to A/B both on your own traffic.

The 5x sticker gap is the smallest the difference ever gets. Fable 5’s thinking is always on, so on the same task it emits more output tokens than Sonnet 5, and output is the line that bills at $50.

TL;DR: Which One Should You Pick?

For most teams the answer is “Sonnet 5 as the default, Fable 5 for the hard tail you cannot afford to get wrong.” Here is the one-line verdict by scenario.

Scenario	Pick	Why
Classification, extraction, chat, RAG answers	Sonnet 5	Bounded output, capability is plenty, a fifth of the price
Routine coding: edits, refactors, test scaffolds	Sonnet 5	63.2% SWE-bench Pro clears everyday work
Frontier agentic coding where a failed patch is expensive	Fable 5	80.3% SWE-bench Pro, 91/100 senior-engineer test
Long-horizon autonomous runs that must land first try	Fable 5	Fewer retries when correctness is the bottleneck
Cost-sensitive default across a mixed workload	Route both	Cheap work to Sonnet 5, the hard tail to Fable 5
Cybersecurity, bio, or distillation work	Neither, use Opus 4.8	Fable 5 auto-routes these to Opus 4.8 anyway

The rest of this piece is the evidence behind that table, plus the honest version of “when does the $50 tier actually pay.”

What Changed: Fable 5 Came Back, Sonnet 5 Arrived

Two releases three weeks apart reset the top and the middle of the Claude line.

Claude Fable 5 shipped on June 9, 2026 as Anthropic’s first generally available Mythos-class model, the family Anthropic previously held back over cybersecurity capability. It is the Mythos model with three safety classifiers layered on top. Anthropic put it in Pro, Max, and Team subscription plans for two weeks, then removed it from those plans on June 23, leaving the API rate of $10/$50 as the way in. It has been rotating in and out of access windows since, which matters for how you architect around it.

Claude Sonnet 5 shipped on June 30, 2026 at introductory pricing of $2/$10 (standard $3/$15 after August 31). It is Anthropic’s most agentic Sonnet-tier model and the new default for professional work that is not at the frontier. We covered the head-to-head with the middle tier in Sonnet 5 vs Opus 4.8.

The reason to compare the two ends directly, rather than each against Opus 4.8, is that they answer different questions. Sonnet 5 answers “what is the cheapest model that clears my everyday bar.” Fable 5 answers “what is the best model money can buy when the task is hard enough that being wrong is the expensive outcome.” Most teams need both answers, and the interesting decision is where you draw the line between them. If you want the full three-way coding shootout with GPT-5.5 in the mix, that lives in Fable 5 vs Opus 4.8 vs GPT-5.5; this piece is narrower and more practical: two tiers, one routing decision.

Quick Specs Comparison

Both models share the same nominal 1M context window and 128K max output. The real differences are price, availability, and the fact that Fable 5 cannot turn thinking off.

Spec	Claude Fable 5	Claude Sonnet 5
ofox model ID	`anthropic/claude-fable-5`	`anthropic/claude-sonnet-5`
Input	$10/M	$2/M (intro), $3/M (standard)
Output	$50/M	$10/M (intro), $15/M (standard)
Cached input read	$1/M (0.1x ratio)	$0.2/M
Context window	1M	1M
Max output	128K	128K
Thinking	Always on, cannot disable	Adaptive, on by default, can disable
Sampling params	400 error	400 error
Safety routing	Cyber / bio / distillation to Opus 4.8	Real-time cyber refusals
ofox availability	Windowed, not always listed	Permanent listing

The intro Sonnet 5 prices ($2/$10) and the cached read ($0.2/M) match the ofox model page for anthropic/claude-sonnet-5 as of July 2, 2026. Fable 5’s $10/$50 is the Anthropic API rate from Anthropic’s Fable 5 announcement; its cached read is the standard 0.1x-of-input ratio Anthropic applies across the line. Fable 5’s ofox listing was not live at the time of writing, so its numbers here are Anthropic-sourced, not read off a live ofox page. Check the ofox catalog for the current Fable 5 listing before you build against it.

The Price Gap, and Why It Is Bigger Than 5x

On per-token rates the gap is clean: Fable 5 is 5x Sonnet 5 during the introductory window, on input, output, and cached reads alike. After August 31, when Sonnet 5 moves to $3/$15, the multiple drops to about 3.3x. Either way, Sonnet 5 is dramatically cheaper per token.

The sticker understates the real difference for one structural reason. Fable 5’s thinking is always on and you cannot turn it off, so on any non-trivial task it produces a chunk of thinking and output tokens that a leaner call would not. Sonnet 5 has adaptive thinking on by default too, but you can dial it down with the effort parameter or disable it outright for bounded work. Output is the line that bills at $50/M on Fable 5 versus $10/M on Sonnet 5, so more output tokens on the pricier model widens the effective gap beyond the 5x sticker. This is the opposite of the Sonnet-versus-Opus story, where the cheaper model’s own thinking narrows the discount. Here the pricier model thinks harder by default, so the gap only grows.

Cached reads are the one place the ratio is a straight 5x with no asterisk. If your prompts carry a large stable prefix (a system prompt, a tool schema, a repeated document set), a cache read is $0.2/M on Sonnet 5 against $1/M on Fable 5. For a cache-heavy production endpoint, that line alone can dominate the monthly bill, and it never favors Fable 5.

Coding Benchmark: The Capability Gap Is Real

Benchmarks are noisy, but the gap between these two is wide enough to survive the noise. Here is where they land on the tests that map to production coding, with Opus 4.8 as the middle-tier reference.

Benchmark	Fable 5	Sonnet 5	Opus 4.8
SWE-bench Verified	95.0%	n/a	88.6%
SWE-bench Pro (agentic coding)	80.3%	63.2%	69.2%
Every Senior Engineer (/100)	91	not published	63
Terminal-Bench 2.1	80.5%	n/a	74.6%

Two rows carry the decision.

SWE-bench Pro is the production read. It runs models against real GitHub issues end to end: read the repo, write a patch, the patch either passes the hidden test suite or it does not, no partial credit. Fable 5’s 80.3% against Sonnet 5’s 63.2% is a 17-point spread, and every one of those points is an issue that closes on the first run instead of failing. On a hard multi-file issue, a first-pass miss means a retry loop or a human picking up the pieces, and both cost more than tokens.

Every’s Senior Engineer benchmark is the ceiling read. Every runs it on the hardest problems they can write, the kind a senior engineer takes a working day to solve. Fable 5 at 91/100 lands in human-senior-engineer range. Opus 4.8 sits at 63. Anthropic has not published a Sonnet 5 figure for this test, but Sonnet 5 already trails Opus 4.8 on SWE-bench Pro (63.2% vs 69.2%), so on a harder benchmark it lands at or below Opus, not near Fable 5. That is the gap the price premium buys: not “a bit better on average,” but “can do a class of task the cheaper model mostly fails.” Treat these leaderboard scores as a snapshot and check Anthropic’s Transparency Hub for the per-benchmark source; the direction is what matters for routing, not the last decimal.

The honest summary of the table: for everyday coding, the extra points do not change the outcome, because Sonnet 5 already closes the issue. For frontier coding, the extra points are the difference between shipping and stalling.

Pricing Math: When the $50 Tier Actually Pays

Sticker price is one number, cost-per-solved-issue is another, and they can point in different directions. Here are two workloads with the assumptions stated so you can swap in your own.

Scenario A, an everyday coding fleet. 5 developers, 20 tasks/day each, 20 workdays (2,000 tasks/month). Per routine task: 40K input, and output of 8K on Sonnet 5 (thinking dialed low) versus 25K on Fable 5 (thinking always on). Assume the task is well within both models’ reach, so first-pass success is near 1 on both.

Line	Sonnet 5 (intro)	Fable 5
Input per task (40K)	$0.08	$0.40
Output per task	$0.08 (8K)	$1.25 (25K)
Cost per task	$0.16	$1.65
Monthly (2,000 tasks)	$320	$3,300
vs the other	baseline	~10x more

On routine work Fable 5 is not 5x more expensive, it is roughly 10x, because the always-on thinking piles onto the $50 output line. Paying that for work Sonnet 5 already closes is pure waste.

Scenario B, the hard tail. Now take genuinely hard, multi-file issues where first-pass success is the whole game. Use the SWE-bench Pro rates as a stand-in: 80.3% for Fable 5, 63.2% for Sonnet 5. Per attempt: 60K input, 40K output on Fable 5, 30K output on Sonnet 5.

Line	Sonnet 5 (intro)	Fable 5
Cost per attempt	$0.42	$2.60
First-pass success	63.2%	80.3%
Expected attempts to solve	~1.58	~1.25
Cost per solved issue (tokens only)	~$0.66	~$3.24

On tokens alone, Sonnet 5 is still cheaper per solved issue even after retries, because a fifth of the per-attempt price buys a lot of retries. So the case for Fable 5 is not a token-cost case. It is this: the SWE-bench Pro rate flatters Sonnet 5 on the hardest tasks. On the class of problem Every’s benchmark targets (where Fable 5 scores 91 and Opus 4.8 only 63), Sonnet 5’s real-world solve rate falls well below its 63.2% headline, its retry count climbs, and some issues it never closes. Once a failed patch costs an hour of senior engineer time or ships a bug, the $3 token delta stops being the number that matters. That is when Fable 5 pays: not because it is cheaper, but because being wrong is expensive and it is wrong less often.

Put a number on it. A senior engineer at a loaded cost of $120/hour is $2/minute. If routing a hard issue to Fable 5 instead of Sonnet 5 saves even fifteen minutes of a human untangling a wrong patch, that is $30 of engineer time against a token delta measured in single dollars. The break-even is not close. The trap is applying that logic to the everyday 80%, where there is no wrong-patch cost to avoid because Sonnet 5 was going to close the issue anyway. The whole discipline of tiering is keeping the Fable 5 share small enough that its 10x effective cost lands only on the tasks where a saved engineer-hour is on the table. Size that share by measuring, not by taste: most teams find the genuine frontier is a single-digit percentage of their traffic, and everything above that percentage is money spent on capability the task did not require.

When to Pick Claude Sonnet 5

Pick anthropic/claude-sonnet-5 for the large majority of work:

High-volume bounded output. Classification, extraction, routing, moderation. Short outputs, big input volume, often cache-heavy. Sonnet 5’s $2/$10 and $0.2/M cached reads cut these bills to a fraction of Fable 5’s.
RAG answers and summarization. Retrieval does the heavy lifting; the model writes a bounded response. Capability is plenty.
Routine coding. Single-file edits, boilerplate, test scaffolds, review comments. 63.2% SWE-bench Pro clears work that is not at the frontier.
Anything latency-sensitive and interactive. Sonnet-tier speed and price fit chat and assistant surfaces better than a ceiling model that always thinks first.

When to Pick Claude Fable 5

Pick anthropic/claude-fable-5 when the task is at the capability frontier and a wrong answer is the expensive outcome:

Frontier agentic coding. Hard, multi-file issues where the 17-point SWE-bench Pro lead is the difference between one run and a retry loop, and where a shipped-wrong patch costs real engineer time.
Long-horizon autonomous runs. Overnight refactors and multi-step agent loops that have to hold together without a human catching a wrong turn on step 12.
Senior-engineer-class problems. The work Every’s benchmark targets, where Sonnet 5’s real solve rate drops and Fable 5’s 91/100 is the reason to reach for it.
When you have access. Fable 5’s availability is windowed, so architect it as the tier you route to when it is live, not a permanent dependency.

When Not to Pick Either (and What to Use Instead)

Two cases fall between the tiers.

The first is cybersecurity, biology and chemistry, or model distillation work. Fable 5 detects these and routes them to Opus 4.8 anyway, so calling Fable 5 for them just adds a routing hop. Call anthropic/claude-opus-4.8 directly and skip it.

The second is the middle of the difficulty range, the tasks that are too hard for Sonnet 5 to close reliably but not hard enough to justify Fable 5’s 10x effective cost. That is exactly where Opus 4.8 lives: $5/$25, 69.2% SWE-bench Pro, and no availability window to plan around. For a lot of teams the real routing tree has three tiers, not two, with Opus 4.8 as the everyday-hard workhorse and Fable 5 reserved for the genuine frontier. The Sonnet 5 vs Opus 4.8 breakdown covers the lower boundary; the Opus 4.8 release review covers the middle.

flowchart TD
    A[Incoming task] --> B{Cyber / bio / distillation?}
    B -->|Yes| C[anthropic/claude-opus-4.8]
    B -->|No| D{Frontier-hard?<br/>failed answer is expensive}
    D -->|No| E[anthropic/claude-sonnet-5]
    D -->|Yes| F{Fable 5 in an access window?}
    F -->|Yes| G[anthropic/claude-fable-5]
    F -->|No| H[anthropic/claude-opus-4.8]

Try Both via ofox: A/B in 10 Lines

The honest way to settle the routing line is to run both on your own tasks and read the token counts. ofox exposes the Claude line on one OpenAI-compatible endpoint (https://api.ofox.ai/v1), so the only thing that changes between runs is the model ID string, and one key covers all three tiers with no separate Anthropic billing. Two gotchas before you run it: both models reject non-default temperature, top_p, and top_k with a 400, so leave sampling params alone (the examples do). And Fable 5 must be live in an ofox access window for its line to resolve; when it is not listed, either wait for the window or point that one call at Anthropic’s own API.

Python: A/B both models in one loop

from openai import OpenAI

client = OpenAI(base_url="https://api.ofox.ai/v1", api_key="YOUR_OFOX_KEY")

prompt = "Fix the race condition in this worker pool: ..."
for model in ["anthropic/claude-fable-5", "anthropic/claude-sonnet-5"]:
    r = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
    )
    u = r.usage
    print(model, u.prompt_tokens, u.completion_tokens)

Watch the completion_tokens column. Fable 5’s always-on thinking shows up there, and multiplied by $50/M it is where the effective cost gap lives.

Node: same shape

import OpenAI from "openai";

const client = new OpenAI({ baseURL: "https://api.ofox.ai/v1", apiKey: process.env.OFOX_KEY });

const prompt = "Fix the race condition in this worker pool: ...";
for (const model of ["anthropic/claude-fable-5", "anthropic/claude-sonnet-5"]) {
  const r = await client.chat.completions.create({
    model,
    messages: [{ role: "user", content: prompt }],
  });
  console.log(model, r.usage.prompt_tokens, r.usage.completion_tokens);
}

Run this on 20 or 30 of your genuinely hard tasks, sum input and output tokens per model, multiply by the specs-table rates, and divide by how many each model actually solved. That solved-issue cost, not the sticker, is the number that decides where the routing line goes. For the routing plumbing itself, the Claude Code hybrid routing pattern writeup covers picking the signal (input length, task-type tag, or a confidence check that escalates only on failure).

Migration Gotchas: Same Shape, Three 400s

Both models keep the Messages API shape, but the same request that worked on an older Claude can 400 on either of these.

Change	Old behavior	On Fable 5 / Sonnet 5
Sampling params	`temperature` / `top_p` / `top_k` accepted	Non-default values return 400 on both
Manual thinking	`budget_tokens` accepted on some models	Returns 400 on both; use `effort`
Disable thinking	`thinking: {type: "disabled"}` accepted	Works on Sonnet 5; 400 on Fable 5 (omit the param)
Refusals	thrown as errors	HTTP 200 with `stop_reason: "refusal"` on both; handle it

The Fable 5 rows are the ones that trip people. Thinking is always on, so there is no disable switch, and the safety classifiers can hand a request to Opus 4.8 mid-flight. On the API, opt into a fallback so a refusal does not just stop the request; Anthropic’s server-side fallbacks parameter re-serves a declined request on Opus 4.8 in the same call. If you are moving a Sonnet 5 workload up to Fable 5 for the hard tail, budget for more output tokens per task, not fewer, because the always-on thinking works against the intuition that a smarter model finishes faster.

The routing test is not the benchmark score, it is cost-per-solved-issue: run both on your real hard tasks, count the tokens, and count how many each one actually closed.

Alternatives

ofox puts Sonnet 5, Opus 4.8, and Fable 5 (when in-window) on one OpenAI-compatible endpoint, so routing between tiers is a one-string change rather than three integrations. Real-time pricing is on the model catalog.
Opus 4.8 is the middle tier worth naming explicitly: $5/$25, 69.2% SWE-bench Pro, always available, no window to plan around. For the tasks between Sonnet 5’s ceiling and Fable 5’s floor, it is often the right pick.
Anthropic direct is the fallback for Fable 5 specifically. When Fable 5 is not listed on an aggregator, its own API keeps the $10/$50 rate available, at the cost of a second key and separate billing.

FAQ

Is Claude Fable 5 worth 5x the price of Sonnet 5? Only for the hardest tasks. Fable 5 buys a real capability jump (80.3% SWE-bench Pro vs 63.2%, and 91/100 on Every’s Senior Engineer test where Opus 4.8 scores 63), but on cost-per-solved-issue Sonnet 5 stays cheaper even after retries. Fable 5 pays when a wrong first answer costs more than the token difference.

How much does Claude Fable 5 cost compared to Sonnet 5? $10/$50 per million tokens versus Sonnet 5’s $2/$10 intro ($3/$15 standard). That is 5x during the intro window, about 3.3x after August 31. Cached reads are $1/M vs $0.2/M.

Is Claude Fable 5 available on ofox? Intermittently. Sonnet 5 is a permanent listing at anthropic/claude-sonnet-5; Fable 5 is offered in access windows, so confirm it is live on the ofox catalog before building against it.

Is Fable 5 better than Sonnet 5 for coding? At the frontier, clearly (80.3% SWE-bench Pro, 91/100 senior-engineer test). For routine coding, Sonnet 5 is already enough at a fifth of the cost.

Why does Fable 5 refuse or route to Opus 4.8? Its safety classifiers hand cybersecurity, bio, and distillation requests to Opus 4.8. A refusal returns as HTTP 200 with stop_reason: "refusal", so check the stop reason before reading content.

Can I set temperature on Fable 5 or Sonnet 5? No. Non-default sampling params 400 on both, as does budget_tokens. Fable 5 also 400s on thinking: {type: "disabled"} because thinking is always on.

What is the context window of Fable 5 and Sonnet 5? Both are 1M tokens, 128K max output. For this choice the window is a wash; price and capability decide it.

Should I switch from Sonnet 5 to Fable 5? Not wholesale. Keep Sonnet 5 as the default and escalate to Fable 5 only when Sonnet 5’s output fails a check. Wholesale switching pays 5x for capability most requests do not need.

Sources Checked for This Refresh

Anthropic, “Claude Fable 5 and Mythos 5” announcement (Fable 5 $10/$50, Mythos-class, safety classifiers routing to Opus 4.8), verified July 2, 2026: https://www.anthropic.com/news/claude-fable-5-mythos-5
Anthropic, “Introducing Claude Fable 5” docs (always-on thinking, no sampling params, refusal handling): https://platform.claude.com/docs/en/about-claude/models/introducing-claude-fable-5
Anthropic, “Introducing Claude Sonnet 5” launch post, June 30, 2026: https://www.anthropic.com/news/claude-sonnet-5
Anthropic, “What’s new in Claude Sonnet 5” docs (behavior changes, pricing): https://platform.claude.com/docs/en/about-claude/models/whats-new-sonnet-5
Anthropic Transparency Hub (per-benchmark source): https://www.anthropic.com/transparency
ofox model page for anthropic/claude-sonnet-5 ($2/$10 intro, $0.2/M cached read, 1M context), verified July 2, 2026: https://ofox.io/models/anthropic/claude-sonnet-5
ofox model catalog (Fable 5 listing status, real-time pricing), checked July 2, 2026: https://ofox.io/models
SWE-bench Pro / SWE-bench Verified / Every Senior Engineer figures from Anthropic launch materials and Every’s published benchmark, cross-referenced with our own Fable 5 vs Opus 4.8 vs GPT-5.5 writeup