Smart people always get the best results with the least money—and AI is no different.
Just as developers were worrying about Claude Fable 5's token price being double that of Opus 4.8, a twist has emerged: many developers have discovered that by setting Fable 5's effort level to the lowest setting ("low"), the model not only remains powerful but actually becomes remarkably efficient, with significantly reduced token consumption.
Even at its lowest effort setting, Fable 5 still outperforms Opus 4.8 at its highest setting (xhigh) on the SWE-bench Pro benchmark, scoring 75.0 vs. 68.6. (Note: These figures are from the Mythos 5 configuration. Fable 5 is the public version with a safety classifier; both share the same weights, and coding tasks rarely trigger the classifier.)
However, using low mode to save money is only part of the story. Over the past day, many have observed that Fable 5's cost savings aren't solely due to low mode. In real-world tasks, it often delivers better results, faster speed, and ultimately a lower bill.
For example, in GameBench tests for a spider-eating-bugs mini-game, Fable 5 generated code faster and produced better results, all while costing less than Opus 4.8. (Left: Fable 5, Right: Opus 4.8)
This makes things interesting. Initially, everyone was stuck on Fable 5 being twice as expensive per token. Now it's clear that it's not just stronger—it's often cheaper overall. The cost is in the unit price, but the savings are in the final bill. Fable 5 Low mode, go!
How a More Expensive Model Saves Money
So why does a more expensive model end up being cheaper?
Fable 5's pricing is $10 per million input tokens and $50 per million output tokens—exactly double the previous flagship, Opus 4.8 ($5/$25).
Boris Cherny, the creator of Claude Code, explained on Threads: "Fable is twice the price per token, but it uses fewer tokens on average to complete the same task because it's smarter and more efficient. On complex tasks, Fable can actually cost less than Opus."
A commenter echoed this sentiment: "That's exactly what I've observed. Fewer tokens per task, fewer correction steps, so fewer wasted tokens."
In other words, less intelligent agents used to burn through tokens by making mistakes, correcting errors, and retrying after failures. The dumber the model, the more errors, and the more you paid for each round of tokens.
Fable 5 cuts precisely that hidden cost. For instance, Fable...
via 量子位
