Anthropic has introduced the new Claude Sonnet 5 model, which demonstrates performance comparable to the flagship Opus 4.8 model while maintaining the same nominal token prices. However, the use of a new tokenizer leads to an actual increase in the cost of processing English text and code by 30-40%.

image

What Happened

Anthropic released Claude Sonnet 5 with support for a 1 million token context window and the ability to generate up to 128,000 tokens of output. Although the official price list remains unchanged ($3 per 1 million input tokens and $15 per 1 million output tokens), architectural changes in the tokenizer increase the number of tokens consumed for the same volume of English text and code.

Context

The new model implements the concept of Adaptive Thinking by default, allowing it to dynamically allocate computational resources to optimize response quality. This architectural change is directly linked to the tokenizer update, which aimed to optimize performance but resulted in economic side effects.

Why It Matters for the Industry

The release of Claude Sonnet 5 confirms an industry trend toward integrating adaptive thinking mechanisms into mid-weight models. For developers, this means a need to revise the unit economics of AI products, as scaling efficiency is now inextricably linked to tokenization architecture rather than just computational power.

Why It Matters for Users

API users should note that when working with English or writing code, real infrastructure costs will increase by approximately 30-40%. Developers are recommended to immediately recalculate budgets and implement token cost calculators for all current pipelines using this model.

Sources

Author

Look at AI, Editorial Team