Agents And Tools Tool Use Token Efficient Tool Use
Starting with Claude Sonnet 3.7, Claude is capable of calling tools in a token-efficient manner. Requests save an average of 14% in output tokens, up to 70%, which also reduces latency. Exact token reduction and latency improvements depend on the overall response shape and size.
Info: Token-efficient tool use is a beta feature that only works with Claude 3.7 Sonnet. To use this beta feature, add the beta header
token-efficient-tools-2025-02-19to a tool use request. This header has no effect on other Claude models.All Claude 4 models support token-efficient tool use by default. No beta header is needed.
Warning: Token-efficient tool use does not currently work with
disable_parallel_tool_use.
Here's an example of how to use token-efficient tools with the API in Claude Sonnet 3.7:
The above request should, on average, use fewer input and output tokens than a normal request. To confirm this, try making the same request but remove token-efficient-tools-2025-02-19 from the beta headers list.
Tip: To keep the benefits of prompt caching, use the beta header consistently for requests you'd like to cache. If you selectively use it, prompt caching will fail.