Approximate the total number of tokens in messages.
The token count includes stringified message content, role, and (optionally) name.
count_tokens_approximately(
messages: Iterable[MessageLikeRepresentation],
*,
chars_per_token: float = 4.0,
extra_tokens_per_message: float = 3.0,
count_name: bool = True,
tokens_per_image: int = 85,
use_usage_metadata_scaling: bool = False,
tools: list[BaseTool | dict[str, Any]] | None = None
) -> intNote:
This is a simple approximation that may not match the exact token count used by specific models. For accurate counts, use model-specific tokenizers.
For multimodal messages containing images, a fixed token penalty is applied per image instead of counting base64-encoded characters, which provides a more realistic approximation.
| Name | Type | Description |
|---|---|---|
messages* | Iterable[MessageLikeRepresentation] | List of messages to count tokens for. |
chars_per_token | float | Default: 4.0Number of characters per token to use for the approximation.
One token corresponds to ~4 chars for common English text.
You can also specify |
extra_tokens_per_message | float | Default: 3.0Number of extra tokens to add per message, e.g.
special tokens, including beginning/end of message.
You can also specify |
count_name | bool | Default: TrueWhether to include message names in the count. |
tokens_per_image | int | Default: 85Fixed token cost per image (default: 85, aligned with OpenAI's low-resolution image token cost). |
use_usage_metadata_scaling | bool | Default: FalseIf True, and all AI messages have consistent
|
tools | list[BaseTool | dict[str, Any]] | None | Default: NoneList of tools to include in the token count. Each tool can be either
a |