Summarizes conversation history when token limits are approached.
This middleware monitors message token counts and automatically summarizes older messages when a threshold is reached, preserving recent messages and maintaining context continuity by ensuring AI/Tool message pairs remain together.
SummarizationMiddleware(
self,
model: str | BaseChatModel,
*,
trigger: ContextSize | list[ContextSize] | None = None,
keep: ContextSize = ('messages', _DEFAULT_MESSAGES_TO_KEEP),
token_counter: TokenCounter = count_tokens_approximately,
summary_prompt: str = DEFAULT_SUMMARY_PROMPT,
trim_tokens_to_summarize: int | None = _DEFAULT_TRIM_TOKEN_LIMIT,
**deprecated_kwargs: Any = {}
)| Name | Type | Description |
|---|---|---|
model* | str | BaseChatModel | The language model to use for generating summaries. |
trigger | ContextSize | list[ContextSize] | None | Default: NoneOne or more thresholds that trigger summarization. Provide a single
Example
See |
keep | ContextSize | Default: ('messages', _DEFAULT_MESSAGES_TO_KEEP)Context retention policy applied after summarization. Provide a Defaults to keeping the most recent Does not support multiple values like Example |
token_counter | TokenCounter | Default: count_tokens_approximatelyFunction to count tokens in messages. |
summary_prompt | str | Default: DEFAULT_SUMMARY_PROMPTPrompt template for generating summaries. |
trim_tokens_to_summarize | int | None | Default: _DEFAULT_TRIM_TOKEN_LIMITMaximum tokens to keep when preparing messages for the summarization call. Pass |
| Name | Type |
|---|---|
| model | str | BaseChatModel |
| trigger | ContextSize | list[ContextSize] | None |
| keep | ContextSize |
| token_counter | TokenCounter |
| summary_prompt | str |
| trim_tokens_to_summarize | int | None |
Start the shell session and run startup commands.
Async start the shell session and run startup commands.
Check for parallel write_todos tool calls and return errors if detected.
Check for parallel write_todos tool calls and return errors if detected.
Update the system message to include the todo system prompt.
Update the system message to include the todo system prompt.
Run shutdown commands and release resources when an agent completes.
Async run shutdown commands and release resources when an agent completes.
Intercept tool execution for retries, monitoring, or modification.
Intercept and control async tool execution via handler callback.