SummarizationMiddleware(
self,
model: str | BaseChatModel,
*,
trigger: ContextSizeThe name of the middleware instance.
Logic to run before the agent execution starts.
Async logic to run before the agent execution starts.
Logic to run after the model is called.
Async logic to run after the model is called.
Intercept and control model execution via handler callback.
Intercept and control async model execution via handler callback.
Logic to run after the agent execution completes.
Async logic to run after the agent execution completes.
Intercept tool execution for retries, monitoring, or modification.
Intercept and control async tool execution via handler callback.
| Name | Type | Description |
|---|---|---|
model* | str | BaseChatModel | The language model to use for generating summaries. |
trigger | ContextSize | list[ContextSize] | None | Default: NoneOne or more thresholds that trigger summarization. Provide a single
Example
See |
keep | ContextSize | Default: ('messages', _DEFAULT_MESSAGES_TO_KEEP) |
token_counter | TokenCounter | Default: count_tokens_approximately |
summary_prompt | str | Default: DEFAULT_SUMMARY_PROMPT |
trim_tokens_to_summarize | int | None | Default: _DEFAULT_TRIM_TOKEN_LIMIT |
| Name | Type |
|---|---|
| model | str | BaseChatModel |
| trigger | ContextSize | list[ContextSize] | None |
| keep | ContextSize |
| token_counter | TokenCounter |
| summary_prompt | str |
| trim_tokens_to_summarize | int | None |
Summarizes conversation history when token limits are approached.
This middleware monitors message token counts and automatically summarizes older messages when a threshold is reached, preserving recent messages and maintaining context continuity by ensuring AI/Tool message pairs remain together.
Function to count tokens in messages.
Prompt template for generating summaries.
Maximum tokens to keep when preparing messages for the summarization call.
Pass None to skip trimming entirely.
Context retention policy applied after summarization.
Provide a ContextSize
tuple to specify how much history to preserve.
Defaults to keeping the most recent 20 messages.
Does not support multiple values like trigger.
# Keep the most recent 20 messages
("messages", 20)
# Keep the most recent 3000 tokens
("tokens", 3000)
# Keep the most recent 30% of the model's max input tokens
("fraction", 0.3)