summarization

Create a Deep Agents SummarizationMiddleware with model-aware defaults.

Why this exists in `deepagents`

The Deep Agents SummarizationMiddleware wraps langchain.agents.middleware.SummarizationMiddleware to add behavior long-running, file-aware agents need. Prefer LangChain's middleware directly if none of the below apply:

Backend offload of evicted history. Evicted messages are appended to /conversation_history/{thread_id}.md (default path) on the configured backend before the summary replaces them, and the summary embeds that path so the agent can re-open it via read_file when FilesystemMiddleware is registered. LangChain drops evicted messages with no recovery path.

Pre-summarization tool-arg truncation. Large write_file / edit_file arguments in older messages are clipped at a lower threshold than full compaction, often reclaiming enough context to skip summarizing. Configured via truncate_args_settings.

ContextOverflowError fallback. On a provider over-budget rejection the middleware summarizes and retries instead of bubbling the error up.

Non-mutating message state. Summarization is tracked in a private _summarization_event field via wrap_model_call, leaving state["messages"] intact. LangChain rewrites it with RemoveMessage(id=REMOVE_ALL_MESSAGES) from before_model. Preserving the raw log enables replay, evals, and shared state with SummarizationToolMiddleware's compact_conversation tool.

Auto-selected trigger/keep thresholds. LangChain accepts fraction-based thresholds but defaults to trigger=None and keep=("messages", 20). This factory picks fraction-based defaults from the model's profile when max_input_tokens is exposed, falling back to fixed counts otherwise — see compute_summarization_defaults.

function

create_summarization_tool_middleware

Create a SummarizationToolMiddleware with model-aware defaults.

Convenience factory: builds a SummarizationMiddleware via create_summarization_middleware and wraps it in a SummarizationToolMiddleware. Saves a step and accepts a model string.

What you get

Only the tool layer is registered — the wrapped SummarizationMiddleware is the engine the tool calls into, not a middleware that runs on its own. The agent gains:

A compact_conversation tool to compact its own context window
A system-prompt nudge hinting when to call it
An eligibility gate at ~50% of the auto-summarization trigger so the tool refuses to compact too early

Pairing with auto-summarization

For automatic summarization at the trigger threshold, also register a SummarizationMiddleware. create_deep_agent adds one by default, so dropping create_summarization_tool_middleware(...) into its middleware=[...] gives you both layers; they share state via the _summarization_event key.

Summarization middleware for automatic and tool-based conversation compaction.

This module provides two middleware classes and a convenience factory:

SummarizationMiddleware — automatically compacts the conversation when token usage exceeds a configurable threshold.

Older messages are summarized via an LLM call and the full history is offloaded to a backend for later retrieval.
SummarizationToolMiddleware — exposes a compact_conversation tool that lets the agent (or a human-in-the-loop approval flow) trigger compaction on demand.

Composes with a SummarizationMiddleware instance and reuses its summarization engine.
create_summarization_tool_middleware — convenience factory that creates both middleware layers with model-aware defaults.

Usage

from deepagents import create_deep_agent
from deepagents.middleware.summarization import (
    SummarizationMiddleware,
    SummarizationToolMiddleware,
)
from deepagents.backends import FilesystemBackend

backend = FilesystemBackend(root_dir="/data")

summ = SummarizationMiddleware(
    model="gpt-5.5",
    backend=backend,
    trigger=("fraction", 0.85),
    keep=("fraction", 0.10),
)
tool_mw = SummarizationToolMiddleware(summ)

agent = create_deep_agent(middleware=[summ, tool_mw])

Storage

Offloaded messages are stored as markdown at /conversation_history/{thread_id}.md.

Each summarization event appends a new section to this file, creating a running log of all evicted messages. Base64 media in evicted messages is written separately under <artifacts_root>/conversation_history/media/ and referenced by path from the markdown, so the history file stays text-only (see _offload_inline_media for the exact path).

Summary prompt

DEEPAGENTS_DEFAULT_SUMMARY_PROMPT augments LangChain's DEFAULT_SUMMARY_PROMPT with a deepagents-specific addendum explaining the media reference tags that the offloading behavior introduces, so the summarizing model knows to preserve them. It is the default summary_prompt for SummarizationMiddleware and both factories.

What you get

Pairing with auto-summarization

LangChain Assistant

Menu

Attributes

Functions

Classes

Type Aliases

Why this exists in `deepagents`

Usage

Storage

Summary prompt

Menu

summarization

Attributes

Functions

Classes

Type Aliases

Why this exists in deepagents

What you get

Pairing with auto-summarization

Usage

Storage

Summary prompt

Why this exists in `deepagents`