Middleware
Reference docs
This page contains reference documentation for Middleware. See the docs for conceptual guides, tutorials, and examples on using Middleware.
Middleware classes¶
LangChain provides prebuilt middleware for common agent use cases:
| CLASS | DESCRIPTION |
|---|---|
SummarizationMiddleware |
Automatically summarize conversation history when approaching token limits |
HumanInTheLoopMiddleware |
Pause execution for human approval of tool calls |
ModelCallLimitMiddleware |
Limit the number of model calls to prevent excessive costs |
ToolCallLimitMiddleware |
Control tool execution by limiting call counts |
ModelFallbackMiddleware |
Automatically fallback to alternative models when primary fails |
PIIMiddleware |
Detect and handle Personally Identifiable Information |
TodoListMiddleware |
Equip agents with task planning and tracking capabilities |
LLMToolSelectorMiddleware |
Use an LLM to select relevant tools before calling main model |
ToolRetryMiddleware |
Automatically retry failed tool calls with exponential backoff |
LLMToolEmulator |
Emulate tool execution using LLM for testing purposes |
ContextEditingMiddleware |
Manage conversation context by trimming or clearing tool uses |
ShellToolMiddleware |
Expose a persistent shell session to agents for command execution |
FilesystemFileSearchMiddleware |
Provide Glob and Grep search tools over filesystem files |
AgentMiddleware |
Base middleware class for creating custom middleware |
Decorators¶
Create custom middleware using these decorators:
| DECORATOR | DESCRIPTION |
|---|---|
@before_agent |
Execute logic before agent execution starts |
@before_model |
Execute logic before each model call |
@after_model |
Execute logic after each model receives a response |
@after_agent |
Execute logic after agent execution completes |
@wrap_model_call |
Wrap and intercept model calls |
@wrap_tool_call |
Wrap and intercept tool calls |
@dynamic_prompt |
Generate dynamic system prompts based on request context |
@hook_config |
Configure hook behavior (e.g., conditional routing) |
Types and utilities¶
Core types for building middleware:
| TYPE | DESCRIPTION |
|---|---|
AgentState |
State container for agent execution |
ModelRequest |
Request details passed to model calls |
ModelResponse |
Response details from model calls |
ClearToolUsesEdit |
Utility for clearing tool usage history from context |
InterruptOnConfig |
Configuration for human-in-the-loop interruptions |
SummarizationMiddleware types:
| TYPE | DESCRIPTION |
|---|---|
ContextSize |
Union type |
ContextFraction |
Summarize at fraction of total context |
ContextTokens |
Summarize at token threshold |
ContextMessages |
Summarize at message threshold |
SummarizationMiddleware
¶
SummarizationMiddleware(
model: str | BaseChatModel,
*,
trigger: ContextSize | list[ContextSize] | None = None,
keep: ContextSize = ("messages", _DEFAULT_MESSAGES_TO_KEEP),
token_counter: TokenCounter = count_tokens_approximately,
summary_prompt: str = DEFAULT_SUMMARY_PROMPT,
trim_tokens_to_summarize: int | None = _DEFAULT_TRIM_TOKEN_LIMIT,
**deprecated_kwargs: Any,
)
Bases: AgentMiddleware
Summarizes conversation history when token limits are approached.
This middleware monitors message token counts and automatically summarizes older messages when a threshold is reached, preserving recent messages and maintaining context continuity by ensuring AI/Tool message pairs remain together.
Initialize summarization middleware.
| PARAMETER | DESCRIPTION |
|---|---|
model
|
The language model to use for generating summaries.
TYPE:
|
trigger
|
One or more thresholds that trigger summarization. Provide a single
Example See
TYPE:
|
keep
|
Context retention policy applied after summarization. Provide a Defaults to keeping the most recent Does not support multiple values like
TYPE:
|
token_counter
|
Function to count tokens in messages.
TYPE:
|
summary_prompt
|
Prompt template for generating summaries.
TYPE:
|
trim_tokens_to_summarize
|
Maximum tokens to keep when preparing messages for the summarization call. Pass
TYPE:
|
HumanInTheLoopMiddleware
¶
HumanInTheLoopMiddleware(
interrupt_on: dict[str, bool | InterruptOnConfig],
*,
description_prefix: str = "Tool execution requires approval",
)
Bases: AgentMiddleware[StateT, ContextT]
Human in the loop middleware.
Initialize the human in the loop middleware.
| PARAMETER | DESCRIPTION |
|---|---|
interrupt_on
|
Mapping of tool name to allowed actions. If a tool doesn't have an entry, it's auto-approved by default.
TYPE:
|
description_prefix
|
The prefix to use when constructing action requests. This is used to provide context about the tool call and the action being requested. Not used if a tool has a
TYPE:
|
ModelCallLimitMiddleware
¶
ModelCallLimitMiddleware(
*,
thread_limit: int | None = None,
run_limit: int | None = None,
exit_behavior: Literal["end", "error"] = "end",
)
Bases: AgentMiddleware[ModelCallLimitState, Any]
Tracks model call counts and enforces limits.
This middleware monitors the number of model calls made during agent execution and can terminate the agent when specified limits are reached. It supports both thread-level and run-level call counting with configurable exit behaviors.
Thread-level: The middleware tracks the number of model calls and persists call count across multiple runs (invocations) of the agent.
Run-level: The middleware tracks the number of model calls made during a single run (invocation) of the agent.
Example
from langchain.agents.middleware.call_tracking import ModelCallLimitMiddleware
from langchain.agents import create_agent
# Create middleware with limits
call_tracker = ModelCallLimitMiddleware(thread_limit=10, run_limit=5, exit_behavior="end")
agent = create_agent("openai:gpt-4o", middleware=[call_tracker])
# Agent will automatically jump to end when limits are exceeded
result = await agent.invoke({"messages": [HumanMessage("Help me with a task")]})
Initialize the call tracking middleware.
| PARAMETER | DESCRIPTION |
|---|---|
thread_limit
|
Maximum number of model calls allowed per thread.
TYPE:
|
run_limit
|
Maximum number of model calls allowed per run.
TYPE:
|
exit_behavior
|
What to do when limits are exceeded.
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If both limits are |
state_schema
class-attribute
instance-attribute
¶
The schema for state passed to the middleware nodes.
ToolCallLimitMiddleware
¶
ToolCallLimitMiddleware(
*,
tool_name: str | None = None,
thread_limit: int | None = None,
run_limit: int | None = None,
exit_behavior: ExitBehavior = "continue",
)
Bases: AgentMiddleware[ToolCallLimitState[ResponseT], ContextT], Generic[ResponseT, ContextT]
Track tool call counts and enforces limits during agent execution.
This middleware monitors the number of tool calls made and can terminate or restrict execution when limits are exceeded. It supports both thread-level (persistent across runs) and run-level (per invocation) call counting.
Configuration
exit_behavior: How to handle when limits are exceeded'continue': Block exceeded tools, let execution continue (default)'error': Raise an exception'end': Stop immediately with aToolMessage+ AI message for the single tool call that exceeded the limit (raisesNotImplementedErrorif there are other pending tool calls (due to parallel tool calling).
Examples:
Continue execution with blocked tools (default)
from langchain.agents.middleware.tool_call_limit import ToolCallLimitMiddleware
from langchain.agents import create_agent
# Block exceeded tools but let other tools and model continue
limiter = ToolCallLimitMiddleware(
thread_limit=20,
run_limit=10,
exit_behavior="continue", # default
)
agent = create_agent("openai:gpt-4o", middleware=[limiter])
Stop immediately when limit exceeded
Raise exception on limit
# Strict limit with exception handling
limiter = ToolCallLimitMiddleware(
tool_name="search", thread_limit=5, exit_behavior="error"
)
agent = create_agent("openai:gpt-4o", middleware=[limiter])
try:
result = await agent.invoke({"messages": [HumanMessage("Task")]})
except ToolCallLimitExceededError as e:
print(f"Search limit exceeded: {e}")
Initialize the tool call limit middleware.
| PARAMETER | DESCRIPTION |
|---|---|
tool_name
|
Name of the specific tool to limit. If
TYPE:
|
thread_limit
|
Maximum number of tool calls allowed per thread.
TYPE:
|
run_limit
|
Maximum number of tool calls allowed per run.
TYPE:
|
exit_behavior
|
How to handle when limits are exceeded.
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If both limits are |
state_schema
class-attribute
instance-attribute
¶
The schema for state passed to the middleware nodes.
ModelFallbackMiddleware
¶
ModelFallbackMiddleware(
first_model: str | BaseChatModel, *additional_models: str | BaseChatModel
)
Bases: AgentMiddleware
Automatic fallback to alternative models on errors.
Retries failed model calls with alternative models in sequence until
success or all models exhausted. Primary model specified in create_agent.
Example
from langchain.agents.middleware.model_fallback import ModelFallbackMiddleware
from langchain.agents import create_agent
fallback = ModelFallbackMiddleware(
"openai:gpt-4o-mini", # Try first on error
"anthropic:claude-sonnet-4-5-20250929", # Then this
)
agent = create_agent(
model="openai:gpt-4o", # Primary model
middleware=[fallback],
)
# If primary fails: tries gpt-4o-mini, then claude-sonnet-4-5-20250929
result = await agent.invoke({"messages": [HumanMessage("Hello")]})
Initialize model fallback middleware.
| PARAMETER | DESCRIPTION |
|---|---|
first_model
|
First fallback model (string name or instance).
TYPE:
|
*additional_models
|
Additional fallbacks in order.
TYPE:
|
PIIMiddleware
¶
PIIMiddleware(
pii_type: Literal["email", "credit_card", "ip", "mac_address", "url"] | str,
*,
strategy: Literal["block", "redact", "mask", "hash"] = "redact",
detector: Callable[[str], list[PIIMatch]] | str | None = None,
apply_to_input: bool = True,
apply_to_output: bool = False,
apply_to_tool_results: bool = False,
)
Bases: AgentMiddleware
Detect and handle Personally Identifiable Information (PII) in conversations.
This middleware detects common PII types and applies configurable strategies to handle them. It can detect emails, credit cards, IP addresses, MAC addresses, and URLs in both user input and agent output.
Built-in PII types:
email: Email addressescredit_card: Credit card numbers (validated with Luhn algorithm)ip: IP addresses (validated with stdlib)mac_address: MAC addressesurl: URLs (bothhttp/httpsand bare URLs)
Strategies:
block: Raise an exception when PII is detectedredact: Replace PII with[REDACTED_TYPE]placeholdersmask: Partially mask PII (e.g.,****-****-****-1234for credit card)hash: Replace PII with deterministic hash (e.g.,<email_hash:a1b2c3d4>)
Strategy Selection Guide:
| Strategy | Preserves Identity? | Best For |
|---|---|---|
block |
N/A | Avoid PII completely |
redact |
No | General compliance, log sanitization |
mask |
No | Human readability, customer service UIs |
hash |
Yes (pseudonymous) | Analytics, debugging |
Example
from langchain.agents.middleware import PIIMiddleware
from langchain.agents import create_agent
# Redact all emails in user input
agent = create_agent(
"openai:gpt-5",
middleware=[
PIIMiddleware("email", strategy="redact"),
],
)
# Use different strategies for different PII types
agent = create_agent(
"openai:gpt-4o",
middleware=[
PIIMiddleware("credit_card", strategy="mask"),
PIIMiddleware("url", strategy="redact"),
PIIMiddleware("ip", strategy="hash"),
],
)
# Custom PII type with regex
agent = create_agent(
"openai:gpt-5",
middleware=[
PIIMiddleware("api_key", detector=r"sk-[a-zA-Z0-9]{32}", strategy="block"),
],
)
Initialize the PII detection middleware.
| PARAMETER | DESCRIPTION |
|---|---|
pii_type
|
Type of PII to detect. Can be a built-in type (
TYPE:
|
strategy
|
How to handle detected PII. Options:
TYPE:
|
detector
|
Custom detector function or regex pattern.
TYPE:
|
apply_to_input
|
Whether to check user messages before model call.
TYPE:
|
apply_to_output
|
Whether to check AI messages after model call.
TYPE:
|
apply_to_tool_results
|
Whether to check tool result messages after tool execution.
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If |
TodoListMiddleware
¶
TodoListMiddleware(
*,
system_prompt: str = WRITE_TODOS_SYSTEM_PROMPT,
tool_description: str = WRITE_TODOS_TOOL_DESCRIPTION,
)
Bases: AgentMiddleware
Middleware that provides todo list management capabilities to agents.
This middleware adds a write_todos tool that allows agents to create and manage
structured task lists for complex multi-step operations. It's designed to help
agents track progress, organize complex tasks, and provide users with visibility
into task completion status.
The middleware automatically injects system prompts that guide the agent on when and how to use the todo functionality effectively.
Example
from langchain.agents.middleware.todo import TodoListMiddleware
from langchain.agents import create_agent
agent = create_agent("openai:gpt-4o", middleware=[TodoListMiddleware()])
# Agent now has access to write_todos tool and todo state tracking
result = await agent.invoke({"messages": [HumanMessage("Help me refactor my codebase")]})
print(result["todos"]) # Array of todo items with status tracking
Initialize the TodoListMiddleware with optional custom prompts.
| PARAMETER | DESCRIPTION |
|---|---|
system_prompt
|
Custom system prompt to guide the agent on using the todo tool.
TYPE:
|
tool_description
|
Custom description for the
TYPE:
|
state_schema
class-attribute
instance-attribute
¶
The schema for state passed to the middleware nodes.
LLMToolSelectorMiddleware
¶
LLMToolSelectorMiddleware(
*,
model: str | BaseChatModel | None = None,
system_prompt: str = DEFAULT_SYSTEM_PROMPT,
max_tools: int | None = None,
always_include: list[str] | None = None,
)
Bases: AgentMiddleware
Uses an LLM to select relevant tools before calling the main model.
When an agent has many tools available, this middleware filters them down to only the most relevant ones for the user's query. This reduces token usage and helps the main model focus on the right tools.
Examples:
Limit to 3 tools
Use a smaller model for selection
Initialize the tool selector.
| PARAMETER | DESCRIPTION |
|---|---|
model
|
Model to use for selection. If not provided, uses the agent's main model. Can be a model identifier string or
TYPE:
|
system_prompt
|
Instructions for the selection model.
TYPE:
|
max_tools
|
Maximum number of tools to select. If the model selects more, only the first If not specified, there is no limit.
TYPE:
|
always_include
|
Tool names to always include regardless of selection. These do not count against the |
ToolRetryMiddleware
¶
ToolRetryMiddleware(
*,
max_retries: int = 2,
tools: list[BaseTool | str] | None = None,
retry_on: RetryOn = (Exception,),
on_failure: OnFailure = "continue",
backoff_factor: float = 2.0,
initial_delay: float = 1.0,
max_delay: float = 60.0,
jitter: bool = True,
)
Bases: AgentMiddleware
Middleware that automatically retries failed tool calls with configurable backoff.
Supports retrying on specific exceptions and exponential backoff.
Examples:
Basic usage with default settings (2 retries, exponential backoff)
Retry specific exceptions only
Custom exception filtering
Apply to specific tools with custom error handling
Apply to specific tools using BaseTool instances
Constant backoff (no exponential growth)
Raise exception on failure
Initialize ToolRetryMiddleware.
| PARAMETER | DESCRIPTION |
|---|---|
max_retries
|
Maximum number of retry attempts after the initial call. Must be
TYPE:
|
tools
|
Optional list of tools or tool names to apply retry logic to. Can be a list of If |
retry_on
|
Either a tuple of exception types to retry on, or a callable
that takes an exception and returns Default is to retry on all exceptions.
TYPE:
|
on_failure
|
Behavior when all retries are exhausted. Options:
Deprecated values (for backwards compatibility):
TYPE:
|
backoff_factor
|
Multiplier for exponential backoff. Each retry waits Set to
TYPE:
|
initial_delay
|
Initial delay in seconds before first retry.
TYPE:
|
max_delay
|
Maximum delay in seconds between retries. Caps exponential backoff growth.
TYPE:
|
jitter
|
Whether to add random jitter (
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If |
LLMToolEmulator
¶
LLMToolEmulator(
*,
tools: list[str | BaseTool] | None = None,
model: str | BaseChatModel | None = None,
)
Bases: AgentMiddleware
Emulates specified tools using an LLM instead of executing them.
This middleware allows selective emulation of tools for testing purposes.
By default (when tools=None), all tools are emulated. You can specify which
tools to emulate by passing a list of tool names or BaseTool instances.
Examples:
Emulate all tools (default behavior)
Emulate specific tools by name
Use a custom model for emulation
Emulate specific tools by passing tool instances
Initialize the tool emulator.
| PARAMETER | DESCRIPTION |
|---|---|
tools
|
List of tool names ( If If empty list, no tools will be emulated. |
model
|
Model to use for emulation. Defaults to Can be a model identifier string or
TYPE:
|
ContextEditingMiddleware
¶
ContextEditingMiddleware(
*,
edits: Iterable[ContextEdit] | None = None,
token_count_method: Literal["approximate", "model"] = "approximate",
)
Bases: AgentMiddleware
Automatically prune tool results to manage context size.
The middleware applies a sequence of edits when the total input token count exceeds configured thresholds.
Currently the ClearToolUsesEdit strategy is supported, aligning with Anthropic's
clear_tool_uses_20250919 behavior (read more).
Initialize an instance of context editing middleware.
| PARAMETER | DESCRIPTION |
|---|---|
edits
|
Sequence of edit strategies to apply. Defaults to a single
TYPE:
|
token_count_method
|
Whether to use approximate token counting (faster, less accurate) or exact counting implemented by the chat model (potentially slower, more accurate).
TYPE:
|
ShellToolMiddleware
¶
ShellToolMiddleware(
workspace_root: str | Path | None = None,
*,
startup_commands: tuple[str, ...] | list[str] | str | None = None,
shutdown_commands: tuple[str, ...] | list[str] | str | None = None,
execution_policy: BaseExecutionPolicy | None = None,
redaction_rules: tuple[RedactionRule, ...] | list[RedactionRule] | None = None,
tool_description: str | None = None,
tool_name: str = SHELL_TOOL_NAME,
shell_command: Sequence[str] | str | None = None,
env: Mapping[str, Any] | None = None,
)
Bases: AgentMiddleware[ShellToolState, Any]
Middleware that registers a persistent shell tool for agents.
The middleware exposes a single long-lived shell session. Use the execution policy to match your deployment's security posture:
HostExecutionPolicy– full host access; best for trusted environments where the agent already runs inside a container or VM that provides isolation.CodexSandboxExecutionPolicy– reuses the Codex CLI sandbox for additional syscall/filesystem restrictions when the CLI is available.DockerExecutionPolicy– launches a separate Docker container for each agent run, providing harder isolation, optional read-only root filesystems, and user remapping.
When no policy is provided the middleware defaults to HostExecutionPolicy.
Initialize an instance of ShellToolMiddleware.
| PARAMETER | DESCRIPTION |
|---|---|
workspace_root
|
Base directory for the shell session. If omitted, a temporary directory is created when the agent starts and removed when it ends. |
startup_commands
|
Optional commands executed sequentially after the session starts.
TYPE:
|
shutdown_commands
|
Optional commands executed before the session shuts down.
TYPE:
|
execution_policy
|
Execution policy controlling timeouts, output limits, and resource configuration. Defaults to
TYPE:
|
redaction_rules
|
Optional redaction rules to sanitize command output before returning it to the model.
TYPE:
|
tool_description
|
Optional override for the registered shell tool description.
TYPE:
|
tool_name
|
Name for the registered shell tool. Defaults to
TYPE:
|
shell_command
|
Optional shell executable (string) or argument sequence used to launch the persistent session. Defaults to an implementation-defined bash command. |
env
|
Optional environment variables to supply to the shell session. Values are coerced to strings before command execution. If omitted, the session inherits the parent process environment. |
FilesystemFileSearchMiddleware
¶
FilesystemFileSearchMiddleware(
*, root_path: str, use_ripgrep: bool = True, max_file_size_mb: int = 10
)
Bases: AgentMiddleware
Provides Glob and Grep search over filesystem files.
This middleware adds two tools that search through local filesystem:
- Glob: Fast file pattern matching by file path
- Grep: Fast content search using ripgrep or Python fallback
Example
Initialize the search middleware.
| PARAMETER | DESCRIPTION |
|---|---|
root_path
|
Root directory to search.
TYPE:
|
use_ripgrep
|
Whether to use Falls back to Python if
TYPE:
|
max_file_size_mb
|
Maximum file size to search in MB.
TYPE:
|
AgentMiddleware
¶
Bases: Generic[StateT, ContextT]
Base middleware class for an agent.
Subclass this and implement any of the defined methods to customize agent behavior between steps in the main agent loop.
before_agent
¶
before_agent(
func: _CallableWithStateAndRuntime[StateT, ContextT] | None = None,
*,
state_schema: type[StateT] | None = None,
tools: list[BaseTool] | None = None,
can_jump_to: list[JumpTo] | None = None,
name: str | None = None,
) -> (
Callable[
[_CallableWithStateAndRuntime[StateT, ContextT]],
AgentMiddleware[StateT, ContextT],
]
| AgentMiddleware[StateT, ContextT]
)
Decorator used to dynamically create a middleware with the before_agent hook.
| PARAMETER | DESCRIPTION |
|---|---|
func
|
The function to be decorated. Must accept:
TYPE:
|
state_schema
|
Optional custom state schema type. If not provided, uses the default
TYPE:
|
tools
|
Optional list of additional tools to register with this middleware. |
can_jump_to
|
Optional list of valid jump destinations for conditional edges. Valid values are:
TYPE:
|
name
|
Optional name for the generated middleware class. If not provided, uses the decorated function's name.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Callable[[_CallableWithStateAndRuntime[StateT, ContextT]], AgentMiddleware[StateT, ContextT]] | AgentMiddleware[StateT, ContextT]
|
Either an |
The decorated function should return:
dict[str, Any]- State updates to merge into the agent stateCommand- A command to control flow (e.g., jump to different node)None- No state updates or flow control
Examples:
Basic usage
With conditional jumping
before_model
¶
before_model(
func: _CallableWithStateAndRuntime[StateT, ContextT] | None = None,
*,
state_schema: type[StateT] | None = None,
tools: list[BaseTool] | None = None,
can_jump_to: list[JumpTo] | None = None,
name: str | None = None,
) -> (
Callable[
[_CallableWithStateAndRuntime[StateT, ContextT]],
AgentMiddleware[StateT, ContextT],
]
| AgentMiddleware[StateT, ContextT]
)
Decorator used to dynamically create a middleware with the before_model hook.
| PARAMETER | DESCRIPTION |
|---|---|
func
|
The function to be decorated. Must accept:
TYPE:
|
state_schema
|
Optional custom state schema type. If not provided, uses the default
TYPE:
|
tools
|
Optional list of additional tools to register with this middleware. |
can_jump_to
|
Optional list of valid jump destinations for conditional edges. Valid values are:
TYPE:
|
name
|
Optional name for the generated middleware class. If not provided, uses the decorated function's name.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Callable[[_CallableWithStateAndRuntime[StateT, ContextT]], AgentMiddleware[StateT, ContextT]] | AgentMiddleware[StateT, ContextT]
|
Either an |
The decorated function should return:
dict[str, Any]- State updates to merge into the agent stateCommand- A command to control flow (e.g., jump to different node)None- No state updates or flow control
Examples:
Basic usage
With conditional jumping
after_model
¶
after_model(
func: _CallableWithStateAndRuntime[StateT, ContextT] | None = None,
*,
state_schema: type[StateT] | None = None,
tools: list[BaseTool] | None = None,
can_jump_to: list[JumpTo] | None = None,
name: str | None = None,
) -> (
Callable[
[_CallableWithStateAndRuntime[StateT, ContextT]],
AgentMiddleware[StateT, ContextT],
]
| AgentMiddleware[StateT, ContextT]
)
Decorator used to dynamically create a middleware with the after_model hook.
| PARAMETER | DESCRIPTION |
|---|---|
func
|
The function to be decorated. Must accept:
TYPE:
|
state_schema
|
Optional custom state schema type. If not provided, uses the default
TYPE:
|
tools
|
Optional list of additional tools to register with this middleware. |
can_jump_to
|
Optional list of valid jump destinations for conditional edges. Valid values are:
TYPE:
|
name
|
Optional name for the generated middleware class. If not provided, uses the decorated function's name.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Callable[[_CallableWithStateAndRuntime[StateT, ContextT]], AgentMiddleware[StateT, ContextT]] | AgentMiddleware[StateT, ContextT]
|
Either an |
The decorated function should return:
dict[str, Any]- State updates to merge into the agent stateCommand- A command to control flow (e.g., jump to different node)None- No state updates or flow control
Examples:
Basic usage for logging model responses
after_agent
¶
after_agent(
func: _CallableWithStateAndRuntime[StateT, ContextT] | None = None,
*,
state_schema: type[StateT] | None = None,
tools: list[BaseTool] | None = None,
can_jump_to: list[JumpTo] | None = None,
name: str | None = None,
) -> (
Callable[
[_CallableWithStateAndRuntime[StateT, ContextT]],
AgentMiddleware[StateT, ContextT],
]
| AgentMiddleware[StateT, ContextT]
)
Decorator used to dynamically create a middleware with the after_agent hook.
Async version is aafter_agent.
| PARAMETER | DESCRIPTION |
|---|---|
func
|
The function to be decorated. Must accept:
TYPE:
|
state_schema
|
Optional custom state schema type. If not provided, uses the default
TYPE:
|
tools
|
Optional list of additional tools to register with this middleware. |
can_jump_to
|
Optional list of valid jump destinations for conditional edges. Valid values are:
TYPE:
|
name
|
Optional name for the generated middleware class. If not provided, uses the decorated function's name.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Callable[[_CallableWithStateAndRuntime[StateT, ContextT]], AgentMiddleware[StateT, ContextT]] | AgentMiddleware[StateT, ContextT]
|
Either an |
The decorated function should return:
dict[str, Any]- State updates to merge into the agent stateCommand- A command to control flow (e.g., jump to different node)None- No state updates or flow control
Examples:
Basic usage for logging agent completion
wrap_model_call
¶
wrap_model_call(
func: _CallableReturningModelResponse[StateT, ContextT] | None = None,
*,
state_schema: type[StateT] | None = None,
tools: list[BaseTool] | None = None,
name: str | None = None,
) -> (
Callable[
[_CallableReturningModelResponse[StateT, ContextT]],
AgentMiddleware[StateT, ContextT],
]
| AgentMiddleware[StateT, ContextT]
)
Create middleware with wrap_model_call hook from a function.
Converts a function with handler callback into middleware that can intercept model calls, implement retry logic, handle errors, and rewrite responses.
| PARAMETER | DESCRIPTION |
|---|---|
func
|
Function accepting (request, handler) that calls handler(request)
to execute the model and returns Request contains state and runtime.
TYPE:
|
state_schema
|
Custom state schema. Defaults to
TYPE:
|
tools
|
Additional tools to register with this middleware. |
name
|
Middleware class name. Defaults to function name.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Callable[[_CallableReturningModelResponse[StateT, ContextT]], AgentMiddleware[StateT, ContextT]] | AgentMiddleware[StateT, ContextT]
|
|
Examples:
Basic retry logic
Model fallback
Rewrite response content (full ModelResponse)
wrap_tool_call
¶
wrap_tool_call(
func: _CallableReturningToolResponse | None = None,
*,
tools: list[BaseTool] | None = None,
name: str | None = None,
) -> Callable[[_CallableReturningToolResponse], AgentMiddleware] | AgentMiddleware
Create middleware with wrap_tool_call hook from a function.
Async version is awrap_tool_call.
Converts a function with handler callback into middleware that can intercept tool calls, implement retry logic, monitor execution, and modify responses.
| PARAMETER | DESCRIPTION |
|---|---|
func
|
Function accepting (request, handler) that calls
handler(request) to execute the tool and returns final Can be sync or async.
TYPE:
|
tools
|
Additional tools to register with this middleware. |
name
|
Middleware class name. Defaults to function name.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Callable[[_CallableReturningToolResponse], AgentMiddleware] | AgentMiddleware
|
|
Examples:
Retry logic
Async retry logic
Modify request
dynamic_prompt
¶
dynamic_prompt(
func: _CallableReturningSystemMessage[StateT, ContextT] | None = None,
) -> (
Callable[
[_CallableReturningSystemMessage[StateT, ContextT]],
AgentMiddleware[StateT, ContextT],
]
| AgentMiddleware[StateT, ContextT]
)
Decorator used to dynamically generate system prompts for the model.
This is a convenience decorator that creates middleware using wrap_model_call
specifically for dynamic prompt generation. The decorated function should return
a string that will be set as the system prompt for the model request.
| PARAMETER | DESCRIPTION |
|---|---|
func
|
The function to be decorated. Must accept:
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Callable[[_CallableReturningSystemMessage[StateT, ContextT]], AgentMiddleware[StateT, ContextT]] | AgentMiddleware[StateT, ContextT]
|
Either an |
The decorated function should return
str– The system prompt string to use for the model requestSystemMessage– A complete system message to use for the model request
Examples:
Basic usage with dynamic content:
@dynamic_prompt
def my_prompt(request: ModelRequest) -> str:
user_name = request.runtime.context.get("user_name", "User")
return f"You are a helpful assistant helping {user_name}."
Using state to customize the prompt:
@dynamic_prompt
def context_aware_prompt(request: ModelRequest) -> str:
msg_count = len(request.state["messages"])
if msg_count > 10:
return "You are in a long conversation. Be concise."
return "You are a helpful assistant."
Using with agent:
hook_config
¶
Decorator to configure hook behavior in middleware methods.
Use this decorator on before_model or after_model methods in middleware classes
to configure their behavior. Currently supports specifying which destinations they
can jump to, which establishes conditional edges in the agent graph.
| PARAMETER | DESCRIPTION |
|---|---|
can_jump_to
|
Optional list of valid jump destinations. Can be:
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Callable[[CallableT], CallableT]
|
Decorator function that marks the method with configuration metadata. |
Examples:
Using decorator on a class method
Alternative: Use the can_jump_to parameter in before_model/after_model
decorators:
ModelRequest
dataclass
¶
ModelRequest(
*,
model: BaseChatModel,
messages: list[AnyMessage],
system_message: SystemMessage | None = None,
system_prompt: str | None = None,
tool_choice: Any | None = None,
tools: list[BaseTool | dict] | None = None,
response_format: ResponseFormat | None = None,
state: AgentState | None = None,
runtime: Runtime[ContextT] | None = None,
model_settings: dict[str, Any] | None = None,
)
Model request information for the agent.
Initialize ModelRequest with backward compatibility for system_prompt.
| PARAMETER | DESCRIPTION |
|---|---|
model
|
The chat model to use.
TYPE:
|
messages
|
List of messages (excluding system prompt).
TYPE:
|
tool_choice
|
Tool choice configuration.
TYPE:
|
tools
|
List of available tools. |
response_format
|
Response format specification.
TYPE:
|
state
|
Agent state.
TYPE:
|
runtime
|
Runtime context.
TYPE:
|
model_settings
|
Additional model settings. |
system_message
|
System message instance (preferred).
TYPE:
|
system_prompt
|
System prompt string (deprecated, converted to SystemMessage).
TYPE:
|
| METHOD | DESCRIPTION |
|---|---|
__setattr__ |
Set an attribute with a deprecation warning. |
override |
Replace the request with a new request with the given overrides. |
system_prompt
property
¶
system_prompt: str | None
Get system prompt text from system_message.
| RETURNS | DESCRIPTION |
|---|---|
str | None
|
The content of the system message if present, otherwise |
__setattr__
¶
override
¶
override(**overrides: Unpack[_ModelRequestOverrides]) -> ModelRequest
Replace the request with a new request with the given overrides.
Returns a new ModelRequest instance with the specified attributes replaced.
This follows an immutable pattern, leaving the original request unchanged.
| PARAMETER | DESCRIPTION |
|---|---|
**overrides
|
Keyword arguments for attributes to override. Supported keys:
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ModelRequest
|
New |
Examples:
Override system message (preferred)
ModelResponse
dataclass
¶
ModelResponse(result: list[BaseMessage], structured_response: Any = None)
Response from model execution including messages and optional structured output.
The result will usually contain a single AIMessage, but may include an additional
ToolMessage if the model used a tool for structured output.
ClearToolUsesEdit
dataclass
¶
ClearToolUsesEdit(
trigger: int = 100000,
clear_at_least: int = 0,
keep: int = 3,
clear_tool_inputs: bool = False,
exclude_tools: Sequence[str] = (),
placeholder: str = DEFAULT_TOOL_PLACEHOLDER,
)
Bases: ContextEdit
Configuration for clearing tool outputs when token limits are exceeded.
| METHOD | DESCRIPTION |
|---|---|
apply |
Apply the clear-tool-uses strategy. |
trigger
class-attribute
instance-attribute
¶
trigger: int = 100000
Token count that triggers the edit.
clear_at_least
class-attribute
instance-attribute
¶
clear_at_least: int = 0
Minimum number of tokens to reclaim when the edit runs.
keep
class-attribute
instance-attribute
¶
keep: int = 3
Number of most recent tool results that must be preserved.
clear_tool_inputs
class-attribute
instance-attribute
¶
clear_tool_inputs: bool = False
Whether to clear the originating tool call parameters on the AI message.
exclude_tools
class-attribute
instance-attribute
¶
List of tool names to exclude from clearing.
placeholder
class-attribute
instance-attribute
¶
placeholder: str = DEFAULT_TOOL_PLACEHOLDER
Placeholder text inserted for cleared tool outputs.
apply
¶
apply(messages: list[AnyMessage], *, count_tokens: TokenCounter) -> None
Apply the clear-tool-uses strategy.
InterruptOnConfig
¶
Bases: TypedDict
Configuration for an action requiring human in the loop.
This is the configuration format used in the HumanInTheLoopMiddleware.__init__
method.
allowed_decisions
instance-attribute
¶
allowed_decisions: list[DecisionType]
The decisions that are allowed for this action.
description
instance-attribute
¶
description: NotRequired[str | _DescriptionFactory]
The description attached to the request for human input.
Can be either:
- A static string describing the approval request
- A callable that dynamically generates the description based on agent state, runtime, and tool call information
Example
# Static string description
config = ToolConfig(
allowed_decisions=["approve", "reject"],
description="Please review this tool execution"
)
# Dynamic callable description
def format_tool_description(
tool_call: ToolCall,
state: AgentState,
runtime: Runtime[ContextT]
) -> str:
import json
return (
f"Tool: {tool_call['name']}\n"
f"Arguments:\n{json.dumps(tool_call['args'], indent=2)}"
)
config = InterruptOnConfig(
allowed_decisions=["approve", "edit", "reject"],
description=format_tool_description
)
args_schema
instance-attribute
¶
args_schema: NotRequired[dict[str, Any]]
JSON schema for the args associated with the action, if edits are allowed.
ContextSize
module-attribute
¶
ContextSize = ContextFraction | ContextTokens | ContextMessages
Union type for context size specifications.
Can be either:
ContextFraction: A fraction of the model's maximum input tokens.ContextTokens: An absolute number of tokens.ContextMessages: An absolute number of messages.
Depending on use with trigger or keep parameters, this type indicates either
when to trigger summarization or how much context to retain.