Decorator used to dynamically create a middleware with the after_agent hook.
Async version is aafter_agent.
Decorator used to dynamically create a middleware with the after_model hook.
Decorator used to dynamically create a middleware with the before_agent hook.
Decorator used to dynamically create a middleware with the before_model hook.
Decorator used to dynamically generate system prompts for the model.
This is a convenience decorator that creates middleware using wrap_model_call
specifically for dynamic prompt generation. The decorated function should return
a string that will be set as the system prompt for the model request.
Decorator to configure hook behavior in middleware methods.
Use this decorator on before_model or after_model methods in middleware classes
to configure their behavior. Currently supports specifying which destinations they
can jump to, which establishes conditional edges in the agent graph.
Create middleware with wrap_model_call hook from a function.
Converts a function with handler callback into middleware that can intercept model calls, implement retry logic, handle errors, and rewrite responses.
Create middleware with wrap_tool_call hook from a function.
Async version is awrap_tool_call.
Converts a function with handler callback into middleware that can intercept tool calls, implement retry logic, monitor execution, and modify responses.
Configuration for clearing tool outputs when token limits are exceeded.
Automatically prune tool results to manage context size.
The middleware applies a sequence of edits when the total input token count exceeds configured thresholds.
Currently the ClearToolUsesEdit strategy is supported, aligning with Anthropic's
clear_tool_uses_20250919 behavior (read more).
Provides Glob and Grep search over filesystem files.
This middleware adds two tools that search through local filesystem:
Human in the loop middleware.
Configuration for an action requiring human in the loop.
This is the configuration format used in the HumanInTheLoopMiddleware.__init__
method.
Tracks model call counts and enforces limits.
This middleware monitors the number of model calls made during agent execution and can terminate the agent when specified limits are reached. It supports both thread-level and run-level call counting with configurable exit behaviors.
Thread-level: The middleware tracks the number of model calls and persists call count across multiple runs (invocations) of the agent.
Run-level: The middleware tracks the number of model calls made during a single run (invocation) of the agent.
Automatic fallback to alternative models on errors.
Retries failed model calls with alternative models in sequence until
success or all models exhausted. Primary model specified in create_agent.
Middleware that automatically retries failed model calls with configurable backoff.
Supports retrying on specific exceptions and exponential backoff.
Raised when configured to block on detected sensitive values.
Detect and handle Personally Identifiable Information (PII) in conversations.
This middleware detects common PII types and applies configurable strategies to handle them. It can detect emails, credit cards, IP addresses, MAC addresses, and URLs in both user input and agent output.
Built-in PII types:
email: Email addressescredit_card: Credit card numbers (validated with Luhn algorithm)ip: IP addresses (validated with stdlib)mac_address: MAC addressesurl: URLs (both http/https and bare URLs)Strategies:
block: Raise an exception when PII is detectedredact: Replace PII with [REDACTED_TYPE] placeholdersmask: Partially mask PII (e.g., ****-****-****-1234 for credit card)hash: Replace PII with deterministic hash (e.g., <email_hash:a1b2c3d4>)Strategy Selection Guide:
| Strategy | Preserves Identity? | Best For |
|---|---|---|
block |
N/A | Avoid PII completely |
redact |
No | General compliance, log sanitization |
mask |
No | Human readability, customer service UIs |
hash |
Yes (pseudonymous) | Analytics, debugging |
Launch the shell through the Codex CLI sandbox.
Ideal when you have the Codex CLI installed and want the additional syscall and
filesystem restrictions provided by Anthropic's Seatbelt (macOS) or Landlock/seccomp
(Linux) profiles. Commands still run on the host, but within the sandbox requested by
the CLI. If the Codex binary is unavailable or the runtime lacks the required
kernel features (e.g., Landlock inside some containers), process startup fails with a
RuntimeError.
Configure sandbox behavior via config_overrides to align with your Codex CLI
profile. This policy does not add its own resource limits; combine it with
host-level guards (cgroups, container resource limits) as needed.
Run the shell inside a dedicated Docker container.
Choose this policy when commands originate from untrusted users or you require
strong isolation between sessions. By default the workspace is bind-mounted only
when it refers to an existing non-temporary directory; ephemeral sessions run
without a mount to minimise host exposure. The container's network namespace is
disabled by default (--network none) and you can enable further hardening via
read_only_rootfs and user.
The security guarantees depend on your Docker daemon configuration. Run the agent on
a host where Docker is locked down (rootless mode, AppArmor/SELinux, etc.) and
review any additional volumes or capabilities passed through extra_run_args. The
default image is python:3.12-alpine3.19; supply a custom image if you need
preinstalled tooling.
Run the shell directly on the host process.
This policy is best suited for trusted or single-tenant environments (CI jobs, developer workstations, pre-sandboxed containers) where the agent must access the host filesystem and tooling without additional isolation. Enforces optional CPU and memory limits to prevent runaway commands but offers no filesystem or network sandboxing; commands can modify anything the process user can reach.
On Linux platforms resource limits are applied with resource.prlimit after the
shell starts. On macOS, where prlimit is unavailable, limits are set in a
preexec_fn before exec. In both cases the shell runs in its own process group
so timeouts can terminate the full subtree.
Configuration for handling a single PII type.
Middleware that registers a persistent shell tool for agents.
The middleware exposes a single long-lived shell session. Use the execution policy to match your deployment's security posture:
HostExecutionPolicy – full host access; best for trusted environments where the
agent already runs inside a container or VM that provides isolation.CodexSandboxExecutionPolicy – reuses the Codex CLI sandbox for additional
syscall/filesystem restrictions when the CLI is available.DockerExecutionPolicy – launches a separate Docker container for each agent run,
providing harder isolation, optional read-only root filesystems, and user
remapping.When no policy is provided the middleware defaults to HostExecutionPolicy.
Summarizes conversation history when token limits are approached.
This middleware monitors message token counts and automatically summarizes older messages when a threshold is reached, preserving recent messages and maintaining context continuity by ensuring AI/Tool message pairs remain together.
Middleware that provides todo list management capabilities to agents.
This middleware adds a write_todos tool that allows agents to create and manage
structured task lists for complex multi-step operations. It's designed to help
agents track progress, organize complex tasks, and provide users with visibility
into task completion status.
The middleware automatically injects system prompts that guide the agent on when
and how to use the todo functionality effectively. It also enforces that the
write_todos tool is called at most once per model turn, since the tool replaces
the entire todo list and parallel calls would create ambiguity about precedence.
Track tool call counts and enforces limits during agent execution.
This middleware monitors the number of tool calls made and can terminate or restrict execution when limits are exceeded. It supports both thread-level (persistent across runs) and run-level (per invocation) call counting.
Emulates specified tools using an LLM instead of executing them.
This middleware allows selective emulation of tools for testing purposes.
By default (when tools=None), all tools are emulated. You can specify which
tools to emulate by passing a list of tool names or BaseTool instances.
Middleware that automatically retries failed tool calls with configurable backoff.
Supports retrying on specific exceptions and exponential backoff.
Uses an LLM to select relevant tools before calling the main model.
When an agent has many tools available, this middleware filters them down to only the most relevant ones for the user's query. This reduces token usage and helps the main model focus on the right tools.
Base middleware class for an agent.
Subclass this and implement any of the defined methods to customize agent behavior between steps in the main agent loop.
State schema for the agent.
Model response with an optional 'Command' from 'wrap_model_call' middleware.
Use this to return a 'Command' alongside the model response from a 'wrap_model_call' handler. The command is applied as an additional state update after the model node completes, using the graph's reducers (e.g. 'add_messages' for the 'messages' key).
Because each 'Command' is applied through the reducer, messages in the command are added alongside the model response messages rather than replacing them. For non-reducer state fields, later commands overwrite earlier ones (outermost middleware wins over inner).
Model request information for the agent.
Response from model execution including messages and optional structured output.
The result will usually contain a single AIMessage, but may include an additional
ToolMessage if the model used a tool for structured output.
Middleware that exposes a persistent shell tool to agents.
Planning and task management middleware for agents.
Model fallback middleware for agents.
Call tracking middleware for agents.
Types for middleware and agents.
PII detection and handling middleware for agents.
Tool retry middleware for agents.
Human in the loop middleware.
Context editing middleware.
Mirrors Anthropic's context editing capabilities by clearing older tool results once the conversation grows beyond a configurable token threshold.
The implementation is intentionally model-agnostic so it can be used with any LangChain chat model.
Tool call limit middleware for agents.
Summarization middleware.
Tool emulator middleware for testing.
Model retry middleware for agents.
File search middleware for Anthropic text editor and memory tools.
This module provides Glob and Grep search tools that operate on files stored in state or filesystem.
LLM-based tool selector middleware.