LangChain Reference home pageLangChain ReferenceLangChain Reference
  • GitHub
  • Main Docs
Deep Agents
LangChain
LangGraph
Integrations
LangSmith
LangChain
  • Browser
  • Universal
  • Hub
  • Node
  • Load
  • Serializable
  • Encoder Backed
  • File System
  • In Memory
LangChain Core
  • Agents
  • Caches
  • Base
  • Dispatch
  • Web
  • Manager
  • Promises
  • Chat History
  • Context
  • Base
  • Langsmith
  • Documents
  • Embeddings
  • Errors
  • Example Selectors
  • Indexing
  • Base
  • Chat Models
  • Llms
  • Profile
  • Structured Output
  • Load
  • Serializable
  • Memory
  • Messages
  • Tool
  • Output Parsers
  • Openai Functions
  • Openai Tools
  • Outputs
  • Prompt Values
  • Prompts
  • Retrievers
  • Document Compressors
  • Runnables
  • Graph
  • Singletons
  • Stores
  • Structured Query
  • Testing
  • Tools
  • Base
  • Console
  • Log Stream
  • Run Collector
  • Tracer Langchain
  • Stream
  • Async Caller
  • Chunk Array
  • Context
  • Env
  • Event Source Parse
  • Format
  • Function Calling
  • Hash
  • Json Patch
  • Json Schema
  • Math
  • Ssrf
  • Standard Schema
  • Stream
  • Testing
  • Tiktoken
  • Types
  • Vectorstores
Text Splitters
MCP Adapters
⌘I

LangChain Assistant

Ask a question to get started

Enter to send•Shift+Enter new line

Menu

LangChain
BrowserUniversalHubNodeLoadSerializableEncoder BackedFile SystemIn Memory
LangChain Core
AgentsCachesBaseDispatchWebManagerPromisesChat HistoryContextBaseLangsmithDocumentsEmbeddingsErrorsExample SelectorsIndexingBaseChat ModelsLlmsProfileStructured OutputLoadSerializableMemoryMessagesToolOutput ParsersOpenai FunctionsOpenai ToolsOutputsPrompt ValuesPromptsRetrieversDocument CompressorsRunnablesGraphSingletonsStoresStructured QueryTestingToolsBaseConsoleLog StreamRun CollectorTracer LangchainStreamAsync CallerChunk ArrayContextEnvEvent Source ParseFormatFunction CallingHashJson PatchJson SchemaMathSsrfStandard SchemaStreamTestingTiktokenTypesVectorstores
Text Splitters
MCP Adapters
Language
Theme
JavaScriptlangchainbrowsermodelCallLimitMiddleware
Functionā—Since v1.2

modelCallLimitMiddleware

Creates a middleware to limit the number of model calls at both thread and run levels.

This middleware helps prevent excessive model API calls by enforcing limits on how many times the model can be invoked. It supports two types of limits:

  • Thread-level limit: Restricts the total number of model calls across an entire conversation thread
  • Run-level limit: Restricts the number of model calls within a single agent run/invocation

How It Works

The middleware intercepts model requests before they are sent and checks the current call counts against the configured limits. If either limit is exceeded, it throws a ModelCallLimitMiddlewareError to stop execution and prevent further API calls.

Use Cases

  • Cost Control: Prevent runaway costs from excessive model calls in production
  • Testing: Ensure agents don't make too many calls during development/testing
  • Safety: Limit potential infinite loops or recursive agent behaviors
  • Rate Limiting: Enforce organizational policies on model usage per conversation
Copy
modelCallLimitMiddleware(
  middlewareOptions: Partial<__type>
): AgentMiddleware<ZodObject<__type, "strip", ZodTypeAny, __type, __type>, ZodObject<__type, "strip", ZodTypeAny, __type, __type>, __type, readonly ClientTool | ServerTool[]>

Used in Docs

  • Prebuilt middleware

Parameters

NameTypeDescription
middlewareOptionsPartial<__type>

Configuration options for the call limits

Example 1

Copy
import { createAgent, modelCallLimitMiddleware } from "langchain";

// Limit to 10 calls per thread and 3 calls per run
const agent = createAgent({
  model: "openai:gpt-4o-mini",
  tools: [myTool],
  middleware: [
    modelCallLimitMiddleware({
      threadLimit: 10,
      runLimit: 3
    })
  ]
});

Example 2

Copy
// Limits can also be configured at runtime via context
const result = await agent.invoke(
  { messages: ["Hello"] },
  {
    configurable: {
      threadLimit: 5  // Override the default limit for this run
    }
  }
);
View source on GitHub