langchain.js
    Preparing search index...

    Function modelCallLimitMiddleware

    • Creates a middleware to limit the number of model calls at both thread and run levels.

      This middleware helps prevent excessive model API calls by enforcing limits on how many times the model can be invoked. It supports two types of limits:

      • Thread-level limit: Restricts the total number of model calls across an entire conversation thread
      • Run-level limit: Restricts the number of model calls within a single agent run/invocation

      The middleware intercepts model requests before they are sent and checks the current call counts against the configured limits. If either limit is exceeded, it throws a ModelCallLimitMiddlewareError to stop execution and prevent further API calls.

      • Cost Control: Prevent runaway costs from excessive model calls in production
      • Testing: Ensure agents don't make too many calls during development/testing
      • Safety: Limit potential infinite loops or recursive agent behaviors
      • Rate Limiting: Enforce organizational policies on model usage per conversation

      Parameters

      • OptionalmiddlewareOptions: any

        Configuration options for the call limits

      Returns AgentMiddleware<
          ZodObject<
              {
                  runModelCallCount: ZodDefault<ZodNumber>;
                  threadModelCallCount: ZodDefault<ZodNumber>;
              },
              "strip",
              ZodTypeAny,
              { runModelCallCount: number; threadModelCallCount: number },
              { runModelCallCount?: number; threadModelCallCount?: number },
          >,
          ZodObject<
              {
                  exitBehavior: ZodOptional<ZodEnum<["error", "end"]>>;
                  runLimit: ZodOptional<ZodNumber>;
                  threadLimit: ZodOptional<ZodNumber>;
              },
              "strip",
              ZodTypeAny,
              {
                  exitBehavior?: "end"
                  | "error";
                  runLimit?: number;
                  threadLimit?: number;
              },
              {
                  exitBehavior?: "end"
                  | "error";
                  runLimit?: number;
                  threadLimit?: number;
              },
          >,
          any,
      >

      A middleware instance that can be passed to createAgent

      When either the thread or run limit is exceeded

      import { createAgent, modelCallLimitMiddleware } from "langchain";

      // Limit to 10 calls per thread and 3 calls per run
      const agent = createAgent({
      model: "openai:gpt-4o-mini",
      tools: [myTool],
      middleware: [
      modelCallLimitMiddleware({
      threadLimit: 10,
      runLimit: 3
      })
      ]
      });
      // Limits can also be configured at runtime via context
      const result = await agent.invoke(
      { messages: ["Hello"] },
      {
      configurable: {
      threadLimit: 5 // Override the default limit for this run
      }
      }
      );