LangChain Reference home pageLangChain ReferenceLangChain Reference
  • GitHub
  • Main Docs
Deep Agents
LangChain
LangGraph
Integrations
LangSmith
LangChain
  • Browser
  • Universal
  • Hub
  • Node
  • Load
  • Serializable
  • Encoder Backed
  • File System
  • In Memory
  • Tools
LangChain Core
  • Agents
  • Caches
  • Base
  • Dispatch
  • Web
  • Manager
  • Promises
  • Chat History
  • Context
  • Base
  • Langsmith
  • Documents
  • Embeddings
  • Errors
  • Example Selectors
  • Indexing
  • Base
  • Chat Models
  • Compat
  • Event
  • Llms
  • Openai Completions Stream
  • Profile
  • Stream
  • Structured Output
  • Load
  • Serializable
  • Memory
  • Messages
  • Tool
  • Output Parsers
  • Openai Functions
  • Openai Tools
  • Outputs
  • Prompt Values
  • Prompts
  • Retrievers
  • Document Compressors
  • Runnables
  • Graph
  • Singletons
  • Stores
  • Structured Query
  • Testing
  • Tools
  • Base
  • Console
  • Log Stream
  • Run Collector
  • Tracer Langchain
  • Stream
  • Async Caller
  • Chunk Array
  • Context
  • Env
  • Event Source Parse
  • Format
  • Function Calling
  • Hash
  • Json Patch
  • Json Schema
  • Math
  • Ssrf
  • Standard Schema
  • Stream
  • Testing
  • Tiktoken
  • Types
  • Uuid
  • Vectorstores
Text Splitters
MCP Adapters
⌘I

LangChain Assistant

Ask a question to get started

Enter to send•Shift+Enter new line

Menu

LangChain
BrowserUniversalHubNodeLoadSerializableEncoder BackedFile SystemIn MemoryTools
LangChain Core
AgentsCachesBaseDispatchWebManagerPromisesChat HistoryContextBaseLangsmithDocumentsEmbeddingsErrorsExample SelectorsIndexingBaseChat ModelsCompatEventLlmsOpenai Completions StreamProfileStreamStructured OutputLoadSerializableMemoryMessagesToolOutput ParsersOpenai FunctionsOpenai ToolsOutputsPrompt ValuesPromptsRetrieversDocument CompressorsRunnablesGraphSingletonsStoresStructured QueryTestingToolsBaseConsoleLog StreamRun CollectorTracer LangchainStreamAsync CallerChunk ArrayContextEnvEvent Source ParseFormatFunction CallingHashJson PatchJson SchemaMathSsrfStandard SchemaStreamTestingTiktokenTypesUuidVectorstores
Text Splitters
MCP Adapters
Language
Theme
JavaScriptlangchainindexbedrockPromptCachingMiddleware
Function●Since v1.4

bedrockPromptCachingMiddleware

Copy
bedrockPromptCachingMiddleware(
  middlewareOptions: Partial<__type>
): AgentMiddleware<undefined, ZodObject<__type, "strip", ZodTypeAny
View source on GitHub
,
__type
,
__type
>
,
__type
,
readonly
ClientTool
|
ServerTool
[
]
,
readonly
[
]
>

Parameters

NameTypeDescription
middlewareOptionsPartial<__type>

Example 1

Basic usage with default settings

Example 2

Custom configuration for longer conversations

Example 3

Conditional caching based on runtime context

Example 4

Optimal setup for customer support chatbot

Creates a prompt caching middleware for AWS Bedrock Converse models to optimize API usage.

This middleware automatically enables Bedrock's prompt caching when using AWS Bedrock Converse models. This can significantly reduce costs for applications with repetitive prompts, long system messages, or extensive conversation histories.

How It Works

The middleware intercepts model requests and sets a cache control signal that ChatBedrockConverse translates into Bedrock cachePoint breakpoints. Cache points are inserted after the system prompt, after the tool definitions, and after the final message, so the stable prefix of each request is cached. On subsequent requests with a matching prefix, the cached representations are reused, skipping redundant token processing. Exact placement varies by model (e.g. Amazon Nova models cache fewer breakpoints and ignore the "1h" TTL).

Benefits

  • Cost Reduction: Avoid reprocessing the same tokens repeatedly
  • Lower Latency: Cached prompts are processed faster as embeddings are pre-computed
  • Better Scalability: Reduced computational load enables handling more requests
  • Consistent Performance: Stable response times for repetitive queries
  • Bedrock Converse Only: This middleware only applies caching to AWS Bedrock Converse models. Other providers are handled per unsupportedModelBehavior
  • Supported Families: Bedrock prompt caching is only available on the Anthropic Claude and Amazon Nova model families. Other Bedrock Converse models (e.g. Mistral, Cohere, Meta) reject cache points at request time, so they are treated as unsupported and routed through unsupportedModelBehavior
  • Automatic Application: Caching is applied automatically when the message count reaches minMessagesToCache
  • TTL Options: Only supports "5m" (5 minutes) and "1h" (1 hour) as TTL values; actual support varies by model
  • Best Use Cases: Long system prompts, multi-turn conversations, repetitive queries, RAG applications

Configuration options for the caching behavior

Copy
import { createAgent } from "langchain";
import { bedrockPromptCachingMiddleware } from "langchain";

const agent = createAgent({
  model: "bedrock:anthropic.claude-haiku-4-5-20251001-v1:0",
  middleware: [
    bedrockPromptCachingMiddleware()
  ]
});
Copy
const cachingMiddleware = bedrockPromptCachingMiddleware({
  ttl: "1h",  // Cache for 1 hour instead of default 5 minutes
  minMessagesToCache: 5  // Only cache after 5 messages
});

const agent = createAgent({
  model: "bedrock:anthropic.claude-haiku-4-5-20251001-v1:0",
  systemPrompt: "You are a helpful assistant with deep knowledge of...", // Long system prompt
  middleware: [cachingMiddleware]
});
Copy
const agent = createAgent({
  model: "bedrock:anthropic.claude-haiku-4-5-20251001-v1:0",
  middleware: [
    bedrockPromptCachingMiddleware({
      enableCaching: true,
      ttl: "5m"
    })
  ]
});

// Disable caching for specific requests
await agent.invoke(
  { messages: [new HumanMessage("Process this without caching")] },
  {
    configurable: {
      middleware_context: { enableCaching: false }
    }
  }
);
Copy
const supportAgent = createAgent({
  model: "bedrock:anthropic.claude-haiku-4-5-20251001-v1:0",
  systemPrompt: `You are a customer support agent for ACME Corp.

    Company policies:
    - Always be polite and professional
    - Refer to knowledge base for product information
    - Escalate billing issues to human agents
    ... (extensive policies and guidelines)
  `,
  tools: [searchKnowledgeBase, createTicket, checkOrderStatus],
  middleware: [
    bedrockPromptCachingMiddleware({
      ttl: "1h",  // Long TTL for stable system prompt
      minMessagesToCache: 1  // Cache immediately due to large system prompt
    })
  ]
});