chat_models

AnyMessage

A type representing any defined Message or MessageChunk type.

RunnableMap: RunnableParallel

TypeBaseModel: type[BaseModel]

LC_ID_PREFIX: str

Internal tracing/callback system identifier.

Used for:

Tracing. Every LangChain operation (LLM call, chain execution, tool use, etc.) gets a unique run_id (UUID)
Enables tracking parent-child relationships between operations

OutputParserLike: Runnable[LanguageModelOutput, T]

Functions

get_llm_cache

Get the value of the llm_cache global setting.

dumpd

Return a dict representation of an object.

dumps

Return a JSON string representation of an object.

convert_to_messages

Convert a sequence of messages to a list of messages.

is_data_content_block

Check if the provided content block is a data content block.

Returns True for both v0 (old-style) and v1 (new-style) multimodal data blocks.

message_chunk_to_message

Convert a message chunk to a Message.

convert_to_openai_image_block

Convert ImageContentBlock to format expected by OpenAI Chat Completions.

merge_chat_generation_chunks

Merge a list of ChatGenerationChunks into a single ChatGenerationChunk.

ensure_config

Ensure that a config is a dict with all keys present.

run_in_executor

Run a function in an executor.

convert_to_json_schema

Convert a schema representation to a JSON schema.

OpenAI tool schema reference

convert_to_openai_tool

Convert a tool-like object to an OpenAI tool schema.

is_basemodel_subclass

Check if the given class is a subclass of Pydantic BaseModel.

Check if the given class is a subclass of any of the following:

pydantic.BaseModel in Pydantic 2.x
pydantic.v1.BaseModel in Pydantic 2.x

from_env

Create a factory method that gets a value from an environment variable.

generate_from_stream

Generate from a stream.

agenerate_from_stream

Async generate from a stream.

Classes

BaseCache

Interface for a caching layer for LLMs and Chat models.

The cache interface consists of the following methods:

lookup: Look up a value based on a prompt and llm_string.
update: Update the cache based on a prompt and llm_string.
clear: Clear the cache.

In addition, the cache interface provides an async version of each method.

The default implementation of the async methods is to run the synchronous method in an executor. It's recommended to override the async methods and provide async implementations to avoid unnecessary overhead.

AsyncCallbackManager

Async callback manager that handles callbacks from LangChain.

AsyncCallbackManagerForLLMRun

Async callback manager for LLM run.

CallbackManager

Callback manager for LangChain.

CallbackManagerForLLMRun

Callback manager for LLM run.

BaseLanguageModel

Abstract base class for interfacing with language models.

All language model wrappers inherited from BaseLanguageModel.

LangSmithParams

LangSmith parameters for tracing.

ModelProfile

Model profile.

Beta feature

This is a beta feature. The format of model profiles is subject to change.

Provides information about chat model capabilities, such as context window sizes and supported features.

AIMessage

Message from an AI.

An AIMessage is returned from a chat model as a response to a prompt.

This message represents the output of the model and consists of both the raw output as returned by the model and standardized fields (e.g., tool calls, usage metadata) added by the LangChain framework.

AIMessageChunk

Message chunk from an AI (yielded when streaming).

BaseMessage

Base abstract message class.

Messages are the inputs and outputs of a chat model.

Examples include HumanMessage, AIMessage, and SystemMessage.

JsonOutputKeyToolsParser

Parse tools from OpenAI response.

PydanticToolsParser

Parse tools from OpenAI response.

ChatGeneration

A single chat generation output.

A subclass of Generation that represents the response from a chat model that generates chat messages.

The message attribute is a structured representation of the chat message. Most of the time, the message will be of type AIMessage.

Users working with chat models will usually access information via either AIMessage (returned from runnable interfaces) or LLMResult (available via callbacks).

ChatGenerationChunk

ChatGeneration chunk.

ChatGeneration chunks can be concatenated with other ChatGeneration chunks.

ChatResult

Use to represent the result of a chat model call with a single prompt.

This container is used internally by some implementations of chat model, it will eventually be mapped to a more general LLMResult object, and then projected into an AIMessage object.

LangChain users working with chat models will usually access information via AIMessage (returned from runnable interfaces) or LLMResult (available via callbacks). Please refer the AIMessage and LLMResult schema documentation for more information.

Generation

A single text generation output.

Generation represents the response from an "old-fashioned" LLM (string-in, string-out) that generates regular text (not chat messages).

This model is used internally by chat model and will eventually be mapped to a more general LLMResult object, and then projected into an AIMessage object.

LangChain users working with chat models will usually access information via AIMessage (returned from runnable interfaces) or LLMResult (available via callbacks). Please refer to AIMessage and LLMResult for more information.

LLMResult

A container for results of an LLM call.

Both chat models and LLMs generate an LLMResult object. This object contains the generated outputs and any additional information that the model provider wants to return.

RunInfo

Class that contains metadata for a single execution of a chain or model.

Defined for backwards compatibility with older versions of langchain_core.

Users can acquire the run_id information from callbacks or via run_id information present in the astream_event API (depending on the use case).

ChatPromptValue

Chat prompt value.

A type of a prompt value that is built from messages.

PromptValue

Base abstract class for inputs to any language model.

PromptValues can be converted to both LLM (pure text-generation) inputs and chat model inputs.

String prompt value.

Base class for rate limiters.

Usage of the base limiter is through the acquire and aacquire methods depending on whether running in a sync or async context.

Implementations are free to add a timeout parameter to their initialize method to allow users to specify a timeout for acquiring the necessary tokens when using a blocking call.

Current limitations:

Rate limiting information is not surfaced in tracing or callbacks. This means that the total time it takes to invoke a chat model will encompass both the time spent waiting for tokens and the time spent making the request.

RunnablePassthrough

Runnable to passthrough inputs unchanged or with additional keys.

This Runnable behaves almost like the identity function, except that it can be configured to add additional keys to the output, if the input is a dict.

The examples below demonstrate this Runnable works using a few simple chains. The chains rely on simple lambdas to make the examples easy to execute and experiment with.

For a UI (and much more) checkout LangSmith.

Runnable

A unit of work that can be invoked, batched, streamed, transformed and composed.

Key Methods

invoke/ainvoke: Transforms a single input into an output.
batch/abatch: Efficiently transforms multiple inputs into outputs.
stream/astream: Streams output from a single input as it's produced.
astream_log: Streams output and selected intermediate results from an input.

Built-in optimizations:

Batch: By default, batch runs invoke() in parallel using a thread pool executor. Override to optimize batching.
Async: Methods with 'a' prefix are asynchronous. By default, they execute the sync counterpart using asyncio's thread pool. Override for native async.

All methods accept an optional config argument, which can be used to configure execution, add tags and metadata for tracing and debugging etc.

Runnables expose schematic information about their input, output and config via the input_schema property, the output_schema property and config_schema method.

Composition

Runnable objects can be composed together to create chains in a declarative way.

Any chain constructed this way will automatically have sync, async, batch, and streaming support.

The main composition primitives are RunnableSequence and RunnableParallel.

RunnableSequence invokes a series of runnables sequentially, with one Runnable's output serving as the next's input. Construct using the | operator or by passing a list of runnables to RunnableSequence.

RunnableParallel invokes runnables concurrently, providing the same input to each. Construct it using a dict literal within a sequence or by passing a dict to RunnableParallel.

For example,

from langchain_core.runnables import RunnableLambda

# A RunnableSequence constructed using the `|` operator
sequence = RunnableLambda(lambda x: x + 1) | RunnableLambda(lambda x: x * 2)
sequence.invoke(1)  # 4
sequence.batch([1, 2, 3])  # [4, 6, 8]

# A sequence that contains a RunnableParallel constructed using a dict literal
sequence = RunnableLambda(lambda x: x + 1) | {
    "mul_2": RunnableLambda(lambda x: x * 2),
    "mul_5": RunnableLambda(lambda x: x * 5),
}
sequence.invoke(1)  # {'mul_2': 4, 'mul_5': 10}

Standard Methods

All Runnables expose additional methods that can be used to modify their behavior (e.g., add a retry policy, add lifecycle listeners, make them configurable, etc.).

These methods will work on any Runnable, including Runnable chains constructed by composing other Runnables. See the individual methods for details.

For example,

from langchain_core.runnables import RunnableLambda

import random

def add_one(x: int) -> int:
    return x + 1

def buggy_double(y: int) -> int:
    """Buggy code that will fail 70% of the time"""
    if random.random() > 0.3:
        print('This code failed, and will probably be retried!')  # noqa: T201
        raise ValueError('Triggered buggy code')
    return y * 2

sequence = (
    RunnableLambda(add_one) |
    RunnableLambda(buggy_double).with_retry( # Retry on failure
        stop_after_attempt=10,
        wait_exponential_jitter=False
    )
)

print(sequence.input_schema.model_json_schema()) # Show inferred input schema
print(sequence.output_schema.model_json_schema()) # Show inferred output schema
print(sequence.invoke(2)) # invoke the sequence (note the retry above!!)

Debugging and tracing

As the chains get longer, it can be useful to be able to see intermediate results to debug and trace the chain.

You can set the global debug flag to True to enable debug output for all chains:

from langchain_core.globals import set_debug

set_debug(True)

Alternatively, you can pass existing or custom callbacks to any given chain:

from langchain_core.tracers import ConsoleCallbackHandler

chain.invoke(..., config={"callbacks": [ConsoleCallbackHandler()]})

RunnableConfig

Configuration for a Runnable.

Note

Custom values

The TypedDict has total=False set intentionally to:

Allow partial configs to be created and merged together via merge_configs
Support config propagation from parent to child runnables via var_child_runnable_config (a ContextVar that automatically passes config down the call stack without explicit parameter passing), where configs are merged rather than replaced

Example

# Parent sets tags
chain.invoke(input, config={"tags": ["parent"]})
# Child automatically inherits and can add:
# ensure_config({"tags": ["child"]}) -> {"tags": ["parent", "child"]}

BaseTool

Base class for all LangChain tools.

This abstract class defines the interface that all LangChain tools must implement.

Tools are components that can be called by agents to perform specific actions.

BaseChatModel

Base class for chat models.