Base class for chat models.
Key imperative methods:
Methods that actually call the underlying model.
This table provides a brief overview of the main imperative methods. Please see the base Runnable reference for full documentation.
| Method | Input | Output | Description |
|---|---|---|---|
invoke |
str | `list[dict |
tuple | BaseMessage]|PromptValue` |
ainvoke |
''' |
BaseMessage |
Defaults to running invoke in an async executor. |
stream |
''' |
Iterator[BaseMessageChunk] |
Defaults to yielding output of invoke. |
astream |
''' |
AsyncIterator[BaseMessageChunk] |
Defaults to yielding output of ainvoke. |
astream_events |
''' |
AsyncIterator[StreamEvent] |
Event types: on_chat_model_start, on_chat_model_stream, on_chat_model_end. |
batch |
list['''] |
list[BaseMessage] |
Defaults to running invoke in concurrent threads. |
abatch |
list['''] |
list[BaseMessage] |
Defaults to running ainvoke in concurrent threads. |
batch_as_completed |
list['''] |
Iterator[tuple[int, Union[BaseMessage, Exception]]] |
Defaults to running invoke in concurrent threads. |
abatch_as_completed |
list['''] |
AsyncIterator[tuple[int, Union[BaseMessage, Exception]]] |
Defaults to running ainvoke in concurrent threads. |
Key declarative methods:
Methods for creating another Runnable using the chat model.
This table provides a brief overview of the main declarative methods. Please see the reference for each method for full documentation.
| Method | Description |
|---|---|
bind_tools |
Create chat model that can call tools. |
with_structured_output |
Create wrapper that structures model output using schema. |
with_retry |
Create wrapper that retries model calls on failure. |
with_fallbacks |
Create wrapper that falls back to other models on failure. |
configurable_fields |
Specify init args of the model that can be configured at runtime via the RunnableConfig. |
configurable_alternatives |
Specify alternative models which can be swapped in at runtime via the RunnableConfig. |
Creating custom chat model:
Custom chat model implementations should inherit from this class. Please reference the table below for information about which methods and properties are required or optional for implementations.
| Method/Property | Description | Required |
|---|---|---|
_generate |
Use to generate a chat result from a prompt | Required |
_llm_type (property) |
Used to uniquely identify the type of the model. Used for logging. | Required |
_identifying_params (property) |
Represent model parameterization for tracing purposes. | Optional |
_stream |
Use to implement streaming | Optional |
_agenerate |
Use to implement a native async method | Optional |
_astream |
Use to implement async version of _stream |
Optional |
An optional rate limiter to use for limiting the number of requests.
Whether to disable streaming for this model.
If streaming is bypassed, then stream/astream/astream_events will
defer to invoke/ainvoke.
True, will always bypass streaming case.'tool_calling', will bypass streaming case only when the model is called
with a tools keyword argument. In other words, LangChain will automatically
switch to non-streaming behavior (invoke) only when the tools argument is
provided. This offers the best of both worlds.False (Default), will always use streaming case if available.The main reason for this flag is that code might be written using stream and
a user may want to swap out a given model for another model whose implementation
does not properly support streaming.
Version of AIMessage output format to store in message content.
AIMessage.content_blocks will lazily parse the contents of content into a
standard format. This flag can be used to additionally store the standard format
in message content, e.g., for serialization purposes.
Supported values:
'v0': provider-specific format in content (can lazily-parse with
content_blocks)'v1': standardized format in content (consistent with content_blocks)Partner packages (e.g.,
langchain-openai) can also use this
field to roll out new content formats in a backward-compatible way.
Profile detailing model capabilities.
This is a beta feature. The format of model profiles is subject to change.
If not specified, automatically loaded from the provider package on initialization if data is available.
Example profile data includes context window sizes, supported modalities, or support for tool calling, structured output, and other features.
Get the output type for this Runnable.
Pass a sequence of prompts to the model and return model generations.
This method should make use of batched calls for models that expose a batched API.
Use this method when you want to:
Asynchronously pass a sequence of prompts to a model and return generations.
This method should make use of batched calls for models that expose a batched API.
Use this method when you want to:
Return a dictionary of the LLM.
Bind tools to the model.
Model wrapper that returns outputs formatted to match the given schema.
Whether to cache the response.
Whether to log the tool's progress.
Callbacks for this call and any sub-calls (e.g. a Chain calling an LLM).
Optional list of tags associated with the retriever.
Optional metadata associated with the retriever.
Optional encoder to use for counting tokens.
Return True as this class is serializable.
Get the namespace of the LangChain object.
Return a unique identifier for this class for serialization purposes.
Convert the graph to a JSON-serializable format.
Serialize a "not implemented" object.
Get a JSON schema that represents the input to the Runnable.
Get a JSON schema that represents the output of the Runnable.
The type of config this Runnable accepts specified as a Pydantic model.
Get a JSON schema that represents the config of the Runnable.
Return a list of prompts used by this Runnable.
Pipe Runnable objects.
Pick keys from the output dict of this Runnable.
Merge the Dict input with the output produced by the mapping argument.
Run invoke in parallel on a list of inputs.
Run ainvoke in parallel on a list of inputs.
Stream all output from a Runnable, as reported to the callback system.
Generate a stream of events.
Bind arguments to a Runnable, returning a new Runnable.
Bind lifecycle listeners to a Runnable, returning a new Runnable.
Bind async lifecycle listeners to a Runnable.
Bind input and output types to a Runnable, returning a new Runnable.
Create a new Runnable that retries the original Runnable on exceptions.
Map a function to multiple iterables.
Add fallbacks to a Runnable, returning a new Runnable.
Create a BaseTool from a Runnable.