Chat model unit tests¶
ChatModelUnitTests
¶
Bases: ChatModelTests
Base class for chat model unit tests.
Test subclasses must implement the chat_model_class and
chat_model_params properties to specify what model to test and its
initialization parameters.
from typing import Type
from langchain_tests.unit_tests import ChatModelUnitTests
from my_package.chat_models import MyChatModel
class TestMyChatModelUnit(ChatModelUnitTests):
@property
def chat_model_class(self) -> Type[MyChatModel]:
# Return the chat model class to test here
return MyChatModel
@property
def chat_model_params(self) -> dict:
# Return initialization parameters for the model.
return {"model": "model-001", "temperature": 0}
Note
API references for individual test methods include troubleshooting tips.
Test subclasses must implement the following two properties:
chat_model_class: The chat model class to test, e.g., ChatParrotLink.
chat_model_params: Initialization parameters for the chat model.
In addition, test subclasses can control what features are tested (such as tool calling or multi-modality) by selectively overriding the following properties.
Expand to see details:
has_tool_calling
Boolean property indicating whether the chat model supports tool calling.
By default, this is determined by whether the chat model's bind_tools method
is overridden. It typically does not need to be overridden on the test class.
has_tool_choice
Boolean property indicating whether the chat model supports forcing tool
calling via a tool_choice parameter.
By default, this is determined by whether the parameter is included in the
signature for the corresponding bind_tools method.
If True, the minimum requirement for this feature is that
tool_choice='any' will force a tool call, and tool_choice=<tool name>
will force a call to a specific tool.
has_structured_output
Boolean property indicating whether the chat model supports structured output.
By default, this is determined by whether the chat model overrides the
with_structured_output or bind_tools methods. If the base
implementations are intended to be used, this method should be overridden.
See docs for Structured output.
structured_output_kwargs
Dict property specifying additional kwargs to pass to
with_structured_output() when running structured output tests.
Override this to customize how your model generates structured output.
The most common use case is specifying the method parameter:
'function_calling': Uses tool/function calling to enforce the schema.'json_mode': Uses the model's JSON mode.'json_schema': Uses native JSON schema support (e.g., OpenAI's structured outputs).
supports_json_mode
Boolean property indicating whether the chat model supports
method='json_mode' in with_structured_output.
JSON mode constrains the model to output valid JSON without enforcing
a specific schema (unlike 'function_calling' or 'json_schema' methods).
When using JSON mode, you must prompt the model to output JSON in your message.
Example:
structured_llm = llm.with_structured_output(MySchema, method="json_mode")
structured_llm.invoke("... Return the result as JSON.")
See docs for Structured output.
Defaults to False.
supports_image_inputs
Boolean property indicating whether the chat model supports image inputs.
Defaults to False.
If set to True, the chat model will be tested using the LangChain
ImageContentBlock format:
{
"type": "image",
"base64": "<base64 image data>",
"mime_type": "image/jpeg", # or appropriate MIME type
}
In addition to OpenAI Chat Completions image_url blocks:
See docs for Multimodality.
supports_image_urls
Boolean property indicating whether the chat model supports image inputs from URLs.
Defaults to False.
If set to True, the chat model will be tested using content blocks of the
form.
See docs for Multimodality.
supports_image_tool_message
Boolean property indicating whether the chat model supports a ToolMessage
that includes image content, e.g. in the OpenAI Chat Completions format.
Defaults to False.
ToolMessage(
content=[
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
},
],
tool_call_id="1",
name="random_image",
)
(OpenAI Chat Completions format), as well as LangChain's ImageContentBlock
format:
ToolMessage(
content=[
{
"type": "image",
"base64": image_data,
"mime_type": "image/jpeg",
},
],
tool_call_id="1",
name="random_image",
)
(standard format).
If set to True, the chat model will be tested with message sequences that
include ToolMessage objects of this form.
supports_pdf_inputs
Boolean property indicating whether the chat model supports PDF inputs.
Defaults to False.
If set to True, the chat model will be tested using the LangChain
FileContentBlock format:
See docs for Multimodality.
supports_pdf_tool_message
Boolean property indicating whether the chat model supports a ToolMessage
that includes PDF content using the LangChain FileContentBlock format.
Defaults to False.
ToolMessage(
content=[
{
"type": "file",
"base64": pdf_data,
"mime_type": "application/pdf",
},
],
tool_call_id="1",
name="random_pdf",
)
using LangChain's FileContentBlock format.
If set to True, the chat model will be tested with message sequences that
include ToolMessage objects of this form.
supports_audio_inputs
Boolean property indicating whether the chat model supports audio inputs.
Defaults to False.
If set to True, the chat model will be tested using the LangChain
AudioContentBlock format:
{
"type": "audio",
"base64": "<base64 audio data>",
"mime_type": "audio/wav", # or appropriate MIME type
}
See docs for Multimodality.
Warning
This test downloads audio data from wikimedia.org. You may need to set the
LANGCHAIN_TESTS_USER_AGENT environment variable to identify these tests,
e.g.,
export LANGCHAIN_TESTS_USER_AGENT="CoolBot/0.0 (https://example.org/coolbot/; coolbot@example.org) generic-library/0.0"
Refer to the Wikimedia Foundation User-Agent Policy.
supports_video_inputs
Boolean property indicating whether the chat model supports image inputs.
Defaults to False.
No current tests are written for this feature.
returns_usage_metadata
Boolean property indicating whether the chat model returns usage metadata on invoke and streaming responses.
Defaults to True.
usage_metadata is an optional dict attribute on AIMessage objects that track
input and output tokens.
Models supporting usage_metadata should also return the name of the
underlying model in the response_metadata of the AIMessage.
supports_anthropic_inputs
Boolean property indicating whether the chat model supports Anthropic-style inputs.
These inputs might feature "tool use" and "tool result" content blocks, e.g.,
[
{"type": "text", "text": "Hmm let me think about that"},
{
"type": "tool_use",
"input": {"fav_color": "green"},
"id": "foo",
"name": "color_picker",
},
]
If set to True, the chat model will be tested using content blocks of this
form.
supported_usage_metadata_details
Property controlling what usage metadata details are emitted in both invoke
and stream.
usage_metadata is an optional dict attribute on AIMessage objects that track
input and output tokens.
It includes optional keys input_token_details and output_token_details
that can track usage details associated with special types of tokens, such as
cached, audio, or reasoning.
Only needs to be overridden if these details are supplied.
supports_model_override
Boolean property indicating whether the chat model supports overriding the model name at runtime via kwargs.
If True, the model accepts a model kwarg in invoke(), stream(), etc.
that overrides the model specified at initialization. This enables dynamic
model selection without creating new chat model instances.
Defaults to False.
model_override_value
Alternative model name to use when testing model override.
Should return a valid model name that differs from the default model.
Required if supports_model_override is True.
enable_vcr_tests
Property controlling whether to enable select tests that rely on VCR caching of HTTP calls, such as benchmarking tests.
To enable these tests, follow these steps:
-
Override the
enable_vcr_testsproperty to returnTrue: -
Configure VCR to exclude sensitive headers and other information from cassettes.
Warning
VCR will by default record authentication headers and other sensitive information in cassettes. Read below for how to configure what information is recorded in cassettes.
To add configuration to VCR, add a
conftest.pyfile to thetests/directory and implement thevcr_configfixture there.langchain-testsexcludes the headers'authorization','x-api-key', and'api-key'from VCR cassettes. To pick up this configuration, you will need to addconftest.pyas shown below. You can also exclude additional headers, override the default exclusions, or apply other customizations to the VCR configuration. See example below:tests/conftest.pyimport pytest from langchain_tests.conftest import ( _base_vcr_config as _base_vcr_config, ) _EXTRA_HEADERS = [ # Specify additional headers to redact ("user-agent", "PLACEHOLDER"), ] def remove_response_headers(response: dict) -> dict: # If desired, remove or modify headers in the response. response["headers"] = {} return response @pytest.fixture(scope="session") def vcr_config(_base_vcr_config: dict) -> dict: # noqa: F811 """Extend the default configuration from langchain_tests.""" config = _base_vcr_config.copy() config.setdefault("filter_headers", []).extend(_EXTRA_HEADERS) config["before_record_response"] = remove_response_headers return configCompressing cassettes
langchain-testsincludes a custom VCR serializer that compresses cassettes using gzip. To use it, register theyaml.gzserializer to your VCR fixture and enable this serializer in the config. See example below:tests/conftest.pyimport pytest from langchain_tests.conftest import ( CustomPersister, CustomSerializer, ) from langchain_tests.conftest import ( _base_vcr_config as _base_vcr_config, ) from vcr import VCR _EXTRA_HEADERS = [ # Specify additional headers to redact ("user-agent", "PLACEHOLDER"), ] def remove_response_headers(response: dict) -> dict: # If desired, remove or modify headers in the response. response["headers"] = {} return response @pytest.fixture(scope="session") def vcr_config(_base_vcr_config: dict) -> dict: # noqa: F811 """Extend the default configuration from langchain_tests.""" config = _base_vcr_config.copy() config.setdefault("filter_headers", []).extend(_EXTRA_HEADERS) config["before_record_response"] = remove_response_headers # New: enable serializer and set file extension config["serializer"] = "yaml.gz" config["path_transformer"] = VCR.ensure_suffix(".yaml.gz") return config def pytest_recording_configure(config: dict, vcr: VCR) -> None: vcr.register_persister(CustomPersister()) vcr.register_serializer("yaml.gz", CustomSerializer())You can inspect the contents of the compressed cassettes (e.g., to ensure no sensitive information is recorded) using
...or by using the serializer:
-
Run tests to generate VCR cassettes.
Exampleuv run python -m pytest tests/integration_tests/test_chat_models.py::TestMyModel::test_stream_timeThis will generate a VCR cassette for the test in
tests/integration_tests/cassettes/.Warning
You should inspect the generated cassette to ensure that it does not contain sensitive information. If it does, you can modify the
vcr_configfixture to exclude headers or modify the response before it is recorded.You can then commit the cassette to your repository. Subsequent test runs will use the cassette instead of making HTTP calls.
Testing initialization from environment variables
Some unit tests may require testing initialization from environment variables.
These tests can be enabled by overriding the init_from_env_params
property (see below).
init_from_env_params
This property is used in unit tests to test initialization from environment variables. It should return a tuple of three dictionaries that specify the environment variables, additional initialization args, and expected instance attributes to check.
Defaults to empty dicts. If not overridden, the test is skipped.
Example:
| METHOD | DESCRIPTION |
|---|---|
test_no_overrides_DO_NOT_OVERRIDE |
Test that no standard tests are overridden. |
model |
Model fixture. |
my_adder_tool |
Adder tool fixture. |
test_init |
Test model initialization. This should pass for all integrations. |
test_init_from_env |
Test initialization from environment variables. |
test_init_streaming |
Test that model can be initialized with |
test_bind_tool_pydantic |
Test bind tools with Pydantic models. |
test_with_structured_output |
Test |
test_standard_params |
Test that model properly generates standard parameters. |
test_serdes |
Test serialization and deserialization of the model. |
test_init_time |
Test initialization time of the chat model. |
chat_model_class
abstractmethod
property
¶
chat_model_class: type[BaseChatModel]
The chat model class to test, e.g., ChatParrotLink.
has_structured_output
property
¶
has_structured_output: bool
Whether the chat model supports structured output.
structured_output_kwargs
property
¶
structured_output_kwargs: dict
Additional kwargs to pass to with_structured_output() in tests.
Override this property to customize how structured output is generated
for your model. The most common use case is specifying the method
parameter, which controls the mechanism used to enforce structured output:
'function_calling': Uses tool/function calling to enforce the schema.'json_mode': Uses the model's JSON mode.'json_schema': Uses native JSON schema support (e.g., OpenAI's structured outputs).
| RETURNS | DESCRIPTION |
|---|---|
dict
|
A dict of kwargs passed to |
supports_image_inputs
property
¶
supports_image_inputs: bool
Supports image inputs.
Whether the chat model supports image inputs, defaults to
False.
supports_image_urls
property
¶
supports_image_urls: bool
Supports image inputs from URLs.
Whether the chat model supports image inputs from URLs, defaults to
False.
supports_pdf_inputs
property
¶
supports_pdf_inputs: bool
Whether the chat model supports PDF inputs, defaults to False.
supports_audio_inputs
property
¶
supports_audio_inputs: bool
Supports audio inputs.
Whether the chat model supports audio inputs, defaults to False.
supports_video_inputs
property
¶
supports_video_inputs: bool
Supports video inputs.
Whether the chat model supports video inputs, defaults to False.
No current tests are written for this feature.
returns_usage_metadata
property
¶
returns_usage_metadata: bool
Returns usage metadata.
Whether the chat model returns usage metadata on invoke and streaming responses.
supports_anthropic_inputs
property
¶
supports_anthropic_inputs: bool
Whether the chat model supports Anthropic-style inputs.
supports_image_tool_message
property
¶
supports_image_tool_message: bool
Supports image ToolMessage objects.
Whether the chat model supports ToolMessage objects that include image
content.
supports_pdf_tool_message
property
¶
supports_pdf_tool_message: bool
Supports PDF ToolMessage objects.
Whether the chat model supports ToolMessage objects that include PDF
content.
enable_vcr_tests
property
¶
enable_vcr_tests: bool
Whether to enable VCR tests for the chat model.
Warning
See enable_vcr_tests dropdown above <ChatModelTests> for more
information.
supported_usage_metadata_details
property
¶
supported_usage_metadata_details: dict[
Literal["invoke", "stream"],
list[
Literal[
"audio_input",
"audio_output",
"reasoning_output",
"cache_read_input",
"cache_creation_input",
]
],
]
Supported usage metadata details.
What usage metadata details are emitted in invoke and stream. Only needs to be overridden if these details are returned by the model.
supports_model_override
property
¶
supports_model_override: bool
Whether the model supports overriding the model name at runtime.
Defaults to True.
If True, the model accepts a model kwarg in invoke(), stream(),
etc. that overrides the model specified at initialization.
This enables dynamic model selection without creating new instances.
model_override_value
property
¶
model_override_value: str | None
Alternative model name to use when testing model override.
Should return a valid model name that differs from the default model.
Required if supports_model_override is True.
standard_chat_model_params
property
¶
standard_chat_model_params: dict
Standard chat model parameters.
init_from_env_params
property
¶
Init from env params.
Environment variables, additional initialization args, and expected instance attributes for testing initialization from environment variables.
test_no_overrides_DO_NOT_OVERRIDE
¶
Test that no standard tests are overridden.
test_init
¶
Test model initialization. This should pass for all integrations.
Troubleshooting
If this test fails, ensure that:
chat_model_paramsis specified and the model can be initialized from those params;- The model accommodates standard parameters.
test_init_from_env
¶
Test initialization from environment variables.
Relies on the init_from_env_params property. Test is skipped if that
property is not set.
Troubleshooting
If this test fails, ensure that init_from_env_params is specified
correctly and that model parameters are properly set from environment
variables during initialization.
test_init_streaming
¶
Test that model can be initialized with streaming=True.
This is for backward-compatibility purposes.
Troubleshooting
If this test fails, ensure that the model can be initialized with a
boolean streaming parameter.
test_bind_tool_pydantic
¶
test_bind_tool_pydantic(model: BaseChatModel, my_adder_tool: BaseTool) -> None
Test bind tools with Pydantic models.
Test that chat model correctly handles Pydantic models that are passed
into bind_tools. Test is skipped if the has_tool_calling property
on the test class is False.
Troubleshooting
If this test fails, ensure that the model's bind_tools method
properly handles Pydantic V2 models.
langchain_core implements a utility function.
that will accommodate most formats.
See example implementation.
of with_structured_output.
test_with_structured_output
¶
test_with_structured_output(model: BaseChatModel, schema: Any) -> None
Test with_structured_output method.
Test is skipped if the has_structured_output property on the test class is
False.
Troubleshooting
If this test fails, ensure that the model's bind_tools method
properly handles Pydantic V2 models.
langchain_core implements a utility function.
that will accommodate most formats.
See example implementation.
of with_structured_output.
test_standard_params
¶
test_standard_params(model: BaseChatModel) -> None
Test that model properly generates standard parameters.
These are used for tracing purposes.
Troubleshooting
If this test fails, check that the model accommodates standard parameters.
Check also that the model class is named according to convention
(e.g., ChatProviderName).
test_serdes
¶
test_serdes(model: BaseChatModel, snapshot: SnapshotAssertion) -> None
Test serialization and deserialization of the model.
Test is skipped if the is_lc_serializable property on the chat model class
is not overwritten to return True.
Troubleshooting
If this test fails, check that the init_from_env_params property is
correctly set on the test class.
test_init_time
¶
Test initialization time of the chat model.
If this test fails, check that we are not introducing undue overhead in the model's initialization.