LangChain Reference home pageLangChain ReferenceLangChain Reference
  • GitHub
  • Main Docs
Deep Agents
LangChain
LangGraph
Integrations
LangSmith
  • Overview
    • Overview
    • Caches
    • Callbacks
    • Documents
    • Document loaders
    • Embeddings
    • Exceptions
    • Language models
    • Serialization
    • Output parsers
    • Prompts
    • Rate limiters
    • Retrievers
    • Runnables
    • Utilities
    • Vector stores
    MCP Adapters
    Standard Tests
    Text Splitters
    ⌘I

    LangChain Assistant

    Ask a question to get started

    Enter to send•Shift+Enter new line

    Menu

    OverviewCachesCallbacksDocumentsDocument loadersEmbeddingsExceptionsLanguage modelsSerializationOutput parsersPromptsRate limitersRetrieversRunnablesUtilitiesVector stores
    MCP Adapters
    Standard Tests
    Text Splitters
    Language
    Theme
    Pythonlangchain-corelanguage_modelschat_model_stream
    Moduleā—Since v1.3

    chat_model_stream

    Functions

    Classes

    View source on GitHub
    function
    finalize_tool_call_chunk

    Parse accumulated tool-chunk args into a finalized block.

    Shared between the compat bridge's _finalize_block and the ChatModelStream end-of-stream sweep. Parses raw_args as JSON: on success builds the requested finalized type (tool_call or server_tool_call) with provider-specific fields (extras) preserved; on failure falls back to invalid_tool_call carrying the raw string so downstream consumers can still introspect the malformed payload.

    class
    AIMessage

    Message from an AI.

    An AIMessage is returned from a chat model as a response to a prompt.

    This message represents the output of the model and consists of both the raw output as returned by the model and standardized fields (e.g., tool calls, usage metadata) added by the LangChain framework.

    class
    SyncProjection

    Sync iterable of deltas with pull-based backpressure.

    Follows the same _request_more convention as langgraph's EventLog: when the cursor catches up to the buffer and the projection is not done, it calls _request_more() to pull more events from the producer.

    Each call to __iter__ creates a new cursor at position 0. Multiple iterators replay all deltas from the start.

    class
    SyncTextProjection

    String-specialized sync projection.

    Adds __str__, __bool__, __repr__ for ergonomic use with .text and .reasoning projections.

    class
    AsyncProjection

    Async iterable of deltas that is also awaitable for the final value.

    Uses an asyncio.Event to notify consumers of state changes. Each waiter — the awaitable (__await__) and each async iterator cursor — shares the event and re-checks its own condition on wake. The event is cleared before a waiter awaits, so stale "something happened" signals don't cause spin loops.

    This is single-loop only — producers and consumers must share an event loop. If cross-thread wake is ever required, revert to a list-of-futures pattern with call_soon_threadsafe.

    class
    ChatModelStream

    Synchronous per-message streaming object for a single LLM response.

    Returned by BaseChatModel.stream_v2(). Content-block protocol events are fed into this object and accumulated into typed projections.

    Projections (always return the same cached object):

    • .text — iterable of str deltas; str() for full text
    • .reasoning — same as .text for reasoning content
    • .tool_calls — iterable of ToolCallChunk deltas; .get() returns list[ToolCall]
    • .output — blocking property, returns assembled AIMessage

    Usage info is available on .output.usage_metadata once the stream has finished.

    Output shape is always v1 content blocks

    .output.content is always a list of v1 protocol blocks (text, reasoning, tool_call, image, …), regardless of the underlying model's output_version setting. That attribute only controls the legacy stream() / astream() / invoke() paths; ChatModelStream is built on the content-block protocol and emits v1 shapes by construction.

    Raw event iteration::

    for event in stream:
        print(event)  # MessagesData dicts
    
    class
    AsyncChatModelStream

    Asynchronous per-message streaming object for a single LLM response.

    Returned by BaseChatModel.astream_v2(). Content-block events are fed into this object by a background producer task.

    Projections:

    • .text — async iterable of text deltas; awaitable for full text
    • .reasoning — async iterable of reasoning deltas; awaitable
    • .tool_calls — async iterable of ToolCallChunk deltas; awaitable for list[ToolCall]
    • .output — awaitable for assembled AIMessage

    Usage info is available on .output.usage_metadata once the stream has finished.

    Output shape is always v1 content blocks

    The assembled message's content is always a list of v1 protocol blocks, regardless of the model's output_version setting — see ChatModelStream for the full rationale.

    The stream itself is awaitable (msg = await stream) and async-iterable (async for event in stream).

    Per-message streaming objects for content-block protocol events.

    ChatModelStream is the synchronous variant returned by BaseChatModel.stream_v2(). AsyncChatModelStream is the asynchronous variant returned by BaseChatModel.astream_v2().

    Both expose typed projection properties (.text, .reasoning, .tool_calls, .usage, .output) that accumulate protocol events as they arrive. Projections can be iterated for deltas or drained for the final accumulated value.

    Raw protocol events are also available via direct iteration on the stream object (replay-buffer semantics — multiple independent consumers supported).