_ChatModelStreamBaseAsynchronous per-message streaming object for a single LLM response.
Returned by BaseChatModel.astream_v2(). Content-block events
are fed into this object by a background producer task.
Projections:
.text — async iterable of text deltas; awaitable for full text.reasoning — async iterable of reasoning deltas; awaitable.tool_calls — async iterable of ToolCallChunk deltas;
awaitable for list[ToolCall].output — awaitable for assembled AIMessageUsage info is available on .output.usage_metadata once the stream
has finished.
The assembled message's content is always a list of v1
protocol blocks, regardless of the model's output_version
setting — see ChatModelStream for the full rationale.
The stream itself is awaitable (msg = await stream) and
async-iterable (async for event in stream).
Text content — async iterable of deltas, awaitable for full.
Reasoning content — same interface as :attr:text.
Tool calls — async iterable, awaitable for finalized list.
Assembled AIMessage — awaitable.
Fan the async pump callback out to every projection.
Used by langgraph's AsyncGraphRunStream._wire_arequest_more so
cursors on stream.text, stream.reasoning, etc. can drive the
shared graph pump when their buffer is empty.
Install a lazy-start callback on this stream and its projections.
Cancel the background producer task and release resources.
If a consumer cancels mid-stream or decides to stop iterating
early, the producer task keeps pumping the provider HTTP call to
completion because asyncio.Task has no implicit link to its
awaiter. Call this method to cancel the producer explicitly; the
stream transitions to an errored state with CancelledError.
If the stream has already produced a message successfully (for
example, after await stream.output), the producer may still be
running post-stream work such as on_llm_end callbacks. In that
case aclose() awaits the task rather than cancelling it —
turning a successful run into a cancelled one would drop the
end callback and corrupt tracing.
Idempotent: safe to call multiple times, including after the
stream has finished normally. Also invoked by the async context
manager protocol on __aexit__.
Fail base projections and async-only projections.