LangChain Reference home pageLangChain ReferenceLangChain Reference
  • GitHub
  • Main Docs
Deep Agents
LangChain
LangGraph
Integrations
LangSmith
  • Overview
    • Overview
    • Caches
    • Callbacks
    • Documents
    • Document loaders
    • Embeddings
    • Exceptions
    • Language models
    • Serialization
    • Output parsers
    • Prompts
    • Rate limiters
    • Retrievers
    • Runnables
    • Utilities
    • Vector stores
    MCP Adapters
    Standard Tests
    Text Splitters
    ⌘I

    LangChain Assistant

    Ask a question to get started

    Enter to send•Shift+Enter new line

    Menu

    OverviewCachesCallbacksDocumentsDocument loadersEmbeddingsExceptionsLanguage modelsSerializationOutput parsersPromptsRate limitersRetrieversRunnablesUtilitiesVector stores
    MCP Adapters
    Standard Tests
    Text Splitters
    Language
    Theme
    Pythonlangchain-corecallbacksbaseLLMManagerMixin
    Classā—Since v0.1

    LLMManagerMixin

    Copy
    LLMManagerMixin()

    Methods

    View source on GitHub
    method
    on_llm_new_token

    Run on new output token.

    Only available when streaming is enabled.

    For both chat models and non-chat models (legacy text completion LLMs).

    method
    on_llm_end

    Run when LLM ends running.

    method
    on_llm_error

    Run when LLM errors.

    method
    on_stream_event

    Run on each protocol event from stream_events(version="v3").

    Also fires for the async equivalent (astream_events(version="v3")).

    Fires once per MessagesData event — message-start, per-block content-block-start / content-block-delta / content-block-finish, and message-finish. Analogous to on_llm_new_token in v1 streaming, but at event granularity rather than chunk: a single chunk can map to multiple events (e.g. a content-block-start plus its first content-block-delta), and lifecycle boundaries are explicit.

    Fires uniformly whether the provider emits events natively via _stream_chat_model_events or goes through the chunk-to-event compat bridge. Observers see the same event stream regardless of how the underlying model produces output.

    Not fired from v1 stream() / astream(); for those, keep using on_llm_new_token. Purely additive — on_chat_model_start, on_llm_end, and on_llm_error still fire around a v2 call as they do around a v1 call.

    Mixin for LLM callbacks.