Module●Since v0.1

langchain_ollama

This is the langchain_ollama package.

Provides infrastructure for interacting with the Ollama service.

Note

Newly added in 0.3.4: validate_model_on_init param on all models. This parameter allows you to validate the model exists in Ollama locally on initialization. If set to True, it will raise an error if the model does not exist locally. This is useful for ensuring that the model is available before attempting to use it, especially in environments where models may not be pre-downloaded.

Classes

class

ChatOllama

Ollama chat model integration.

Setup

Install langchain-ollama and download any models you want to use from ollama.

ollama pull gpt-oss:20b
pip install -U langchain-ollama

Key init args — completion params: model: str Name of Ollama model to use. reasoning: bool | None Controls the reasoning/thinking mode for supported models.

    - `True`: Enables reasoning mode. The model's reasoning process will be
        captured and returned separately in the `additional_kwargs` of the
        response message, under `reasoning_content`. The main response
        content will not include the reasoning tags.
    - `False`: Disables reasoning mode. The model will not perform any reasoning,
        and the response will not include any reasoning content.
    - `None` (Default): The model will use its default reasoning behavior. Note
        however, if the model's default behavior *is* to perform reasoning, think tags
        (`<think>` and `</think>`) will be present within the main response content
        unless you set `reasoning` to `True`.
temperature: float
    Sampling temperature. Ranges from `0.0` to `1.0`.
num_predict: int | None
    Max number of tokens to generate.

See full list of supported init args and their descriptions in the params section.

class

OllamaEmbeddings

Ollama embedding model integration.

class

OllamaLLM

Ollama large language models.

Modules

module

llms

Ollama large language models.

module

embeddings

Ollama embeddings models.

module

chat_models

Ollama chat models.

Input Flow (LangChain -> Ollama)

_convert_messages_to_ollama_messages():

Transforms LangChain messages to ollama.Message format
Extracts text content, images (base64), and tool calls

_chat_params():

Combines messages with model parameters (temperature, top_p, etc.)
Attaches tools if provided
Configures reasoning/thinking mode via think parameter
Sets output format (raw, JSON, or JSON schema)

Output Flow (Ollama -> LangChain)

Ollama Response

Stream dictionary chunks containing:

message: Dict with role, content, tool_calls, thinking
done: Boolean indicating completion
done_reason: Reason for completion (stop, length, load)
Token counts/timing metadata

Response Processing (_iterate_over_stream())

Extracts content from message.content
Parses tool calls into ToolCalls
Separates reasoning content when reasoning=True (stored in additional_kwargs)
Builds usage metadata from token counts

LangChain Output (ChatGenerationChunk -> AIMessage)

Streaming: Yields ChatGenerationChunk with AIMessageChunk content
Non-streaming: Returns ChatResult with complete AIMessage
Tool calls attached to AIMessage.tool_calls
Reasoning content in AIMessage.additional_kwargs['reasoning_content']

View source on GitHub

langchain_ollama

Classes

Modules

LangChain Assistant

Menu

langchain_ollama

Classes

Modules