Module●Since v0.1

langchain_ibm

Classes

IBM watsonx.ai chat models integration.

To use, you should have langchain_ibm python package installed, and the environment variable WATSONX_API_KEY set with your API key, or pass it as a named parameter api_key to the constructor.

pip install -U langchain-ibm

# or using uv
uv add langchain-ibm

export WATSONX_API_KEY="your-api-key"

Deprecated

apikey and WATSONX_APIKEY are deprecated and will be removed in version 2.0.0. Use api_key and WATSONX_API_KEY instead.

Instantiate

Create a model instance with desired params. For example:

from langchain_ibm import ChatWatsonx
from ibm_watsonx_ai.foundation_models.schema import TextChatParameters

parameters = TextChatParameters(
    top_p=1, temperature=0.5, max_completion_tokens=None
)

model = ChatWatsonx(
    model_id="meta-llama/llama-3-3-70b-instruct",
    url="https://us-south.ml.cloud.ibm.com",
    project_id="*****",
    params=parameters,
    # api_key="*****"
)

Invoke

Generate a response from the model:

messages = [
    (
        "system",
        "You are a helpful translator. Translate the user sentence to French.",
    ),
    ("human", "I love programming."),
]
model.invoke(messages)

Results in an AIMessage response:

AIMessage(
    content="J'adore programmer.",
    additional_kwargs={},
    response_metadata={
        "token_usage": {
            "completion_tokens": 7,
            "prompt_tokens": 30,
            "total_tokens": 37,
        },
        "model_name": "ibm/granite-3-3-8b-instruct",
        "system_fingerprint": "",
        "finish_reason": "stop",
    },
    id="chatcmpl-529352c4-93ba-4801-8f1d-a3b4e3935194---daed91fb74d0405f200db1e63da9a48a---7a3ef799-4413-47e4-b24c-85d267e37fa2",
    usage_metadata={"input_tokens": 30, "output_tokens": 7, "total_tokens": 37},
)

Stream

Stream a response from the model:

for chunk in model.stream(messages):
    print(chunk.text)

Results in a sequence of AIMessageChunk objects with partial content:

AIMessageChunk(content="", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775")
AIMessageChunk(content="J", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775")
AIMessageChunk(content="'", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775")
AIMessageChunk(content="ad", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775")
AIMessageChunk(content="or", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775")
AIMessageChunk(
    content=" programmer", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775"
)
AIMessageChunk(content=".", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775")
AIMessageChunk(
    content="",
    response_metadata={
        "finish_reason": "stop",
        "model_name": "ibm/granite-3-3-8b-instruct",
    },
    id="run--e48a38d3-1500-4b5e-870c-6313e8cff775",
)
AIMessageChunk(
    content="",
    id="run--e48a38d3-1500-4b5e-870c-6313e8cff775",
    usage_metadata={"input_tokens": 30, "output_tokens": 7, "total_tokens": 37},
)

To collect the full message, you can concatenate the chunks:

stream = model.stream(messages)
full = next(stream)
for chunk in stream:
    full += chunk

full

AIMessageChunk(
    content="J'adore programmer.",
    response_metadata={
        "finish_reason": "stop",
        "model_name": "ibm/granite-3-3-8b-instruct",
    },
    id="chatcmpl-88a48b71-c149-4a0c-9c02-d6b97ca5dc6c---b7ba15879a8c5283b1e8a3b8db0229f0---0037ca4f-8a74-4f84-a46c-ab3fd1294f24",
    usage_metadata={"input_tokens": 30, "output_tokens": 7, "total_tokens": 37},
)

Async

Asynchronous equivalents of invoke, stream, and batch are also available:

# Invoke
await model.ainvoke(messages)

# Stream
async for chunk in model.astream(messages):
    print(chunk.text)

# Batch
await model.abatch([messages])

Results in an AIMessage response:

AIMessage(
    content="J'adore programmer.",
    additional_kwargs={},
    response_metadata={
        "token_usage": {
            "completion_tokens": 7,
            "prompt_tokens": 30,
            "total_tokens": 37,
        },
        "model_name": "ibm/granite-3-3-8b-instruct",
        "system_fingerprint": "",
        "finish_reason": "stop",
    },
    id="chatcmpl-5bef2d81-ef56-463b-a8fa-c2bcc2a3c348---821e7750d18925f2b36226db66667e26---6396c786-9da9-4468-883e-11ed90a05937",
    usage_metadata={"input_tokens": 30, "output_tokens": 7, "total_tokens": 37},
)

For batched calls, results in a list[AIMessage].

Tool calling

from pydantic import BaseModel, Field

class GetWeather(BaseModel):
    '''Get the current weather in a given location'''

    location: str = Field(
        ..., description="The city and state, e.g. San Francisco, CA"
    )

class GetPopulation(BaseModel):
    '''Get the current population in a given location'''

    location: str = Field(
        ..., description="The city and state, e.g. San Francisco, CA"
    )

model_with_tools = model.bind_tools(
    [GetWeather, GetPopulation]
    # strict = True  # Enforce tool args schema is respected
)
ai_msg = model_with_tools.invoke(
    "Which city is hotter today and which is bigger: LA or NY?"
)
ai_msg.tool_calls

[
    {
        "name": "GetWeather",
        "args": {"location": "Los Angeles, CA"},
        "id": "chatcmpl-tool-59632abcee8f48a18a5f3a81422b917b",
        "type": "tool_call",
    },
    {
        "name": "GetWeather",
        "args": {"location": "New York, NY"},
        "id": "chatcmpl-tool-c6f3b033b4594918bb53f656525b0979",
        "type": "tool_call",
    },
    {
        "name": "GetPopulation",
        "args": {"location": "Los Angeles, CA"},
        "id": "chatcmpl-tool-175a23281e4747ea81cbe472b8e47012",
        "type": "tool_call",
    },
    {
        "name": "GetPopulation",
        "args": {"location": "New York, NY"},
        "id": "chatcmpl-tool-e1ccc534835945aebab708eb5e685bf7",
        "type": "tool_call",
    },
]

Reasoning output

from langchain_ibm import ChatWatsonx
from ibm_watsonx_ai.foundation_models.schema import TextChatParameters

parameters = TextChatParameters(
    include_reasoning=True, reasoning_effort="medium"
)

model = ChatWatsonx(
    model_id="openai/gpt-oss-120b",
    url="https://us-south.ml.cloud.ibm.com",
    project_id="*****",
    params=parameters,
    # api_key="*****"
)

response = model.invoke("What is 3^3?")

# Response text
print(f"Output: {response.content}")

# Reasoning summaries
print(f"Reasoning: {response.additional_kwargs['reasoning_content']}")

Output: 3^3 = 27
Reasoning: The user asks "What is 3^3?" That's 27. Provide answer.

Added in version 0.3.19: Updated AIMessage format

langchain-ibm >= 0.3.19 allows users to set Reasoning output parameters and will format output from reasoning summaries into additional_kwargs field.

Structured output

from pydantic import BaseModel, Field

class Joke(BaseModel):
    '''Joke to tell user.'''

    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: int | None = Field(description="How funny the joke is, 1 to 10")

structured_model = model.with_structured_output(Joke)
structured_model.invoke("Tell me a joke about cats")

Joke(
    setup="Why was the cat sitting on the computer?",
    punchline="To keep an eye on the mouse!",
    rating=None,
)

See with_structured_output for more info.

JSON mode

json_model = model.bind(response_format={"type": "json_object"})
ai_msg = json_model.invoke(
    “Return JSON with 'random_ints': an array of 10 random integers from 0-99.”
)
ai_msg.content

'{\n  "random_ints": [12, 34, 56, 78, 10, 22, 44, 66, 88, 99]\n}'

Image input

import base64
import httpx
from langchain.messages import HumanMessage

image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8")
message = HumanMessage(
    content=[
        {"type": "text", "text": "describe the weather in this image"},
        {
            "type": "image_url",
            "image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
        },
    ]
)

ai_msg = model.invoke([message])
ai_msg.content

"The weather in the image presents a clear, sunny day with good visibility
and no immediate signs of rain or strong winds. The vibrant blue sky with
scattered white clouds gives the impression of a tranquil, pleasant day
conducive to outdoor activities."

Token usage

ai_msg = model.invoke(messages)
ai_msg.usage_metadata

{'input_tokens': 30, 'output_tokens': 7, 'total_tokens': 37}

stream = model.stream(messages)
full = next(stream)
for chunk in stream:
    full += chunk
full.usage_metadata

{'input_tokens': 30, 'output_tokens': 7, 'total_tokens': 37}

Logprobs

logprobs_model = model.bind(logprobs=True)
ai_msg = logprobs_model.invoke(messages)
ai_msg.response_metadata["logprobs"]

{
    'content': [
        {
            'token': 'J',
            'logprob': -0.0017940393
        },
        {
            'token': "'",
            'logprob': -1.7523613e-05
        },
        {
            'token': 'ad',
            'logprob': -0.16112353
        },
        {
            'token': 'ore',
            'logprob': -0.0003091811
        },
        {
            'token': ' programmer',
            'logprob': -0.24849245
        },
        {
            'token': '.',
            'logprob': -2.5033638e-05
        },
        {
            'token': '<|end_of_text|>',
            'logprob': -7.080781e-05
        }
    ]
}

Response metadata

ai_msg = model.invoke(messages)
ai_msg.response_metadata

{
    'token_usage': {
        'completion_tokens': 7,
        'prompt_tokens': 30,
        'total_tokens': 37
    },
    'model_name': 'ibm/granite-3-3-8b-instruct',
    'system_fingerprint': '',
    'finish_reason': 'stop'
}

class

WatsonxEmbeddings

IBM watsonx.ai embedding model integration.

Setup

To use, you should have langchain_ibm python package installed, and the environment variable WATSONX_API_KEY set with your API key, or pass it as a named parameter api_key to the constructor.

pip install -U langchain-ibm

# or using uv
uv add langchain-ibm

export WATSONX_API_KEY="your-api-key"

Deprecated

apikey and WATSONX_APIKEY are deprecated and will be removed in version 2.0.0. Use api_key and WATSONX_API_KEY instead.

Instantiate

from langchain_ibm import WatsonxEmbeddings

embeddings = WatsonxEmbeddings(
    model_id="ibm/granite-embedding-278m-multilingual",
    url="https://us-south.ml.cloud.ibm.com",
    project_id="*****",
    # api_key="*****"
)

Embed single text

input_text = "The meaning of life is 42"
vector = embeddings.embed_query("hello")
print(vector[:3])

[-0.0020519258, 0.0147288125, -0.0090887165]

Embed multiple texts

vectors = embeddings.embed_documents(["hello", "goodbye"])
# Showing only the first 3 coordinates
print(len(vectors))
print(vectors[0][:3])

2
[-0.0020519265, 0.01472881, -0.009088721]

Async

await embeddings.aembed_query(input_text)
print(vector[:3])

# multiple:
# await embeddings.aembed_documents(input_texts)

[-0.0020519258, 0.0147288125, -0.0090887165]

class

WatsonxLLM

IBM watsonx.ai large language models class.

Setup

To use the large language models, you need to have the langchain_ibm python package installed, and the environment variable WATSONX_API_KEY set with your API key or pass it as a named parameter api_key to the constructor.

pip install -U langchain-ibm

# or using uv
uv add langchain-ibm

export WATSONX_API_KEY="your-api-key"

Deprecated

apikey and WATSONX_APIKEY are deprecated and will be removed in version 2.0.0. Use api_key and WATSONX_API_KEY instead.

Instantiate

from langchain_ibm import WatsonxLLM
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames

parameters = {
    GenTextParamsMetaNames.DECODING_METHOD: "sample",
    GenTextParamsMetaNames.MAX_NEW_TOKENS: 100,
    GenTextParamsMetaNames.MIN_NEW_TOKENS: 1,
    GenTextParamsMetaNames.TEMPERATURE: 0.5,
    GenTextParamsMetaNames.TOP_K: 50,
    GenTextParamsMetaNames.TOP_P: 1,
}

model = WatsonxLLM(
    model_id="google/flan-t5-xl",
    url="https://us-south.ml.cloud.ibm.com",
    project_id="*****",
    params=parameters,
    # api_key="*****"
)

Invoke

input_text = "The meaning of life is "
response = model.invoke(input_text)
print(response)

"42, but what was the question?
The answer to the ultimate question of life, the universe, and everything is 42.
But what was the question? This is a reference to Douglas Adams' science fiction
series "The Hitchhiker's Guide to the Galaxy."

Stream

for chunk in model.stream(input_text):
    print(chunk, end="")

"42, but what was the question?
The answer to the ultimate question of life, the universe, and everything is 42.
But what was the question? This is a reference to Douglas Adams' science fiction
series "The Hitchhiker's Guide to the Galaxy."

Async

response = await model.ainvoke(input_text)

# stream:
# async for chunk in model.astream(input_text):
#     print(chunk, end="")

# batch:
# await model.abatch([input_text])

"42, but what was the question?
The answer to the ultimate question of life, the universe, and everything is 42.
But what was the question? This is a reference to Douglas Adams' science fiction
series "The Hitchhiker's Guide to the Galaxy."

class

WatsonxRerank

Document compressor that uses watsonx Rerank API.

Setup

To use, you should have langchain_ibm python package installed, and the environment variable WATSONX_API_KEY set with your API key, or pass it as a named parameter api_key to the constructor.

pip install -U langchain-ibm

# or using uv
uv add langchain-ibm

export WATSONX_API_KEY="your-api-key"

Deprecated

apikey and WATSONX_APIKEY are deprecated and will be removed in version 2.0.0. Use api_key and WATSONX_API_KEY instead.

Instantiate

from langchain_ibm import WatsonxRerank
from ibm_watsonx_ai.foundation_models.schema import RerankParameters

parameters = RerankParameters(truncate_input_tokens=20)

ranker = WatsonxRerank(
    model_id="cross-encoder/ms-marco-minilm-l-12-v2",
    url="https://us-south.ml.cloud.ibm.com",
    project_id="*****",
    params=parameters,
    # api_key="*****"
)

Rerank

query = "red cat chasing a laser pointer"
documents = [
    "A red cat darts across the living room, pouncing on a red laser dot.",
    "Two dogs play fetch in the park with a tennis ball.",
    "The tabby cat naps on a sunny windowsill all afternoon.",
    "A recipe for tuna casserole with crispy breadcrumbs.",
]

ranker.rerank(documents=documents, query=query)

[
    {"index": 0, "relevance_score": 0.8719543218612671},
    {"index": 2, "relevance_score": 0.6520894169807434},
    {"index": 1, "relevance_score": 0.6270776391029358},
    {"index": 3, "relevance_score": 0.4607713520526886},
]

Modules

module

llms

Base classes for IBM watsonx.ai large language models.

module

embeddings

IBM watsonx.ai embeddings wrapper.

module

utils

Utility helpers for langchain-ibm.

module

rerank

IBM watsonx.ai rerank wrapper.

module

chat_models

IBM watsonx.ai chat wrapper.

View source on GitHub

langchain_ibm

Classes

Modules

LangChain Assistant

Menu

langchain_ibm

Classes

Modules