IBM watsonx.ai chat models integration.
To use, you should have langchain_ibm python package installed,
and the environment variable WATSONX_API_KEY set with your API key, or pass
it as a named parameter api_key to the constructor.
pip install -U langchain-ibm
# or using uv
uv add langchain-ibm
export WATSONX_API_KEY="your-api-key"
apikey and WATSONX_APIKEY are deprecated and will be removed in
version 2.0.0. Use api_key and WATSONX_API_KEY instead.
Create a model instance with desired params. For example:
from langchain_ibm import ChatWatsonx
from ibm_watsonx_ai.foundation_models.schema import TextChatParameters
parameters = TextChatParameters(
top_p=1, temperature=0.5, max_completion_tokens=None
)
model = ChatWatsonx(
model_id="meta-llama/llama-3-3-70b-instruct",
url="https://us-south.ml.cloud.ibm.com",
project_id="*****",
params=parameters,
# api_key="*****"
)Generate a response from the model:
messages = [
(
"system",
"You are a helpful translator. Translate the user sentence to French.",
),
("human", "I love programming."),
]
model.invoke(messages)
Results in an AIMessage response:
AIMessage(
content="J'adore programmer.",
additional_kwargs={},
response_metadata={
"token_usage": {
"completion_tokens": 7,
"prompt_tokens": 30,
"total_tokens": 37,
},
"model_name": "ibm/granite-3-3-8b-instruct",
"system_fingerprint": "",
"finish_reason": "stop",
},
id="chatcmpl-529352c4-93ba-4801-8f1d-a3b4e3935194---daed91fb74d0405f200db1e63da9a48a---7a3ef799-4413-47e4-b24c-85d267e37fa2",
usage_metadata={"input_tokens": 30, "output_tokens": 7, "total_tokens": 37},
)Stream a response from the model:
for chunk in model.stream(messages):
print(chunk.text)
Results in a sequence of AIMessageChunk objects with partial content:
AIMessageChunk(content="", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775")
AIMessageChunk(content="J", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775")
AIMessageChunk(content="'", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775")
AIMessageChunk(content="ad", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775")
AIMessageChunk(content="or", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775")
AIMessageChunk(
content=" programmer", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775"
)
AIMessageChunk(content=".", id="run--e48a38d3-1500-4b5e-870c-6313e8cff775")
AIMessageChunk(
content="",
response_metadata={
"finish_reason": "stop",
"model_name": "ibm/granite-3-3-8b-instruct",
},
id="run--e48a38d3-1500-4b5e-870c-6313e8cff775",
)
AIMessageChunk(
content="",
id="run--e48a38d3-1500-4b5e-870c-6313e8cff775",
usage_metadata={"input_tokens": 30, "output_tokens": 7, "total_tokens": 37},
)
To collect the full message, you can concatenate the chunks:
stream = model.stream(messages)
full = next(stream)
for chunk in stream:
full += chunk
full
AIMessageChunk(
content="J'adore programmer.",
response_metadata={
"finish_reason": "stop",
"model_name": "ibm/granite-3-3-8b-instruct",
},
id="chatcmpl-88a48b71-c149-4a0c-9c02-d6b97ca5dc6c---b7ba15879a8c5283b1e8a3b8db0229f0---0037ca4f-8a74-4f84-a46c-ab3fd1294f24",
usage_metadata={"input_tokens": 30, "output_tokens": 7, "total_tokens": 37},
)Asynchronous equivalents of invoke, stream, and batch are also available:
# Invoke
await model.ainvoke(messages)
# Stream
async for chunk in model.astream(messages):
print(chunk.text)
# Batch
await model.abatch([messages])
Results in an AIMessage response:
AIMessage(
content="J'adore programmer.",
additional_kwargs={},
response_metadata={
"token_usage": {
"completion_tokens": 7,
"prompt_tokens": 30,
"total_tokens": 37,
},
"model_name": "ibm/granite-3-3-8b-instruct",
"system_fingerprint": "",
"finish_reason": "stop",
},
id="chatcmpl-5bef2d81-ef56-463b-a8fa-c2bcc2a3c348---821e7750d18925f2b36226db66667e26---6396c786-9da9-4468-883e-11ed90a05937",
usage_metadata={"input_tokens": 30, "output_tokens": 7, "total_tokens": 37},
)
For batched calls, results in a list[AIMessage].
from pydantic import BaseModel, Field
class GetWeather(BaseModel):
'''Get the current weather in a given location'''
location: str = Field(
..., description="The city and state, e.g. San Francisco, CA"
)
class GetPopulation(BaseModel):
'''Get the current population in a given location'''
location: str = Field(
..., description="The city and state, e.g. San Francisco, CA"
)
model_with_tools = model.bind_tools(
[GetWeather, GetPopulation]
# strict = True # Enforce tool args schema is respected
)
ai_msg = model_with_tools.invoke(
"Which city is hotter today and which is bigger: LA or NY?"
)
ai_msg.tool_calls
[
{
"name": "GetWeather",
"args": {"location": "Los Angeles, CA"},
"id": "chatcmpl-tool-59632abcee8f48a18a5f3a81422b917b",
"type": "tool_call",
},
{
"name": "GetWeather",
"args": {"location": "New York, NY"},
"id": "chatcmpl-tool-c6f3b033b4594918bb53f656525b0979",
"type": "tool_call",
},
{
"name": "GetPopulation",
"args": {"location": "Los Angeles, CA"},
"id": "chatcmpl-tool-175a23281e4747ea81cbe472b8e47012",
"type": "tool_call",
},
{
"name": "GetPopulation",
"args": {"location": "New York, NY"},
"id": "chatcmpl-tool-e1ccc534835945aebab708eb5e685bf7",
"type": "tool_call",
},
]from langchain_ibm import ChatWatsonx
from ibm_watsonx_ai.foundation_models.schema import TextChatParameters
parameters = TextChatParameters(
include_reasoning=True, reasoning_effort="medium"
)
model = ChatWatsonx(
model_id="openai/gpt-oss-120b",
url="https://us-south.ml.cloud.ibm.com",
project_id="*****",
params=parameters,
# api_key="*****"
)
response = model.invoke("What is 3^3?")
# Response text
print(f"Output: {response.content}")
# Reasoning summaries
print(f"Reasoning: {response.additional_kwargs['reasoning_content']}")
Output: 3^3 = 27
Reasoning: The user asks "What is 3^3?" That's 27. Provide answer.
AIMessage formatlangchain-ibm >= 0.3.19
allows users to set Reasoning output parameters and will format output from
reasoning summaries into additional_kwargs field.
from pydantic import BaseModel, Field
class Joke(BaseModel):
'''Joke to tell user.'''
setup: str = Field(description="The setup of the joke")
punchline: str = Field(description="The punchline to the joke")
rating: int | None = Field(description="How funny the joke is, 1 to 10")
structured_model = model.with_structured_output(Joke)
structured_model.invoke("Tell me a joke about cats")
Joke(
setup="Why was the cat sitting on the computer?",
punchline="To keep an eye on the mouse!",
rating=None,
)
See with_structured_output for more info.
json_model = model.bind(response_format={"type": "json_object"})
ai_msg = json_model.invoke(
“Return JSON with 'random_ints': an array of 10 random integers from 0-99.”
)
ai_msg.content
'{\n "random_ints": [12, 34, 56, 78, 10, 22, 44, 66, 88, 99]\n}'import base64
import httpx
from langchain.messages import HumanMessage
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8")
message = HumanMessage(
content=[
{"type": "text", "text": "describe the weather in this image"},
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
},
]
)
ai_msg = model.invoke([message])
ai_msg.content
"The weather in the image presents a clear, sunny day with good visibility
and no immediate signs of rain or strong winds. The vibrant blue sky with
scattered white clouds gives the impression of a tranquil, pleasant day
conducive to outdoor activities."ai_msg = model.invoke(messages)
ai_msg.usage_metadata
{'input_tokens': 30, 'output_tokens': 7, 'total_tokens': 37}
stream = model.stream(messages)
full = next(stream)
for chunk in stream:
full += chunk
full.usage_metadata
{'input_tokens': 30, 'output_tokens': 7, 'total_tokens': 37}logprobs_model = model.bind(logprobs=True)
ai_msg = logprobs_model.invoke(messages)
ai_msg.response_metadata["logprobs"]
{
'content': [
{
'token': 'J',
'logprob': -0.0017940393
},
{
'token': "'",
'logprob': -1.7523613e-05
},
{
'token': 'ad',
'logprob': -0.16112353
},
{
'token': 'ore',
'logprob': -0.0003091811
},
{
'token': ' programmer',
'logprob': -0.24849245
},
{
'token': '.',
'logprob': -2.5033638e-05
},
{
'token': '<|end_of_text|>',
'logprob': -7.080781e-05
}
]
}ai_msg = model.invoke(messages)
ai_msg.response_metadata
{
'token_usage': {
'completion_tokens': 7,
'prompt_tokens': 30,
'total_tokens': 37
},
'model_name': 'ibm/granite-3-3-8b-instruct',
'system_fingerprint': '',
'finish_reason': 'stop'
}IBM watsonx.ai embedding model integration.
To use, you should have langchain_ibm python package installed,
and the environment variable WATSONX_API_KEY set with your API key, or pass
it as a named parameter api_key to the constructor.
pip install -U langchain-ibm
# or using uv
uv add langchain-ibm
export WATSONX_API_KEY="your-api-key"
apikey and WATSONX_APIKEY are deprecated and will be removed in
version 2.0.0. Use api_key and WATSONX_API_KEY instead.
from langchain_ibm import WatsonxEmbeddings
embeddings = WatsonxEmbeddings(
model_id="ibm/granite-embedding-278m-multilingual",
url="https://us-south.ml.cloud.ibm.com",
project_id="*****",
# api_key="*****"
)input_text = "The meaning of life is 42"
vector = embeddings.embed_query("hello")
print(vector[:3])
[-0.0020519258, 0.0147288125, -0.0090887165]vectors = embeddings.embed_documents(["hello", "goodbye"])
# Showing only the first 3 coordinates
print(len(vectors))
print(vectors[0][:3])
2
[-0.0020519265, 0.01472881, -0.009088721]await embeddings.aembed_query(input_text)
print(vector[:3])
# multiple:
# await embeddings.aembed_documents(input_texts)
[-0.0020519258, 0.0147288125, -0.0090887165]IBM watsonx.ai large language models class.
To use the large language models, you need to have the langchain_ibm python
package installed, and the environment variable WATSONX_API_KEY set with your
API key or pass it as a named parameter api_key to the constructor.
pip install -U langchain-ibm
# or using uv
uv add langchain-ibm
export WATSONX_API_KEY="your-api-key"
apikey and WATSONX_APIKEY are deprecated and will be removed in
version 2.0.0. Use api_key and WATSONX_API_KEY instead.
from langchain_ibm import WatsonxLLM
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames
parameters = {
GenTextParamsMetaNames.DECODING_METHOD: "sample",
GenTextParamsMetaNames.MAX_NEW_TOKENS: 100,
GenTextParamsMetaNames.MIN_NEW_TOKENS: 1,
GenTextParamsMetaNames.TEMPERATURE: 0.5,
GenTextParamsMetaNames.TOP_K: 50,
GenTextParamsMetaNames.TOP_P: 1,
}
model = WatsonxLLM(
model_id="google/flan-t5-xl",
url="https://us-south.ml.cloud.ibm.com",
project_id="*****",
params=parameters,
# api_key="*****"
)input_text = "The meaning of life is "
response = model.invoke(input_text)
print(response)
"42, but what was the question?
The answer to the ultimate question of life, the universe, and everything is 42.
But what was the question? This is a reference to Douglas Adams' science fiction
series "The Hitchhiker's Guide to the Galaxy."for chunk in model.stream(input_text):
print(chunk, end="")
"42, but what was the question?
The answer to the ultimate question of life, the universe, and everything is 42.
But what was the question? This is a reference to Douglas Adams' science fiction
series "The Hitchhiker's Guide to the Galaxy."response = await model.ainvoke(input_text)
# stream:
# async for chunk in model.astream(input_text):
# print(chunk, end="")
# batch:
# await model.abatch([input_text])
"42, but what was the question?
The answer to the ultimate question of life, the universe, and everything is 42.
But what was the question? This is a reference to Douglas Adams' science fiction
series "The Hitchhiker's Guide to the Galaxy."Document compressor that uses watsonx Rerank API.
To use, you should have langchain_ibm python package installed,
and the environment variable WATSONX_API_KEY set with your API key, or pass
it as a named parameter api_key to the constructor.
pip install -U langchain-ibm
# or using uv
uv add langchain-ibm
export WATSONX_API_KEY="your-api-key"
apikey and WATSONX_APIKEY are deprecated and will be removed in
version 2.0.0. Use api_key and WATSONX_API_KEY instead.
from langchain_ibm import WatsonxRerank
from ibm_watsonx_ai.foundation_models.schema import RerankParameters
parameters = RerankParameters(truncate_input_tokens=20)
ranker = WatsonxRerank(
model_id="cross-encoder/ms-marco-minilm-l-12-v2",
url="https://us-south.ml.cloud.ibm.com",
project_id="*****",
params=parameters,
# api_key="*****"
)query = "red cat chasing a laser pointer"
documents = [
"A red cat darts across the living room, pouncing on a red laser dot.",
"Two dogs play fetch in the park with a tennis ball.",
"The tabby cat naps on a sunny windowsill all afternoon.",
"A recipe for tuna casserole with crispy breadcrumbs.",
]
ranker.rerank(documents=documents, query=query)
[
{"index": 0, "relevance_score": 0.8719543218612671},
{"index": 2, "relevance_score": 0.6520894169807434},
{"index": 1, "relevance_score": 0.6270776391029358},
{"index": 3, "relevance_score": 0.4607713520526886},
]