Skip to content

langchain-xai

PyPI - Version PyPI - License PyPI - Downloads

Reference docs

This page contains reference documentation for xAI. See the docs for conceptual guides, tutorials, and examples on using xAI modules.

langchain_xai

LangChain integration with xAI.

ChatXAI

Bases: BaseChatOpenAI

ChatXAI chat model.

Refer to xAI's documentation for more nuanced details on the API's behavior and supported parameters.

Setup

Install langchain-xai and set environment variable XAI_API_KEY.

pip install -U langchain-xai
export XAI_API_KEY="your-api-key"

Key init args — completion params: model: Name of model to use. temperature: Sampling temperature between 0 and 2. Higher values mean more random completions, while lower values (like 0.2) mean more focused and deterministic completions. (Default: 1.) max_tokens: Max number of tokens to generate. Refer to your model's documentation for the maximum number of tokens it can generate. logprobs: Whether to return logprobs.

Key init args — client params: timeout: Timeout for requests. max_retries: Max number of retries. api_key: xAI API key. If not passed in will be read from env var XAI_API_KEY.

Instantiate
from langchain_xai import ChatXAI

model = ChatXAI(
    model="grok-4",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    # api_key="...",
    # other params...
)
Invoke
messages = [
    (
        "system",
        "You are a helpful translator. Translate the user sentence to French.",
    ),
    ("human", "I love programming."),
]
model.invoke(messages)
AIMessage(
    content="J'adore la programmation.",
    response_metadata={
        "token_usage": {
            "completion_tokens": 9,
            "prompt_tokens": 32,
            "total_tokens": 41,
        },
        "model_name": "grok-4",
        "system_fingerprint": None,
        "finish_reason": "stop",
        "logprobs": None,
    },
    id="run-168dceca-3b8b-4283-94e3-4c739dbc1525-0",
    usage_metadata={
        "input_tokens": 32,
        "output_tokens": 9,
        "total_tokens": 41,
    },
)
Stream
for chunk in model.stream(messages):
    print(chunk.text, end="")
content='J' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
content="'" id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
content='ad' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
content='ore' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
content=' la' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
content=' programm' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
content='ation' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
content='.' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
content='' response_metadata={'finish_reason': 'stop', 'model_name': 'grok-4'} id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
Async
await model.ainvoke(messages)

# stream:
# async for chunk in (await model.astream(messages))

# batch:
# await model.abatch([messages])
AIMessage(
    content="J'adore la programmation.",
    response_metadata={
        "token_usage": {
            "completion_tokens": 9,
            "prompt_tokens": 32,
            "total_tokens": 41,
        },
        "model_name": "grok-4",
        "system_fingerprint": None,
        "finish_reason": "stop",
        "logprobs": None,
    },
    id="run-09371a11-7f72-4c53-8e7c-9de5c238b34c-0",
    usage_metadata={
        "input_tokens": 32,
        "output_tokens": 9,
        "total_tokens": 41,
    },
)
Reasoning

Certain xAI models support reasoning, which allows the model to provide reasoning content along with the response.

If provided, reasoning content is returned under the additional_kwargs field of the AIMessage or AIMessageChunk.

If supported, reasoning effort can be specified in the model constructor's extra_body argument, which will control the amount of reasoning the model does. The value can be one of 'low' or 'high'.

model = ChatXAI(
    model="grok-3-mini",
    extra_body={"reasoning_effort": "high"},
)

Note

As of 2025-07-10, reasoning_content is only returned in Grok 3 models, such as Grok 3 Mini.

Note

Note that in Grok 4, as of 2025-07-10, reasoning is not exposed in reasoning_content (other than initial 'Thinking...' text), reasoning cannot be disabled, and the reasoning_effort cannot be specified.

Tool calling / function calling:

from pydantic import BaseModel, Field

model = ChatXAI(model="grok-4")


class GetWeather(BaseModel):
    '''Get the current weather in a given location'''

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")


class GetPopulation(BaseModel):
    '''Get the current population in a given location'''

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")


model_with_tools = model.bind_tools([GetWeather, GetPopulation])
ai_msg = model_with_tools.invoke("Which city is bigger: LA or NY?")
ai_msg.tool_calls
[
    {
        "name": "GetPopulation",
        "args": {"location": "NY"},
        "id": "call_m5tstyn2004pre9bfuxvom8x",
        "type": "tool_call",
    },
    {
        "name": "GetPopulation",
        "args": {"location": "LA"},
        "id": "call_0vjgq455gq1av5sp9eb1pw6a",
        "type": "tool_call",
    },
]

Note

With stream response, the tool / function call will be returned in whole in a single chunk, instead of being streamed across chunks.

Tool choice can be controlled by setting the tool_choice parameter in the model constructor's extra_body argument. For example, to disable tool / function calling:

model = ChatXAI(model="grok-4", extra_body={"tool_choice": "none"})
To require that the model always calls a tool / function, set tool_choice to 'required':

model = ChatXAI(model="grok-4", extra_body={"tool_choice": "required"})

To specify a tool / function to call, set tool_choice to the name of the tool / function:

from pydantic import BaseModel, Field

model = ChatXAI(
    model="grok-4",
    extra_body={
        "tool_choice": {"type": "function", "function": {"name": "GetWeather"}}
    },
)

class GetWeather(BaseModel):
    \"\"\"Get the current weather in a given location\"\"\"

    location: str = Field(..., description='The city and state, e.g. San Francisco, CA')


class GetPopulation(BaseModel):
    \"\"\"Get the current population in a given location\"\"\"

    location: str = Field(..., description='The city and state, e.g. San Francisco, CA')


model_with_tools = model.bind_tools([GetWeather, GetPopulation])
ai_msg = model_with_tools.invoke(
    "Which city is bigger: LA or NY?",
)
ai_msg.tool_calls

The resulting tool call would be:

[
    {
        "name": "GetWeather",
        "args": {"location": "Los Angeles, CA"},
        "id": "call_81668711",
        "type": "tool_call",
    }
]

Parallel tool calling / parallel function calling: By default, parallel tool / function calling is enabled, so you can process multiple function calls in one request/response cycle. When two or more tool calls are required, all of the tool call requests will be included in the response body.

Structured output
from typing import Optional

from pydantic import BaseModel, Field


class Joke(BaseModel):
    '''Joke to tell user.'''

    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: int | None = Field(description="How funny the joke is, from 1 to 10")


structured_model = model.with_structured_output(Joke)
structured_model.invoke("Tell me a joke about cats")
Joke(
    setup="Why was the cat sitting on the computer?",
    punchline="To keep an eye on the mouse!",
    rating=7,
)
Token usage
ai_msg = model.invoke(messages)
ai_msg.usage_metadata
{"input_tokens": 37, "output_tokens": 6, "total_tokens": 43}
Logprobs
logprobs_model = model.bind(logprobs=True)
messages = [("human", "Say Hello World! Do not return anything else.")]
ai_msg = logprobs_model.invoke(messages)
ai_msg.response_metadata["logprobs"]
{
    "content": None,
    "token_ids": [22557, 3304, 28808, 2],
    "tokens": [" Hello", " World", "!", "</s>"],
    "token_logprobs": [-4.7683716e-06, -5.9604645e-07, 0, -0.057373047],
}

Response metadata:

ai_msg = model.invoke(messages)
ai_msg.response_metadata
{
    "token_usage": {
        "completion_tokens": 4,
        "prompt_tokens": 19,
        "total_tokens": 23,
    },
    "model_name": "grok-4",
    "system_fingerprint": None,
    "finish_reason": "stop",
    "logprobs": None,
}
METHOD DESCRIPTION
get_lc_namespace

Get the namespace of the LangChain object.

is_lc_serializable

Return whether this model can be serialized by LangChain.

validate_environment

Validate that api key and python package exists in environment.

with_structured_output

Model wrapper that returns outputs formatted to match the given schema.

model_name class-attribute instance-attribute

model_name: str = Field(default='grok-4', alias='model')

Model name to use.

xai_api_key class-attribute instance-attribute

xai_api_key: SecretStr | None = Field(
    alias="api_key", default_factory=secret_from_env("XAI_API_KEY", default=None)
)

xAI API key.

Automatically read from env variable XAI_API_KEY if not provided.

xai_api_base class-attribute instance-attribute

xai_api_base: str = Field(default='https://api.x.ai/v1/')

Base URL path for API requests.

search_parameters class-attribute instance-attribute

search_parameters: dict[str, Any] | None = None

Parameters for search requests. Example: {"mode": "auto"}.

lc_secrets property

lc_secrets: dict[str, str]

A map of constructor argument names to secret ids.

For example, {"xai_api_key": "XAI_API_KEY"}

lc_attributes property

lc_attributes: dict[str, Any]

List of attribute names that should be included in the serialized kwargs.

These attributes must be accepted by the constructor.

get_lc_namespace classmethod

get_lc_namespace() -> list[str]

Get the namespace of the LangChain object.

RETURNS DESCRIPTION
list[str]

["langchain_xai", "chat_models"]

is_lc_serializable classmethod

is_lc_serializable() -> bool

Return whether this model can be serialized by LangChain.

validate_environment

validate_environment() -> Self

Validate that api key and python package exists in environment.

with_structured_output

with_structured_output(
    schema: _DictOrPydanticClass | None = None,
    *,
    method: Literal[
        "function_calling", "json_mode", "json_schema"
    ] = "function_calling",
    include_raw: bool = False,
    strict: bool | None = None,
    **kwargs: Any,
) -> Runnable[LanguageModelInput, _DictOrPydantic]

Model wrapper that returns outputs formatted to match the given schema.

PARAMETER DESCRIPTION
schema

The output schema. Can be passed in as:

  • An OpenAI function/tool schema,
  • A JSON Schema,
  • A TypedDict class,
  • Or a Pydantic class.

If schema is a Pydantic class then the model output will be a Pydantic instance of that class, and the model-generated fields will be validated by the Pydantic class. Otherwise the model output will be a dict and will not be validated.

See langchain_core.utils.function_calling.convert_to_openai_tool for more on how to properly specify types and descriptions of schema fields when specifying a Pydantic or TypedDict class.

TYPE: _DictOrPydanticClass | None DEFAULT: None

method

The method for steering model generation, one of:

TYPE: Literal['function_calling', 'json_mode', 'json_schema'] DEFAULT: 'function_calling'

include_raw

If False then only the parsed structured output is returned.

If an error occurs during model output parsing it will be raised.

If True then both the raw model response (a BaseMessage) and the parsed model response will be returned.

If an error occurs during output parsing it will be caught and returned as well.

The final output is always a dict with keys 'raw', 'parsed', and 'parsing_error'.

TYPE: bool DEFAULT: False

strict
  • True: Model output is guaranteed to exactly match the schema. The input schema will also be validated according to the supported schemas.
  • False: Input schema will not be validated and model output will not be validated.
  • None: strict argument will not be passed to the model.

TYPE: bool | None DEFAULT: None

kwargs

Additional keyword args aren't supported.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Runnable[LanguageModelInput, _DictOrPydantic]

A Runnable that takes same inputs as a langchain_core.language_models.chat.BaseChatModel. If include_raw is False and schema is a Pydantic class, Runnable outputs an instance of schema (i.e., a Pydantic object). Otherwise, if include_raw is False then Runnable outputs a dict.

If include_raw is True, then Runnable outputs a dict with keys:

  • 'raw': BaseMessage
  • 'parsed': None if there was a parsing error, otherwise the type depends on the schema as described above.
  • 'parsing_error': BaseException | None