ChatGroq

Setup:

Install langchain-groq and set environment variable GROQ_API_KEY.

pip install -U langchain-groq
export GROQ_API_KEY="your-api-key"

Key init args — completion params: model: Name of Groq model to use, e.g. llama-3.1-8b-instant. temperature: Sampling temperature. Ranges from 0.0 to 1.0. max_tokens: Max number of tokens to generate. reasoning_format: The format for reasoning output. Groq will default to raw if left undefined.

    - `'parsed'`: Separates reasoning into a dedicated field while keeping the
        response concise. Reasoning will be returned in the
        `additional_kwargs.reasoning_content` field of the response.
    - `'raw'`: Includes reasoning within think tags (e.g.
        `<think>{reasoning_content}</think>`).
    - `'hidden'`: Returns only the final answer content. Note: this only
        suppresses reasoning content in the response; the model will still perform
        reasoning unless overridden in `reasoning_effort`.

    See the [Groq documentation](https://console.groq.com/docs/reasoning#reasoning)
    for more details and a list of supported models.
model_kwargs:
    Holds any model parameters valid for create call not
    explicitly specified.

Key init args — client params: timeout: Timeout for requests. max_retries: Max number of retries. api_key: Groq API key. If not passed in will be read from env var GROQ_API_KEY. base_url: Base URL path for API requests, leave blank if not using a proxy or service emulator. custom_get_token_ids: Optional encoder to use for counting tokens.

See full list of supported init args and their descriptions in the params section.

Instantiate:

from langchain_groq import ChatGroq

model = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=0.0,
    max_retries=2,
    # other params...
)

Invoke:

messages = [
    ("system", "You are a helpful translator. Translate the user sentence to French."),
    ("human", "I love programming."),
]
model.invoke(messages)

AIMessage(content='The English sentence "I love programming" can
be translated to French as "J\'aime programmer". The word
"programming" is translated as "programmer" in French.',
response_metadata={'token_usage': {'completion_tokens': 38,
'prompt_tokens': 28, 'total_tokens': 66, 'completion_time':
0.057975474, 'prompt_time': 0.005366091, 'queue_time': None,
'total_time': 0.063341565}, 'model_name': 'llama-3.1-8b-instant',
'system_fingerprint': 'fp_c5f20b5bb1', 'finish_reason': 'stop',
'logprobs': None}, id='run-ecc71d70-e10c-4b69-8b8c-b8027d95d4b8-0')

Vision:

from langchain_groq import ChatGroq
from langchain_core.messages import HumanMessage

model = ChatGroq(model="meta-llama/llama-4-scout-17b-16e-instruct")

message = HumanMessage(
    content=[
        {"type": "text", "text": "Describe this image in detail"},
        {"type": "image_url", "image_url": {"url": "example_url.jpg"}},
    ]
)

response = model.invoke([message])
print(response.content)

Vision-capable models:

meta-llama/llama-4-scout-17b-16e-instruct
meta-llama/llama-4-maverick-17b-128e-instruct

Maximum image size: 20MB per request.

Stream:

# Streaming `text` for each content chunk received
for chunk in model.stream(messages):
    print(chunk.text, end="")

content='' id='run-4e9f926b-73f5-483b-8ef5-09533d925853'
content='The' id='run-4e9f926b-73f5-483b-8ef5-09533d925853'
content=' English' id='run-4e9f926b-73f5-483b-8ef5-09533d925853'
content=' sentence' id='run-4e9f926b-73f5-483b-8ef5-09533d925853'
...
content=' program' id='run-4e9f926b-73f5-483b-8ef5-09533d925853'
content='".' id='run-4e9f926b-73f5-483b-8ef5-09533d925853'
content='' response_metadata={'finish_reason': 'stop'}
id='run-4e9f926b-73f5-483b-8ef5-09533d925853

# Reconstructing a full response
stream = model.stream(messages)
full = next(stream)
for chunk in stream:
    full += chunk
full

AIMessageChunk(content='The English sentence "I love programming"
can be translated to French as "J\'aime programmer". Here\'s the
breakdown of the sentence: "J\'aime" is the French equivalent of "
I love", and "programmer" is the French infinitive for "to program".
So, the literal translation is "I love to program". However, in
English we often omit the "to" when talking about activities we
love, and the same applies to French. Therefore, "J\'aime
programmer" is the correct and natural way to express "I love
programming" in French.', response_metadata={'finish_reason':
'stop'}, id='run-a3c35ac4-0750-4d08-ac55-bfc63805de76')

Async:

await model.ainvoke(messages)

AIMessage(content='The English sentence "I love programming" can
be translated to French as "J\'aime programmer". The word
"programming" is translated as "programmer" in French. I hope
this helps! Let me know if you have any other questions.',
response_metadata={'token_usage': {'completion_tokens': 53,
'prompt_tokens': 28, 'total_tokens': 81, 'completion_time':
0.083623752, 'prompt_time': 0.007365126, 'queue_time': None,
'total_time': 0.090988878}, 'model_name': 'llama-3.1-8b-instant',
'system_fingerprint': 'fp_c5f20b5bb1', 'finish_reason': 'stop',
'logprobs': None}, id='run-897f3391-1bea-42e2-82e0-686e2367bcf8-0')

Tool calling:

from pydantic import BaseModel, Field

class GetWeather(BaseModel):
    '''Get the current weather in a given location'''

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

class GetPopulation(BaseModel):
    '''Get the current population in a given location'''

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

model_with_tools = model.bind_tools([GetWeather, GetPopulation])
ai_msg = model_with_tools.invoke("What is the population of NY?")
ai_msg.tool_calls

[
    {
        "name": "GetPopulation",
        "args": {"location": "NY"},
        "id": "call_bb8d",
    }
]

See ChatGroq.bind_tools() method for more.

Structured output:

from typing import Optional

from pydantic import BaseModel, Field

class Joke(BaseModel):
    '''Joke to tell user.'''

    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: int | None = Field(description="How funny the joke is, from 1 to 10")

structured_model = model.with_structured_output(Joke)
structured_model.invoke("Tell me a joke about cats")

Joke(
    setup="Why don't cats play poker in the jungle?",
    punchline="Too many cheetahs!",
    rating=None,
)

See ChatGroq.with_structured_output() for more.

Response metadata:

ai_msg = model.invoke(messages)
ai_msg.response_metadata

{
    "token_usage": {
        "completion_tokens": 70,
        "prompt_tokens": 28,
        "total_tokens": 98,
        "completion_time": 0.111956391,
        "prompt_time": 0.007518279,
        "queue_time": None,
        "total_time": 0.11947467,
    },
    "model_name": "llama-3.1-8b-instant",
    "system_fingerprint": "fp_c5f20b5bb1",
    "finish_reason": "stop",
    "logprobs": None,
}

LangChain Assistant

Menu

Bases

Attributes

Methods

Inherited fromBaseChatModel(langchain_core)

Attributes

Methods

Inherited fromBaseLanguageModel(langchain_core)

Attributes

Methods

Inherited fromRunnableSerializable(langchain_core)

Attributes

Methods

Inherited fromSerializable(langchain_core)

Attributes

Methods

Inherited fromRunnable(langchain_core)

Attributes

Methods

Menu

ChatGroq

Bases

Used in Docs

Attributes

Methods

Inherited fromBaseChatModel(langchain_core)

Attributes

Methods

Inherited fromBaseLanguageModel(langchain_core)

Attributes

Methods

Inherited fromRunnableSerializable(langchain_core)

Attributes

Methods

Inherited fromSerializable(langchain_core)

Attributes

Methods

Inherited fromRunnable(langchain_core)

Attributes

Methods