ChatGoogleGenerativeAI

Setup:

Vertex AI Platform Support

Added in langchain-google-genai 4.0.0.

ChatGoogleGenerativeAI now supports both the Gemini Developer API and Vertex AI Platform as backend options.

For Gemini Developer API (simplest):

Set the GOOGLE_API_KEY environment variable (recommended), or
Pass your API key using the api_key parameter

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview", api_key="...")

For Vertex AI Platform with API key:

export GEMINI_API_KEY='your-api-key'
export GOOGLE_GENAI_USE_VERTEXAI=true
export GOOGLE_CLOUD_PROJECT='your-project-id'

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")
# Or explicitly:
model = ChatGoogleGenerativeAI(
    model="gemini-3.1-pro-preview",
    api_key="...",
    project="your-project-id",
    vertexai=True,
)

For Vertex AI with credentials:

model = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    project="your-project-id",
    # Uses Application Default Credentials (ADC)
)

Automatic backend detection (when vertexai=None / unspecified):

If GOOGLE_GENAI_USE_VERTEXAI env var is set, uses that value
If credentials parameter is provided, uses Vertex AI
If project parameter is provided, uses Vertex AI
Otherwise, uses Gemini Developer API

Environment variables:

Variable	Purpose	Backend
`GOOGLE_API_KEY`	API key (primary)	Both (see `GOOGLE_GENAI_USE_VERTEXAI`)
`GEMINI_API_KEY`	API key (fallback)	Both (see `GOOGLE_GENAI_USE_VERTEXAI`)
`GOOGLE_GENAI_USE_VERTEXAI`	Force Vertex AI backend (`true`/`false`)	Vertex AI
`GOOGLE_CLOUD_PROJECT`	GCP project ID	Vertex AI
`GOOGLE_CLOUD_LOCATION`	GCP region (default: `global`)	Vertex AI
`HTTPS_PROXY`	HTTP/HTTPS proxy URL	Both
`SSL_CERT_FILE`	Custom SSL certificate file	Both

GOOGLE_API_KEY is checked first for backwards compatibility. (GEMINI_API_KEY was introduced later to better reflect the API's branding.)

Proxy configuration:

Set these before initializing:

export HTTPS_PROXY='http://username:password@proxy_uri:port'
export SSL_CERT_FILE='path/to/cert.pem'  # Optional: custom SSL certificate

For SOCKS5 proxies or advanced proxy configuration, use the client_args parameter:

model = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    client_args={"proxy": "socks5://user:pass@host:port"},
)

Instantiation

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")
model.invoke("Write me a ballad about LangChain")

Invoke

messages = [
    ("system", "Translate the user sentence to French."),
    ("human", "I love programming."),
]
model.invoke(messages)

AIMessage(
    content=[
        {
            "type": "text",
            "text": "**J'adore la programmation.**\n\nYou can also say:...",
            "extras": {"signature": "Eq0W..."},
        }
    ],
    additional_kwargs={},
    response_metadata={
        "prompt_feedback": {"block_reason": 0, "safety_ratings": []},
        "finish_reason": "STOP",
        "model_name": "gemini-3.1-pro-preview",
        "safety_ratings": [],
        "model_provider": "google_genai",
    },
    id="lc_run--63a04ced-6b63-4cf6-86a1-c32fa565938e-0",
    usage_metadata={
        "input_tokens": 12,
        "output_tokens": 826,
        "total_tokens": 838,
        "input_token_details": {"cache_read": 0},
        "output_token_details": {"reasoning": 777},
    },
)

content format

The shape of content may differ based on the model chosen. See the docs for more info.

Stream

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-2.5-flash")

for chunk in model.stream(messages):
    print(chunk)

AIMessageChunk(
    content="J",
    response_metadata={"finish_reason": "STOP", "safety_ratings": []},
    id="run-e905f4f4-58cb-4a10-a960-448a2bb649e3",
    usage_metadata={
        "input_tokens": 18,
        "output_tokens": 1,
        "total_tokens": 19,
    },
)
AIMessageChunk(
    content="'adore programmer. \\n",
    response_metadata={
        "finish_reason": "STOP",
        "safety_ratings": [
            {
                "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
                "probability": "NEGLIGIBLE",
                "blocked": False,
            },
            {
                "category": "HARM_CATEGORY_HATE_SPEECH",
                "probability": "NEGLIGIBLE",
                "blocked": False,
            },
            {
                "category": "HARM_CATEGORY_HARASSMENT",
                "probability": "NEGLIGIBLE",
                "blocked": False,
            },
            {
                "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
                "probability": "NEGLIGIBLE",
                "blocked": False,
            },
        ],
    },
    id="run-e905f4f4-58cb-4a10-a960-448a2bb649e3",
    usage_metadata={
        "input_tokens": 18,
        "output_tokens": 5,
        "total_tokens": 23,
    },
)

To assemble a full AIMessage message from a stream of chunks:

stream = model.stream(messages)
full = next(stream)
for chunk in stream:
    full += chunk
full

AIMessageChunk(
    content="J'adore programmer. \\n",
    response_metadata={
        "finish_reason": "STOPSTOP",
        "safety_ratings": [
            {
                "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
                "probability": "NEGLIGIBLE",
                "blocked": False,
            },
            {
                "category": "HARM_CATEGORY_HATE_SPEECH",
                "probability": "NEGLIGIBLE",
                "blocked": False,
            },
            {
                "category": "HARM_CATEGORY_HARASSMENT",
                "probability": "NEGLIGIBLE",
                "blocked": False,
            },
            {
                "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
                "probability": "NEGLIGIBLE",
                "blocked": False,
            },
        ],
    },
    id="run-3ce13a42-cd30-4ad7-a684-f1f0b37cdeec",
    usage_metadata={
        "input_tokens": 36,
        "output_tokens": 6,
        "total_tokens": 42,
    },
)

content format

The shape of content may differ based on the model chosen. See the docs for more info.

Async invocation

await model.ainvoke(messages)

# stream:
async for chunk in (await model.astream(messages))

# batch:
await model.abatch([messages])

Tool calling

See the docs for more info.

from pydantic import BaseModel, Field

class GetWeather(BaseModel):
    '''Get the current weather in a given location'''

    location: str = Field(
        ..., description="The city and state, e.g. San Francisco, CA"
    )

class GetPopulation(BaseModel):
    '''Get the current population in a given location'''

    location: str = Field(
        ..., description="The city and state, e.g. San Francisco, CA"
    )

llm_with_tools = llm.bind_tools([GetWeather, GetPopulation])
ai_msg = llm_with_tools.invoke(
    "Which city is hotter today and which is bigger: LA or NY?"
)
ai_msg.tool_calls

[
    {
        "name": "GetWeather",
        "args": {"location": "Los Angeles, CA"},
        "id": "c186c99f-f137-4d52-947f-9e3deabba6f6",
    },
    {
        "name": "GetWeather",
        "args": {"location": "New York City, NY"},
        "id": "cebd4a5d-e800-4fa5-babd-4aa286af4f31",
    },
    {
        "name": "GetPopulation",
        "args": {"location": "Los Angeles, CA"},
        "id": "4f92d897-f5e4-4d34-a3bc-93062c92591e",
    },
    {
        "name": "GetPopulation",
        "args": {"location": "New York City, NY"},
        "id": "634582de-5186-4e4b-968b-f192f0a93678",
    },
]

Structured output

See the docs for more info.

from typing import Optional

from pydantic import BaseModel, Field

class Joke(BaseModel):
    '''Joke to tell user.'''

    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: Optional[int] = Field(
        description="How funny the joke is, from 1 to 10"
    )

# Default method uses json_schema for reliable structured output
structured_model = model.with_structured_output(Joke)
structured_model.invoke("Tell me a joke about cats")

# Alternative: use function_calling method (less reliable)
structured_model_fc = model.with_structured_output(
    Joke, method="function_calling"
)

Joke(
    setup="Why are cats so good at video games?",
    punchline="They have nine lives on the internet",
    rating=None,
)

Two methods are supported for structured output:

method='json_schema' (default): Uses Gemini's native structured output API.

The Google GenAI SDK automatically transforms schemas to ensure compatibility with Gemini. This includes:
- Inlining $defs definitions (Union types work correctly)
- Resolving $ref references for nested schemas
- Property ordering preservation
- Support for streaming partial JSON chunks
Uses Gemini's response_json_schema API param. Refer to the Gemini API docs for more details. This method is recommended for better reliability as it constrains the model's generation process directly.
method='function_calling': Uses tool calling to extract structured data. Less reliable than json_schema but compatible with all models.

Image input

See the docs for more info.

import base64
import httpx
from langchain.messages import HumanMessage

image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8")
message = HumanMessage(
    content=[
        {"type": "text", "text": "describe the weather in this image"},
        {
            "type": "image_url",
            "image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
        },
    ]
)
ai_msg = model.invoke([message])
ai_msg.content

The weather in this image appears to be sunny and pleasant. The sky is a bright
blue with scattered white clouds, suggesting fair weather. The lush green grass
and trees indicate a warm and possibly slightly breezy day. There are no...

PDF input

See the docs for more info.

import base64
from langchain.messages import HumanMessage

pdf_bytes = open("/path/to/your/test.pdf", "rb").read()
pdf_base64 = base64.b64encode(pdf_bytes).decode("utf-8")

message = HumanMessage(
    content=[
        {"type": "text", "text": "describe the document in a sentence"},
        {
            "type": "file",
            "source_type": "base64",
            "mime_type": "application/pdf",
            "data": pdf_base64,
        },
    ]
)
ai_msg = model.invoke([message])

Audio input

See the docs for more info.

import base64
from langchain.messages import HumanMessage

audio_bytes = open("/path/to/your/audio.mp3", "rb").read()
audio_base64 = base64.b64encode(audio_bytes).decode("utf-8")

message = HumanMessage(
    content=[
        {"type": "text", "text": "summarize this audio in a sentence"},
        {
            "type": "file",
            "source_type": "base64",
            "mime_type": "audio/mp3",
            "data": audio_base64,
        },
    ]
)
ai_msg = model.invoke([message])

Video input

See the docs for more info.

import base64
from langchain.messages import HumanMessage

video_bytes = open("/path/to/your/video.mp4", "rb").read()
video_base64 = base64.b64encode(video_bytes).decode("utf-8")

message = HumanMessage(
    content=[
        {
            "type": "text",
            "text": "describe what's in this video in a sentence",
        },
        {
            "type": "file",
            "source_type": "base64",
            "mime_type": "video/mp4",
            "data": video_base64,
        },
    ]
)
ai_msg = model.invoke([message])

You can also pass YouTube URLs directly:

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

message = HumanMessage(
    content=[
        {"type": "text", "text": "Summarize the video in 3 sentences."},
        {
            "type": "media",
            "file_uri": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
            "mime_type": "video/mp4",
        },
    ]
)
response = model.invoke([message])
print(response.text)

Image generation

See the docs for more info.

Audio generation

See the docs for more info.

Vertex compatibility

Audio generation models (TTS) are currently in preview on Vertex AI and may require allowlist access. If you receive an INVALID_ARGUMENT error when using TTS models with vertexai=True, your project may need to be allowlisted.

See this post on the Google AI forum for more details.

File upload

You can also upload files to Google's servers and reference them by URI.

This works for PDFs, images, videos, and audio files.

import time
from google import genai
from langchain.messages import HumanMessage

client = genai.Client()

myfile = client.files.upload(file="/path/to/your/sample.pdf")
while myfile.state.name == "PROCESSING":
    time.sleep(2)
    myfile = client.files.get(name=myfile.name)

message = HumanMessage(
    content=[
        {"type": "text", "text": "What is in the document?"},
        {
            "type": "media",
            "file_uri": myfile.uri,
            "mime_type": "application/pdf",
        },
    ]
)
ai_msg = model.invoke([message])

Thinking

See the docs for more info.

Gemini 3+ models use thinking_level ('low', 'medium', or 'high') to control reasoning depth. If not specified, defaults to 'high'.

model = ChatGoogleGenerativeAI(
    model="gemini-3.1-pro-preview",
    thinking_level="low",  # For faster, lower-latency responses
)

Gemini 2.5 models use thinking_budget (an integer token count) to control reasoning. Set to 0 to disable thinking (where supported), or -1 for dynamic thinking.

See the Gemini API docs for more details on thinking models.

To see a thinking model's thoughts, set include_thoughts=True to have the model's reasoning summaries included in the response.

model = ChatGoogleGenerativeAI(
    model="gemini-3.1-pro-preview",
    include_thoughts=True,
)
ai_msg = model.invoke("How many 'r's are in the word 'strawberry'?")

Thought signatures

Gemini 3+ models return thought signatures—encrypted representations of the model's internal reasoning.

For multi-turn conversations involving tool calls, you must pass the full AIMessage back to the model so that these signatures are preserved. This happens automatically when you append the AIMessage to your message list.

See the LangChain docs for more info as well as a code example.

See the Gemini API docs for more details on thought signatures.

Google search

See the docs for more info.

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")
response = model.invoke(
    "When is the next total solar eclipse in US?",
    tools=[{"google_search": {}}],
)
response.content_blocks

Alternatively, you can bind the tool to the model for easier reuse across calls:

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

model_with_search = model.bind_tools([{"google_search": {}}])
response = model_with_search.invoke(
    "When is the next total solar eclipse in US?"
)

response.content_blocks

Google Maps

See the docs for more info.

Code execution

See the docs for more info.

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")

model_with_code_interpreter = model.bind_tools([{"code_execution": {}}])
response = model_with_code_interpreter.invoke("Use Python to calculate 3^3.")

response.content_blocks

[{'type': 'server_tool_call',
  'name': 'code_interpreter',
  'args': {'code': 'print(3**3)', 'language': <Language.PYTHON: 1>},
  'id': '...'},
 {'type': 'server_tool_result',
  'tool_call_id': '',
  'status': 'success',
  'output': '27\n',
  'extras': {'block_type': 'code_execution_result',
   'outcome': 1}},
 {'type': 'text', 'text': 'The calculation of 3 to the power of 3 is 27.'}]

Computer use

See the docs for more info.

Preview model limitations

The Computer Use model is in preview and may produce unexpected behavior.

Always supervise automated tasks and avoid use with sensitive data or critical operations. See the Gemini API docs for safety best practices.

Token usage

See the docs for more info.

ai_msg = model.invoke(messages)
ai_msg.usage_metadata

{"input_tokens": 18, "output_tokens": 5, "total_tokens": 23}

Safety settings

Gemini models have default safety settings that can be overridden. If you are receiving lots of "Safety Warnings" from your models, you can try tweaking the safety_settings attribute of the model. For example, to turn off safety blocking for dangerous content, you can construct your LLM as follows:

from langchain_google_genai import (
    ChatGoogleGenerativeAI,
    HarmBlockThreshold,
    HarmCategory,
)

llm = ChatGoogleGenerativeAI(
    model="gemini-3.1-pro-preview",
    safety_settings={
        HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
    },
)

For an enumeration of the categories and thresholds available, see Google's safety setting types.

Context caching

See the docs for more info.

Context caching allows you to store and reuse content (e.g., PDFs, images) for faster processing. The cached_content parameter accepts a cache name created via the Google Generative AI API.

See the Gemini docs for more details on cached content.

Below are two examples: caching a single file directly and caching multiple files using Part.

Single file example

This caches a single file and queries it.

from google import genai
from google.genai import types
import time
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.messages import HumanMessage

client = genai.Client()

# Upload file
file = client.files.upload(file="path/to/your/file")
while file.state.name == "PROCESSING":
    time.sleep(2)
    file = client.files.get(name=file.name)

# Create cache
model = "gemini-3.1-pro-preview"
cache = client.caches.create(
    model=model,
    config=types.CreateCachedContentConfig(
        display_name="Cached Content",
        system_instruction=(
            "You are an expert content analyzer, and your job is to answer "
            "the user's query based on the file you have access to."
        ),
        contents=[file],
        ttl="300s",
    ),
)

# Query with LangChain
llm = ChatGoogleGenerativeAI(
    model=model,
    cached_content=cache.name,
)
message = HumanMessage(content="Summarize the main points of the content.")
llm.invoke([message])

Multiple files example

This caches two files using Part and queries them together.

from google import genai
from google.genai.types import CreateCachedContentConfig, Content, Part
import time
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.messages import HumanMessage

client = genai.Client()

# Upload files
file_1 = client.files.upload(file="./file1")
while file_1.state.name == "PROCESSING":
    time.sleep(2)
    file_1 = client.files.get(name=file_1.name)

file_2 = client.files.upload(file="./file2")
while file_2.state.name == "PROCESSING":
    time.sleep(2)
    file_2 = client.files.get(name=file_2.name)

# Create cache with multiple files
contents = [
    Content(
        role="user",
        parts=[
            Part.from_uri(file_uri=file_1.uri, mime_type=file_1.mime_type),
            Part.from_uri(file_uri=file_2.uri, mime_type=file_2.mime_type),
        ],
    )
]
model = "gemini-3.1-pro-preview"
cache = client.caches.create(
    model=model,
    config=CreateCachedContentConfig(
        display_name="Cached Contents",
        system_instruction=(
            "You are an expert content analyzer, and your job is to answer "
            "the user's query based on the files you have access to."
        ),
        contents=contents,
        ttl="300s",
    ),
)

# Query with LangChain
llm = ChatGoogleGenerativeAI(
    model=model,
    cached_content=cache.name,
)
message = HumanMessage(
    content="Provide a summary of the key information across both files."
)
llm.invoke([message])

Response metadata

ai_msg = model.invoke(messages)
ai_msg.response_metadata

{
    "model_name": "gemini-3.1-pro-preview",
    "model_provider": "google_genai",
    "prompt_feedback": {"block_reason": 0, "safety_ratings": []},
    "finish_reason": "STOP",
    "safety_ratings": [
        {
            "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
            "probability": "NEGLIGIBLE",
            "blocked": False,
        },
        {
            "category": "HARM_CATEGORY_HATE_SPEECH",
            "probability": "NEGLIGIBLE",
            "blocked": False,
        },
        {
            "category": "HARM_CATEGORY_HARASSMENT",
            "probability": "NEGLIGIBLE",
            "blocked": False,
        },
        {
            "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
            "probability": "NEGLIGIBLE",
            "blocked": False,
        },
    ],
}

LangChain Assistant

Menu

Bases

Constructors

Attributes

Methods

Inherited fromBaseChatModel(langchain_core)

Attributes

Methods

Inherited fromBaseLanguageModel(langchain_core)

Attributes

Methods

Inherited fromRunnableSerializable(langchain_core)

Attributes

Methods

Inherited fromSerializable(langchain_core)

Attributes

Methods

Inherited fromRunnable(langchain_core)

Attributes

Methods

Menu

ChatGoogleGenerativeAI

Bases

Used in Docs

Constructors

Attributes

Methods

Inherited fromBaseChatModel(langchain_core)

Attributes

Methods

Inherited fromBaseLanguageModel(langchain_core)

Attributes

Methods

Inherited fromRunnableSerializable(langchain_core)

Attributes

Methods

Inherited fromSerializable(langchain_core)

Attributes

Methods

Inherited fromRunnable(langchain_core)

Attributes

Methods