ChatGoogleGenerativeAI(
self,
**kwargs: Any = {},
)_BaseGoogleGenerativeAIBaseChatModelEnforce a schema to the output.
The format of the dictionary should follow JSON Schema specification.
The Google GenAI SDK automatically transforms schemas for Gemini compatibility:
$defs definitions (enables Union types with anyOf)$ref pointers for nested/recursive schemasminimum/maximum, minItems/maxItemsUnion types in Pydantic models (e.g., field: Union[TypeA, TypeB]) are
automatically converted to anyOf schemas and work correctly with the
json_schema method.
Refer to the Gemini API docs for more details on supported JSON Schema features.
Google GenAI chat model integration.
Setup:
Added in langchain-google-genai 4.0.0.
ChatGoogleGenerativeAI now supports both the Gemini Developer API and
Vertex AI Platform as backend options.
For Gemini Developer API (simplest):
GOOGLE_API_KEY environment variable (recommended), orapi_key
parameterfrom langchain_google_genai import ChatGoogleGenerativeAI
model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview", api_key="...")
For Vertex AI Platform with API key:
export GEMINI_API_KEY='your-api-key'
export GOOGLE_GENAI_USE_VERTEXAI=true
export GOOGLE_CLOUD_PROJECT='your-project-id'
model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")
# Or explicitly:
model = ChatGoogleGenerativeAI(
model="gemini-3.1-pro-preview",
api_key="...",
project="your-project-id",
vertexai=True,
)
For Vertex AI with credentials:
model = ChatGoogleGenerativeAI(
model="gemini-2.5-flash",
project="your-project-id",
# Uses Application Default Credentials (ADC)
)
Automatic backend detection (when vertexai=None / unspecified):
GOOGLE_GENAI_USE_VERTEXAI env var is set, uses that valuecredentials parameter is provided, uses Vertex AIproject parameter is provided, uses Vertex AIEnvironment variables:
| Variable | Purpose | Backend |
|---|---|---|
GOOGLE_API_KEY |
API key (primary) | Both (see GOOGLE_GENAI_USE_VERTEXAI) |
GEMINI_API_KEY |
API key (fallback) | Both (see GOOGLE_GENAI_USE_VERTEXAI) |
GOOGLE_GENAI_USE_VERTEXAI |
Force Vertex AI backend (true/false) |
Vertex AI |
GOOGLE_CLOUD_PROJECT |
GCP project ID | Vertex AI |
GOOGLE_CLOUD_LOCATION |
GCP region (default: global) |
Vertex AI |
HTTPS_PROXY |
HTTP/HTTPS proxy URL | Both |
SSL_CERT_FILE |
Custom SSL certificate file | Both |
GOOGLE_API_KEY is checked first for backwards compatibility. (GEMINI_API_KEY
was introduced later to better reflect the API's branding.)
Proxy configuration:
Set these before initializing:
export HTTPS_PROXY='http://username:password@proxy_uri:port'
export SSL_CERT_FILE='path/to/cert.pem' # Optional: custom SSL certificate
For SOCKS5 proxies or advanced proxy configuration, use the
client_args
parameter:
model = ChatGoogleGenerativeAI(
model="gemini-2.5-flash",
client_args={"proxy": "socks5://user:pass@host:port"},
)
from langchain_google_genai import ChatGoogleGenerativeAI
model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")
model.invoke("Write me a ballad about LangChain")messages = [
("system", "Translate the user sentence to French."),
("human", "I love programming."),
]
model.invoke(messages)
AIMessage(
content=[
{
"type": "text",
"text": "**J'adore la programmation.**\n\nYou can also say:...",
"extras": {"signature": "Eq0W..."},
}
],
additional_kwargs={},
response_metadata={
"prompt_feedback": {"block_reason": 0, "safety_ratings": []},
"finish_reason": "STOP",
"model_name": "gemini-3.1-pro-preview",
"safety_ratings": [],
"model_provider": "google_genai",
},
id="lc_run--63a04ced-6b63-4cf6-86a1-c32fa565938e-0",
usage_metadata={
"input_tokens": 12,
"output_tokens": 826,
"total_tokens": 838,
"input_token_details": {"cache_read": 0},
"output_token_details": {"reasoning": 777},
},
)
content formatThe shape of content may differ based on the model chosen. See
the docs
for more info.
from langchain_google_genai import ChatGoogleGenerativeAI
model = ChatGoogleGenerativeAI(model="gemini-2.5-flash")
for chunk in model.stream(messages):
print(chunk)
AIMessageChunk(
content="J",
response_metadata={"finish_reason": "STOP", "safety_ratings": []},
id="run-e905f4f4-58cb-4a10-a960-448a2bb649e3",
usage_metadata={
"input_tokens": 18,
"output_tokens": 1,
"total_tokens": 19,
},
)
AIMessageChunk(
content="'adore programmer. \\n",
response_metadata={
"finish_reason": "STOP",
"safety_ratings": [
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"probability": "NEGLIGIBLE",
"blocked": False,
},
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"probability": "NEGLIGIBLE",
"blocked": False,
},
{
"category": "HARM_CATEGORY_HARASSMENT",
"probability": "NEGLIGIBLE",
"blocked": False,
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"probability": "NEGLIGIBLE",
"blocked": False,
},
],
},
id="run-e905f4f4-58cb-4a10-a960-448a2bb649e3",
usage_metadata={
"input_tokens": 18,
"output_tokens": 5,
"total_tokens": 23,
},
)
To assemble a full AIMessage message from a
stream of chunks:
stream = model.stream(messages)
full = next(stream)
for chunk in stream:
full += chunk
full
AIMessageChunk(
content="J'adore programmer. \\n",
response_metadata={
"finish_reason": "STOPSTOP",
"safety_ratings": [
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"probability": "NEGLIGIBLE",
"blocked": False,
},
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"probability": "NEGLIGIBLE",
"blocked": False,
},
{
"category": "HARM_CATEGORY_HARASSMENT",
"probability": "NEGLIGIBLE",
"blocked": False,
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"probability": "NEGLIGIBLE",
"blocked": False,
},
],
},
id="run-3ce13a42-cd30-4ad7-a684-f1f0b37cdeec",
usage_metadata={
"input_tokens": 36,
"output_tokens": 6,
"total_tokens": 42,
},
)
content formatThe shape of content may differ based on the model chosen. See
the docs
for more info.
await model.ainvoke(messages)
# stream:
async for chunk in (await model.astream(messages))
# batch:
await model.abatch([messages])See the docs for more info.
from pydantic import BaseModel, Field
class GetWeather(BaseModel):
'''Get the current weather in a given location'''
location: str = Field(
..., description="The city and state, e.g. San Francisco, CA"
)
class GetPopulation(BaseModel):
'''Get the current population in a given location'''
location: str = Field(
..., description="The city and state, e.g. San Francisco, CA"
)
llm_with_tools = llm.bind_tools([GetWeather, GetPopulation])
ai_msg = llm_with_tools.invoke(
"Which city is hotter today and which is bigger: LA or NY?"
)
ai_msg.tool_calls
[
{
"name": "GetWeather",
"args": {"location": "Los Angeles, CA"},
"id": "c186c99f-f137-4d52-947f-9e3deabba6f6",
},
{
"name": "GetWeather",
"args": {"location": "New York City, NY"},
"id": "cebd4a5d-e800-4fa5-babd-4aa286af4f31",
},
{
"name": "GetPopulation",
"args": {"location": "Los Angeles, CA"},
"id": "4f92d897-f5e4-4d34-a3bc-93062c92591e",
},
{
"name": "GetPopulation",
"args": {"location": "New York City, NY"},
"id": "634582de-5186-4e4b-968b-f192f0a93678",
},
]See the docs for more info.
from typing import Optional
from pydantic import BaseModel, Field
class Joke(BaseModel):
'''Joke to tell user.'''
setup: str = Field(description="The setup of the joke")
punchline: str = Field(description="The punchline to the joke")
rating: Optional[int] = Field(
description="How funny the joke is, from 1 to 10"
)
# Default method uses json_schema for reliable structured output
structured_model = model.with_structured_output(Joke)
structured_model.invoke("Tell me a joke about cats")
# Alternative: use function_calling method (less reliable)
structured_model_fc = model.with_structured_output(
Joke, method="function_calling"
)
Joke(
setup="Why are cats so good at video games?",
punchline="They have nine lives on the internet",
rating=None,
)
Two methods are supported for structured output:
method='json_schema' (default): Uses Gemini's native structured output API.
The Google GenAI SDK automatically transforms schemas to ensure compatibility with Gemini. This includes:
$defs definitions (Union types work correctly)$ref references for nested schemasUses Gemini's response_json_schema API param. Refer to the Gemini API
docs for more
details. This method is recommended for better reliability as it
constrains the model's generation process directly.
method='function_calling': Uses tool calling to extract structured data.
Less reliable than json_schema but compatible with all models.
See the docs for more info.
import base64
import httpx
from langchain.messages import HumanMessage
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8")
message = HumanMessage(
content=[
{"type": "text", "text": "describe the weather in this image"},
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
},
]
)
ai_msg = model.invoke([message])
ai_msg.content
The weather in this image appears to be sunny and pleasant. The sky is a bright
blue with scattered white clouds, suggesting fair weather. The lush green grass
and trees indicate a warm and possibly slightly breezy day. There are no...See the docs for more info.
import base64
from langchain.messages import HumanMessage
pdf_bytes = open("/path/to/your/test.pdf", "rb").read()
pdf_base64 = base64.b64encode(pdf_bytes).decode("utf-8")
message = HumanMessage(
content=[
{"type": "text", "text": "describe the document in a sentence"},
{
"type": "file",
"source_type": "base64",
"mime_type": "application/pdf",
"data": pdf_base64,
},
]
)
ai_msg = model.invoke([message])See the docs for more info.
import base64
from langchain.messages import HumanMessage
audio_bytes = open("/path/to/your/audio.mp3", "rb").read()
audio_base64 = base64.b64encode(audio_bytes).decode("utf-8")
message = HumanMessage(
content=[
{"type": "text", "text": "summarize this audio in a sentence"},
{
"type": "file",
"source_type": "base64",
"mime_type": "audio/mp3",
"data": audio_base64,
},
]
)
ai_msg = model.invoke([message])See the docs for more info.
import base64
from langchain.messages import HumanMessage
video_bytes = open("/path/to/your/video.mp4", "rb").read()
video_base64 = base64.b64encode(video_bytes).decode("utf-8")
message = HumanMessage(
content=[
{
"type": "text",
"text": "describe what's in this video in a sentence",
},
{
"type": "file",
"source_type": "base64",
"mime_type": "video/mp4",
"data": video_base64,
},
]
)
ai_msg = model.invoke([message])
You can also pass YouTube URLs directly:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage
model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")
message = HumanMessage(
content=[
{"type": "text", "text": "Summarize the video in 3 sentences."},
{
"type": "media",
"file_uri": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"mime_type": "video/mp4",
},
]
)
response = model.invoke([message])
print(response.text)See the docs for more info.
See the docs for more info.
Audio generation models (TTS) are currently in preview on Vertex AI
and may require allowlist access. If you receive an INVALID_ARGUMENT
error when using TTS models with vertexai=True, your project may need to
be allowlisted.
See this post on the Google AI forum for more details.
You can also upload files to Google's servers and reference them by URI.
This works for PDFs, images, videos, and audio files.
import time
from google import genai
from langchain.messages import HumanMessage
client = genai.Client()
myfile = client.files.upload(file="/path/to/your/sample.pdf")
while myfile.state.name == "PROCESSING":
time.sleep(2)
myfile = client.files.get(name=myfile.name)
message = HumanMessage(
content=[
{"type": "text", "text": "What is in the document?"},
{
"type": "media",
"file_uri": myfile.uri,
"mime_type": "application/pdf",
},
]
)
ai_msg = model.invoke([message])See the docs for more info.
Gemini 3+ models use thinking_level
('low', 'medium', or 'high') to control reasoning depth. If not specified,
defaults to 'high'.
model = ChatGoogleGenerativeAI(
model="gemini-3.1-pro-preview",
thinking_level="low", # For faster, lower-latency responses
)
Gemini 2.5 models use thinking_budget
(an integer token count) to control reasoning. Set to 0 to disable thinking
(where supported), or -1 for dynamic thinking.
See the Gemini API docs for more details on thinking models.
To see a thinking model's thoughts, set include_thoughts=True
to have the model's reasoning summaries included in the response.
model = ChatGoogleGenerativeAI(
model="gemini-3.1-pro-preview",
include_thoughts=True,
)
ai_msg = model.invoke("How many 'r's are in the word 'strawberry'?")Gemini 3+ models return thought signatures—encrypted representations of the model's internal reasoning.
For multi-turn conversations involving tool calls, you must pass the full
AIMessage back to the model so that these
signatures are preserved. This happens automatically when you append the
AIMessage to your message list.
See the LangChain docs for more info as well as a code example.
See the Gemini API docs for more details on thought signatures.
See the docs for more info.
model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")
response = model.invoke(
"When is the next total solar eclipse in US?",
tools=[{"google_search": {}}],
)
response.content_blocks
Alternatively, you can bind the tool to the model for easier reuse across calls:
model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")
model_with_search = model.bind_tools([{"google_search": {}}])
response = model_with_search.invoke(
"When is the next total solar eclipse in US?"
)
response.content_blocksSee the docs for more info.
See the docs for more info.
from langchain_google_genai import ChatGoogleGenerativeAI
model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")
model_with_code_interpreter = model.bind_tools([{"code_execution": {}}])
response = model_with_code_interpreter.invoke("Use Python to calculate 3^3.")
response.content_blocks
[{'type': 'server_tool_call',
'name': 'code_interpreter',
'args': {'code': 'print(3**3)', 'language': <Language.PYTHON: 1>},
'id': '...'},
{'type': 'server_tool_result',
'tool_call_id': '',
'status': 'success',
'output': '27\n',
'extras': {'block_type': 'code_execution_result',
'outcome': 1}},
{'type': 'text', 'text': 'The calculation of 3 to the power of 3 is 27.'}]
See the docs for more info.
The Computer Use model is in preview and may produce unexpected behavior.
Always supervise automated tasks and avoid use with sensitive data or critical operations. See the Gemini API docs for safety best practices.
See the docs for more info.
ai_msg = model.invoke(messages)
ai_msg.usage_metadata
{"input_tokens": 18, "output_tokens": 5, "total_tokens": 23}Gemini models have default safety settings that can be overridden. If you
are receiving lots of "Safety Warnings" from your models, you can try
tweaking the safety_settings attribute of the model. For example, to
turn off safety blocking for dangerous content, you can construct your
LLM as follows:
from langchain_google_genai import (
ChatGoogleGenerativeAI,
HarmBlockThreshold,
HarmCategory,
)
llm = ChatGoogleGenerativeAI(
model="gemini-3.1-pro-preview",
safety_settings={
HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
},
)
For an enumeration of the categories and thresholds available, see Google's safety setting types.
See the docs for more info.
Context caching allows you to store and reuse content (e.g., PDFs, images) for
faster processing. The cached_content
parameter accepts a cache name created via the Google Generative AI API.
See the Gemini docs for more details on cached content.
Below are two examples: caching a single file directly and caching multiple
files using Part.
This caches a single file and queries it.
from google import genai
from google.genai import types
import time
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.messages import HumanMessage
client = genai.Client()
# Upload file
file = client.files.upload(file="path/to/your/file")
while file.state.name == "PROCESSING":
time.sleep(2)
file = client.files.get(name=file.name)
# Create cache
model = "gemini-3.1-pro-preview"
cache = client.caches.create(
model=model,
config=types.CreateCachedContentConfig(
display_name="Cached Content",
system_instruction=(
"You are an expert content analyzer, and your job is to answer "
"the user's query based on the file you have access to."
),
contents=[file],
ttl="300s",
),
)
# Query with LangChain
llm = ChatGoogleGenerativeAI(
model=model,
cached_content=cache.name,
)
message = HumanMessage(content="Summarize the main points of the content.")
llm.invoke([message])This caches two files using Part and queries them together.
from google import genai
from google.genai.types import CreateCachedContentConfig, Content, Part
import time
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.messages import HumanMessage
client = genai.Client()
# Upload files
file_1 = client.files.upload(file="./file1")
while file_1.state.name == "PROCESSING":
time.sleep(2)
file_1 = client.files.get(name=file_1.name)
file_2 = client.files.upload(file="./file2")
while file_2.state.name == "PROCESSING":
time.sleep(2)
file_2 = client.files.get(name=file_2.name)
# Create cache with multiple files
contents = [
Content(
role="user",
parts=[
Part.from_uri(file_uri=file_1.uri, mime_type=file_1.mime_type),
Part.from_uri(file_uri=file_2.uri, mime_type=file_2.mime_type),
],
)
]
model = "gemini-3.1-pro-preview"
cache = client.caches.create(
model=model,
config=CreateCachedContentConfig(
display_name="Cached Contents",
system_instruction=(
"You are an expert content analyzer, and your job is to answer "
"the user's query based on the files you have access to."
),
contents=contents,
ttl="300s",
),
)
# Query with LangChain
llm = ChatGoogleGenerativeAI(
model=model,
cached_content=cache.name,
)
message = HumanMessage(
content="Provide a summary of the key information across both files."
)
llm.invoke([message])ai_msg = model.invoke(messages)
ai_msg.response_metadata
{
"model_name": "gemini-3.1-pro-preview",
"model_provider": "google_genai",
"prompt_feedback": {"block_reason": 0, "safety_ratings": []},
"finish_reason": "STOP",
"safety_ratings": [
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"probability": "NEGLIGIBLE",
"blocked": False,
},
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"probability": "NEGLIGIBLE",
"blocked": False,
},
{
"category": "HARM_CATEGORY_HARASSMENT",
"probability": "NEGLIGIBLE",
"blocked": False,
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"probability": "NEGLIGIBLE",
"blocked": False,
},
],
}