Method●Since v0.1

with_structured_output

Model wrapper that returns outputs formatted to match the given schema.

Behavior changed in langchain-google-vertexai 1.1.0

Return type corrected in version 1.1.0. Previously if a dict schema was provided then the output had the form [{"args": {}, "name": "schema_name"}] where the output was a list with a single dict and the "args" of the one dict corresponded to the schema.

As of 1.1.0 this has been fixed so that the schema (the value corresponding to the old "args" key) is returned directly.

with_structured_output(
  self,
  schema: dict | type[BaseModel] | type,
  *,
  include_raw: bool = False,
  method: Literal['json_mode'] | None = None,
  **kwargs: Any = {}
) -> Runnable[LanguageModelInput, dict | BaseModel]

Pydantic schema, exclude raw

from pydantic import BaseModel
from langchain_google_vertexai import ChatVertexAI

class AnswerWithJustification(BaseModel):
    '''An answer to the user question along with justification for the answer.'''

    answer: str
    justification: str

llm = ChatVertexAI(model_name="gemini-2.0-flash-001", temperature=0)
structured_llm = llm.with_structured_output(AnswerWithJustification)

structured_llm.invoke(
    "What weighs more a pound of bricks or a pound of feathers"
)
# -> AnswerWithJustification(
#     answer='They weigh the same.', justification='A pound is a pound.'
# )

Pydantic schema, include raw

from pydantic import BaseModel
from langchain_google_vertexai import ChatVertexAI

class AnswerWithJustification(BaseModel):
    '''An answer to the user question along with justification for the answer.'''

    answer: str
    justification: str

llm = ChatVertexAI(model_name="gemini-2.0-flash-001", temperature=0)
structured_llm = llm.with_structured_output(
    AnswerWithJustification, include_raw=True
)

structured_llm.invoke(
    "What weighs more a pound of bricks or a pound of feathers"
)
# -> {
#     'raw': AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_Ao02pnFYXD6GN1yzc0uXPsvF', 'function': {'arguments': '{"answer":"They weigh the same.","justification":"Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ."}', 'name': 'AnswerWithJustification'}, 'type': 'function'}]}),
#     'parsed': AnswerWithJustification(answer='They weigh the same.', justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'),
#     'parsing_error': None
# }

Dict schema, exclude raw

from pydantic import BaseModel
from langchain_core.utils.function_calling import (
    convert_to_openai_function,
)
from langchain_google_vertexai import ChatVertexAI

class AnswerWithJustification(BaseModel):
    '''An answer to the user question along with justification for the answer.'''

    answer: str
    justification: str

dict_schema = convert_to_openai_function(AnswerWithJustification)
llm = ChatVertexAI(model_name="gemini-2.0-flash-001", temperature=0)
structured_llm = llm.with_structured_output(dict_schema)

structured_llm.invoke(
    "What weighs more a pound of bricks or a pound of feathers"
)
# -> {
#     'answer': 'They weigh the same',
#     'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume and density of the two substances differ.'
# }

Pydantic schema, streaming

from pydantic import BaseModel, Field
from langchain_google_vertexai import ChatVertexAI

class Explanation(BaseModel):
    '''A topic explanation with examples.'''

    description: str = Field(
        description="A brief description of the topic."
    )
    examples: str = Field(description="Two examples related to the topic.")

llm = ChatVertexAI(model_name="gemini-2.0-flash", temperature=0)
structured_llm = llm.with_structured_output(Explanation, method="json_mode")

for chunk in structured_llm.stream("Tell me about transformer models"):
    print(chunk)
    print("-------------------------")
# -> description='Transformer models are a type of neural network architecture that have revolutionized the field of natural language processing (NLP) and are also increasingly used in computer vision and other domains. They rely on the self-attention mechanism to weigh the importance of different parts of the input data, allowing them to effectively capture long-range dependencies. Unlike recurrent neural networks (RNNs), transformers can process the entire input sequence in parallel, leading to significantly faster training times. Key components of transformer models include: the self-attention mechanism (calculates attention weights between different parts of the input), multi-head attention (performs self-attention multiple times with different learned parameters), positional encoding (adds information about the position of tokens in the input sequence), feedforward networks (applies a non-linear transformation to each position), and encoder-decoder structure (used for sequence-to-sequence tasks).' examples='1. BERT (Bidirectional Encoder Representations from Transformers): A pre-trained transformer'
#    -------------------------
#    description='Transformer models are a type of neural network architecture that have revolutionized the field of natural language processing (NLP) and are also increasingly used in computer vision and other domains. They rely on the self-attention mechanism to weigh the importance of different parts of the input data, allowing them to effectively capture long-range dependencies. Unlike recurrent neural networks (RNNs), transformers can process the entire input sequence in parallel, leading to significantly faster training times. Key components of transformer models include: the self-attention mechanism (calculates attention weights between different parts of the input), multi-head attention (performs self-attention multiple times with different learned parameters), positional encoding (adds information about the position of tokens in the input sequence), feedforward networks (applies a non-linear transformation to each position), and encoder-decoder structure (used for sequence-to-sequence tasks).' examples='1. BERT (Bidirectional Encoder Representations from Transformers): A pre-trained transformer model that can be fine-tuned for various NLP tasks like text classification, question answering, and named entity recognition. 2. GPT (Generative Pre-trained Transformer): A language model that uses transformers to generate coherent and contextually relevant text. GPT models are used in chatbots, content creation, and code generation.'
#    -------------------------

Parameters

Name	Type	Description
`schema`*	`dict \| type[BaseModel] \| type`	The output schema as a dict or a Pydantic class. If a Pydantic class then the model output will be an object of that class. If a `dict` then the model output will be a dict. With a Pydantic class the returned attributes will be validated, whereas with a `dict` they will not be. If `method` is `'function_calling'` and `schema` is a `dict`, then the `dict` must match the OpenAI function-calling spec.
`include_raw`	`bool`	Default:`False` If `False` then only the parsed structured output is returned. If an error occurs during model output parsing it will be raised. If `True` then both the raw model response (a `BaseMessage`) and the parsed model response will be returned. If an error occurs during output parsing it will be caught and returned as well. The final output is always a `dict` with keys `'raw'`, `'parsed'`, and `'parsing_error'`.
`method`	`Literal['json_mode'] \| None`	Default:`None` If set to `'json_schema'` it will use controlled generation to generate the response rather than function calling. Does not work with schemas with references or Pydantic models with self-references.

View source on GitHub

with_structured_output

Model wrapper that returns outputs formatted to match the given schema.

Behavior changed in langchain-google-vertexai 1.1.0

As of 1.1.0 this has been fixed so that the schema (the value corresponding to the old "args" key) is returned directly.

with_structured_output( self, schema: dict | type[BaseModel] | type, *, include_raw: bool = False, method: Literal['json_mode'] | None = None, **kwargs: Any = {} ) -> Runnable[LanguageModelInput, dict | BaseModel]

from pydantic import BaseModel from langchain_google_vertexai import ChatVertexAI class AnswerWithJustification(BaseModel): '''An answer to the user question along with justification for the answer.''' answer: str justification: str llm = ChatVertexAI(model_name="gemini-2.0-flash-001", temperature=0) structured_llm = llm.with_structured_output( AnswerWithJustification, include_raw=True ) structured_llm.invoke( "What weighs more a pound of bricks or a pound of feathers" ) # -> { # 'raw': AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_Ao02pnFYXD6GN1yzc0uXPsvF', 'function': {'arguments': '{"answer":"They weigh the same.","justification":"Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ."}', 'name': 'AnswerWithJustification'}, 'type': 'function'}]}), # 'parsed': AnswerWithJustification(answer='They weigh the same.', justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'), # 'parsing_error': None # }

from pydantic import BaseModel from langchain_core.utils.function_calling import ( convert_to_openai_function, ) from langchain_google_vertexai import ChatVertexAI class AnswerWithJustification(BaseModel): '''An answer to the user question along with justification for the answer.''' answer: str justification: str dict_schema = convert_to_openai_function(AnswerWithJustification) llm = ChatVertexAI(model_name="gemini-2.0-flash-001", temperature=0) structured_llm = llm.with_structured_output(dict_schema) structured_llm.invoke( "What weighs more a pound of bricks or a pound of feathers" ) # -> { # 'answer': 'They weigh the same', # 'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume and density of the two substances differ.' # }

from pydantic import BaseModel, Field from langchain_google_vertexai import ChatVertexAI class Explanation(BaseModel): '''A topic explanation with examples.''' description: str = Field( description="A brief description of the topic." ) examples: str = Field(description="Two examples related to the topic.") llm = ChatVertexAI(model_name="gemini-2.0-flash", temperature=0) structured_llm = llm.with_structured_output(Explanation, method="json_mode") for chunk in structured_llm.stream("Tell me about transformer models"): print(chunk) print("-------------------------") # -> description='Transformer models are a type of neural network architecture that have revolutionized the field of natural language processing (NLP) and are also increasingly used in computer vision and other domains. They rely on the self-attention mechanism to weigh the importance of different parts of the input data, allowing them to effectively capture long-range dependencies. Unlike recurrent neural networks (RNNs), transformers can process the entire input sequence in parallel, leading to significantly faster training times. Key components of transformer models include: the self-attention mechanism (calculates attention weights between different parts of the input), multi-head attention (performs self-attention multiple times with different learned parameters), positional encoding (adds information about the position of tokens in the input sequence), feedforward networks (applies a non-linear transformation to each position), and encoder-decoder structure (used for sequence-to-sequence tasks).' examples='1. BERT (Bidirectional Encoder Representations from Transformers): A pre-trained transformer' # ------------------------- # description='Transformer models are a type of neural network architecture that have revolutionized the field of natural language processing (NLP) and are also increasingly used in computer vision and other domains. They rely on the self-attention mechanism to weigh the importance of different parts of the input data, allowing them to effectively capture long-range dependencies. Unlike recurrent neural networks (RNNs), transformers can process the entire input sequence in parallel, leading to significantly faster training times. Key components of transformer models include: the self-attention mechanism (calculates attention weights between different parts of the input), multi-head attention (performs self-attention multiple times with different learned parameters), positional encoding (adds information about the position of tokens in the input sequence), feedforward networks (applies a non-linear transformation to each position), and encoder-decoder structure (used for sequence-to-sequence tasks).' examples='1. BERT (Bidirectional Encoder Representations from Transformers): A pre-trained transformer model that can be fine-tuned for various NLP tasks like text classification, question answering, and named entity recognition. 2. GPT (Generative Pre-trained Transformer): A language model that uses transformers to generate coherent and contextually relevant text. GPT models are used in chatbots, content creation, and code generation.' # -------------------------

Parameters

Name	Type	Description
`schema`*	`dict \| type[BaseModel] \| type`	The output schema as a dict or a Pydantic class. If a Pydantic class then the model output will be an object of that class. If a `dict` then the model output will be a dict. With a Pydantic class the returned attributes will be validated, whereas with a `dict` they will not be. If `method` is `'function_calling'` and `schema` is a `dict`, then the `dict` must match the OpenAI function-calling spec.
`include_raw`	`bool`	Default:`False` If `False` then only the parsed structured output is returned. If an error occurs during model output parsing it will be raised. If `True` then both the raw model response (a `BaseMessage`) and the parsed model response will be returned. If an error occurs during output parsing it will be caught and returned as well. The final output is always a `dict` with keys `'raw'`, `'parsed'`, and `'parsing_error'`.
`method`	`Literal['json_mode'] \| None`	Default:`None` If set to `'json_schema'` it will use controlled generation to generate the response rather than function calling. Does not work with schemas with references or Pydantic models with self-references.

with_structured_output

Parameters

LangChain Assistant

Menu

with_structured_output

Parameters