Call options for the ChatGoogle model.
This interface extends the base Google chat model call options and provides
configuration for individual model invocations. These options can be passed
when calling methods like invoke(), stream(), or batch() to customize
the behavior of a specific request.
interface ChatGoogleCallOptionsBaseChatGoogleCallOptionsRuntime values for attributes previously made configurable on this Runnable, or sub-Runnables.
If true, enables enhanced civic answers feature.
Positive values penalize tokens that repeatedly appear in the generated text, decreasing the probability of repeating content.
Configuration for image generation.
Returns the log probabilities of the top candidate tokens at each generation step. The model's chosen token might not be the same as the top candidate token at each step. Specify the number of candidates to return by using an integer value in the range of 1-20.
Describes the format of structured outputs. This should be provided if an output is considered to be structured
Maximum number of parallel calls to make.
Maximum number of tokens that can be generated in the response. A token is approximately four characters. 100 tokens correspond to roughly 60-80 words. Specify a lower value for shorter responses and a higher value for potentially longer responses.
The number of reasoning tokens that the model should generate. If explicitly set, then the reasoning blocks will be returned.
Media resolution for input media processing.
Positive values penalize tokens that already appear in the generated text, increasing the probability of generating more diverse content.
An alias for maxReasoningTokens under Gemini 2.5 or
the primary thinking/reasoning setting for Gemini 3.
If explicitly set, then the reasoning blocks will be returned.
Maximum number of times a call can recurse. If not provided, defaults to 25.
If true, returns the log probabilities of the tokens that were chosen by the model at each step. By default, this parameter is set to false.
The requested modalities of the response. Represents the set of modalities that the model can return. An empty list is equivalent to requesting only text.
The schema that the generated response should match.
Can be a Zod schema or a JSON Schema object.
When set, the response will be structured according to this schema
and responseMimeType will automatically be set to "application/json".
Unique identifier for the tracer run for this call. If not provided, a new UUID will be generated.
Name for the tracer run for this call. Defaults to the name of the class.
Per request settings for blocking unsafe content
When seed is fixed to a specific value, the model makes a best effort to provide the same response for repeated requests. Deterministic output isn't guaranteed. Also, changing the model or parameter settings, such as the temperature, can cause variations in the response even when you use the same seed value. By default, a random seed value is used.
Abort signal for this call. If provided, the call will be aborted when the signal is aborted.
Speech generation configuration. You can use either Google's definition of the speech configuration, or a simplified version we've defined (which can be as simple as the name of a pre-defined voice).
Stop tokens to use for this call. If not provided, the default stop tokens for the model will be used.
Specifies a list of strings that tells the model to stop generating text if one of the strings is encountered in the response. If a string appears multiple times in the response, then the response truncates where it's first encountered. The strings are case-sensitive.
The temperature is used for sampling during response generation, which occurs when topP and topK are applied. Temperature controls the degree of randomness in token selection. Lower temperatures are good for prompts that require a less open-ended or creative response, while higher temperatures can lead to more diverse or creative results.
Specify a lower value for less random responses and a higher value for more random responses.
An alias for maxReasoningTokens for compatibility.
Configuration for the model's thinking process
An alias for reasoningEffort for compatibility.
Timeout for this call in milliseconds.
Specifies how the chat model should use tools.
A list of tools the model may use to generate the next response. Can be LangChain tools, OpenAI tools, or Gemini function declarations.
Top-K changes how the model selects tokens for output. A top-K of 1 means the selected token is the most probable among all tokens in the model's vocabulary (also called greedy decoding), while a top-K of 3 means that the next token is selected from among the 3 most probable tokens (using temperature).
Top-P changes how the model selects tokens for output. Tokens are selected from the most probable to least probable until the sum of their probabilities equals the top-P value.
Specify a lower value for less random responses and a higher value for more random responses.