Configuration parameters for the ChatGoogle model.
This interface extends the base Google chat model parameters and adds specific configuration options for the Generative AI API.
interface ChatGoogleParamsNodeApiClientParamsChatGoogleParamsOptional. The API client implementation for making HTTP requests to the Gemini API. If not set, a default client will be used based on the runtime environment.
The version of the API functions. Part of the path. Usually this is computed based on platformType.
GCP service account credentials for authentication.
Can be provided as either:
If not provided, the client will attempt to read from the
GOOGLE_CLOUD_CREDENTIALS environment variable.
If true, enables enhanced civic answers feature.
Hostname for the API call (if this is running on GCP) Usually this is computed based on location and platformType.
Positive values penalize tokens that repeatedly appear in the generated text, decreasing the probability of repeating content.
Google Auth options configuration.
Provides fine-grained control over authentication behavior using the google-auth-library. This option enables:
When provided, this takes precedence over simple credential-based auth but is lower priority than API key authentication.
Configuration for image generation.
Region where the LLM is stored (if this is running on GCP) Defaults to "global"
Returns the log probabilities of the top candidate tokens at each generation step. The model's chosen token might not be the same as the top candidate token at each step. Specify the number of candidates to return by using an integer value in the range of 1-20.
Maximum number of parallel calls to make.
Maximum number of tokens that can be generated in the response. A token is approximately four characters. 100 tokens correspond to roughly 60-80 words. Specify a lower value for shorter responses and a higher value for potentially longer responses.
The number of reasoning tokens that the model should generate. If explicitly set, then the reasoning blocks will be returned.
The maximum number of retries that can be made for a single call, with an exponential backoff between each attempt. Defaults to 6.
Media resolution for input media processing.
Custom handler to handle failed attempts. Takes the originally thrown error object as input, and should itself throw an error if the input error is not retryable.
What platform to run the service on. If not specified, the class should determine this from other means. Either way, the platform actually used will be in the "platform" getter.
Positive values penalize tokens that already appear in the generated text, increasing the probability of generating more diverse content.
An alias for maxReasoningTokens under Gemini 2.5 or
the primary thinking/reasoning setting for Gemini 3.
If explicitly set, then the reasoning blocks will be returned.
If true, returns the log probabilities of the tokens that were chosen by the model at each step. By default, this parameter is set to false.
The requested modalities of the response. Represents the set of modalities that the model can return. An empty list is equivalent to requesting only text.
The schema that the generated response should match.
Can be a Zod schema or a JSON Schema object.
When set, the response will be structured according to this schema
and responseMimeType will automatically be set to "application/json".
Per request settings for blocking unsafe content
When seed is fixed to a specific value, the model makes a best effort to provide the same response for repeated requests. Deterministic output isn't guaranteed. Also, changing the model or parameter settings, such as the temperature, can cause variations in the response even when you use the same seed value. By default, a random seed value is used.
Speech generation configuration. You can use either Google's definition of the speech configuration, or a simplified version we've defined (which can be as simple as the name of a pre-defined voice).
Specifies a list of strings that tells the model to stop generating text if one of the strings is encountered in the response. If a string appears multiple times in the response, then the response truncates where it's first encountered. The strings are case-sensitive.
The temperature is used for sampling during response generation, which occurs when topP and topK are applied. Temperature controls the degree of randomness in token selection. Lower temperatures are good for prompts that require a less open-ended or creative response, while higher temperatures can lead to more diverse or creative results.
Specify a lower value for less random responses and a higher value for more random responses.
An alias for maxReasoningTokens for compatibility.
Configuration for the model's thinking process
An alias for reasoningEffort for compatibility.
A list of tools the model may use to generate the next response. Can be LangChain tools, OpenAI tools, or Gemini function declarations.
Top-K changes how the model selects tokens for output. A top-K of 1 means the selected token is the most probable among all tokens in the model's vocabulary (also called greedy decoding), while a top-K of 3 means that the next token is selected from among the 3 most probable tokens (using temperature).
Top-P changes how the model selects tokens for output. Tokens are selected from the most probable to least probable until the sum of their probabilities equals the top-P value.
Specify a lower value for less random responses and a higher value for more random responses.
Whether to print out response text.
For compatibility with Google's libraries, should this use Vertex? The "platformType" parmeter takes precedence.
Optional. The API client implementation for making HTTP requests to the Gemini API.
The version of the API functions. Part of the path.
GCP service account credentials for authentication.
If true, enables enhanced civic answers feature.
Hostname for the API call (if this is running on GCP)
Positive values penalize tokens that repeatedly appear in the generated
Google Auth options configuration.
Configuration for image generation.
Region where the LLM is stored (if this is running on GCP)
Returns the log probabilities of the top candidate tokens at each generation
Maximum number of parallel calls to make.
Maximum number of tokens that can be generated in the response. A token is
The number of reasoning tokens that the model should generate.
The maximum number of retries that can be made for a single call,
Media resolution for input media processing.
Custom handler to handle failed attempts. Takes the originally thrown
What platform to run the service on.
Positive values penalize tokens that already appear in the generated text,
An alias for maxReasoningTokens under Gemini 2.5 or
If true, returns the log probabilities of the tokens that were chosen by
The requested modalities of the response.
The schema that the generated response should match.
Per request settings for blocking unsafe content
When seed is fixed to a specific value, the model makes a best effort to
Speech generation configuration.
Specifies a list of strings that tells the model to stop generating text if
The temperature is used for sampling during response generation, which
An alias for maxReasoningTokens for compatibility.
Configuration for the model's thinking process
An alias for reasoningEffort for compatibility.
A list of tools the model may use to generate the next response.
Top-K changes how the model selects tokens for output. A top-K of 1 means
Top-P changes how the model selects tokens for output. Tokens are selected
Whether to print out response text.
For compatibility with Google's libraries, should this use Vertex?