interface GoogleAIBaseLanguageModelCallOptionsFrequency penalty applied to the next token's logprobs,
Frequency penalty applied to the next token's logprobs,
Custom metadata labels to associate with the request.
Whether to return log probabilities of the output tokens or not.
Maximum number of tokens to generate in the completion.
The maximum number of the output tokens that will be used
Model to use
Model to use
Presence penalty applied to the next token's logprobs
An OpenAI compatible parameter that will map to "maxReasoningTokens"
An OpenAI compatible parameter that will map to "thinkingLevel"
Available for gemini-1.5-pro.
The modalities of the response.
The schema that the model's output should conform to.
Seed used in decoding. If not set, the request uses a randomly generated seed.
Speech generation configuration.
Whether or not to stream.
Sampling temperature to use
An alias for "maxReasoningTokens"
Optional. The level of thoughts tokens that the model should generate.
Top-k changes how the model selects tokens for output.
An integer between 0 and 5 specifying the number of
Top-p changes how the model selects tokens for output.
Allowed functions to call when the mode is "any". If empty, any one of the provided functions are called.
Runtime values for attributes previously made configurable on this Runnable, or sub-Runnables.
Frequency penalty applied to the next token's logprobs, multiplied by the number of times each token has been seen in the respponse so far. A positive penalty will discourage the use of tokens that have already been used, proportional to the number of times the token has been used: The more a token is used, the more dificult it is for the model to use that token again increasing the vocabulary of responses. Caution: A negative penalty will encourage the model to reuse tokens proportional to the number of times the token has been used. Small negative values will reduce the vocabulary of a response. Larger negative values will cause the model to start repeating a common token until it hits the maxOutputTokens limit.
Custom metadata labels to associate with the request. Only supported on Vertex AI (Google Cloud Platform). Labels are key-value pairs where both keys and values must be strings.
Example:
{
labels: {
"team": "research",
"component": "frontend",
"environment": "production"
}
}Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
Describes the format of structured outputs. This should be provided if an output is considered to be structured
The maximum number of concurrent calls that can be made.
Defaults to Infinity, which means no limit.
Maximum number of tokens to generate in the completion. This may include reasoning tokens (for backwards compatibility).
The maximum number of the output tokens that will be used for the "thinking" or "reasoning" stages.
Model to use
Model to use
Alias for model
Presence penalty applied to the next token's logprobs if the token has already been seen in the response. This penalty is binary on/off and not dependant on the number of times the token is used (after the first). Use frequencyPenalty for a penalty that increases with each use. A positive penalty will discourage the use of tokens that have already been used in the response, increasing the vocabulary. A negative penalty will encourage the use of tokens that have already been used in the response, decreasing the vocabulary.
An OpenAI compatible parameter that will map to "maxReasoningTokens"
An OpenAI compatible parameter that will map to "thinkingLevel"
Maximum number of times a call can recurse. If not provided, defaults to 25.
Available for gemini-1.5-pro.
The output format of the generated candidate text.
Supported MIME types:
text/plain: Text output.application/json: JSON response in the candidates.The modalities of the response.
The schema that the model's output should conform to. When this is set, the model will output JSON that conforms to the schema.
Unique identifier for the tracer run for this call. If not provided, a new UUID will be generated.
Name for the tracer run for this call. Defaults to the name of the class.
Seed used in decoding. If not set, the request uses a randomly generated seed.
Speech generation configuration. You can use either Google's definition of the speech configuration, or a simplified version we've defined (which can be as simple as the name of a pre-defined voice).
Stop tokens to use for this call. If not provided, the default stop tokens for the model will be used.
Whether or not to stream.
Whether or not to include usage data, like token counts in the streamed response chunks.
Sampling temperature to use
An alias for "maxReasoningTokens"
Optional. The level of thoughts tokens that the model should generate. Can be specified directly or via reasoningLevel for OpenAI compatibility.
Timeout for this call in milliseconds.
Top-k changes how the model selects tokens for output.
A top-k of 1 means the selected token is the most probable among all tokens in the model’s vocabulary (also called greedy decoding), while a top-k of 3 means that the next token is selected from among the 3 most probable tokens (using temperature).
An integer between 0 and 5 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used.
Top-p changes how the model selects tokens for output.
Tokens are selected from most probable to least until the sum of their probabilities equals the top-p value.
For example, if tokens A, B, and C have a probability of .3, .2, and .1 and the top-p value is .5, then the model will select either A or B as the next token (using temperature).
The params which can be passed to the API at request time.