interface ChatOpenAICompletionsCallOptionsMaximum number of times a call can recurse. If not provided, defaults to 25.
Parameters for audio output. Required when audio output is requested with
modalities: ["audio"].
Learn more.
Callbacks for this call and any sub-calls (eg. a Chain calling an LLM). Tags are passed to all callbacks, metadata is passed to handle*Start callbacks.
Runtime values for attributes previously made configurable on this Runnable, or sub-Runnables.
Describes the format of structured outputs. This should be provided if an output is considered to be structured
Maximum number of parallel calls to make.
Output types that you would like the model to generate for this request. Most models are capable of generating text, which is the default:
["text"]
The gpt-4o-audio-preview model can also be used to
generate audio. To request that
this model generate both text and audio responses, you can use:
["text", "audio"]
Additional options to pass to the underlying axios request.
Version of AIMessage output format to store in message content.
AIMessage.contentBlocks will lazily parse the contents of content into a
standard format. This flag can be used to additionally store the standard format
as the message content, e.g., for serialization purposes.
.contentBlocks).contentBlocks)You can also set LC_OUTPUT_VERSION as an environment variable to "v1" to
enable this by default.
The model may choose to call multiple functions in a single turn. You can set parallel_tool_calls to false which ensures only one tool is called at most. Learn more
Static predicted output content, such as the content of a text file that is being regenerated. Learn more.
Used by OpenAI to cache responses for similar requests to optimize your cache
hit rates. Replaces the user field.
Learn more.
Used by OpenAI to set cache retention time
Adds a prompt index to prompts passed to the model to track what prompt is being used for a given generation.
Options for reasoning models.
Note that some options, like reasoning summaries, are only available when using the responses API. If these options are set, the responses API will be used to fulfill the request.
These options will be ignored when not using a reasoning model.
Maximum number of times a call can recurse. If not provided, defaults to 25.
An object specifying the format that the model must output.
Unique identifier for the tracer run for this call. If not provided, a new UUID will be generated.
Name for the tracer run for this call. Defaults to the name of the class.
When provided, the completions API will make a best effort to sample
deterministically, such that repeated requests with the same seed
and parameters should return the same result.
Service tier to use for this request. Can be "auto", "default", or "flex" Specifies the service tier for prioritization and latency optimization.
Abort signal for this call. If provided, the call will be aborted when the signal is aborted.
Stop tokens to use for this call. If not provided, the default stop tokens for the model will be used.
Additional options to pass to streamed completions. If provided, this takes precedence over "streamUsage" set at initialization time.
If true, model output is guaranteed to exactly match the JSON Schema
provided in the tool definition. If true, the input schema will also be
validated according to
https://platform.openai.com/docs/guides/structured-outputs/supported-schemas.
If false, input schema will not be validated and model output will not
be validated.
If undefined, strict argument will not be passed to the model.
Timeout for this call in milliseconds.
Specifies which tool the model should use to respond. Can be an OpenAIToolChoice or a ResponsesToolChoice. If not set, the model will decide which tool to use automatically.
A list of tools that the model may use to generate responses. Each tool can be a function, a built-in tool, or a custom tool definition. If not provided, the model will not use any tools.
The verbosity of the model's response.
Constrains effort on reasoning for reasoning models. Reduces reasoning in responses, which can reduce latency and cost at the expense of quality.
Accepts values: "low", "medium", or "high".