Input parameters for the DeepInfra embeddings
interface DeepInfraEmbeddingsParamsThe API token to use for authentication.
If not provided, it will be read from the DEEPINFRA_API_TOKEN environment variable.
Prompt processing batch size.
The maximum number of concurrent calls that can be made.
Defaults to Infinity, which means no limit.
The maximum number of retries that can be made for a single call, with an exponential backoff between each attempt. Defaults to 6.
Model name to use. Available options are: qwen-turbo, qwen-plus, qwen-max, or Other compatible models.
Alias for model
Custom handler to handle failed attempts. Takes the originally thrown error object as input, and should itself throw an error if the input error is not retryable.