interface TogetherAIInputsWhether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
The maximum number of concurrent calls that can be made.
Defaults to Infinity, which means no limit.
The maximum number of retries that can be made for a single call, with an exponential backoff between each attempt. Defaults to 6.
Model name to use. Available options are: qwen-turbo, qwen-plus, qwen-max, or Other compatible models.
Model name to use. Available options are: qwen-turbo, qwen-plus, qwen-max, or Other compatible models.
Alias for model
Custom handler to handle failed attempts. Takes the originally thrown error object as input, and should itself throw an error if the input error is not retryable.
Penalizes repeated tokens according to frequency. Range from 1.0 to 2.0. Defaults to 1.0.
Stop tokens to use for this call. If not provided, the default stop tokens for the model will be used.
Whether to stream the results or not. Defaults to false.
Amount of randomness injected into the response. Ranges from 0 to 1 (0 is not included). Use temp closer to 0 for analytical / multiple choice, and temp closer to 1 for creative and generative tasks. Defaults to 0.95.
Total probability mass of tokens to consider at each step. Range from 0 to 1.0. Defaults to 0.8.
Whether to print out response text.
Note that the modelPath is the only required parameter. For testing you
can set this in the environment variable LLAMA_PATH.