Ask a question to get started
Enter to send•Shift+Enter new line
Yuan2( self, **kwargs: Any = {}, )
LLM
Yuan2.0 inference api
Token context window.
The temperature to use for sampling.
The top-p value to use for sampling.
The top-k value to use for sampling.
The do_sample is a Boolean value that determines whether to use the sampling method during text generation.
Whether to echo the prompt.
A list of strings to stop generation when encountered.
Last n tokens to penalize
The penalty to apply to repeated tokens.
Whether to stream the results or not.
History of the conversation
Whether to use history or not
Yuan2.0 language models.
Example:
.. code-block:: python
yuan_llm = Yuan2( infer_api="http://127.0.0.1:8000/yuan", max_tokens=1024, temp=1.0, top_p=0.9, top_k=40, ) print(yuan_llm) print(yuan_llm.invoke("你是谁?"))