HuggingFaceEndpoint()Typical Decoding mass. See Typical Decoding for Natural Language Generation for more information.
The parameter for repetition penalty. 1.0 means no penalty. See this paper for more details.
Watermarking with [A Watermark for Large Language Models] (https://arxiv.org/abs/2301.10226)
HuggingFace Endpoint.
To use this class, you should have installed the huggingface_hub package, and
the environment variable HUGGINGFACEHUB_API_TOKEN set with your API token,
or given as a named parameter to the constructor.
Example:
.. code-block:: python
llm = HuggingFaceEndpoint( endpoint_url="http://localhost:8010/", max_new_tokens=512, top_k=10, top_p=0.95, typical_p=0.95, temperature=0.01, repetition_penalty=1.03, huggingfacehub_api_token="my-api-key" ) print(llm.invoke("What is Deep Learning?"))
from langchain_core.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
callbacks = [StreamingStdOutCallbackHandler()] llm = HuggingFaceEndpoint( endpoint_url="http://localhost:8010/", max_new_tokens=512, top_k=10, top_p=0.95, typical_p=0.95, temperature=0.01, repetition_penalty=1.03, callbacks=callbacks, streaming=True, huggingfacehub_api_token="my-api-key" ) print(llm.invoke("What is Deep Learning?"))