Xinference embedding models.
To use, you should have the xinference library installed:
.. code-block:: bash
pip install xinference
If you're simply using the services provided by Xinference, you can utilize the xinference_client package:
.. code-block:: bash
pip install xinference_client
Check out: https://github.com/xorbitsai/inference To run, you need to start a Xinference supervisor on one server and Xinference workers on the other servers.
Example:
To start a local instance of Xinference, run
.. code-block:: bash
$ xinference
You can also deploy Xinference in a distributed cluster. Here are the steps:
Starting the supervisor:
.. code-block:: bash
$ xinference-supervisor
If you're simply using the services provided by Xinference, you can utilize the xinference_client package:
.. code-block:: bash
pip install xinference_client
Starting the worker:
.. code-block:: bash
$ xinference-worker
Then, launch a model using command line interface (CLI).
Example:
.. code-block:: bash
$ xinference launch -n orca -s 3 -q q4_0
It will return a model UID. Then you can use Xinference Embedding with LangChain.
Example:
.. code-block:: python
from langchain_community.embeddings import XinferenceEmbeddings
xinference = XinferenceEmbeddings(
server_url="http://0.0.0.0:9997",
model_uid = {model_uid} # replace model_uid with the model UID return from launching the model
)
URL of the xinference server
UID of the launched model
Embed a list of documents using Xinference. Args: texts: The list of texts to embed. Returns: List of embeddings, one for each text.
Embed a query of documents using Xinference. Args: text: The text to embed. Returns: Embeddings for the text.