Xinference

Name	Type
server_url	Optional[str]
model_uid	Optional[str]
api_key	Optional[str]

Xinference large-scale model inference service.

To use, you should have the xinference library installed:

.. code-block:: bash

pip install "xinference[all]"

If you're simply using the services provided by Xinference, you can utilize the xinference_client package:

.. code-block:: bash

pip install xinference_client

Check out: https://github.com/xorbitsai/inference To run, you need to start a Xinference supervisor on one server and Xinference workers on the other servers

Example:

To start a local instance of Xinference, run

.. code-block:: bash

$ xinference

You can also deploy Xinference in a distributed cluster. Here are the steps:

Starting the supervisor:

.. code-block:: bash

$ xinference-supervisor

Starting the worker:

.. code-block:: bash

$ xinference-worker

Then, launch a model using command line interface (CLI).

Example:

.. code-block:: bash

$ xinference launch -n orca -s 3 -q q4_0

It will return a model UID. Then, you can use Xinference with LangChain.

Example:

.. code-block:: python

from langchain_community.llms import Xinference

llm = Xinference(
    server_url="http://0.0.0.0:9997",
    model_uid = {model_uid} # replace model_uid with the model UID return from launching the model
)

llm.invoke(
    prompt="Q: where can we visit in the capital of France? A:",
    generate_config={"max_tokens": 1024, "stream": True},
)

Example:

.. code-block:: python

from langchain_community.llms import Xinference
from langchain_classic.prompts import PromptTemplate

llm = Xinference(
    server_url="http://0.0.0.0:9997",
    model_uid={model_uid}, # replace model_uid with the model UID return from launching the model
    stream=True
)
prompt = PromptTemplate(
    input=['country'],
    template="Q: where can we visit in the capital of {country}? A:"
)
chain = prompt | llm
chain.stream(input={'country': 'France'})

To view all the supported builtin models, run:

.. code-block:: bash

$ xinference list --all

LangChain Assistant

Menu

Bases

Constructors

Attributes

Inherited fromBaseLLM(langchain_core)

Attributes

Methods

Inherited fromBaseLanguageModel(langchain_core)

Attributes

Methods

Inherited fromRunnableSerializable(langchain_core)

Attributes

Methods

Inherited fromSerializable(langchain_core)

Attributes

Methods

Inherited fromRunnable(langchain_core)

Attributes

Methods

Inherited fromBaseModel

Attributes

Menu

Xinference

Bases

Used in Docs

Constructors

Attributes

Inherited fromBaseLLM(langchain_core)

Attributes

Methods

Inherited fromBaseLanguageModel(langchain_core)

Attributes

Methods

Inherited fromRunnableSerializable(langchain_core)

Attributes

Methods

Inherited fromSerializable(langchain_core)

Attributes

Methods

Inherited fromRunnable(langchain_core)

Attributes

Methods

Inherited fromBaseModel

Attributes