OllamaEmbeddings

Set up a local Ollama instance:

Install the Ollama package and set up a local Ollama instance.

You will need to choose a model to serve.

You can view a list of available models via the model library.

To fetch a model from the Ollama model library use ollama pull <name-of-model>.

For example, to pull the llama3 model:

ollama pull llama3

This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model.

On Mac, the models will be downloaded to ~/.ollama/models
On Linux (or WSL), the models will be stored at /usr/share/ollama/.ollama/models

You can specify the exact version of the model of interest as such ollama pull vicuna:13b-v1.5-16k-q4_0.

To view pulled models:

ollama list

To start serving:

ollama serve

View the Ollama documentation for more commands.

ollama help

Install the langchain-ollama integration package:

pip install -U langchain_ollama

Key init args — completion params: model: str Name of Ollama model to use. base_url: str | None Base url the model is hosted under.

See full list of supported init args and their descriptions in the params section.

Instantiate:

from langchain_ollama import OllamaEmbeddings

embed = OllamaEmbeddings(model="llama3")

Embed single text:

input_text = "The meaning of life is 42"
vector = embed.embed_query(input_text)
print(vector[:3])

[-0.024603435769677162, -0.007543657906353474, 0.0039630369283258915]

Embed multiple texts:

input_texts = ["Document 1...", "Document 2..."]
vectors = embed.embed_documents(input_texts)
print(len(vectors))
# The first 3 coordinates for the first vector
print(vectors[0][:3])

2
[-0.024603435769677162, -0.007543657906353474, 0.0039630369283258915]

Async:

vector = await embed.aembed_query(input_text)
print(vector[:3])

# multiple:
# await embed.aembed_documents(input_texts)

[-0.009100092574954033, 0.005071679595857859, -0.0029193938244134188]

LangChain Assistant

Menu

Bases

Attributes

Methods