`langchain-weaviate`¶

Reference docs

This page contains reference documentation for Weaviate. See the docs for conceptual guides, tutorials, and examples on using Weaviate modules.

langchain_weaviate ¶

WeaviateVectorStore ¶

Bases: VectorStore

Weaviate vector store.

To use, you should have the weaviate-client python package installed.

Example

.. code-block:: python

import weaviate
from langchain_community.vectorstores import Weaviate

client = weaviate.Client(url=os.environ["WEAVIATE_URL"], ...)
weaviate = Weaviate(client, index_name, text_key)

METHOD	DESCRIPTION
`get_by_ids`	Get documents by their IDs.
`aget_by_ids`	Async get documents by their IDs.
`adelete`	Async delete by vector ID or other criteria.
`aadd_texts`	Async run more texts through the embeddings and add to the `VectorStore`.
`add_documents`	Add or update documents in the `VectorStore`.
`aadd_documents`	Async run more documents through the embeddings and add to the `VectorStore`.
`search`	Return docs most similar to query using a specified search type.
`asearch`	Async return docs most similar to query using a specified search type.
`asimilarity_search_with_score`	Async run similarity search with distance.
`similarity_search_with_relevance_scores`	Return docs and relevance scores in the range `[0, 1]`.
`asimilarity_search_with_relevance_scores`	Async return docs and relevance scores in the range `[0, 1]`.
`asimilarity_search`	Async return docs most similar to query.
`similarity_search_by_vector`	Return docs most similar to embedding vector.
`asimilarity_search_by_vector`	Async return docs most similar to embedding vector.
`amax_marginal_relevance_search`	Async return docs selected using the maximal marginal relevance.
`amax_marginal_relevance_search_by_vector`	Async return docs selected using the maximal marginal relevance.
`from_documents`	Return `VectorStore` initialized from documents and embeddings.
`afrom_documents`	Async return `VectorStore` initialized from documents and embeddings.
`afrom_texts`	Async return `VectorStore` initialized from texts and embeddings.
`as_retriever`	Return `VectorStoreRetriever` initialized from this `VectorStore`.
`__init__`	Initialize with Weaviate client.
`add_texts`	Upload texts with metadata (properties) to Weaviate.
`similarity_search`	Return docs most similar to query.
`max_marginal_relevance_search`	Return docs selected using the maximal marginal relevance.
`max_marginal_relevance_search_by_vector`	Return docs selected using the maximal marginal relevance.
`similarity_search_with_score`	Return list of documents most similar to the query
`from_texts`	Construct Weaviate wrapper from raw documents.
`delete`	Delete by vector IDs.

embeddings `property` ¶

embeddings: Embeddings | None

Access the query embedding object if available.

get_by_ids ¶

get_by_ids(ids: Sequence[str]) -> list[Document]

Get documents by their IDs.

The returned documents are expected to have the ID field set to the ID of the document in the vector store.

Fewer documents may be returned than requested if some IDs are not found or if there are duplicated IDs.

Users should not assume that the order of the returned documents matches the order of the input IDs. Instead, users should rely on the ID field of the returned documents.

This method should NOT raise exceptions if no documents are found for some IDs.

PARAMETER	DESCRIPTION
`ids`	List of IDs to retrieve. TYPE: `Sequence[str]`

RETURNS	DESCRIPTION
`list[Document]`	List of `Document` objects.

aget_by_ids `async` ¶

aget_by_ids(ids: Sequence[str]) -> list[Document]

Async get documents by their IDs.

The returned documents are expected to have the ID field set to the ID of the document in the vector store.

Fewer documents may be returned than requested if some IDs are not found or if there are duplicated IDs.

Users should not assume that the order of the returned documents matches the order of the input IDs. Instead, users should rely on the ID field of the returned documents.

This method should NOT raise exceptions if no documents are found for some IDs.

PARAMETER	DESCRIPTION
`ids`	List of IDs to retrieve. TYPE: `Sequence[str]`

RETURNS	DESCRIPTION
`list[Document]`	List of `Document` objects.

adelete `async` ¶

adelete(ids: list[str] | None = None, **kwargs: Any) -> bool | None

Async delete by vector ID or other criteria.

PARAMETER	DESCRIPTION
`ids`	List of IDs to delete. If `None`, delete all. TYPE: `list[str] \| None` DEFAULT: `None`
`**kwargs`	Other keyword arguments that subclasses might use. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`bool \| None`	`True` if deletion is successful, `False` otherwise, `None` if not implemented.

aadd_texts `async` ¶

aadd_texts(
    texts: Iterable[str],
    metadatas: list[dict] | None = None,
    *,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> list[str]

Async run more texts through the embeddings and add to the VectorStore.

PARAMETER	DESCRIPTION
`texts`	Iterable of strings to add to the `VectorStore`. TYPE: `Iterable[str]`
`metadatas`	Optional list of metadatas associated with the texts. TYPE: `list[dict] \| None` DEFAULT: `None`
`ids`	Optional list TYPE: `list[str] \| None` DEFAULT: `None`
`**kwargs`	`VectorStore` specific parameters. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[str]`	List of IDs from adding the texts into the `VectorStore`.

RAISES	DESCRIPTION
`ValueError`	If the number of metadatas does not match the number of texts.
`ValueError`	If the number of IDs does not match the number of texts.

add_documents ¶

add_documents(documents: list[Document], **kwargs: Any) -> list[str]

Add or update documents in the VectorStore.

PARAMETER	DESCRIPTION
`documents`	Documents to add to the `VectorStore`. TYPE: `list[Document]`
`**kwargs`	Additional keyword arguments. If kwargs contains IDs and documents contain ids, the IDs in the kwargs will receive precedence. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[str]`	List of IDs of the added texts.

aadd_documents `async` ¶

aadd_documents(documents: list[Document], **kwargs: Any) -> list[str]

Async run more documents through the embeddings and add to the VectorStore.

PARAMETER	DESCRIPTION
`documents`	Documents to add to the `VectorStore`. TYPE: `list[Document]`
`**kwargs`	Additional keyword arguments. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[str]`	List of IDs of the added texts.

search ¶

search(query: str, search_type: str, **kwargs: Any) -> list[Document]

Return docs most similar to query using a specified search type.

PARAMETER	DESCRIPTION
`query`	Input text. TYPE: `str`
`search_type`	Type of search to perform. Can be `'similarity'`, `'mmr'`, or `'similarity_score_threshold'`. TYPE: `str`
`**kwargs`	Arguments to pass to the search method. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[Document]`	List of `Document` objects most similar to the query.

RAISES	DESCRIPTION
`ValueError`	If `search_type` is not one of `'similarity'`, `'mmr'`, or `'similarity_score_threshold'`.

asearch `async` ¶

asearch(query: str, search_type: str, **kwargs: Any) -> list[Document]

Async return docs most similar to query using a specified search type.

PARAMETER	DESCRIPTION
`query`	Input text. TYPE: `str`
`search_type`	Type of search to perform. Can be `'similarity'`, `'mmr'`, or `'similarity_score_threshold'`. TYPE: `str`
`**kwargs`	Arguments to pass to the search method. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[Document]`	List of `Document` objects most similar to the query.

RAISES	DESCRIPTION
`ValueError`	If `search_type` is not one of `'similarity'`, `'mmr'`, or `'similarity_score_threshold'`.

asimilarity_search_with_score `async` ¶

asimilarity_search_with_score(
    *args: Any, **kwargs: Any
) -> list[tuple[Document, float]]

Async run similarity search with distance.

PARAMETER	DESCRIPTION
`*args`	Arguments to pass to the search method. TYPE: `Any` DEFAULT: `()`
`**kwargs`	Arguments to pass to the search method. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[tuple[Document, float]]`	List of tuples of `(doc, similarity_score)`.

similarity_search_with_relevance_scores ¶

similarity_search_with_relevance_scores(
    query: str, k: int = 4, **kwargs: Any
) -> list[tuple[Document, float]]

Return docs and relevance scores in the range [0, 1].

0 is dissimilar, 1 is most similar.

PARAMETER	DESCRIPTION
`query`	Input text. TYPE: `str`
`k`	Number of `Document` objects to return. TYPE: `int` DEFAULT: `4`
`**kwargs`	Kwargs to be passed to similarity search. Should include `score_threshold`, an optional floating point value between `0` to `1` to filter the resulting set of retrieved docs. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[tuple[Document, float]]`	List of tuples of `(doc, similarity_score)`.

asimilarity_search_with_relevance_scores `async` ¶

asimilarity_search_with_relevance_scores(
    query: str, k: int = 4, **kwargs: Any
) -> list[tuple[Document, float]]

Async return docs and relevance scores in the range [0, 1].

0 is dissimilar, 1 is most similar.

PARAMETER	DESCRIPTION
`query`	Input text. TYPE: `str`
`k`	Number of `Document` objects to return. TYPE: `int` DEFAULT: `4`
`**kwargs`	Kwargs to be passed to similarity search. Should include `score_threshold`, an optional floating point value between `0` to `1` to filter the resulting set of retrieved docs. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[tuple[Document, float]]`	List of tuples of `(doc, similarity_score)`

asimilarity_search `async` ¶

asimilarity_search(query: str, k: int = 4, **kwargs: Any) -> list[Document]

Async return docs most similar to query.

PARAMETER	DESCRIPTION
`query`	Input text. TYPE: `str`
`k`	Number of `Document` objects to return. TYPE: `int` DEFAULT: `4`
`**kwargs`	Arguments to pass to the search method. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[Document]`	List of `Document` objects most similar to the query.

similarity_search_by_vector ¶

similarity_search_by_vector(
    embedding: list[float], k: int = 4, **kwargs: Any
) -> list[Document]

Return docs most similar to embedding vector.

PARAMETER	DESCRIPTION
`embedding`	Embedding to look up documents similar to. TYPE: `list[float]`
`k`	Number of `Document` objects to return. TYPE: `int` DEFAULT: `4`
`**kwargs`	Arguments to pass to the search method. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[Document]`	List of `Document` objects most similar to the query vector.

asimilarity_search_by_vector `async` ¶

asimilarity_search_by_vector(
    embedding: list[float], k: int = 4, **kwargs: Any
) -> list[Document]

Async return docs most similar to embedding vector.

PARAMETER	DESCRIPTION
`embedding`	Embedding to look up documents similar to. TYPE: `list[float]`
`k`	Number of `Document` objects to return. TYPE: `int` DEFAULT: `4`
`**kwargs`	Arguments to pass to the search method. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[Document]`	List of `Document` objects most similar to the query vector.

amax_marginal_relevance_search `async` ¶

amax_marginal_relevance_search(
    query: str, k: int = 4, fetch_k: int = 20, lambda_mult: float = 0.5, **kwargs: Any
) -> list[Document]

Async return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

PARAMETER	DESCRIPTION
`query`	Text to look up documents similar to. TYPE: `str`
`k`	Number of `Document` objects to return. TYPE: `int` DEFAULT: `4`
`fetch_k`	Number of `Document` objects to fetch to pass to MMR algorithm. TYPE: `int` DEFAULT: `20`
`lambda_mult`	Number between `0` and `1` that determines the degree of diversity among the results with `0` corresponding to maximum diversity and `1` to minimum diversity. TYPE: `float` DEFAULT: `0.5`
`**kwargs`	Arguments to pass to the search method. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[Document]`	List of `Document` objects selected by maximal marginal relevance.

amax_marginal_relevance_search_by_vector `async` ¶

amax_marginal_relevance_search_by_vector(
    embedding: list[float],
    k: int = 4,
    fetch_k: int = 20,
    lambda_mult: float = 0.5,
    **kwargs: Any,
) -> list[Document]

Async return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

PARAMETER	DESCRIPTION
`embedding`	Embedding to look up documents similar to. TYPE: `list[float]`
`k`	Number of `Document` objects to return. TYPE: `int` DEFAULT: `4`
`fetch_k`	Number of `Document` objects to fetch to pass to MMR algorithm. TYPE: `int` DEFAULT: `20`
`lambda_mult`	Number between `0` and `1` that determines the degree of diversity among the results with `0` corresponding to maximum diversity and `1` to minimum diversity. TYPE: `float` DEFAULT: `0.5`
`**kwargs`	Arguments to pass to the search method. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[Document]`	List of `Document` objects selected by maximal marginal relevance.

from_documents `classmethod` ¶

from_documents(documents: list[Document], embedding: Embeddings, **kwargs: Any) -> Self

Return VectorStore initialized from documents and embeddings.

PARAMETER	DESCRIPTION
`documents`	List of `Document` objects to add to the `VectorStore`. TYPE: `list[Document]`
`embedding`	Embedding function to use. TYPE: `Embeddings`
`**kwargs`	Additional keyword arguments. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`Self`	`VectorStore` initialized from documents and embeddings.

afrom_documents `async` `classmethod` ¶

afrom_documents(
    documents: list[Document], embedding: Embeddings, **kwargs: Any
) -> Self

Async return VectorStore initialized from documents and embeddings.

PARAMETER	DESCRIPTION
`documents`	List of `Document` objects to add to the `VectorStore`. TYPE: `list[Document]`
`embedding`	Embedding function to use. TYPE: `Embeddings`
`**kwargs`	Additional keyword arguments. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`Self`	`VectorStore` initialized from documents and embeddings.

afrom_texts `async` `classmethod` ¶

afrom_texts(
    texts: list[str],
    embedding: Embeddings,
    metadatas: list[dict] | None = None,
    *,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> Self

Async return VectorStore initialized from texts and embeddings.

PARAMETER	DESCRIPTION
`texts`	Texts to add to the `VectorStore`. TYPE: `list[str]`
`embedding`	Embedding function to use. TYPE: `Embeddings`
`metadatas`	Optional list of metadatas associated with the texts. TYPE: `list[dict] \| None` DEFAULT: `None`
`ids`	Optional list of IDs associated with the texts. TYPE: `list[str] \| None` DEFAULT: `None`
`**kwargs`	Additional keyword arguments. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`Self`	`VectorStore` initialized from texts and embeddings.

as_retriever ¶

as_retriever(**kwargs: Any) -> VectorStoreRetriever

Return VectorStoreRetriever initialized from this VectorStore.

PARAMETER DESCRIPTION

**kwargs

Keyword arguments to pass to the search function.

Can include:

search_type: Defines the type of search that the Retriever should perform. Can be 'similarity' (default), 'mmr', or 'similarity_score_threshold'.
search_kwargs: Keyword arguments to pass to the search function.

Can include things like:
- k: Amount of documents to return (Default: 4)
- score_threshold: Minimum relevance threshold for similarity_score_threshold
- fetch_k: Amount of documents to pass to MMR algorithm (Default: 20)
- lambda_mult: Diversity of results returned by MMR; 1 for minimum diversity and 0 for maximum. (Default: 0.5)
- filter: Filter by document metadata

TYPE: Any DEFAULT: {}

RETURNS	DESCRIPTION
`VectorStoreRetriever`	Retriever class for `VectorStore`.

Examples:

# Retrieve more documents with higher diversity
# Useful if your dataset has many similar documents
docsearch.as_retriever(
    search_type="mmr", search_kwargs={"k": 6, "lambda_mult": 0.25}
)

# Fetch more documents for the MMR algorithm to consider
# But only return the top 5
docsearch.as_retriever(search_type="mmr", search_kwargs={"k": 5, "fetch_k": 50})

# Only retrieve documents that have a relevance score
# Above a certain threshold
docsearch.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"score_threshold": 0.8},
)

# Only get the single most similar document from the dataset
docsearch.as_retriever(search_kwargs={"k": 1})

# Use a filter to only retrieve documents from a specific paper
docsearch.as_retriever(
    search_kwargs={"filter": {"paper_title": "GPT-4 Technical Report"}}
)

init ¶

__init__(
    client: WeaviateClient,
    index_name: str | None,
    text_key: str,
    embedding: Embeddings | None = None,
    attributes: list[str] | None = None,
    relevance_score_fn: Callable[[float], float] | None = _default_score_normalizer,
    use_multi_tenancy: bool = False,
)

Initialize with Weaviate client.

add_texts ¶

add_texts(
    texts: Iterable[str],
    metadatas: list[dict] | None = None,
    tenant: str | None = None,
    **kwargs: Any,
) -> list[str]

Upload texts with metadata (properties) to Weaviate.

similarity_search ¶

similarity_search(query: str, k: int = 4, **kwargs: Any) -> list[Document]

Return docs most similar to query.

PARAMETER	DESCRIPTION
`query`	Text to look up documents similar to. TYPE: `str`
`k`	Number of Documents to return. Defaults to 4. TYPE: `int` DEFAULT: `4`
`**kwargs`	Additional keyword arguments will be passed to the `hybrid()` function of the weaviate client. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list[Document]`	List of Documents most similar to the query.

max_marginal_relevance_search ¶

max_marginal_relevance_search(
    query: str, k: int = 4, fetch_k: int = 20, lambda_mult: float = 0.5, **kwargs: Any
) -> list[Document]

Return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

PARAMETER	DESCRIPTION
`query`	Text to look up documents similar to. TYPE: `str`
`k`	Number of Documents to return. Defaults to 4. TYPE: `int` DEFAULT: `4`
`fetch_k`	Number of Documents to fetch to pass to MMR algorithm. TYPE: `int` DEFAULT: `20`
`lambda_mult`	Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity. Defaults to 0.5. TYPE: `float` DEFAULT: `0.5`

RETURNS	DESCRIPTION
`list[Document]`	List of Documents selected by maximal marginal relevance.

max_marginal_relevance_search_by_vector ¶

max_marginal_relevance_search_by_vector(
    embedding: list[float],
    k: int = 4,
    fetch_k: int = 20,
    lambda_mult: float = 0.5,
    **kwargs: Any,
) -> list[Document]

Return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

PARAMETER	DESCRIPTION
`embedding`	Embedding to look up documents similar to. TYPE: `list[float]`
`k`	Number of Documents to return. Defaults to 4. TYPE: `int` DEFAULT: `4`
`fetch_k`	Number of Documents to fetch to pass to MMR algorithm. TYPE: `int` DEFAULT: `20`
`lambda_mult`	Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity. Defaults to 0.5. TYPE: `float` DEFAULT: `0.5`

RETURNS	DESCRIPTION
`list[Document]`	List of Documents selected by maximal marginal relevance.

similarity_search_with_score ¶

similarity_search_with_score(
    query: str, k: int = 4, **kwargs: Any
) -> list[tuple[Document, float]]

Return list of documents most similar to the query text and cosine distance in float for each. Lower score represents more similarity.

from_texts `classmethod` ¶

from_texts(
    texts: list[str],
    embedding: Embeddings | None,
    metadatas: list[dict] | None = None,
    *,
    tenant: str | None = None,
    client: WeaviateClient | None = None,
    index_name: str | None = None,
    text_key: str = "text",
    relevance_score_fn: Callable[[float], float] | None = _default_score_normalizer,
    **kwargs: Any,
) -> WeaviateVectorStore

Construct Weaviate wrapper from raw documents.

This is a user-friendly interface that

Embeds documents.
Creates a new index for the embeddings in the Weaviate instance.
Adds the documents to the newly created Weaviate index.

This is intended to be a quick way to get started.

PARAMETER	DESCRIPTION
`texts`	Texts to add to vector store. TYPE: `list[str]`
`embedding`	Text embedding model to use. TYPE: `Embeddings \| None`
`client`	weaviate.Client to use. TYPE: `WeaviateClient \| None` DEFAULT: `None`
`metadatas`	Metadata associated with each text. TYPE: `list[dict] \| None` DEFAULT: `None`
`tenant`	The tenant name. Defaults to None. TYPE: `str \| None` DEFAULT: `None`
`index_name`	Index name. TYPE: `str \| None` DEFAULT: `None`
`text_key`	Key to use for uploading/retrieving text to/from vectorstore. TYPE: `str` DEFAULT: `'text'`
`relevance_score_fn`	Function for converting whatever distance function the vector store uses to a relevance score, which is a normalized similarity score (0 means dissimilar, 1 means similar). TYPE: `Callable[[float], float] \| None` DEFAULT: `_default_score_normalizer`
`**kwargs`	Additional named parameters to pass to `Weaviate.__init__()`. TYPE: `Any` DEFAULT: `{}`

Example

.. code-block:: python

from langchain_community.embeddings import OpenAIEmbeddings
from langchain_community.vectorstores import Weaviate

embeddings = OpenAIEmbeddings()
weaviate = Weaviate.from_texts(
    texts,
    embeddings,
    client=client
)

delete ¶

delete(ids: list[str] | None = None, tenant: str | None = None, **kwargs: Any) -> None

Delete by vector IDs.

PARAMETER	DESCRIPTION
`ids`	List of ids to delete. TYPE: `list[str] \| None` DEFAULT: `None`
`tenant`	The tenant name. Defaults to None. TYPE: `str \| None` DEFAULT: `None`

langchain-weaviate¶

langchain_weaviate ¶

WeaviateVectorStore ¶

embeddings property ¶

get_by_ids ¶

aget_by_ids async ¶

adelete async ¶

aadd_texts async ¶

add_documents ¶

aadd_documents async ¶

search ¶

asearch async ¶

asimilarity_search_with_score async ¶

similarity_search_with_relevance_scores ¶

asimilarity_search_with_relevance_scores async ¶

asimilarity_search async ¶

similarity_search_by_vector ¶

asimilarity_search_by_vector async ¶

amax_marginal_relevance_search async ¶

amax_marginal_relevance_search_by_vector async ¶

from_documents classmethod ¶

afrom_documents async classmethod ¶

afrom_texts async classmethod ¶

as_retriever ¶

__init__ ¶

add_texts ¶

similarity_search ¶

max_marginal_relevance_search ¶

max_marginal_relevance_search_by_vector ¶

similarity_search_with_score ¶

from_texts classmethod ¶

delete ¶

`langchain-weaviate`¶

embeddings `property` ¶

aget_by_ids `async` ¶

adelete `async` ¶

aadd_texts `async` ¶

aadd_documents `async` ¶

asearch `async` ¶

asimilarity_search_with_score `async` ¶

asimilarity_search_with_relevance_scores `async` ¶

asimilarity_search `async` ¶

asimilarity_search_by_vector `async` ¶

amax_marginal_relevance_search `async` ¶

amax_marginal_relevance_search_by_vector `async` ¶

from_documents `classmethod` ¶

afrom_documents `async` `classmethod` ¶

afrom_texts `async` `classmethod` ¶

init ¶

from_texts `classmethod` ¶