Skip to content

langchain-azure-postgresql

PyPI - Version PyPI - License PyPI - Downloads

Reference docs

This page contains reference documentation for Azure AI. See the docs for conceptual guides, tutorials, and examples on using Azure AI.

langchain_azure_postgresql

Common utilities and models for Azure PostgreSQL AI integrations.

FUNCTION DESCRIPTION
async_check_connection

Check if the connection to Azure Database for PostgreSQL is valid and required extensions are installed.

check_connection

Check if the connection to Azure Database for PostgreSQL is valid and required extensions are installed.

create_extensions

Create required extensions in the Azure Database for PostgreSQL connection.

HNSW

Bases: Algorithm[HNSWSearchParams]

HNSW algorithm settings.

Provides build-time and (via :class:HNSWSearchParams) search-time parameters for HNSW vector indexes.

:param m: The maximum number of connections per layer for HNSW index building. :type m: PositiveInt | None :param ef_construction: The size of the dynamic candidate list for constructing the HNSW graph. :type ef_construction: PositiveInt | None

Notes:

If ef_construction is not at least twice the value of m, a ValueError will be raised during validation.

METHOD DESCRIPTION
index_settings

Return the general index settings for the algorithm.

build_settings

Return the specific index build settings for the algorithm.

default_search_params

Return the default search parameters for the algorithm.

index_settings

index_settings(exclude_none: bool = True) -> dict[str, Any]

Return the general index settings for the algorithm.

:param exclude_none: Whether to exclude keys with None values in the dictionary. :type exclude_none: bool :return: A dictionary containing the index settings. :rtype: dict[str, Any]

build_settings

build_settings(exclude_none=True)

Return the specific index build settings for the algorithm.

:param exclude_none: Whether to exclude keys with None values in the dictionary. :type exclude_none: bool :return: A dictionary containing the settings. :rtype: dict[str, Any]

default_search_params

default_search_params() -> HNSWSearchParams

Return the default search parameters for the algorithm.

:return: An instance of the search parameters model. :rtype: SP

Algorithm

Bases: BaseModel, Generic[SP]

Base class for vector index algorithms and their settings.

Subclasses provide index build-time settings via :meth:build_settings and the default search-time settings via :meth:default_search_params.

The generic type parameter SP is a :class:SearchParams subtype that models the search-time parameters for the algorithm.

:param op_class: The operator class to use for the vector index. :type op_class: VectorOpClass :param maintenance_work_mem: The amount of memory to use for maintenance operations. :type maintenance_work_mem: str | None :param max_parallel_maintenance_workers: The maximum number of parallel workers for maintenance operations. :type max_parallel_maintenance_workers: NonNegativeInt | None :param max_parallel_workers: The maximum number of parallel workers for query execution. :type max_parallel_workers: NonNegativeInt | None

METHOD DESCRIPTION
default_search_params

Return the default search parameters for the algorithm.

build_settings

Return the specific index build settings for the algorithm.

index_settings

Return the general index settings for the algorithm.

default_search_params abstractmethod

default_search_params() -> SP

Return the default search parameters for the algorithm.

:return: An instance of the search parameters model. :rtype: SP

build_settings abstractmethod

build_settings(exclude_none: bool = True) -> dict[str, Any]

Return the specific index build settings for the algorithm.

:param exclude_none: Whether to exclude keys with None values in the dictionary. :type exclude_none: bool :return: A dictionary containing the settings. :rtype: dict[str, Any]

index_settings

index_settings(exclude_none: bool = True) -> dict[str, Any]

Return the general index settings for the algorithm.

:param exclude_none: Whether to exclude keys with None values in the dictionary. :type exclude_none: bool :return: A dictionary containing the index settings. :rtype: dict[str, Any]

AsyncAzurePGConnectionPool

Bases: AsyncConnectionPool

Async connection pool for Azure Database for PostgreSQL connections.

AsyncConnectionInfo

Bases: BaseConnectionInfo

Base connection information for Azure Database for PostgreSQL connections.

:param host: Hostname of the Azure Database for PostgreSQL server. :type host: str | None :param dbname: Name of the database to connect to. :type dbname: str :param port: Port number for the connection. :type port: int :param credentials: Credentials for authentication. :type credentials: BasicAuth | AsyncTokenCredential :param sslmode: SSL mode for the connection. :type sslmode: SSLMode

AzurePGConnectionPool

Bases: ConnectionPool

Connection pool for Azure Database for PostgreSQL connections.

BasicAuth

Bases: BaseModel

Basic username/password authentication for Azure Database for PostgreSQL connections.

:param username: Username for the connection. :type username: str :param password: Password for the connection. :type password: str

ConnectionInfo

Bases: BaseConnectionInfo

Base connection information for Azure Database for PostgreSQL connections.

:param host: Hostname of the Azure Database for PostgreSQL server. :type host: str | None :param dbname: Name of the database to connect to. :type dbname: str :param port: Port number for the connection. :type port: int :param sslmode: SSL mode for the connection. :type sslmode: SSLMode :param credentials: Credentials for the connection. :type credentials: BasicAuth | TokenCredential

DiskANN

Bases: Algorithm[DiskANNSearchParams]

DiskANN algorithm settings.

Provides build-time and (via :class:DiskANNSearchParams) search-time parameters for DiskANN vector indexes.

:param max_neighbors: The maximum number of edges per node in the graph. :type max_neighbors: PositiveInt | None :param l_value_ib: The value of the L parameter for DiskANN index building. :type l_value_ib: PositiveInt | None :param product_quantized: Whether to use product quantization (PQ) for the index. :type product_quantized: bool | None :param pq_param_num_chunks: Number of chunks for product quantization (PQ). :type pq_param_num_chunks: NonNegativeInt | None :param pq_param_training_samples: Number of training samples for product quantization (PQ). :type pq_param_training_samples: NonNegativeInt | None

Notes:

If product_quantized is True, pq_param_num_chunks and pq_param_training_samples can be provided. Otherwise, these parameters are invalid and raise a ValueError during validation.

METHOD DESCRIPTION
index_settings

Return the general index settings for the algorithm.

build_settings

Return the specific index build settings for the algorithm.

default_search_params

Return the default search parameters for the algorithm.

index_settings

index_settings(exclude_none: bool = True) -> dict[str, Any]

Return the general index settings for the algorithm.

:param exclude_none: Whether to exclude keys with None values in the dictionary. :type exclude_none: bool :return: A dictionary containing the index settings. :rtype: dict[str, Any]

build_settings

build_settings(exclude_none: bool = True) -> dict[str, Any]

Return the specific index build settings for the algorithm.

:param exclude_none: Whether to exclude keys with None values in the dictionary. :type exclude_none: bool :return: A dictionary containing the settings. :rtype: dict[str, Any]

default_search_params

default_search_params() -> DiskANNSearchParams

Return the default search parameters for the algorithm.

:return: An instance of the search parameters model. :rtype: SP

DiskANNIterativeScanMode

Bases: str, Enum

Enumeration for DiskANN iterative scan modes.

DiskANNSearchParams

Bases: SearchParams

Search-time parameters for DiskANN indexes.

All settings are exported with the diskann. prefix when used in SQL.

:param l_value_is: The value of the L parameter for DiskANN index searching. :type l_value_is: PositiveInt | None :param iterative_search: The iterative search mode for DiskANN index searching. :type iterative_search: DiskANNIterativeScanMode | None

METHOD DESCRIPTION
search_settings

Return the specific index search settings for the algorithm.

search_settings

search_settings(exclude_none=True)

Return the specific index search settings for the algorithm.

:param exclude_none: Whether to exclude keys with None values in the dictionary. :type exclude_none: bool :return: A dictionary containing the search settings. :rtype: dict[str, Any]

Extension

Bases: BaseModel

Model representing a PostgreSQL extension.

:param ext_name: Name of the extension to be created, checked or dropped. :type ext_name: str :param ext_version: Optional version of the extension to be created or checked. :type ext_version: str | None :param schema_name: Optional schema name where the extension should be created or checked. :type schema_name: str | None :param cascade: Whether to automatically install the extension dependencies or drop the objects that depend on the extension. :type cascade: bool

HNSWIterativeScanMode

Bases: str, Enum

Enumeration for HNSW iterative scan modes.

HNSWSearchParams

Bases: SearchParams

Search-time parameters for HNSW indexes.

All settings are exported with the hnsw. prefix when used in SQL.

:param ef_search: Size of the dynamic candidate list for HNSW index searching. :type ef_search: PositiveInt | None :param iterative_scan: The iterative search mode for HNSW index searching. :type iterative_scan: HNSWIterativeScanMode | None :param max_scan_tuples: The maximum number of tuples to visit during HNSW index searching. :type max_scan_tuples: PositiveInt | None :param scan_mem_multiplier: The maximum amount of memory to use, as a multiple of work_mem, during HNSW index searching. :type scan_mem_multiplier: PositiveFloat | None

METHOD DESCRIPTION
search_settings

Return the specific index search settings for the algorithm.

search_settings

search_settings(exclude_none=True)

Return the specific index search settings for the algorithm.

:param exclude_none: Whether to exclude keys with None values in the dictionary. :type exclude_none: bool :return: A dictionary containing the search settings. :rtype: dict[str, Any]

IVFFlat

Bases: Algorithm[IVFFlatSearchParams]

IVF-Flat algorithm settings.

Provides build-time and (via :class:IVFFlatSearchParams) search-time parameters for IVF-Flat vector indexes.

:param lists: The number of inverted lists to use for IVF-Flat indexing. :type lists: PositiveInt | None

METHOD DESCRIPTION
index_settings

Return the general index settings for the algorithm.

build_settings

Return the specific index build settings for the algorithm.

default_search_params

Return the default search parameters for the algorithm.

index_settings

index_settings(exclude_none: bool = True) -> dict[str, Any]

Return the general index settings for the algorithm.

:param exclude_none: Whether to exclude keys with None values in the dictionary. :type exclude_none: bool :return: A dictionary containing the index settings. :rtype: dict[str, Any]

build_settings

build_settings(exclude_none: bool = True) -> dict[str, Any]

Return the specific index build settings for the algorithm.

:param exclude_none: Whether to exclude keys with None values in the dictionary. :type exclude_none: bool :return: A dictionary containing the settings. :rtype: dict[str, Any]

default_search_params

default_search_params() -> IVFFlatSearchParams

Return the default search parameters for the algorithm.

:return: An instance of the search parameters model. :rtype: SP

IVFFlatIterativeScanMode

Bases: str, Enum

Enumeration for IVFFlat iterative scan modes.

IVFFlatSearchParams

Bases: SearchParams

Search-time parameters for IVF-Flat indexes.

All settings are exported with the ivfflat. prefix when used in SQL.

:param probes: The number of probes to use during IVF-Flat index searching. :type probes: PositiveInt | None :param iterative_scan: The iterative search mode for IVF-Flat index searching. :type iterative_scan: IVFFlatIterativeScanMode | None :param max_probes: The maximum number of probes to use during IVF-Flat index searching. :type max_probes: PositiveInt | None

METHOD DESCRIPTION
search_settings

Return the specific index search settings for the algorithm.

search_settings

search_settings(exclude_none=True)

Return the specific index search settings for the algorithm.

:param exclude_none: Whether to exclude keys with None values in the dictionary. :type exclude_none: bool :return: A dictionary containing the search settings. :rtype: dict[str, Any]

SSLMode

Bases: str, Enum

SSL mode for Azure Database for PostgreSQL connections.

VectorOpClass

Bases: str, Enum

Enumeration for operator classes used in vector indexes.

METHOD DESCRIPTION
to_operator

Return the distance operator as a string.

to_operator

to_operator() -> str

Return the distance operator as a string.

:return: The distance operator string. :rtype: str :raises ValueError: If the vector operator class is unsupported.

VectorType

Bases: str, Enum

Enumeration for vector types used in vector similarity search.

AsyncAzurePGVectorStore

Bases: BaseModel, VectorStore

LangChain VectorStore backed by Azure Database for PostgreSQL (async).

The store validates or creates the backing table on initialization, and optionally discovers an existing vector index configuration. It supports inserting, deleting, fetching by id, similarity search, and MMR search.

Fields such as schema_name, table_name, and column names control the schema layout. embedding_type, embedding_dimension, and embedding_index describe the vector column and its index behavior.

Metadata can be stored in a single JSONB column by passing a string (default "metadata"), in multiple typed columns via a list of strings/tuples, or disabled by setting metadata_columns=None.

:param embedding: The embedding model to use for embedding vector generation. :type embedding: Embeddings | None :param connection: The database connection or connection pool to use. :type connection: AsyncConnection | AsyncConnectionPool :param schema_name: The name of the database schema to use. :type schema_name: str :param table_name: The name of the database table to use. :type table_name: str :param id_column: The name of the column containing document IDs (UUIDs). :type id_column: str :param content_column: The name of the column containing document content. :type content_column: str :param embedding_column: The name of the column containing document embeddings. :type embedding_column: str :param embedding_type: The type of the embedding vectors. :type embedding_type: VectorType | None :param embedding_dimension: The dimensionality of the embedding vectors. :type embedding_dimension: PositiveInt | None :param embedding_index: The algorithm used for indexing the embedding vectors. :type embedding_index: Algorithm | None :param _embedding_index_name: (internal) The name of the discovered or created index. :type _embedding_index_name: str | None :param metadata_columns: The columns to use for storing metadata. :type metadata_columns: list[str] | list[tuple[str, str]] | str | None

METHOD DESCRIPTION
asearch

Async return docs most similar to query using a specified search type.

asimilarity_search_with_relevance_scores

Async return docs and relevance scores in the range [0, 1].

as_retriever

Return VectorStoreRetriever initialized from this VectorStore.

create_index

Create the vector index on the embedding column (if not already exists).

reindex

Reindex the existing vector index.

afrom_documents

Create a store and add documents in one step.

afrom_texts

Create a store and add texts with optional metadata.

aadd_documents

Insert or upsert a batch of LangChain documents.

aadd_texts

Insert or upsert texts with optional metadatas and embeddings.

adelete

Delete by ids or truncate the table.

aget_by_ids

Fetch documents by their ids.

asimilarity_search

Similarity search for a query string using the configured index.

asimilarity_search_with_score

Similarity search returning (document, distance) pairs.

asimilarity_search_by_vector

Similarity search for a precomputed embedding vector.

amax_marginal_relevance_search

MMR search for a query string.

amax_marginal_relevance_search_by_vector

MMR search for a precomputed embedding vector.

from_documents

Return VectorStore initialized from documents and embeddings.

from_texts

Return VectorStore initialized from texts and embeddings.

add_documents

Add or update documents in the VectorStore.

add_texts

Run more texts through the embeddings and add to the VectorStore.

delete

Delete by vector ID or other criteria.

get_by_ids

Get documents by their IDs.

search

Return docs most similar to query using a specified search type.

similarity_search

Return docs most similar to query.

similarity_search_with_score

Run similarity search with distance.

similarity_search_by_vector

Return docs most similar to embedding vector.

similarity_search_with_relevance_scores

Return docs and relevance scores in the range [0, 1].

max_marginal_relevance_search

Return docs selected using the maximal marginal relevance.

max_marginal_relevance_search_by_vector

Return docs selected using the maximal marginal relevance.

embeddings property

embeddings: Embeddings | None

Access the query embedding object if available.

asearch async

asearch(query: str, search_type: str, **kwargs: Any) -> list[Document]

Async return docs most similar to query using a specified search type.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

search_type

Type of search to perform.

Can be 'similarity', 'mmr', or 'similarity_score_threshold'.

TYPE: str

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query.

RAISES DESCRIPTION
ValueError

If search_type is not one of 'similarity', 'mmr', or 'similarity_score_threshold'.

asimilarity_search_with_relevance_scores async

asimilarity_search_with_relevance_scores(
    query: str, k: int = 4, **kwargs: Any
) -> list[tuple[Document, float]]

Async return docs and relevance scores in the range [0, 1].

0 is dissimilar, 1 is most similar.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

**kwargs

Kwargs to be passed to similarity search.

Should include score_threshold, an optional floating point value between 0 to 1 to filter the resulting set of retrieved docs.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[tuple[Document, float]]

List of tuples of (doc, similarity_score)

as_retriever

as_retriever(**kwargs: Any) -> VectorStoreRetriever

Return VectorStoreRetriever initialized from this VectorStore.

PARAMETER DESCRIPTION
**kwargs

Keyword arguments to pass to the search function.

Can include:

  • search_type: Defines the type of search that the Retriever should perform. Can be 'similarity' (default), 'mmr', or 'similarity_score_threshold'.
  • search_kwargs: Keyword arguments to pass to the search function.

    Can include things like:

    • k: Amount of documents to return (Default: 4)
    • score_threshold: Minimum relevance threshold for similarity_score_threshold
    • fetch_k: Amount of documents to pass to MMR algorithm (Default: 20)
    • lambda_mult: Diversity of results returned by MMR; 1 for minimum diversity and 0 for maximum. (Default: 0.5)
    • filter: Filter by document metadata

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
VectorStoreRetriever

Retriever class for VectorStore.

Examples:

# Retrieve more documents with higher diversity
# Useful if your dataset has many similar documents
docsearch.as_retriever(
    search_type="mmr", search_kwargs={"k": 6, "lambda_mult": 0.25}
)

# Fetch more documents for the MMR algorithm to consider
# But only return the top 5
docsearch.as_retriever(search_type="mmr", search_kwargs={"k": 5, "fetch_k": 50})

# Only retrieve documents that have a relevance score
# Above a certain threshold
docsearch.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"score_threshold": 0.8},
)

# Only get the single most similar document from the dataset
docsearch.as_retriever(search_kwargs={"k": 1})

# Use a filter to only retrieve documents from a specific paper
docsearch.as_retriever(
    search_kwargs={"filter": {"paper_title": "GPT-4 Technical Report"}}
)

create_index async

create_index(*, concurrently: bool = False) -> bool

Create the vector index on the embedding column (if not already exists).

Builds a vector index for the configured embedding_column using the algorithm specified by embedding_index (DiskANN, HNSW or IVFFlat). The effective index type name is inferred from the concrete Algorithm instance and the index name is generated as <table>_<column>_<type>_idx. If an index has already been discovered (_embedding_index_name is not None) the operation is skipped.

Prior to executing create index the per-build tuning parameters (returned by :meth:Algorithm.index_settings) are applied via set GUCs so they only affect this session. Build-time options (returned by :meth:Algorithm.build_settings) are appended in a with (...) clause.

For quantized operator classes: - halfvec_* (scalar quantization) casts both the stored column and future query vectors to halfvec(dim). - bit_* (binary quantization) wraps the column with binary_quantize(col)::bit(dim). Otherwise the raw column is indexed.

:param concurrently: When True uses create index concurrently to avoid long write-locks at the expense of a slower build. :type concurrently: bool :return: True if the index was created, False when an existing index prevented creation. :rtype: bool :raises AssertionError: If required attributes (embedding_index or embedding_dimension) are unexpectedly None.

reindex async

reindex(*, concurrently: bool = False, verbose: bool = False) -> bool

Reindex the existing vector index.

Issues a reindex (concurrently <bool>, verbose <bool>) index command for the previously discovered or created index (tracked in _embedding_index_name). The session-level index tuning GUCs (returned by :meth:Algorithm.index_settings) are applied beforehand to influence the reindex process (useful for algorithms whose maintenance cost or accuracy depends on these settings).

:param concurrently: When True performs a concurrent reindex to minimize locking, trading speed for availability. :type concurrently: bool :param verbose: When True enables PostgreSQL verbose output, which may aid in diagnosing build performance issues. :type verbose: bool :return: True if reindex succeeded, False if no index existed. :rtype: bool :raises AssertionError: If embedding_index is unexpectedly None.

afrom_documents async classmethod

afrom_documents(
    documents: list[Document], embedding: Embeddings, **kwargs: Any
) -> Self

Create a store and add documents in one step.

:param documents: The list of documents to add to the store. :type documents: list[Document] :param embedding: The embedding model to use for embedding vector generation. :type embedding: Embeddings

Kwargs
  • connection: (required) psycopg AsyncConnection or AsyncConnectionPool
  • schema_name, table_name, id_column, content_column, embedding_column: customize table/column names
  • embedding_type: VectorType of the embedding column
  • embedding_dimension: dimension of the vector column
  • embedding_index: Algorithm describing the vector index
  • metadata_columns: str | list[str | (str, str)] | None to configure metadata storage
  • on_conflict_update (passed to add): bool to upsert existing rows

:return: The created vector store instance. :rtype: Self

afrom_texts async classmethod

afrom_texts(
    texts: list[str],
    embedding: Embeddings,
    metadatas: list[dict] | None = None,
    *,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> Self

Create a store and add texts with optional metadata.

:param texts: The list of texts to add to the store. :type texts: list[str] :param embedding: The embedding model to use for embedding vector generation. :type embedding: Embeddings :param metadatas: The list of metadata dictionaries corresponding to each text. :type metadatas: list[dict] | None :param ids: The list of custom IDs corresponding to each text. When ids are not provided, UUIDs are generated. :type ids: list[str] | None

Kwargs

See :meth:afrom_documents for required and/or supported kwargs.

:return: The created vector store instance. :rtype: Self

aadd_documents async

aadd_documents(documents: list[Document], **kwargs: Any) -> list[str]

Insert or upsert a batch of LangChain documents.

Kwargs
  • ids: list[str] custom ids, otherwise UUIDs or doc.id are used
  • on_conflict_update: bool to update existing rows on id conflict

:return: Inserted ids. :rtype: list[str]

aadd_texts async

aadd_texts(
    texts: Iterable[str],
    metadatas: list[dict] | None = None,
    *,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> list[str]

Insert or upsert texts with optional metadatas and embeddings.

If an embeddings model is present, embeddings are computed and stored. When metadata_columns is a string, metadata is written as JSONB; otherwise only provided keys matching configured columns are stored.

Kwargs
  • ids: list[str] custom ids, otherwise UUIDs are used
  • on_conflict_update: bool to update existing rows on id conflict

:return: Inserted ids. :rtype: list[str] :raises ValueError: If the length of 'metadatas', 'texts', and 'ids' do not match.

adelete async

adelete(ids: list[str] | None = None, **kwargs: Any) -> bool | None

Delete by ids or truncate the table.

If ids is None, the table is truncated.

Kwargs
  • restart: bool to restart (when True) or continue (when False) identity, when truncating
  • cascade: bool to cascade (when True) or restrict (when False), when truncating

:return: True if the operation was successful, False otherwise. :rtype: bool | None

aget_by_ids async

aget_by_ids(ids: Sequence[str]) -> list[Document]

Fetch documents by their ids.

:param ids: Sequence of string ids. :type ids: Sequence[str] :return: Documents with metadata reconstructed from configured columns. :rtype: list[Document]

asimilarity_search(query: str, k: int = 4, **kwargs: Any) -> list[Document]

Similarity search for a query string using the configured index.

:param query: Query text to embed and search. :type query: str :param k: Number of most similar documents. :type k: int

Kwargs
  • filter: Filter | None; Optional filter to apply to the search.
  • top_m: int; Number of top results to prefetch when re-ranking (default: 5 * k).

:return: Top-k documents. :rtype: list[Document]

asimilarity_search_with_score async

asimilarity_search_with_score(
    query: str, k: int = 4, **kwargs: Any
) -> list[tuple[Document, float]]

Similarity search returning (document, distance) pairs.

:param query: Query text to embed and search. :type query: str :param k: Number of most similar documents. :type k: int

Kwargs

See :meth:asimilarity_search for supported kwargs.

:return: Top-k (document, distance) pairs. :rtype: list[tuple[Document, float]]

asimilarity_search_by_vector async

asimilarity_search_by_vector(
    embedding: list[float], k: int = 4, **kwargs: Any
) -> list[Document]

Similarity search for a precomputed embedding vector.

:param embedding: The precomputed embedding vector to search for. :type embedding: list[float] :param k: Number of most similar documents. :type k: int

Kwargs

See :meth:asimilarity_search for supported kwargs.

:return: Top-k documents. :rtype: list[Document]

amax_marginal_relevance_search(
    query: str, k: int = 4, fetch_k: int = 20, lambda_mult: float = 0.5, **kwargs: Any
) -> list[Document]

MMR search for a query string.

:param query: The query string to search for. :type query: str :param k: Number of most similar documents to return. :type k: int :param fetch_k: Candidate pool size before MMR reranking. :type fetch_k: int :param lambda_mult: Diversity vs. relevance trade-off parameter. :type lambda_mult: float

Kwargs

See :meth:similarity_search for supported kwargs.

:return: Top-k documents. :rtype: list[Document]

amax_marginal_relevance_search_by_vector async

amax_marginal_relevance_search_by_vector(
    embedding: list[float],
    k: int = 4,
    fetch_k: int = 20,
    lambda_mult: float = 0.5,
    **kwargs: Any,
) -> list[Document]

MMR search for a precomputed embedding vector.

:param embedding: The precomputed embedding vector to search for. :type embedding: list[float] :param k: Number of most similar documents to return. :type k: int :param fetch_k: Candidate pool size before MMR reranking. :type fetch_k: int :param lambda_mult: Diversity vs. relevance trade-off parameter. :type lambda_mult: float

Kwargs

See :meth:similarity_search for supported kwargs.

:return: Top-k documents. :rtype: list[Document]

from_documents classmethod

from_documents(documents: list[Document], embedding: Embeddings, **kwargs: Any) -> Self

Return VectorStore initialized from documents and embeddings.

PARAMETER DESCRIPTION
documents

List of Document objects to add to the VectorStore.

TYPE: list[Document]

embedding

Embedding function to use.

TYPE: Embeddings

**kwargs

Additional keyword arguments.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Self

VectorStore initialized from documents and embeddings.

from_texts classmethod

from_texts(
    texts: list[str],
    embedding: Embeddings,
    metadatas: list[dict] | None = None,
    *,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> Self

Return VectorStore initialized from texts and embeddings.

PARAMETER DESCRIPTION
texts

Texts to add to the VectorStore.

TYPE: list[str]

embedding

Embedding function to use.

TYPE: Embeddings

metadatas

Optional list of metadatas associated with the texts.

TYPE: list[dict] | None DEFAULT: None

ids

Optional list of IDs associated with the texts.

TYPE: list[str] | None DEFAULT: None

**kwargs

Additional keyword arguments.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
VST

VectorStore initialized from texts and embeddings.

add_documents

add_documents(documents: list[Document], **kwargs: Any) -> list[str]

Add or update documents in the VectorStore.

PARAMETER DESCRIPTION
documents

Documents to add to the VectorStore.

TYPE: list[Document]

**kwargs

Additional keyword arguments.

If kwargs contains IDs and documents contain ids, the IDs in the kwargs will receive precedence.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[str]

List of IDs of the added texts.

add_texts

add_texts(
    texts: Iterable[str],
    metadatas: list[dict] | None = None,
    *,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> list[str]

Run more texts through the embeddings and add to the VectorStore.

PARAMETER DESCRIPTION
texts

Iterable of strings to add to the VectorStore.

TYPE: Iterable[str]

metadatas

Optional list of metadatas associated with the texts.

TYPE: list[dict] | None DEFAULT: None

ids

Optional list of IDs associated with the texts.

TYPE: list[str] | None DEFAULT: None

**kwargs

VectorStore specific parameters. One of the kwargs should be ids which is a list of ids associated with the texts.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[str]

List of IDs from adding the texts into the VectorStore.

RAISES DESCRIPTION
ValueError

If the number of metadatas does not match the number of texts.

ValueError

If the number of IDs does not match the number of texts.

delete

delete(ids: list[str] | None = None, **kwargs: Any) -> bool | None

Delete by vector ID or other criteria.

PARAMETER DESCRIPTION
ids

List of IDs to delete. If None, delete all.

TYPE: list[str] | None DEFAULT: None

**kwargs

Other keyword arguments that subclasses might use.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
bool | None

True if deletion is successful, False otherwise, None if not implemented.

get_by_ids

get_by_ids(ids: Sequence[str]) -> list[Document]

Get documents by their IDs.

The returned documents are expected to have the ID field set to the ID of the document in the vector store.

Fewer documents may be returned than requested if some IDs are not found or if there are duplicated IDs.

Users should not assume that the order of the returned documents matches the order of the input IDs. Instead, users should rely on the ID field of the returned documents.

This method should NOT raise exceptions if no documents are found for some IDs.

PARAMETER DESCRIPTION
ids

List of IDs to retrieve.

TYPE: Sequence[str]

RETURNS DESCRIPTION
list[Document]

List of Document objects.

search

search(query: str, search_type: str, **kwargs: Any)

Return docs most similar to query using a specified search type.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

search_type

Type of search to perform.

Can be 'similarity', 'mmr', or 'similarity_score_threshold'.

TYPE: str

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query.

RAISES DESCRIPTION
ValueError

If search_type is not one of 'similarity', 'mmr', or 'similarity_score_threshold'.

similarity_search(query: str, k: int = 4, **kwargs: Any) -> list[Document]

Return docs most similar to query.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query.

similarity_search_with_score

similarity_search_with_score(
    query: str, k: int = 4, **kwargs: Any
) -> list[tuple[Document, float]]

Run similarity search with distance.

PARAMETER DESCRIPTION
*args

Arguments to pass to the search method.

TYPE: Any DEFAULT: ()

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[tuple[Document, float]]

List of tuples of (doc, similarity_score).

similarity_search_by_vector

similarity_search_by_vector(
    embedding: list[float], k: int = 4, **kwargs: Any
) -> list[Document]

Return docs most similar to embedding vector.

PARAMETER DESCRIPTION
embedding

Embedding to look up documents similar to.

TYPE: list[float]

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query vector.

similarity_search_with_relevance_scores

similarity_search_with_relevance_scores(query, k=4, **kwargs)

Return docs and relevance scores in the range [0, 1].

0 is dissimilar, 1 is most similar.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

**kwargs

Kwargs to be passed to similarity search.

Should include score_threshold, an optional floating point value between 0 to 1 to filter the resulting set of retrieved docs.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[tuple[Document, float]]

List of tuples of (doc, similarity_score).

max_marginal_relevance_search(
    query: str, k: int = 4, fetch_k: int = 20, lambda_mult: float = 0.5, **kwargs: Any
) -> list[Document]

Return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

PARAMETER DESCRIPTION
query

Text to look up documents similar to.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

fetch_k

Number of Document objects to fetch to pass to MMR algorithm.

TYPE: int DEFAULT: 20

lambda_mult

Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity.

TYPE: float DEFAULT: 0.5

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects selected by maximal marginal relevance.

max_marginal_relevance_search_by_vector

max_marginal_relevance_search_by_vector(
    embedding: list[float],
    k: int = 4,
    fetch_k: int = 20,
    lambda_mult: float = 0.5,
    **kwargs: Any,
) -> list[Document]

Return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

PARAMETER DESCRIPTION
embedding

Embedding to look up documents similar to.

TYPE: list[float]

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

fetch_k

Number of Document objects to fetch to pass to MMR algorithm.

TYPE: int DEFAULT: 20

lambda_mult

Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity.

TYPE: float DEFAULT: 0.5

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects selected by maximal marginal relevance.

AzurePGVectorStore

Bases: BaseModel, VectorStore

LangChain VectorStore backed by Azure Database for PostgreSQL (sync).

The store validates or creates the backing table on initialization, and optionally discovers an existing vector index configuration. It supports inserting, deleting, fetching by id, similarity search, and MMR search.

Fields such as schema_name, table_name, and column names control the schema layout. embedding_type, embedding_dimension, and embedding_index describe the vector column and its index behavior.

Metadata can be stored in a single JSONB column by passing a string (default "metadata"), in multiple typed columns via a list of strings/tuples, or disabled by setting metadata_columns=None.

:param embedding: The embedding model to use for embedding vector generation. :type embedding: Embeddings | None :param connection: The database connection or connection pool to use. :type connection: Connection | ConnectionPool :param schema_name: The name of the database schema to use. :type schema_name: str :param table_name: The name of the database table to use. :type table_name: str :param id_column: The name of the column containing document IDs (UUIDs). :type id_column: str :param content_column: The name of the column containing document content. :type content_column: str :param embedding_column: The name of the column containing document embeddings. :type embedding_column: str :param embedding_type: The type of the embedding vectors. :type embedding_type: VectorType | None :param embedding_dimension: The dimensionality of the embedding vectors. :type embedding_dimension: PositiveInt | None :param embedding_index: The algorithm used for indexing the embedding vectors. :type embedding_index: Algorithm | None :param _embedding_index_name: (internal) The name of the discovered or created index. :type _embedding_index_name: str | None :param metadata_columns: The columns to use for storing metadata. :type metadata_columns: list[str] | list[tuple[str, str]] | str | None

METHOD DESCRIPTION
search

Return docs most similar to query using a specified search type.

similarity_search_with_relevance_scores

Return docs and relevance scores in the range [0, 1].

as_retriever

Return VectorStoreRetriever initialized from this VectorStore.

create_index

Create the vector index on the embedding column (if not already exists).

reindex

Reindex the existing vector index.

from_documents

Create a store and add documents in one step.

from_texts

Create a store and add texts with optional metadata.

add_documents

Insert or upsert a batch of LangChain documents.

add_texts

Insert or upsert texts with optional metadatas and embeddings.

delete

Delete by ids or truncate the table.

get_by_ids

Fetch documents by their ids.

similarity_search

Similarity search for a query string using the configured index.

similarity_search_with_score

Similarity search returning (document, distance) pairs.

similarity_search_by_vector

Similarity search for a precomputed embedding vector.

max_marginal_relevance_search

MMR search for a query string.

max_marginal_relevance_search_by_vector

MMR search for a precomputed embedding vector.

afrom_documents

Async return VectorStore initialized from documents and embeddings.

afrom_texts

Async return VectorStore initialized from texts and embeddings.

aadd_documents

Async run more documents through the embeddings and add to the VectorStore.

aadd_texts

Async run more texts through the embeddings and add to the VectorStore.

adelete

Async delete by vector ID or other criteria.

aget_by_ids

Async get documents by their IDs.

asearch

Async return docs most similar to query using a specified search type.

asimilarity_search

Async return docs most similar to query.

asimilarity_search_with_score

Async run similarity search with distance.

asimilarity_search_by_vector

Async return docs most similar to embedding vector.

asimilarity_search_with_relevance_scores

Async return docs and relevance scores in the range [0, 1].

amax_marginal_relevance_search

Async return docs selected using the maximal marginal relevance.

amax_marginal_relevance_search_by_vector

Async return docs selected using the maximal marginal relevance.

embeddings property

embeddings: Embeddings | None

Access the query embedding object if available.

search

search(query: str, search_type: str, **kwargs: Any) -> list[Document]

Return docs most similar to query using a specified search type.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

search_type

Type of search to perform.

Can be 'similarity', 'mmr', or 'similarity_score_threshold'.

TYPE: str

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query.

RAISES DESCRIPTION
ValueError

If search_type is not one of 'similarity', 'mmr', or 'similarity_score_threshold'.

similarity_search_with_relevance_scores

similarity_search_with_relevance_scores(
    query: str, k: int = 4, **kwargs: Any
) -> list[tuple[Document, float]]

Return docs and relevance scores in the range [0, 1].

0 is dissimilar, 1 is most similar.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

**kwargs

Kwargs to be passed to similarity search.

Should include score_threshold, an optional floating point value between 0 to 1 to filter the resulting set of retrieved docs.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[tuple[Document, float]]

List of tuples of (doc, similarity_score).

as_retriever

as_retriever(**kwargs: Any) -> VectorStoreRetriever

Return VectorStoreRetriever initialized from this VectorStore.

PARAMETER DESCRIPTION
**kwargs

Keyword arguments to pass to the search function.

Can include:

  • search_type: Defines the type of search that the Retriever should perform. Can be 'similarity' (default), 'mmr', or 'similarity_score_threshold'.
  • search_kwargs: Keyword arguments to pass to the search function.

    Can include things like:

    • k: Amount of documents to return (Default: 4)
    • score_threshold: Minimum relevance threshold for similarity_score_threshold
    • fetch_k: Amount of documents to pass to MMR algorithm (Default: 20)
    • lambda_mult: Diversity of results returned by MMR; 1 for minimum diversity and 0 for maximum. (Default: 0.5)
    • filter: Filter by document metadata

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
VectorStoreRetriever

Retriever class for VectorStore.

Examples:

# Retrieve more documents with higher diversity
# Useful if your dataset has many similar documents
docsearch.as_retriever(
    search_type="mmr", search_kwargs={"k": 6, "lambda_mult": 0.25}
)

# Fetch more documents for the MMR algorithm to consider
# But only return the top 5
docsearch.as_retriever(search_type="mmr", search_kwargs={"k": 5, "fetch_k": 50})

# Only retrieve documents that have a relevance score
# Above a certain threshold
docsearch.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"score_threshold": 0.8},
)

# Only get the single most similar document from the dataset
docsearch.as_retriever(search_kwargs={"k": 1})

# Use a filter to only retrieve documents from a specific paper
docsearch.as_retriever(
    search_kwargs={"filter": {"paper_title": "GPT-4 Technical Report"}}
)

create_index

create_index(*, concurrently: bool = False) -> bool

Create the vector index on the embedding column (if not already exists).

Builds a vector index for the configured embedding_column using the algorithm specified by embedding_index (DiskANN, HNSW or IVFFlat). The effective index type name is inferred from the concrete Algorithm instance and the index name is generated as <table>_<column>_<type>_idx. If an index has already been discovered (_embedding_index_name is not None) the operation is skipped.

Prior to executing create index the per-build tuning parameters (returned by :meth:Algorithm.index_settings) are applied via set GUCs so they only affect this session. Build-time options (returned by :meth:Algorithm.build_settings) are appended in a with (...) clause.

For quantized operator classes: - halfvec_* (scalar quantization) casts both the stored column and future query vectors to halfvec(dim). - bit_* (binary quantization) wraps the column with binary_quantize(col)::bit(dim). Otherwise the raw column is indexed.

:param concurrently: When True uses create index concurrently to avoid long write-locks at the expense of a slower build. :type concurrently: bool :return: True if the index was created, False when an existing index prevented creation. :rtype: bool :raises AssertionError: If required attributes (embedding_index or embedding_dimension) are unexpectedly None.

reindex

reindex(*, concurrently: bool = False, verbose: bool = False) -> bool

Reindex the existing vector index.

Issues a reindex (concurrently <bool>, verbose <bool>) index command for the previously discovered or created index (tracked in _embedding_index_name). The session-level index tuning GUCs (returned by :meth:Algorithm.index_settings) are applied beforehand to influence the reindex process (useful for algorithms whose maintenance cost or accuracy depends on these settings).

:param concurrently: When True performs a concurrent reindex to minimize locking, trading speed for availability. :type concurrently: bool :param verbose: When True enables PostgreSQL verbose output, which may aid in diagnosing build performance issues. :type verbose: bool :return: True if reindex succeeded, False if no index existed. :rtype: bool :raises AssertionError: If embedding_index is unexpectedly None.

from_documents classmethod

from_documents(documents: list[Document], embedding: Embeddings, **kwargs: Any) -> Self

Create a store and add documents in one step.

:param documents: The list of documents to add to the store. :type documents: list[Document] :param embedding: The embedding model to use for embedding vector generation. :type embedding: Embeddings

Kwargs
  • connection: (required) psycopg Connection or ConnectionPool
  • schema_name, table_name, id_column, content_column, embedding_column: customize table/column names
  • embedding_type: VectorType of the embedding column
  • embedding_dimension: dimension of the vector column
  • embedding_index: Algorithm describing the vector index
  • metadata_columns: str | list[str | (str, str)] | None to configure metadata storage
  • on_conflict_update (passed to add): bool to upsert existing rows

:return: The created vector store instance. :rtype: Self

from_texts classmethod

from_texts(
    texts: list[str],
    embedding: Embeddings,
    metadatas: list[dict] | None = None,
    *,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> Self

Create a store and add texts with optional metadata.

:param texts: The list of texts to add to the store. :type texts: list[str] :param embedding: The embedding model to use for embedding vector generation. :type embedding: Embeddings :param metadatas: The list of metadata dictionaries corresponding to each text. :type metadatas: list[dict] | None :param ids: The list of custom IDs corresponding to each text. When ids are not provided, UUIDs are generated. :type ids: list[str] | None

Kwargs

See :meth:from_documents for required and/or supported kwargs.

:return: The created vector store instance. :rtype: Self

add_documents

add_documents(documents: list[Document], **kwargs: Any) -> list[str]

Insert or upsert a batch of LangChain documents.

Kwargs
  • ids: list[str] custom ids, otherwise UUIDs or doc.id are used
  • on_conflict_update: bool to update existing rows on id conflict

:return: Inserted ids. :rtype: list[str]

add_texts

add_texts(
    texts: Iterable[str],
    metadatas: list[dict] | None = None,
    *,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> list[str]

Insert or upsert texts with optional metadatas and embeddings.

If an embeddings model is present, embeddings are computed and stored. When metadata_columns is a string, metadata is written as JSONB; otherwise only provided keys matching configured columns are stored.

Kwargs
  • ids: list[str] custom ids, otherwise UUIDs are used
  • on_conflict_update: bool to update existing rows on id conflict

:return: Inserted ids. :rtype: list[str] :raises ValueError: If the length of 'metadatas', 'texts', and 'ids' do not match.

delete

delete(ids: list[str] | None = None, **kwargs: Any) -> bool | None

Delete by ids or truncate the table.

If ids is None, the table is truncated.

Kwargs
  • restart: bool to restart (when True) or continue (when False) identity, when truncating
  • cascade: bool to cascade (when True) or restrict (when False), when truncating

:return: True if the operation was successful, False otherwise. :rtype: bool | None

get_by_ids

get_by_ids(ids: Sequence[str]) -> list[Document]

Fetch documents by their ids.

:param ids: Sequence of string ids. :type ids: Sequence[str] :return: Documents with metadata reconstructed from configured columns. :rtype: list[Document]

similarity_search(query: str, k: int = 4, **kwargs: Any) -> list[Document]

Similarity search for a query string using the configured index.

:param query: Query text to embed and search. :type query: str :param k: Number of most similar documents. :type k: int

Kwargs
  • filter: Filter | None; Optional filter to apply to the search.
  • top_m: int; Number of top results to prefetch when re-ranking (default: 5 * k).

:return: Top-k documents. :rtype: list[Document]

similarity_search_with_score

similarity_search_with_score(
    query: str, k: int = 4, **kwargs: Any
) -> list[tuple[Document, float]]

Similarity search returning (document, distance) pairs.

:param query: Query text to embed and search. :type query: str :param k: Number of most similar documents (and their distances). :type k: int

Kwargs

See :meth:similarity_search for supported kwargs.

:return: Top-k (document, distance) pairs. :rtype: list[tuple[Document, float]]

similarity_search_by_vector

similarity_search_by_vector(
    embedding: list[float], k: int = 4, **kwargs: Any
) -> list[Document]

Similarity search for a precomputed embedding vector.

:param embedding: The precomputed embedding vector to search for. :type embedding: list[float] :param k: Number of most similar documents. :type k: int

Kwargs

See :meth:similarity_search for supported kwargs.

:return: Top-k documents. :rtype: list[Document]

max_marginal_relevance_search(
    query: str, k: int = 4, fetch_k: int = 20, lambda_mult: float = 0.5, **kwargs: Any
) -> list[Document]

MMR search for a query string.

:param query: The query string to search for. :type query: str :param k: Number of most similar documents to return. :type k: int :param fetch_k: Candidate pool size before MMR reranking. :type fetch_k: int :param lambda_mult: Diversity vs. relevance trade-off parameter. :type lambda_mult: float

Kwargs

See :meth:similarity_search for supported kwargs.

:return: Top-k documents. :rtype: list[Document]

max_marginal_relevance_search_by_vector

max_marginal_relevance_search_by_vector(
    embedding: list[float],
    k: int = 4,
    fetch_k: int = 20,
    lambda_mult: float = 0.5,
    **kwargs: Any,
) -> list[Document]

MMR search for a precomputed embedding vector.

:param embedding: The precomputed embedding vector to search for. :type embedding: list[float] :param k: Number of most similar documents to return. :type k: int :param fetch_k: Candidate pool size before MMR reranking. :type fetch_k: int :param lambda_mult: Diversity vs. relevance trade-off parameter. :type lambda_mult: float

Kwargs

See :meth:similarity_search for supported kwargs.

:return: Top-k documents. :rtype: list[Document]

afrom_documents async classmethod

afrom_documents(
    documents: list[Document], embedding: Embeddings, **kwargs: Any
) -> Self

Async return VectorStore initialized from documents and embeddings.

PARAMETER DESCRIPTION
documents

List of Document objects to add to the VectorStore.

TYPE: list[Document]

embedding

Embedding function to use.

TYPE: Embeddings

**kwargs

Additional keyword arguments.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Self

VectorStore initialized from documents and embeddings.

afrom_texts async classmethod

afrom_texts(
    texts: list[str],
    embedding: Embeddings,
    metadatas: list[dict] | None = None,
    *,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> Self

Async return VectorStore initialized from texts and embeddings.

PARAMETER DESCRIPTION
texts

Texts to add to the VectorStore.

TYPE: list[str]

embedding

Embedding function to use.

TYPE: Embeddings

metadatas

Optional list of metadatas associated with the texts.

TYPE: list[dict] | None DEFAULT: None

ids

Optional list of IDs associated with the texts.

TYPE: list[str] | None DEFAULT: None

**kwargs

Additional keyword arguments.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Self

VectorStore initialized from texts and embeddings.

aadd_documents async

aadd_documents(documents: list[Document], **kwargs: Any) -> list[str]

Async run more documents through the embeddings and add to the VectorStore.

PARAMETER DESCRIPTION
documents

Documents to add to the VectorStore.

TYPE: list[Document]

**kwargs

Additional keyword arguments.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[str]

List of IDs of the added texts.

aadd_texts async

aadd_texts(
    texts: Iterable[str],
    metadatas: list[dict] | None = None,
    *,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> list[str]

Async run more texts through the embeddings and add to the VectorStore.

PARAMETER DESCRIPTION
texts

Iterable of strings to add to the VectorStore.

TYPE: Iterable[str]

metadatas

Optional list of metadatas associated with the texts.

TYPE: list[dict] | None DEFAULT: None

ids

Optional list

TYPE: list[str] | None DEFAULT: None

**kwargs

VectorStore specific parameters.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[str]

List of IDs from adding the texts into the VectorStore.

RAISES DESCRIPTION
ValueError

If the number of metadatas does not match the number of texts.

ValueError

If the number of IDs does not match the number of texts.

adelete async

adelete(ids: list[str] | None = None, **kwargs: Any) -> bool | None

Async delete by vector ID or other criteria.

PARAMETER DESCRIPTION
ids

List of IDs to delete. If None, delete all.

TYPE: list[str] | None DEFAULT: None

**kwargs

Other keyword arguments that subclasses might use.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
bool | None

True if deletion is successful, False otherwise, None if not implemented.

aget_by_ids async

aget_by_ids(ids: Sequence[str]) -> list[Document]

Async get documents by their IDs.

The returned documents are expected to have the ID field set to the ID of the document in the vector store.

Fewer documents may be returned than requested if some IDs are not found or if there are duplicated IDs.

Users should not assume that the order of the returned documents matches the order of the input IDs. Instead, users should rely on the ID field of the returned documents.

This method should NOT raise exceptions if no documents are found for some IDs.

PARAMETER DESCRIPTION
ids

List of IDs to retrieve.

TYPE: Sequence[str]

RETURNS DESCRIPTION
list[Document]

List of Document objects.

asearch async

asearch(query: str, search_type: str, **kwargs: Any) -> list[Document]

Async return docs most similar to query using a specified search type.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

search_type

Type of search to perform.

Can be 'similarity', 'mmr', or 'similarity_score_threshold'.

TYPE: str

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query.

RAISES DESCRIPTION
ValueError

If search_type is not one of 'similarity', 'mmr', or 'similarity_score_threshold'.

asimilarity_search(query: str, k: int = 4, **kwargs: Any) -> list[Document]

Async return docs most similar to query.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query.

asimilarity_search_with_score async

asimilarity_search_with_score(
    *args: Any, **kwargs: Any
) -> list[tuple[Document, float]]

Async run similarity search with distance.

PARAMETER DESCRIPTION
*args

Arguments to pass to the search method.

TYPE: Any DEFAULT: ()

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[tuple[Document, float]]

List of tuples of (doc, similarity_score).

asimilarity_search_by_vector async

asimilarity_search_by_vector(
    embedding: list[float], k: int = 4, **kwargs: Any
) -> list[Document]

Async return docs most similar to embedding vector.

PARAMETER DESCRIPTION
embedding

Embedding to look up documents similar to.

TYPE: list[float]

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query vector.

asimilarity_search_with_relevance_scores async

asimilarity_search_with_relevance_scores(
    query: str, k: int = 4, **kwargs: Any
) -> list[tuple[Document, float]]

Async return docs and relevance scores in the range [0, 1].

0 is dissimilar, 1 is most similar.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

**kwargs

Kwargs to be passed to similarity search.

Should include score_threshold, an optional floating point value between 0 to 1 to filter the resulting set of retrieved docs.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[tuple[Document, float]]

List of tuples of (doc, similarity_score)

amax_marginal_relevance_search(
    query: str, k: int = 4, fetch_k: int = 20, lambda_mult: float = 0.5, **kwargs: Any
) -> list[Document]

Async return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

PARAMETER DESCRIPTION
query

Text to look up documents similar to.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

fetch_k

Number of Document objects to fetch to pass to MMR algorithm.

TYPE: int DEFAULT: 20

lambda_mult

Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity.

TYPE: float DEFAULT: 0.5

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects selected by maximal marginal relevance.

amax_marginal_relevance_search_by_vector async

amax_marginal_relevance_search_by_vector(
    embedding: list[float],
    k: int = 4,
    fetch_k: int = 20,
    lambda_mult: float = 0.5,
    **kwargs: Any,
) -> list[Document]

Async return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

PARAMETER DESCRIPTION
embedding

Embedding to look up documents similar to.

TYPE: list[float]

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

fetch_k

Number of Document objects to fetch to pass to MMR algorithm.

TYPE: int DEFAULT: 20

lambda_mult

Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity.

TYPE: float DEFAULT: 0.5

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects selected by maximal marginal relevance.

async_check_connection async

async_check_connection(
    conn: AsyncConnection, /, required_extensions: list[Extension] = []
)

Check if the connection to Azure Database for PostgreSQL is valid and required extensions are installed.

:param conn: Async connection to the Azure Database for PostgreSQL. :type conn: AsyncConnection :param required_extensions: List of required extensions to check if they are installed. :type required_extensions: list[Extension] :raises RuntimeError: If the connection check fails or required extensions are not installed.

check_connection

check_connection(conn: Connection, /, required_extensions: list[Extension] = [])

Check if the connection to Azure Database for PostgreSQL is valid and required extensions are installed.

:param conn: Connection to the Azure Database for PostgreSQL. :type conn: Connection :param required_extensions: List of required extensions to check if they are installed. :type required_extensions: list[Extension] :raises RuntimeError: If the connection check fails or required extensions are not installed.

create_extensions

create_extensions(conn: Connection, /, required_extensions: list[Extension] = [])

Create required extensions in the Azure Database for PostgreSQL connection.

:param conn: Connection to the Azure Database for PostgreSQL. :type conn: Connection :param required_extensions: List of required extensions to create. :type required_extensions: list[Extension] :raises Exception: If the connection is not valid or if an error occurs during extension creation.