Vector store stores embedded data and performs vector search.
One of the most common ways to store and search over unstructured data is to embed it and store the resulting embedding vectors, and then query the store and retrieve the data that are 'most similar' to the embedded query.
Aerospike vector store.
To use, you should have the aerospike_vector_search python package installed.
Alibaba Cloud OpenSearch vector store.
Alibaba Cloud Opensearch` client configuration.
AnalyticDB (distributed PostgreSQL) vector store.
AnalyticDB is a distributed full postgresql syntax cloud-native database.
connection_string is a postgres connection string.embedding_function any embedding function implementing
langchain.embeddings.base.Embeddings interface.collection_name is the name of the collection to use. (default: langchain)
pre_delete_collection if True, will delete the collection if it exists.
(default: False)
Annoy vector store.
To use, you should have the annoy python package installed.
Apache Doris vector store.
You need a pymysql python package, and a valid account
to connect to Apache Doris.
For more information, please visit Apache Doris official site Apache Doris github
Atlas vector store.
Atlas is the Nomic's neural database and rhizomatic instrument.
To use, you should have the nomic python package installed.
AwaDB vector store.
Azure Cosmos DB for MongoDB vCore vector store.
To use, you should have both:
pymongo python package installedAzure Cognitive Search vector store.
Bagel.net Inference platform.
To use, you should have the bagelML python package installed.
Baidu Elasticsearch vector store.
Baidu VectorDB as a vector store.
In order to use this you need to have a database instance. See the following documentation for details: https://cloud.baidu.com/doc/VDB/index.html
Clarifai AI vector store.
To use, you should have the clarifai python SDK package installed.
ClickHouse vector store integration.
ClickHouse client configuration.
DashVector vector store.
To use, you should have the dashvector python package installed.
Dingo vector store.
To use, you should have the dingodb python package installed.
HnswLib storage using DocArray package.
To use it, you should have the docarray package with version >=0.32.0 installed.
You can install it with pip install docarray.
In-memory DocArray storage for exact search.
To use it, you should have the docarray package with version >=0.32.0 installed.
You can install it with pip install docarray.
Amazon DocumentDB (with MongoDB compatibility) vector store.
Please refer to the official Vector Search documentation for more details:
https://docs.aws.amazon.com/documentdb/latest/developerguide/vector-search.html
To use, you should have both:
pymongo python package installedDuckDB vector store.
This class provides a vector store interface for adding texts and performing similarity searches using DuckDB.
For more information about DuckDB, see: https://duckdb.org/
This integration requires the duckdb Python package.
You can install it with pip install duckdb.
Security Notice: The default DuckDB configuration is not secure.
By **default**, DuckDB can interact with files across the entire file system,
which includes abilities to read, write, and list files and directories.
It can also access some python variables present in the global namespace.
When using this DuckDB vectorstore, we suggest that you initialize the
DuckDB connection with a secure configuration.
For example, you can set `enable_external_access` to `false` in the connection
configuration to disable external access to the DuckDB connection.
You can view the DuckDB configuration options here:
https://duckdb.org/docs/configuration/overview.html
Please review other relevant security considerations in the DuckDB
documentation. (e.g., "autoinstall_known_extensions": "false",
"autoload_known_extensions": "false")
See https://python.langchain.com/docs/security for more information.
ecloud Elasticsearch vector store.
Wrapper around Epsilla vector database.
As a prerequisite, you need to install pyepsilla package
and have a running Epsilla vector database (for example, through our docker image)
See the following documentation for how to run an Epsilla vector database:
https://epsilla-inc.gitbook.io/epsilladb/quick-start
FAISS vector store integration.
See The FAISS Library paper.
Hologres API vector store.
connection_string is a hologres connection string.embedding_function any embedding function implementing
langchain.embeddings.base.Embeddings interface.ndims is the number of dimensions of the embedding output.table_name is the name of the table to store embeddings and data.
(default: langchain_pg_embedding)
pre_delete_table if True, will delete the table if it exists.
(default: False)
Infinispan VectorStore interface.
This class exposes the method to present Infinispan as a VectorStore. It relies on the Infinispan class (below) which takes care of the REST interface with the server.
KDB.AI vector store.
See https://kdb.ai.
To use, you should have the kdbai_client python package installed.
Enumerator of the Distance strategies.
Kinetica vector store.
To use, you should have the gpudb python package installed.
Kinetica client configuration.
LanceDB vector store.
To use, you should have lancedb python package installed.
You can install it with pip install lancedb.
Postgres with the lantern extension as a vector store.
lantern uses sequential scan by default. but you can create a HNSW index using the create_hnsw_index method.
connection_string is a postgres connection string.embedding_function any embedding function implementing
langchain.embeddings.base.Embeddings interface.collection_name is the name of the collection to use. (default: langchain)
distance_strategy is the distance strategy to use. (default: EUCLIDEAN)
EUCLIDEAN is the euclidean distance.COSINE is the cosine distance.HAMMING is the hamming distance.pre_delete_collection if True, will delete the collection if it exists.
(default: False)
Implementation of Vector Store using LLMRails.
ManticoreSearch Engine vector store.
To use, you should have the manticoresearch python package installed.
Marqo vector store.
Marqo indexes have their own models associated with them to generate your embeddings. This means that you can selected from a range of different models and also use CLIP models to create multimodal indexes with images and text together.
Marqo also supports more advanced queries with multiple weighted terms, see See https://docs.marqo.ai/latest/#searching-using-weights-in-queries. This class can flexibly take strings or dictionaries for weighted queries in its similarity search methods.
To use, you should have the marqo python package installed, you can do this with
pip install marqo.
Meilisearch vector store.
To use this, you need to have meilisearch python package installed,
and a running Meilisearch instance.
To learn more about Meilisearch Python, refer to the in-depth Meilisearch Python documentation: https://meilisearch.github.io/meilisearch-python/.
See the following documentation for how to run a Meilisearch instance: https://www.meilisearch.com/docs/learn/getting_started/quick_start.
Momento Vector Index (MVI) vector store.
Momento Vector Index is a serverless vector index that can be used to store and
search vectors. To use you should have the momento python package installed.
MyScale vector store.
You need a clickhouse-connect python package, and a valid account
to connect to MyScale.
MyScale can not only search with simple vector indexes. It also supports a complex query with multiple conditions, constraints and even sub-queries.
For more information, please visit myscale official site
MyScale client configuration.
Amazon OpenSearch Vector Engine vector store.
VectorStore connecting to Pathway Vector Store.
Postgres with the pg_embedding extension as a vector store.
pg_embedding uses sequential scan by default. but you can create a HNSW index using the create_hnsw_index method.
connection_string is a postgres connection string.embedding_function any embedding function implementing
langchain.embeddings.base.Embeddings interface.collection_name is the name of the collection to use. (default: langchain)
distance_strategy is the distance strategy to use. (default: EUCLIDEAN)
EUCLIDEAN is the euclidean distance.pre_delete_collection if True, will delete the collection if it exists.
(default: False)
Relyt (distributed PostgreSQL) vector store.
Relyt is a distributed full postgresql syntax cloud-native database.
connection_string is a postgres connection string.embedding_function any embedding function implementing
langchain.embeddings.base.Embeddings interface.collection_name is the name of the collection to use. (default: langchain)
pre_delete_collection if True, will delete the collection if it exists.
(default: False)
Rockset vector store.
To use, you should have the rockset python package installed. Note that to use
this, the collection being used must already exist in your Rockset instance.
You must also ensure you use a Rockset ingest transformation to apply
VECTOR_ENFORCE on the column being used to store embedding_key in the
collection.
See: https://rockset.com/blog/introducing-vector-search-on-rockset/ for more details
Everything below assumes commons Rockset workspace.
ScaNN vector store.
To use, you should have the scann python package installed.
SemaDB vector store.
This vector store is a wrapper around the SemaDB database.
Simple in-memory vector store based on the scikit-learn library
NearestNeighbors.
SQLite with Vec extension as a vector database.
To use, you should have the sqlite-vec python package installed.
Example:
.. code-block:: python
from langchain_community.vectorstores import SQLiteVec
from langchain_community.embeddings.openai import OpenAIEmbeddings
...
SQLite with VSS extension as a vector database.
To use, you should have the sqlite-vss python package installed.
Example:
.. code-block:: python
from langchain_community.vectorstores import SQLiteVSS
from langchain_community.embeddings.openai import OpenAIEmbeddings
...
StarRocks vector store.
You need a pymysql python package, and a valid account
to connect to StarRocks.
Right now StarRocks has only implemented cosine_similarity function to
compute distance between two vectors. And there is no vector inside right now,
so we have to iterate all vectors and compute spatial distance.
For more information, please visit StarRocks official site StarRocks github
Supabase Postgres vector store.
It assumes you have the pgvector
extension installed and a match_documents (or similar) function. For more details:
https://integrations.langchain.com/vectorstores?integration_name=SupabaseVectorStore
You can implement your own match_documents function in order to limit the search
space to a subset of documents based on your own authorization or business logic.
Note that the Supabase Python client does not yet support async operations.
If you'd like to use max_marginal_relevance_search, please review the instructions
below on modifying the match_documents function to return matched embeddings.
Examples:
.. code-block:: python
from langchain_community.embeddings.openai import OpenAIEmbeddings
from langchain_core.documents import Document
from langchain_community.vectorstores import SupabaseVectorStore
from supabase.client import create_client
docs = [
Document(page_content="foo", metadata={"id": 1}),
]
embeddings = OpenAIEmbeddings()
supabase_client = create_client("my_supabase_url", "my_supabase_key")
vector_store = SupabaseVectorStore.from_documents(
docs,
embeddings,
client=supabase_client,
table_name="documents",
query_name="match_documents",
chunk_size=500,
)
To load from an existing table:
.. code-block:: python
from langchain_community.embeddings.openai import OpenAIEmbeddings
from langchain_community.vectorstores import SupabaseVectorStore
from supabase.client import create_client
embeddings = OpenAIEmbeddings()
supabase_client = create_client("my_supabase_url", "my_supabase_key")
vector_store = SupabaseVectorStore(
client=supabase_client,
embedding=embeddings,
table_name="documents",
query_name="match_documents",
)
SurrealDB as Vector Store.
To use, you should have the surrealdb python package installed.
Tablestore vector store.
To use, you should have the tablestore python package installed.
Tair vector store.
Tencent VectorDB as a vector store.
In order to use this you need to have a database instance. See the following documentation for details: https://cloud.tencent.com/document/product/1709/104489
Vectorstore that uses ThirdAI's NeuralDB Enterprise Python Client for NeuralDBs.
To use, you should have the thirdai[neural_db] python package installed.
Vectorstore that uses ThirdAI's NeuralDB.
To use, you should have the thirdai[neural_db] python package installed.
TiDB Vector Store.
TileDB vector store.
To use, you should have the tiledb-vector-search python package installed.
Timescale Postgres vector store
To use, you should have the timescale_vector python package installed.
Typesense vector store.
To use, you should have the typesense python package installed.
Upstash Vector vector store
To use, the upstash-vector python package must be installed.
Also an Upstash Vector index is required. First create a new Upstash Vector index
and copy the index_url and index_token variables. Then either pass
them through the constructor or set the environment
variables UPSTASH_VECTOR_REST_URL and UPSTASH_VECTOR_REST_TOKEN.
USearch vector store.
To use, you should have the usearch python package installed.
Vald vector database.
To use, you should have the vald-client-python python package installed.
Vectara API vector store.
See (https://vectara.com).
Vespa vector store.
To use, you should have the python client library pyvespa installed.
VLite is a simple and fast vector database for semantic search.
Yellowbrick as a vector database. Example: .. code-block:: python from langchain_community.vectorstores import Yellowbrick from langchain_community.embeddings.openai import OpenAIEmbeddings ...
Zep vector store.
It provides methods for adding texts or documents to the store, searching for similar documents, and deleting documents.
Search scores are calculated using cosine similarity normalized to [0, 1].
Zep vector store.
It provides methods for adding texts or documents to the store, searching for similar documents, and deleting documents.
Search scores are calculated using cosine similarity normalized to [0, 1].
Zilliz vector store.
You need to have pymilvus installed and a
running Zilliz database.
See the following documentation for how to run a Zilliz instance: https://docs.zilliz.com/docs/create-cluster
IF USING L2/IP metric IT IS HIGHLY SUGGESTED TO NORMALIZE YOUR DATA.
Azure Cosmos DB for NoSQL vector store.
To use, you should have both:
- the azure-cosmos python package installed
You can read more about vector search, full text search and hybrid search using AzureCosmosDBNoSQL here: https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/vector-search https://learn.microsoft.com/en-us/azure/cosmos-db/gen-ai/full-text-search https://learn.microsoft.com/en-us/azure/cosmos-db/gen-ai/hybrid-search
Google Cloud BigQuery vector store.
To use, you need the following packages installed: google-cloud-bigquery
ChromaDB vector store.
To use, you should have the chromadb python package installed.
Couchbase Vector Store vector store.
To use it, you need
couchbase library[DEPRECATED] Elasticsearch with k-nearest neighbor search
(k-NN) vector store.
Recommended to use ElasticsearchStore instead, which supports metadata filtering, customising the query retriever and much more!
You can read more on ElasticsearchStore: https://python.langchain.com/docs/integrations/vectorstores/elasticsearch
It creates an Elasticsearch index of text data that can be searched using k-NN search. The text data is transformed into vector embeddings using a provided embedding model, and these embeddings are stored in the Elasticsearch index.
ElasticVectorSearch uses the brute force method of searching on vectors.
Recommended to use ElasticsearchStore instead, which gives you the option to uses the approx HNSW algorithm which performs better on large datasets.
ElasticsearchStore also supports metadata filtering, customising the query retriever and much more!
You can read more on ElasticsearchStore: https://python.langchain.com/docs/integrations/vectorstores/elasticsearch
To connect to an Elasticsearch instance that does not require
login credentials, pass the Elasticsearch URL and index name along with the
embedding object to the constructor.
Elasticsearch vector store.
Postgres/PGVector vector store.
DEPRECATED: This class is pending deprecation and will likely receive
no updates. An improved version of this class is available in
langchain_postgres as PGVector. Please use that class instead.
When migrating please keep in mind that:
* The new implementation works with psycopg3, not with psycopg2
(This implementation does not work with psycopg3).
* Filtering syntax has changed to use $ prefixed operators for JSONB
metadata fields. (New implementation only uses JSONB field for metadata)
* The new implementation made some schema changes to address issues
with the existing implementation. So you will need to re-create
your tables and re-index your data or else carry out a manual
migration.
To use, you should have the pgvector python package installed.
Redis vector database.
SingleStore DB vector store.
The prerequisite for using this class is the installation of the singlestoredb
Python package.
The SingleStoreDB vectorstore can be created by providing an embedding function and the relevant parameters for the database connection, connection pool, and optionally, the names of the table and the fields to use.
Wrapper around Vald vector database.
Pathway Vector Store client.
The Pathway Vector Server is a pipeline written in the Pathway framweork which indexes all files in a given folder, embeds them, and builds a vector index. The pipeline reacts to changes in source files, automatically updating appropriate index entries.
The PathwayVectorClient implements the LangChain VectorStore interface and queries the PathwayVectorServer to retrieve up-to-date documents.
You can use the client with managed instances of Pathway Vector Store, or run your own instance as described at https://pathway.com/developers/user-guide/llm-xpack/vectorstore_pipeline/
Wrapper around the Baidu vector database.
Utility functions for working with vectors and vectorstores.
Wrapper around TileDB vector database.
Vector Store in Google Cloud BigQuery.
Module providing Infinispan as a VectorStore
Wrapper around Epsilla vector database.
Wrapper around scikit-learn NearestNeighbors implementation.
The vector store can be persisted in json, bson or parquet format.
Wrapper around LLMRails vector database.
VectorStore wrapper around a Postgres-TimescaleVector database.
Wrapper around the Tencent vector database.