Chroma vector store integration.
Setup:
Install @langchain/community and chromadb.
npm install @langchain/community chromadb
import { Chroma } from '@langchain/community/vectorstores/chroma';
// Or other embeddings
import { OpenAIEmbeddings } from '@langchain/openai';
const embeddings = new OpenAIEmbeddings({
model: "text-embedding-3-small",
})
const vectorStore = new Chroma(
embeddings,
{
collectionName: "foo",
host: "localhost",
}
);
import type { Document } from '@langchain/core/documents';
const document1 = { pageContent: "foo", metadata: { baz: "bar" } };
const document2 = { pageContent: "thud", metadata: { bar: "baz" } };
const document3 = { pageContent: "I will be deleted :(", metadata: {} };
const documents: Document[] = [document1, document2, document3];
const ids = ["1", "2", "3"];
await vectorStore.addDocuments(documents, { ids });
await vectorStore.delete({ ids: ["3"] });
const results = await vectorStore.similaritySearch("thud", 1);
for (const doc of results) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
// Output: * thud [{"baz":"bar"}]
const resultsWithFilter = await vectorStore.similaritySearch("thud", 1, { baz: "bar" });
for (const doc of resultsWithFilter) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
// Output: * foo [{"baz":"bar"}]
const resultsWithScore = await vectorStore.similaritySearchWithScore("qux", 1);
for (const [doc, score] of resultsWithScore) {
console.log(`* [SIM=${score.toFixed(6)}] ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
// Output: * [SIM=0.000000] qux [{"bar":"baz","baz":"bar"}]
const retriever = vectorStore.asRetriever({
searchType: "mmr", // Leave blank for standard similarity search
k: 1,
});
const resultAsRetriever = await retriever.invoke("thud");
console.log(resultAsRetriever);
// Output: [Document({ metadata: { "baz":"bar" }, pageContent: "thud" })]
The embeddings generated for the input texts.
Returns a string representing the type of vector store, which subclasses must implement to identify their specific vector storage type.
Adds an array of documents to the collection. The documents are first
converted to vectors using the embedDocuments method of the
embeddings instance.
Adds an array of vectors and corresponding documents to the collection. The vectors and documents are batch inserted into the database.
Creates a VectorStoreRetriever instance with flexible configuration options.
Deletes rows from the Cassandra table that match the specified WHERE clause conditions.
Ensures that a collection exists in the Chroma database. If the collection does not exist, it is created.
Return documents selected using the maximal marginal relevance. Maximal marginal relevance optimizes for similarity to the query AND diversity among selected documents.
Searches for documents similar to a text query by embedding the query and performing a similarity search on the resulting vector.
Performs a similarity search on the vectors in the collection. The search is performed using the given query vector and returns the top k most similar vectors along with their corresponding documents and similarity scores.
Searches for documents similar to a text query by embedding the query, and returns results with similarity scores.
Creates a new Chroma instance from an existing collection in the
Chroma database.
Creates an instance of AnalyticDBVectorStore from an array of texts
and corresponding metadata. The texts are first converted to Document
instances before being added to the collection.
The name of the serializable. Override to provide an alias or to preserve the serialized module name in minified environments.
Implemented as a static method to support loading logic.