MariaDB vector store integration.
Setup:
Install @langchain/community and mariadb.
If you wish to generate ids, you should also install the uuid package.
npm install @langchain/community mariadb uuid
import {
MariaDBStore,
DistanceStrategy,
} from "@langchain/community/vectorstores/mariadb";
// Or other embeddings
import { OpenAIEmbeddings } from "@langchain/openai";
import { PoolConfig } from "mariadb";
const embeddings = new OpenAIEmbeddings({
model: "text-embedding-3-small",
});
// Sample config
const config = {
connectionOptions: {
host: "127.0.0.1",
port: 3306,
user: "myuser",
password: "ChangeMe",
database: "api",
} as PoolConfig,
tableName: "testlangchainjs",
columns: {
idColumnName: "id",
vectorColumnName: "vector",
contentColumnName: "content",
metadataColumnName: "metadata",
},
// supported distance strategies: COSINE (default) or EUCLIDEAN
distanceStrategy: "COSINE" as DistanceStrategy,
};
const vectorStore = await MariaDBStore.initialize(embeddings, config);
import type { Document } from '@langchain/core/documents';
const document1 = { pageContent: "foo", metadata: { baz: "bar" } };
const document2 = { pageContent: "thud", metadata: { bar: "baz" } };
const document3 = { pageContent: "i will be deleted :(", metadata: {} };
const documents: Document[] = [document1, document2, document3];
const ids = ["1", "2", "3"];
await vectorStore.addDocuments(documents, { ids });
await vectorStore.delete({ ids: ["3"] });
const results = await vectorStore.similaritySearch("thud", 1);
for (const doc of results) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
// Output: * thud [{"baz":"bar"}]
const resultsWithFilter = await vectorStore.similaritySearch("thud", 1, {"country": "BG"});
for (const doc of resultsWithFilter) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
// Output: * foo [{"baz":"bar"}]
const resultsWithScore = await vectorStore.similaritySearchWithScore("qux", 1);
for (const [doc, score] of resultsWithScore) {
console.log(`* [SIM=${score.toFixed(6)}] ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
// Output: * [SIM=0.000000] qux [{"bar":"baz","baz":"bar"}]
const retriever = vectorStore.asRetriever({
searchType: "mmr", // Leave blank for standard similarity search
k: 1,
});
const resultAsRetriever = await retriever.invoke("thud");
console.log(resultAsRetriever);
// Output: [Document({ metadata: { "baz":"bar" }, pageContent: "thud" })]
The size to chunk the sitemap URLs into for scraping.
The distance strategy to use for vector similarity calculations. Defaults to DOT.
The embeddings generated for the input texts.
Returns a string representing the type of vector store, which subclasses must implement to identify their specific vector storage type.
Adds an array of documents to the collection. The documents are first
converted to vectors using the embedDocuments method of the
embeddings instance.
Adds an array of vectors and corresponding documents to the collection. The vectors and documents are batch inserted into the database.
Creates a VectorStoreRetriever instance with flexible configuration options.
Deletes rows from the Cassandra table that match the specified WHERE clause conditions.
Terminates the connection pool.
Method to ensure the existence of the collection table in the database. It creates the table if it does not already exist.
Method to ensure the existence of the table in the database. It creates the table if it does not already exist.
Return documents selected using the maximal marginal relevance. Maximal marginal relevance optimizes for similarity to the query AND diversity among selected documents.
Searches for documents similar to a text query by embedding the query and performing a similarity search on the resulting vector.
Performs a similarity search on the vectors in the collection. The search is performed using the given query vector and returns the top k most similar vectors along with their corresponding documents and similarity scores.
Searches for documents similar to a text query by embedding the query, and returns results with similarity scores.
Creates an instance of AnalyticDBVectorStore from an array of texts
and corresponding metadata. The texts are first converted to Document
instances before being added to the collection.
Initializes the llama_cpp model for usage in the chat models wrapper.
The name of the serializable. Override to provide an alias or to preserve the serialized module name in minified environments.
Implemented as a static method to support loading logic.