PGVector vector store integration.
Setup:
Install @langchain/community and pg.
If you wish to generate ids, you should also install the uuid package.
npm install @langchain/community pg uuid
import {
PGVectorStore,
DistanceStrategy,
} from "@langchain/community/vectorstores/pgvector";
// Or other embeddings
import { OpenAIEmbeddings } from "@langchain/openai";
import { PoolConfig } from "pg";
const embeddings = new OpenAIEmbeddings({
model: "text-embedding-3-small",
});
// Sample config
const config = {
postgresConnectionOptions: {
type: "postgres",
host: "127.0.0.1",
port: 5433,
user: "myuser",
password: "ChangeMe",
database: "api",
} as PoolConfig,
tableName: "testlangchainjs",
columns: {
idColumnName: "id",
vectorColumnName: "vector",
contentColumnName: "content",
metadataColumnName: "metadata",
},
// supported distance strategies: cosine (default), innerProduct, or euclidean
distanceStrategy: "cosine" as DistanceStrategy,
};
const vectorStore = await PGVectorStore.initialize(embeddings, config);
import type { Document } from '@langchain/core/documents';
const document1 = { pageContent: "foo", metadata: { baz: "bar", num: 4 } };
const document2 = { pageContent: "thud", metadata: { bar: "baz" } };
const document3 = { pageContent: "i will be deleted :(", metadata: {} };
const documents: Document[] = [document1, document2, document3];
const ids = ["1", "2", "3"];
await vectorStore.addDocuments(documents, { ids });
await vectorStore.delete({ ids: ["3"] });
const results = await vectorStore.similaritySearch("thud", 1);
for (const doc of results) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
// Output: * thud [{"baz":"bar"}]
const resultsWithFilter = await vectorStore.similaritySearch("thud", 1, { baz: "bar" });
for (const doc of resultsWithFilter) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
// Output: * foo [{"baz":"bar"}]
Available filter operators: in, notIn, lte, lt, gte, gt, neq
const resultsWithFilters = await vectorStore.similaritySearch("thud", 1, {
baz: {
in: ["bar", "car"],
},
num: {
lte: 10
}
});
for (const doc of resultsWithFilters) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
// Output: * foo [{"baz":"bar"}]
const resultsWithScore = await vectorStore.similaritySearchWithScore("qux", 1);
for (const [doc, score] of resultsWithScore) {
console.log(`* [SIM=${score.toFixed(6)}] ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
// Output: * [SIM=0.000000] qux [{"bar":"baz","baz":"bar"}]
const retriever = vectorStore.asRetriever({
searchType: "mmr", // Leave blank for standard similarity search
k: 1,
});
const resultAsRetriever = await retriever.invoke("thud");
console.log(resultAsRetriever);
// Output: [Document({ metadata: { "baz":"bar" }, pageContent: "thud" })]
The size to chunk the sitemap URLs into for scraping.
The Momento cache client.
The distance strategy to use for vector similarity calculations. Defaults to DOT.
The embeddings generated for the input texts.
Returns a string representing the type of vector store, which subclasses must implement to identify their specific vector storage type.
Adds an array of documents to the collection. The documents are first
converted to vectors using the embedDocuments method of the
embeddings instance.
Adds an array of vectors and corresponding documents to the collection. The vectors and documents are batch inserted into the database.
Creates a VectorStoreRetriever instance with flexible configuration options.
Creates an HNSW vector index on a specified table and vector column with optional build and search configurations. If no configurations are provided, default parameters from the database are used. If provided values exceed the valid ranges, an error will be raised. The index is always created in ONLINE mode.
Deletes rows from the Cassandra table that match the specified WHERE clause conditions.
Terminates the connection pool.
Method to ensure the existence of the collection table in the database. It creates the table if it does not already exist.
Method to ensure the existence of the table in the database. It creates the table if it does not already exist.
Inserts a row for the collectionName provided at initialization if it does not exist and returns the collectionId.
Return documents selected using the maximal marginal relevance. Maximal marginal relevance optimizes for similarity to the query AND diversity among selected documents.
Searches for documents similar to a text query by embedding the query and performing a similarity search on the resulting vector.
Performs a similarity search on the vectors in the collection. The search is performed using the given query vector and returns the top k most similar vectors along with their corresponding documents and similarity scores.
Performs similarity search with both distance and similarity scores returned. This method returns both the raw distance and the normalized similarity score for each result.
Searches for documents similar to a text query by embedding the query, and returns results with similarity scores.
Creates an instance of AnalyticDBVectorStore from an array of texts
and corresponding metadata. The texts are first converted to Document
instances before being added to the collection.
Initializes the llama_cpp model for usage in the chat models wrapper.
The name of the serializable. Override to provide an alias or to preserve the serialized module name in minified environments.
Implemented as a static method to support loading logic.