Abstract base class representing the interface for a record manager.
The record manager abstraction is used by the langchain indexing API.
The record manager keeps track of which documents have been
written into a VectorStore and when they were written.
The indexing API computes hashes for each document and stores the hash together with the write time and the source id in the record manager.
On subsequent indexing runs, the indexing API can check the record manager to determine which documents have already been indexed and which have not.
This allows the indexing API to avoid re-indexing documents that have already been indexed, and to only index new documents.
The main benefit of this abstraction is that it works across many vectorstores.
To be supported, a VectorStore needs to only support the ability to add and
delete documents by ID. Using the record manager, the indexing API will
be able to delete outdated documents and avoid redundant indexing of documents
that have already been indexed.
The main constraints of this abstraction are:
VectorStore fails.RecordManager(
self,
namespace: str,
)| Name | Type | Description |
|---|---|---|
namespace* | str | The namespace for the record manager. |
| Name | Type |
|---|---|
| namespace | str |
Create the database schema for the record manager.
Asynchronously create the database schema for the record manager.
Get the current server time as a high resolution timestamp!
It's important to get this from the server to ensure a monotonic clock, otherwise there may be data loss when cleaning up old documents!
Asynchronously get the current server time as a high resolution timestamp.
It's important to get this from the server to ensure a monotonic clock, otherwise there may be data loss when cleaning up old documents!
Upsert records into the database.
Asynchronously upsert records into the database.
Check if the provided keys exist in the database.
Asynchronously check if the provided keys exist in the database.
List records in the database based on the provided filters.
Asynchronously list records in the database based on the provided filters.
Delete specified records from the database.
Asynchronously delete specified records from the database.