class ApifyDatasetLoaderA class that extends the BaseDocumentLoader and implements the DocumentLoader interface. It represents a document loader that loads documents from an Apify dataset.
The async caller should be used by subclasses to make any async calls, which will thus benefit from the concurrency and retry logic.
A method that loads the text file or blob and returns a promise that
resolves to an array of Document instances. It reads the text from
the file or blob using the readFile function from the
node:fs/promises module or the text() method of the blob. It then
parses the text using the parse() method and creates a Document
instance for each parsed page. The metadata includes the source of the
text (file path or blob) and, if there are multiple pages, the line
number of each page.
Create an ApifyDatasetLoader by calling an Actor on the Apify platform and waiting for its results to be ready.
Create an ApifyDatasetLoader by calling a saved Actor task on the Apify platform and waiting for its results to be ready.
const loader = new ApifyDatasetLoader("your-dataset-id", {
datasetMappingFunction: (item) =>
new Document({
pageContent: item.text || "",
metadata: { source: item.url },
}),
clientOptions: {
token: "your-apify-token",
},
});
const docs = await loader.load();
const chain = new RetrievalQAChain();
const res = await chain.invoke({ query: "What is LangChain?" });
console.log(res.text);
console.log(res.sourceDocuments.map((d) => d.metadata.source));