langchain.js
    Preparing search index...

    A class that extends the BaseDocumentLoader and implements the DocumentLoader interface. It represents a document loader for loading web-based documents using Cheerio.

    const loader = new CheerioWebBaseLoader("https://exampleurl.com");
    const docs = await loader.load();
    console.log({ docs });

    Hierarchy (View Summary)

    Implements

    Index

    Constructors

    Properties

    caller: AsyncCaller
    headers?: HeadersInit
    selector?: SelectorType
    textDecoder?: TextDecoder
    timeout: number
    webPath: string

    Methods

    • Extracts the text content from the loaded document using the selector and creates a Document instance with the extracted text and metadata. It returns an array of Document instances.

      Returns Promise<Document[]>

      A Promise that resolves to an array of Document instances.

    • Fetches the web document from the webPath and loads it using Cheerio. It returns a CheerioAPI instance.

      Returns Promise<CheerioAPI>

      A Promise that resolves to a CheerioAPI instance.

    • Parameters

      • url: string
      • caller: AsyncCaller
      • timeout: undefined | number
      • OptionaltextDecoder: TextDecoder
      • Optionaloptions: CheerioOptions & { headers?: HeadersInit }

      Returns Promise<CheerioAPI>

    • A static method that dynamically imports the Cheerio library and returns the load function. If the import fails, it throws an error.

      Returns Promise<
          {
              load: (
                  content: string | Buffer<ArrayBufferLike> | AnyNode | AnyNode[],
                  options?: null | CheerioOptions,
                  isDocument?: boolean,
              ) => CheerioAPI;
          },
      >

      A Promise that resolves to an object containing the load function from the Cheerio library.

    • Fetches web documents from the given array of URLs and loads them using Cheerio. It returns an array of CheerioAPI instances.

      Parameters

      • urls: string[]

        An array of URLs to fetch and load.

      • caller: AsyncCaller
      • timeout: undefined | number
      • OptionaltextDecoder: TextDecoder
      • Optionaloptions: CheerioOptions & { headers?: HeadersInit }

      Returns Promise<CheerioAPI[]>

      A Promise that resolves to an array of CheerioAPI instances.