langchain.js
    Preparing search index...

    A transformer that converts HTML content to plain text.

    const loader = new CheerioWebBaseLoader("https://example.com/some-page");
    const docs = await loader.load();

    const splitter = new RecursiveCharacterTextSplitter({
    maxCharacterCount: 1000,
    });
    const transformer = new HtmlToTextTransformer();

    // The sequence of text splitting followed by HTML to text transformation
    const sequence = splitter.pipe(transformer);

    // Processing the loaded documents through the sequence
    const newDocuments = await sequence.invoke(docs);

    console.log(newDocuments);

    Hierarchy (View Summary)

    Index

    Constructors

    Properties

    Methods

    Constructors

    • Parameters

      • options: HtmlToTextOptions = {}

      Returns HtmlToTextTransformer

    Properties

    options: HtmlToTextOptions = {}

    Methods

    • Parameters

      • document: Document

      Returns Promise<Document>

    • Returns string