Class MozillaReadabilityTransformer

A transformer that uses the Mozilla Readability library to extract the main content from a web page.

Example

const loader = new HTMLWebBaseLoader("https://example.com/article");
const docs = await loader.load();

const splitter = new RecursiveCharacterTextSplitter({
 maxCharacterCount: 5000,
});
const transformer = new MozillaReadabilityTransformer();

// The sequence processes the loaded documents through the splitter and then the transformer.
const sequence = transformer.pipe(splitter);

// Invoke the sequence to transform the documents into a more readable format.
const newDocuments = await sequence.invoke(docs);

console.log(newDocuments);

Hierarchy (View Summary)

Toolkit
- MozillaReadabilityTransformer

Index

Constructors

constructor

new MozillaReadabilityTransformer(
options?: Options,
): MozillaReadabilityTransformer
Parameters
- options: Options = {}
Returns MozillaReadabilityTransformer
Overrides MappingDocumentTransformer.constructor
- Defined in remotes/langchain-ai/langchainjs/main/libs/langchain-community/src/document_transformers/mozilla_readability.ts:36

Properties

`Protected`options

options: Options = {}

Methods

_transformDocument

_transformDocument(document: Document): Promise<Document>
Parameters
- document: Document
Returns Promise<Document>
- Defined in remotes/langchain-ai/langchainjs/main/libs/langchain-community/src/document_transformers/mozilla_readability.ts:40

`Static`lc_name

lc_name(): string
Returns string
- Defined in remotes/langchain-ai/langchainjs/main/libs/langchain-community/src/document_transformers/mozilla_readability.ts:32

Class MozillaReadabilityTransformer

Example

Hierarchy (View Summary)

Index

Constructors

Properties

Methods

Constructors

constructor

Parameters

Returns MozillaReadabilityTransformer

Properties

`Protected`options

Methods

_transformDocument

Parameters

Returns Promise<Document>

`Static`lc_name

Returns string

Settings

On This Page

Class MozillaReadabilityTransformer

Example

Hierarchy (View Summary)

Index

Constructors

Properties

Methods

Constructors

constructor

Parameters

Returns MozillaReadabilityTransformer

Properties

Protectedoptions

Methods

_transformDocument

Parameters

Returns Promise<Document>

Staticlc_name

Returns string

Settings

On This Page

`Protected`options

`Static`lc_name