Class●Since v0.3

LLMSherpaFileLoader

LLMSherpaFileLoader(
  self,
  file_path: Union[str, Path],
  new_indent_parser: bool = True

Bases

BaseLoader

Constructors

Attributes

Methods

Inherited fromBaseLoader(langchain_core)

Methods

Mload Maload Mload_and_split Malazy_load

View source on GitHub

Name	Type
file_path	Union[str, Path]
new_indent_parser	bool
apply_ocr	bool
strategy	str
llmsherpa_api_url	str

Load Documents using LLMSherpa.

LLMSherpaFileLoader use LayoutPDFReader, which is part of the LLMSherpa library. This tool is designed to parse PDFs while preserving their layout information, which is often lost when using most PDF to text parsers.

Examples

from langchain_community.document_loaders.llmsherpa import LLMSherpaFileLoader

loader = LLMSherpaFileLoader( "example.pdf", strategy="chunks", llmsherpa_api_url="http://localhost:5010/api/parseDocument?renderFormat=all", ) docs = loader.load()

LangChain Assistant

Menu

LLMSherpaFileLoader

Bases

Constructors

Attributes

Methods

Inherited fromBaseLoader(langchain_core)

Methods

Examples