ZeroxPDFLoader

ZeroxPDFLoader(
  self,
  file_path: Union[str, PurePath],
  model: str = 'gpt-4o-mini',

Bases

BasePDFLoader

Name	Type
file_path	Union[str, PurePath]
model	str

Document loader utilizing Zerox library: https://github.com/getomni-ai/zerox

Zerox converts PDF document to series of images (page-wise) and uses vision-capable LLM model to generate Markdown representation.

Zerox utilizes anyc operations. Therefore when using this loader inside Jupyter Notebook (or any environment running async) you will need to:

    import nest_asyncio
    nest_asyncio.apply()