ArxivLoader

ArxivLoader(
  self,
  query: str,
  doc_content_chars_max: Optional[int] = None,
  **kwargs

Bases

BaseLoader

Constructors

Attributes

Methods

Inherited fromBaseLoader(langchain_core)

Methods

Mload Maload Mload_and_split Malazy_load

View source on GitHub

Name	Type	Description
`query`*	`str`
`doc_content_chars_max`	`Optional[int]`	Default:`None`

Name	Type
query	str
doc_content_chars_max	Optional[int]

Setup:

Install arxiv and PyMuPDF packages. PyMuPDF transforms PDF files downloaded from the arxiv.org site into the text format.

.. code-block:: bash

pip install -U arxiv pymupdf

Instantiate:

.. code-block:: python

from langchain_community.document_loaders import ArxivLoader

loader = ArxivLoader( query="reasoning", # load_max_docs=2, # load_all_available_meta=False )

Load:

.. code-block:: python

docs = loader.load()
print(docs[0].page_content[:100])
print(docs[0].metadata)

.. code-block:: python Understanding the Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggre { 'Published': '2024-02-29', 'Title': 'Understanding the Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation', 'Authors': 'Xinyi Wang, Alfonso Amayuelas, Kexun Zhang, Liangming Pan, Wenhu Chen, William Yang Wang', 'Summary': 'Pre-trained language models (LMs) are able to perform complex reasoning without explicit fine-tuning...' }

Lazy load:

.. code-block:: python

docs = []
docs_lazy = loader.lazy_load()

# async variant:
# docs_lazy = await loader.alazy_load()

for doc in docs_lazy:
    docs.append(doc)
print(docs[0].page_content[:100])
print(docs[0].metadata)

.. code-block:: python

Understanding the Reasoning Ability of Language Models
From the Perspective of Reasoning Paths Aggre
{
    'Published': '2024-02-29',
    'Title': 'Understanding the Reasoning Ability of Language Models From the
            Perspective of Reasoning Paths Aggregation',
    'Authors': 'Xinyi Wang, Alfonso Amayuelas, Kexun Zhang, Liangming Pan,
            Wenhu Chen, William Yang Wang',
    'Summary': 'Pre-trained language models (LMs) are able to perform complex reasoning
            without explicit fine-tuning...'
}

Async load:

.. code-block:: python

docs = await loader.aload()
print(docs[0].page_content[:100])
print(docs[0].metadata)

.. code-block:: python

Understanding the Reasoning Ability of Language Models
From the Perspective of Reasoning Paths Aggre
{
    'Published': '2024-02-29',
    'Title': 'Understanding the Reasoning Ability of Language Models From the
            Perspective of Reasoning Paths Aggregation',
    'Authors': 'Xinyi Wang, Alfonso Amayuelas, Kexun Zhang, Liangming Pan,
            Wenhu Chen, William Yang Wang',
    'Summary': 'Pre-trained language models (LMs) are able to perform complex reasoning
            without explicit fine-tuning...'
}

Use summaries of articles as docs:

.. code-block:: python

from langchain_community.document_loaders import ArxivLoader

loader = ArxivLoader(
    query="reasoning"
)

docs = loader.get_summaries_as_docs()
print(docs[0].page_content[:100])
print(docs[0].metadata)

.. code-block:: python

Pre-trained language models (LMs) are able to perform complex reasoning
without explicit fine-tuning
{
    'Entry ID': 'http://arxiv.org/abs/2402.03268v2',
    'Published': datetime.date(2024, 2, 29),
    'Title': 'Understanding the Reasoning Ability of Language Models From the
            Perspective of Reasoning Paths Aggregation',
    'Authors': 'Xinyi Wang, Alfonso Amayuelas, Kexun Zhang, Liangming Pan,
            Wenhu Chen, William Yang Wang'
}

LangChain Assistant

Menu

ArxivLoader

Bases

Constructors

Attributes

Methods

Inherited fromBaseLoader(langchain_core)

Methods

Parameters