| Name | Type | Description |
|---|---|---|
urls* | List[str] | URLs to load. Each is loaded into its own document. |
text_mode | bool | Default: TrueIf True, extract text from URL and use that for page content. Otherwise, extract raw HTML. |
nlp | bool | Default: FalseIf True, perform NLP on the extracted contents, like providing a summary and extracting keywords. |
continue_on_failure | bool | Default: True |
show_progress_bar | bool | Default: False |
**newspaper_kwargs | Any | Default: {} |
Load news articles from URLs using Unstructured.
Example:
.. code-block:: python
from langchain_community.document_loaders import NewsURLLoader
loader = NewsURLLoader(
urls=["
Newspaper reference:
If True, continue loading documents even if loading fails for a particular URL.
If True, use tqdm to show a loading progress bar. Requires
tqdm to be installed, pip install tqdm.
Any additional named arguments to pass to newspaper.Article().