# BSHTMLLoader

> **Class** in `langchain_community`

📖 [View in docs](https://reference.langchain.com/python/langchain-community/document_loaders/html_bs/BSHTMLLoader)

__ModuleName__ document loader integration

## Signature

```python
BSHTMLLoader(
    self,
    file_path: Union[str, Path],
    open_encoding: Union[str, None] = None,
    bs_kwargs: Union[dict, None] = None,
    get_text_separator: str = '',
)
```

## Description

**Setup:**

Install ``langchain-community`` and ``bs4``.

.. code-block:: bash

    pip install -U langchain-community bs4

**Instantiate:**

.. code-block:: python

from langchain_community.document_loaders import BSHTMLLoader

loader = BSHTMLLoader(
    file_path="./example_data/fake-content.html",
)

**Lazy load:**

.. code-block:: python

    docs = []
    docs_lazy = loader.lazy_load()

    # async variant:
    # docs_lazy = await loader.alazy_load()

    for doc in docs_lazy:
        docs.append(doc)
    print(docs[0].page_content[:100])
    print(docs[0].metadata)

.. code-block:: python

    Test Title

    My First Heading
    My first paragraph.

    {'source': './example_data/fake-content.html', 'title': 'Test Title'}

**Async load:**

.. code-block:: python

    docs = await loader.aload()
    print(docs[0].page_content[:100])
    print(docs[0].metadata)

.. code-block:: python

    Test Title

    My First Heading
    My first paragraph.

    {'source': './example_data/fake-content.html', 'title': 'Test Title'}

## Parameters

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `file_path` | `Union[str, Path]` | Yes | The path to the file to load. |
| `open_encoding` | `Union[str, None]` | No | The encoding to use when opening the file. (default: `None`) |
| `bs_kwargs` | `Union[dict, None]` | No | Any kwargs to pass to the BeautifulSoup object. (default: `None`) |
| `get_text_separator` | `str` | No | The separator to use when calling get_text on the soup. (default: `''`) |

## Extends

- `BaseLoader`

## Constructors

```python
__init__(
    self,
    file_path: Union[str, Path],
    open_encoding: Union[str, None] = None,
    bs_kwargs: Union[dict, None] = None,
    get_text_separator: str = '',
) -> None
```

| Name | Type |
|------|------|
| `file_path` | `Union[str, Path]` |
| `open_encoding` | `Union[str, None]` |
| `bs_kwargs` | `Union[dict, None]` |
| `get_text_separator` | `str` |


## Properties

- `file_path`
- `open_encoding`
- `bs_kwargs`
- `get_text_separator`

## Methods

- [`lazy_load()`](https://reference.langchain.com/python/langchain-community/document_loaders/html_bs/BSHTMLLoader/lazy_load)

---

[View source on GitHub](https://github.com/langchain-ai/langchain-community/blob/4b280287bd55b99b44db2dd849f02d66c89534d5/libs/community/langchain_community/document_loaders/html_bs.py#L13)