# BeautifulSoupTransformer

> **Class** in `langchain_community`

📖 [View in docs](https://reference.langchain.com/python/langchain-community/document_transformers/beautiful_soup_transformer/BeautifulSoupTransformer)

Transform HTML content by extracting specific tags and removing unwanted ones.

## Signature

```python
BeautifulSoupTransformer(
    self,
)
```

## Description

**Example:**

.. code-block:: python

from langchain_community.document_transformers import BeautifulSoupTransformer

bs4_transformer = BeautifulSoupTransformer()
docs_transformed = bs4_transformer.transform_documents(docs)

## Extends

- `BaseDocumentTransformer`

## Constructors

```python
__init__(
    self,
) -> None
```


## Methods

- [`transform_documents()`](https://reference.langchain.com/python/langchain-community/document_transformers/beautiful_soup_transformer/BeautifulSoupTransformer/transform_documents)
- [`remove_unwanted_classnames()`](https://reference.langchain.com/python/langchain-community/document_transformers/beautiful_soup_transformer/BeautifulSoupTransformer/remove_unwanted_classnames)
- [`remove_unwanted_tags()`](https://reference.langchain.com/python/langchain-community/document_transformers/beautiful_soup_transformer/BeautifulSoupTransformer/remove_unwanted_tags)
- [`extract_tags()`](https://reference.langchain.com/python/langchain-community/document_transformers/beautiful_soup_transformer/BeautifulSoupTransformer/extract_tags)
- [`remove_unnecessary_lines()`](https://reference.langchain.com/python/langchain-community/document_transformers/beautiful_soup_transformer/BeautifulSoupTransformer/remove_unnecessary_lines)
- [`atransform_documents()`](https://reference.langchain.com/python/langchain-community/document_transformers/beautiful_soup_transformer/BeautifulSoupTransformer/atransform_documents)

---

[View source on GitHub](https://github.com/langchain-ai/langchain-community/blob/4b280287bd55b99b44db2dd849f02d66c89534d5/libs/community/langchain_community/document_transformers/beautiful_soup_transformer.py#L6)