# ExperimentalMarkdownSyntaxTextSplitter

> **Class** in `langchain_text_splitters`

📖 [View in docs](https://reference.langchain.com/python/langchain-text-splitters/markdown/ExperimentalMarkdownSyntaxTextSplitter)

An experimental text splitter for handling Markdown syntax.

This splitter aims to retain the exact whitespace of the original text while
extracting structured metadata, such as headers. It is a re-implementation of the
`MarkdownHeaderTextSplitter` with notable changes to the approach and additional
features.

Key Features:

* Retains the original whitespace and formatting of the Markdown text.
* Extracts headers, code blocks, and horizontal rules as metadata.
* Splits out code blocks and includes the language in the "Code" metadata key.
* Splits text on horizontal rules (`---`) as well.
* Defaults to sensible splitting behavior, which can be overridden using the
    `headers_to_split_on` parameter.

## Signature

```python
ExperimentalMarkdownSyntaxTextSplitter(
    self,
    headers_to_split_on: list[tuple[str, str]] | None = None,
    return_each_line: bool = False,
    strip_headers: bool = True,
)
```

## Description

**Example:**

```python
headers_to_split_on = [
    ("#", "Header 1"),
    ("##", "Header 2"),
]
splitter = ExperimentalMarkdownSyntaxTextSplitter(
    headers_to_split_on=headers_to_split_on
)
chunks = splitter.split(text)
for chunk in chunks:
    print(chunk)
```

This class is currently experimental and subject to change based on feedback and
further development.

## Parameters

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `headers_to_split_on` | `list[tuple[str, str]] \| None` | No | A list of tuples, where each tuple contains a header tag (e.g., "h1") and its corresponding metadata key.  If `None`, default headers are used. (default: `None`) |
| `return_each_line` | `bool` | No | Whether to return each line as an individual chunk.  Defaults to `False`, which aggregates lines into larger chunks. (default: `False`) |
| `strip_headers` | `bool` | No | Whether to exclude headers from the resulting chunks. (default: `True`) |

## Constructors

```python
__init__(
    self,
    headers_to_split_on: list[tuple[str, str]] | None = None,
    return_each_line: bool = False,
    strip_headers: bool = True,
) -> None
```

| Name | Type |
|------|------|
| `headers_to_split_on` | `list[tuple[str, str]] \| None` |
| `return_each_line` | `bool` |
| `strip_headers` | `bool` |


## Properties

- `chunks`
- `current_chunk`
- `current_header_stack`
- `strip_headers`
- `splittable_headers`
- `return_each_line`

## Methods

- [`split_text()`](https://reference.langchain.com/python/langchain-text-splitters/markdown/ExperimentalMarkdownSyntaxTextSplitter/split_text)

---

[View source on GitHub](https://github.com/langchain-ai/langchain/blob/6fb37dba71da807af60aa7b909f71f0625a666bf/libs/text-splitters/langchain_text_splitters/markdown.py#L298)