# TextSplitter

> **Class** in `langchain_text_splitters`

📖 [View in docs](https://reference.langchain.com/python/langchain-text-splitters/base/TextSplitter)

Interface for splitting text into chunks.

## Signature

```python
TextSplitter(
    self,
    chunk_size: int = 4000,
    chunk_overlap: int = 200,
    length_function: Callable[[str], int] = len,
    keep_separator: bool | Literal['start', 'end'] = False,
    add_start_index: bool = False,
    strip_whitespace: bool = True,
)
```

## Parameters

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `chunk_size` | `int` | No | Maximum size of chunks to return (default: `4000`) |
| `chunk_overlap` | `int` | No | Overlap in characters between chunks (default: `200`) |
| `length_function` | `Callable[[str], int]` | No | Function that measures the length of given chunks (default: `len`) |
| `keep_separator` | `bool \| Literal['start', 'end']` | No | Whether to keep the separator and where to place it in each corresponding chunk `(True='start')` (default: `False`) |
| `add_start_index` | `bool` | No | If `True`, includes chunk's start index in metadata (default: `False`) |
| `strip_whitespace` | `bool` | No | If `True`, strips whitespace from the start and end of every document (default: `True`) |

## Extends

- `BaseDocumentTransformer`
- `ABC`

## Constructors

```python
__init__(
    self,
    chunk_size: int = 4000,
    chunk_overlap: int = 200,
    length_function: Callable[[str], int] = len,
    keep_separator: bool | Literal['start', 'end'] = False,
    add_start_index: bool = False,
    strip_whitespace: bool = True,
) -> None
```

| Name | Type |
|------|------|
| `chunk_size` | `int` |
| `chunk_overlap` | `int` |
| `length_function` | `Callable[[str], int]` |
| `keep_separator` | `bool \| Literal['start', 'end']` |
| `add_start_index` | `bool` |
| `strip_whitespace` | `bool` |


## Methods

- [`split_text()`](https://reference.langchain.com/python/langchain-text-splitters/base/TextSplitter/split_text)
- [`create_documents()`](https://reference.langchain.com/python/langchain-text-splitters/base/TextSplitter/create_documents)
- [`split_documents()`](https://reference.langchain.com/python/langchain-text-splitters/base/TextSplitter/split_documents)
- [`from_huggingface_tokenizer()`](https://reference.langchain.com/python/langchain-text-splitters/base/TextSplitter/from_huggingface_tokenizer)
- [`from_tiktoken_encoder()`](https://reference.langchain.com/python/langchain-text-splitters/base/TextSplitter/from_tiktoken_encoder)
- [`transform_documents()`](https://reference.langchain.com/python/langchain-text-splitters/base/TextSplitter/transform_documents)

---

[View source on GitHub](https://github.com/langchain-ai/langchain/blob/fb6ab993a73180538f6cca876b3c85d46c08845f/libs/text-splitters/langchain_text_splitters/base.py#L44)