# MWDumpLoader

> **Class** in `langchain_community`

📖 [View in docs](https://reference.langchain.com/python/langchain-community/document_loaders/mediawikidump/MWDumpLoader)

Load `MediaWiki` dump from an `XML` file.

## Signature

```python
MWDumpLoader(
    self,
    file_path: Union[str, Path],
    encoding: Optional[str] = 'utf8',
    namespaces: Optional[Sequence[int]] = None,
    skip_redirects: Optional[bool] = False,
    stop_on_error: Optional[bool] = True,
)
```

## Description

**Example:**

.. code-block:: python

from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import MWDumpLoader

loader = MWDumpLoader(
    file_path="myWiki.xml",
    encoding="utf8"
)
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=0
)
texts = text_splitter.split_documents(docs)

:param file_path: XML local file path
:type file_path: str
:param encoding: Charset encoding, defaults to "utf8"
:type encoding: str, optional
:param namespaces: The namespace of pages you want to parse.
    See https://www.mediawiki.org/wiki/Help:Namespaces#Localisation
    for a list of all common namespaces
:type namespaces: List[int],optional
:param skip_redirects: TR=rue to skip pages that redirect to other pages,
    False to keep them. False by default
:type skip_redirects: bool, optional
:param stop_on_error: False to skip over pages that cause parsing errors,
    True to stop. True by default
:type stop_on_error: bool, optional

## Extends

- `BaseLoader`

## Constructors

```python
__init__(
    self,
    file_path: Union[str, Path],
    encoding: Optional[str] = 'utf8',
    namespaces: Optional[Sequence[int]] = None,
    skip_redirects: Optional[bool] = False,
    stop_on_error: Optional[bool] = True,
)
```

| Name | Type |
|------|------|
| `file_path` | `Union[str, Path]` |
| `encoding` | `Optional[str]` |
| `namespaces` | `Optional[Sequence[int]]` |
| `skip_redirects` | `Optional[bool]` |
| `stop_on_error` | `Optional[bool]` |


## Properties

- `file_path`
- `encoding`
- `namespaces`
- `skip_redirects`
- `stop_on_error`

## Methods

- [`lazy_load()`](https://reference.langchain.com/python/langchain-community/document_loaders/mediawikidump/MWDumpLoader/lazy_load)

---

[View source on GitHub](https://github.com/langchain-ai/langchain-community/blob/4b280287bd55b99b44db2dd849f02d66c89534d5/libs/community/langchain_community/document_loaders/mediawikidump.py#L15)