| Name | Type | Description |
|---|---|---|
path* | str | Path or name of the dataset. |
page_content_column | str | Default: 'text'Page content column name. Default is "text". |
name | Optional[str] | Default: NoneName of the dataset configuration. |
data_dir | Optional[str] | Default: None |
data_files | Optional[Union[str, Sequence[str], Mapping[str, Union[str, Sequence[str | |
cache_dir | Optional[str] | Default: None |
keep_in_memory | Optional[bool] | Default: None |
save_infos | bool | Default: False |
use_auth_token | Optional[Union[bool, str]] | Default: None |
num_proc | Optional[int] | Default: None |
Load from Hugging Face Hub datasets.
Default: None |
Data directory of the dataset configuration.
Directory to read/write data.
Whether to copy the dataset in-memory.
Save the dataset information (checksums/size/splits/...). Default is False.
Bearer token for remote files on the Dataset Hub.
Number of processes.
Path(s) to source data file(s).