| Name | Type | Description |
|---|---|---|
file_path* | Union[str, Path] | The path to the CSV file. |
source_column | Optional[str] | Default: NoneThe name of the column in the CSV file to use as the source. Optional. Defaults to None. |
metadata_columns | Sequence[str] | Default: () |
csv_args | Optional[Dict] | Default: None |
encoding | Optional[str] | Default: None |
autodetect_encoding | bool | Default: False |
content_columns | Sequence[str] | Default: () |
Load a CSV file into a list of Document objects.
Each document represents one row of the CSV file. Every row is converted into a key/value pair and outputted to a new line in the document's page_content.
The source for each document loaded from csv is set to the value of the
file_path argument for all documents by default.
You can override this by setting the source_column argument to the
name of a column in the CSV file.
The source of each document will then be set to the value of the column
with the name specified in source_column.
Output Example:
.. code-block:: txt
column1: value1 column2: value2 column3: value3
Instantiate:
.. code-block:: python
from langchain_community.document_loaders import CSVLoader
loader = CSVLoader(file_path='./hw_200.csv', csv_args={ 'delimiter': ',', 'quotechar': '"', 'fieldnames': ['Index', 'Height', 'Weight'] })
Load:
.. code-block:: python
docs = loader.load()
print(docs[0].page_content[:100])
print(docs[0].metadata)
.. code-block:: python
Index: Index
Height: Height(Inches)"
Weight: "Weight(Pounds)"
{'source': './hw_200.csv', 'row': 0}
Async load:
.. code-block:: python
docs = await loader.aload()
print(docs[0].page_content[:100])
print(docs[0].metadata)
.. code-block:: python
Index: Index
Height: Height(Inches)"
Weight: "Weight(Pounds)"
{'source': './hw_200.csv', 'row': 0}
Lazy load:
.. code-block:: python
docs = []
docs_lazy = loader.lazy_load()
# async variant:
# docs_lazy = await loader.alazy_load()
for doc in docs_lazy:
docs.append(doc)
print(docs[0].page_content[:100])
print(docs[0].metadata)
.. code-block:: python
Index: Index
Height: Height(Inches)"
Weight: "Weight(Pounds)"
{'source': './hw_200.csv', 'row': 0}
A sequence of column names to use as metadata. Optional.
A dictionary of arguments to pass to the csv.DictReader. Optional. Defaults to None.
The encoding of the CSV file. Optional. Defaults to None.
Whether to try to autodetect the file encoding.
A sequence of column names to use for the document content. If not present, use all columns that are not part of the metadata.