csv_loader

Try to detect the file encoding.

Returns a list of FileEncoding tuples with the detected encodings ordered by confidence.

Load a CSV file into a list of Document objects.

Each document represents one row of the CSV file. Every row is converted into a key/value pair and outputted to a new line in the document's page_content.

The source for each document loaded from csv is set to the value of the file_path argument for all documents by default. You can override this by setting the source_column argument to the name of a column in the CSV file. The source of each document will then be set to the value of the column with the name specified in source_column.

Load CSV files using Unstructured.

Like other Unstructured loaders, UnstructuredCSVLoader can be used in both "single" and "elements" mode. If you use the loader in "elements" mode, the CSV file will be a single Unstructured Table element. If you use the loader in "elements" mode, an HTML representation of the table will be available in the "text_as_html" key in the document metadata.

Examples

from langchain_community.document_loaders.csv_loader import UnstructuredCSVLoader

loader = UnstructuredCSVLoader("stanley-cups.csv", mode="elements") docs = loader.load()

Load files using Unstructured.

The file loader uses the unstructured partition function and will automatically detect the file type. You can run the loader in different modes: "single", "elements", and "paged". The default "single" mode will return a single langchain Document object. If you use "elements" mode, the unstructured library will split the document into elements such as Title and NarrativeText and return those as individual langchain Document objects. In addition to these post-processing modes (which are specific to the LangChain Loaders), Unstructured has its own "chunking" parameters for post-processing elements into more useful chunks for uses cases such as Retrieval Augmented Generation (RAG). You can pass in additional unstructured kwargs to configure different unstructured settings.

Examples

from langchain_community.document_loaders import UnstructuredFileLoader

loader = UnstructuredFileLoader( "example.pdf", mode="elements", strategy="fast", ) docs = loader.load()

References

https://docs.unstructured.io/open-source/core-functionality/partitioning https://docs.unstructured.io/open-source/core-functionality/chunking

LangChain Assistant

Menu

Functions

Classes

Examples

Examples

References