Load LangSmith Dataset examples as Document objects.
Loads the example inputs as the Document page content and places the entire
example into the Document metadata. This allows you to easily create few-shot
example retrievers from the loaded documents.
from langchain_core.document_loaders import LangSmithLoader
loader = LangSmithLoader(dataset_id="...", limit=100)
docs = []
for doc in loader.lazy_load():
docs.append(doc)
# -> [Document("...", metadata={"inputs": {...}, "outputs": {...}, ...}), ...]LangSmithLoader(
self,
*,
dataset_id: uuid.UUID | str | None = None,
dataset_name: str | None = None,
example_ids: Sequence[uuid.UUID | str] | None = None,
as_of: datetime.datetime | str | None = None,
splits: Sequence[str] | None = None,
inline_s3_urls: bool = True,
offset: int = 0,
limit: int | None = None,
metadata: dict | None = None,
filter: str | None = None,
content_key: str = '',
format_content: Callable[..., str] | None = None,
client: LangSmithClient | None = None,
**client_kwargs: Any = {}
)| Name | Type | Description |
|---|---|---|
dataset_id | uuid.UUID | str | None | Default: NoneThe ID of the dataset to filter by. |
dataset_name | str | None | Default: NoneThe name of the dataset to filter by. |
content_key | str | Default: ''The inputs key to set as
|
format_content | Callable[..., str] | None | Default: NoneFunction for converting the content extracted from the example inputs into a string. Defaults to JSON-encoding the contents. |
example_ids | Sequence[uuid.UUID | str] | None | Default: NoneThe IDs of the examples to filter by. |
as_of | datetime.datetime | str | None | Default: NoneThe dataset version tag or timestamp to retrieve the examples as of. Response examples will only be those that were present at the time of the tagged (or timestamped) version. |
splits | Sequence[str] | None | Default: NoneA list of dataset splits, which are divisions of your dataset such
as Returns examples only from the specified splits. |
inline_s3_urls | bool | Default: TrueWhether to inline S3 URLs. |
offset | int | Default: 0The offset to start from. |
limit | int | None | Default: NoneThe maximum number of examples to return. |
metadata | dict | None | Default: NoneMetadata to filter by. |
filter | str | None | Default: NoneA structured filter string to apply to the examples. |
client | LangSmithClient | None | Default: NoneLangSmith Client. If not provided will be initialized from below args. |
client_kwargs | Any | Default: {}Keyword args to pass to LangSmith client init. Should only be specified if |
| Name | Type |
|---|---|
| dataset_id | uuid.UUID | str | None |
| dataset_name | str | None |
| example_ids | Sequence[uuid.UUID | str] | None |
| as_of | datetime.datetime | str | None |
| splits | Sequence[str] | None |
| inline_s3_urls | bool |
| offset | int |
| limit | int | None |
| metadata | dict | None |
| filter | str | None |
| content_key | str |
| format_content | Callable[..., str] | None |
| client | LangSmithClient | None |