Class●Since v0.2

LangSmithLoader

Load LangSmith Dataset examples as Document objects.

Loads the example inputs as the Document page content and places the entire example into the Document metadata. This allows you to easily create few-shot example retrievers from the loaded documents.

Lazy loading

from langchain_core.document_loaders import LangSmithLoader

loader = LangSmithLoader(dataset_id="...", limit=100)
docs = []
for doc in loader.lazy_load():
    docs.append(doc)

# -> [Document("...", metadata={"inputs": {...}, "outputs": {...}, ...}), ...]

LangSmithLoader(
  self,
  *,
  dataset_id: uuid.UUID | str | None = None,
  dataset_name: str | None = None,
  example_ids: Sequence[uuid.UUID | str] | None = None,
  as_of: datetime.datetime | str | None = None,
  splits: Sequence[str] | None = None,
  inline_s3_urls: bool = True,
  offset: int = 0,
  limit: int | None = None,
  metadata: dict | None = None,
  filter: str | None = None,
  content_key: str = '',
  format_content: Callable[..., str] | None = None,
  client: LangSmithClient | None = None,
  **client_kwargs: Any = {}
)

Bases

BaseLoader

Parameters

Name	Type	Description
`dataset_id`	`uuid.UUID \| str \| None`	Default:`None` The ID of the dataset to filter by.
`dataset_name`	`str \| None`	Default:`None` The name of the dataset to filter by.
`content_key`	`str`	Default:`''` The inputs key to set as `Document` page content. `'.'` characters are interpreted as nested keys, e.g. `content_key="first.second"` will result in `Document(page_content=format_content(example.inputs["first"]["second"]))`
`format_content`	`Callable[..., str] \| None`	Default:`None` Function for converting the content extracted from the example inputs into a string. Defaults to JSON-encoding the contents.
`example_ids`	`Sequence[uuid.UUID \| str] \| None`	Default:`None` The IDs of the examples to filter by.
`as_of`	`datetime.datetime \| str \| None`	Default:`None` The dataset version tag or timestamp to retrieve the examples as of. Response examples will only be those that were present at the time of the tagged (or timestamped) version.
`splits`	`Sequence[str] \| None`	Default:`None` A list of dataset splits, which are divisions of your dataset such as `train`, `test`, or `validation`. Returns examples only from the specified splits.
`inline_s3_urls`	`bool`	Default:`True` Whether to inline S3 URLs.
`offset`	`int`	Default:`0` The offset to start from.
`limit`	`int \| None`	Default:`None` The maximum number of examples to return.
`metadata`	`dict \| None`	Default:`None` Metadata to filter by.
`filter`	`str \| None`	Default:`None` A structured filter string to apply to the examples.
`client`	`LangSmithClient \| None`	Default:`None` LangSmith Client. If not provided will be initialized from below args.
`client_kwargs`	`Any`	Default:`{}` Keyword args to pass to LangSmith client init. Should only be specified if `client` isn't.

Constructors

constructor

__init__

Name	Type
dataset_id	uuid.UUID \| str \| None
dataset_name	str \| None
example_ids	Sequence[uuid.UUID \| str] \| None
as_of	datetime.datetime \| str \| None
splits	Sequence[str] \| None
inline_s3_urls	bool
offset	int
limit	int \| None
metadata	dict \| None
filter	str \| None
content_key	str
format_content	Callable[..., str] \| None
client	LangSmithClient \| None

Attributes

dataset_id: dataset_id

attribute

dataset_name: dataset_name

attribute

example_ids: example_ids

inline_s3_urls: inline_s3_urls

Methods

method

lazy_load

Inherited fromBaseLoader

Methods

Mload

—

Eagerly load the chat sessions into memory.

Maload

—

Load data into Document objects.

Mload_and_split

—

Load Document and split into chunks. Chunks are returned as Document.

Malazy_load

—

A lazy loader for Document.

View source on GitHub

LangSmithLoader

Load LangSmith Dataset examples as Document objects.

Lazy loading

from langchain_core.document_loaders import LangSmithLoader

loader = LangSmithLoader(dataset_id="...", limit=100)
docs = []
for doc in loader.lazy_load():
    docs.append(doc)

# -> [Document("...", metadata={"inputs": {...}, "outputs": {...}, ...}), ...]

LangSmithLoader( self, *, dataset_id: uuid.UUID | str | None = None, dataset_name: str | None = None, example_ids: Sequence[uuid.UUID | str] | None = None, as_of: datetime.datetime | str | None = None, splits: Sequence[str] | None = None, inline_s3_urls: bool = True, offset: int = 0, limit: int | None = None, metadata: dict | None = None, filter: str | None = None, content_key: str = '', format_content: Callable[..., str] | None = None, client: LangSmithClient | None = None, **client_kwargs: Any = {} )

Parameters

Name	Type	Description
`dataset_id`	`uuid.UUID \| str \| None`	Default:`None` The ID of the dataset to filter by.
`dataset_name`	`str \| None`	Default:`None` The name of the dataset to filter by.
`content_key`	`str`	Default:`''` The inputs key to set as `Document` page content. `'.'` characters are interpreted as nested keys, e.g. `content_key="first.second"` will result in `Document(page_content=format_content(example.inputs["first"]["second"]))`
`format_content`	`Callable[..., str] \| None`	Default:`None` Function for converting the content extracted from the example inputs into a string. Defaults to JSON-encoding the contents.
`example_ids`	`Sequence[uuid.UUID \| str] \| None`	Default:`None` The IDs of the examples to filter by.
`as_of`	`datetime.datetime \| str \| None`	Default:`None` The dataset version tag or timestamp to retrieve the examples as of. Response examples will only be those that were present at the time of the tagged (or timestamped) version.
`splits`	`Sequence[str] \| None`	Default:`None` A list of dataset splits, which are divisions of your dataset such as `train`, `test`, or `validation`. Returns examples only from the specified splits.
`inline_s3_urls`	`bool`	Default:`True` Whether to inline S3 URLs.
`offset`	`int`	Default:`0` The offset to start from.
`limit`	`int \| None`	Default:`None` The maximum number of examples to return.
`metadata`	`dict \| None`	Default:`None` Metadata to filter by.
`filter`	`str \| None`	Default:`None` A structured filter string to apply to the examples.
`client`	`LangSmithClient \| None`	Default:`None` LangSmith Client. If not provided will be initialized from below args.
`client_kwargs`	`Any`	Default:`{}` Keyword args to pass to LangSmith client init. Should only be specified if `client` isn't.

Constructors

constructor

__init__

Name	Type
dataset_id	uuid.UUID \| str \| None
dataset_name	str \| None
example_ids	Sequence[uuid.UUID \| str] \| None
as_of	datetime.datetime \| str \| None
splits	Sequence[str] \| None
inline_s3_urls	bool
offset	int
limit	int \| None
metadata	dict \| None
filter	str \| None
content_key	str
format_content	Callable[..., str] \| None
client	LangSmithClient \| None

LangSmithLoader

Bases

Parameters

Constructors

Attributes

Methods

Inherited fromBaseLoader

Methods

LangChain Assistant

Menu

LangSmithLoader

Bases

Parameters

Constructors

Attributes

Methods

Inherited fromBaseLoader

Methods

LangSmithLoader

Bases

Used in Docs

Parameters

Constructors

Attributes

Methods

Inherited fromBaseLoader

Methods

Menu

LangSmithLoader

Bases

Used in Docs

Parameters

Constructors

Attributes

Methods

Inherited fromBaseLoader

Methods