| Name | Type | Description |
|---|---|---|
experiment* | Union[str, uuid.UUID] | The identifier of the experiment to evaluate. |
evaluators | Optional[Sequence[EVALUATOR_T]] | Default: None |
summary_evaluators | Optional[Sequence[SUMMARY_EVALUATOR_T]] | Default: None |
metadata | Optional[dict] | Default: None |
max_concurrency | int | None | Default: 0 |
client | Optional[langsmith.Client] | Default: None |
load_nested | bool | Default: False |
blocking | bool | Default: True |
Evaluate existing experiment runs.
Environment:
LANGSMITH_TEST_CACHE: If set, API calls will be cached to disk to save time and
cost during testing.Recommended to commit the cache files to your repository for faster CI/CD runs.
Requires the 'langsmith[vcr]' package to be installed.
Optional sequence of evaluators to use for individual run evaluation.
Optional sequence of evaluators to apply over the entire dataset.
Optional metadata to include in the evaluation results.
The maximum number of concurrent evaluations to run.
If None then no limit is set. If 0 then no concurrency.
Optional Langsmith client to use for evaluation.
Whether to load all child runs for the experiment.
Default is to only load the top-level root runs.
Whether to block until evaluation is complete.