Evaluate existing experiment runs.
evaluate_existing(
experiment: Union[str, uuid.UUID, schemas.TracerSession],
,
evaluators: Optional[Sequence[EVALUATOR_T]] = None,
summary_evaluators: Optional[Sequence[SUMMARY_EVALUATOR_T]] = None,
metadata: Optional[dict] = None,
max_concurrency: Optional[int] = 0,
client: Optional[langsmith.Client] = None,
load_nested: bool = False,
blocking: bool = True
) -> ExperimentResultsEnvironment:
LANGSMITH_TEST_CACHE: If set, API calls will be cached to disk to save time and
cost during testing.Recommended to commit the cache files to your repository for faster CI/CD runs.
Requires the 'langsmith[vcr]' package to be installed.
| Name | Type | Description |
|---|---|---|
experiment* | Union[str, uuid.UUID] | The identifier of the experiment to evaluate. |
evaluators | Optional[Sequence[EVALUATOR_T]] | Default: NoneOptional sequence of evaluators to use for individual run evaluation. |
summary_evaluators | Optional[Sequence[SUMMARY_EVALUATOR_T]] | Default: NoneOptional sequence of evaluators to apply over the entire dataset. |
metadata | Optional[dict] | Default: NoneOptional metadata to include in the evaluation results. |
max_concurrency | int | None | Default: 0The maximum number of concurrent evaluations to run. If |
client | Optional[langsmith.Client] | Default: NoneOptional Langsmith client to use for evaluation. |
load_nested | bool | Default: FalseWhether to load all child runs for the experiment. Default is to only load the top-level root runs. |
blocking | bool | Default: TrueWhether to block until evaluation is complete. |