Evaluate a run.
evaluate_run(
self,
run: Union[ls_schemas.Run, ls_schemas.RunBase, str, uuid.UUID],
evaluator: ls_evaluator.RunEvaluator,
*,
source_info: Optional[dict[str, Any]] = None,
reference_example: Optional[Union[ls_schemas.Example, str, dict, uuid.UUID]] = None,
load_child_runs: bool = False
) -> ls_evaluator.EvaluationResult| Name | Type | Description |
|---|---|---|
run* | Union[Run, RunBase, str, UUID] | The run to evaluate. |
evaluator* | RunEvaluator | The evaluator to use. |
source_info | Optional[Dict[str, Any]] | Default: NoneAdditional information about the source of the evaluation to log as feedback metadata. |
reference_example | Optional[Union[Example, str, dict, UUID]] | Default: NoneThe example to use as a reference for the evaluation. If not provided, the run's reference example will be used. |
load_child_runs | bool, default=False | Default: FalseWhether to load child runs when resolving the run ID. |