Ask a question to get started
Enter to send•Shift+Enter new line
ComparisonEvaluationResult()
BaseModel
The aspect, metric name, or label for this evaluation.
The scores for each run in the comparison.
The ID of the trace of the evaluator itself.
Comment for the scores. If a string, it's shared across all target runs.
If a dict, it maps run IDs to individual comments.
dict
Feedback scores for the results of comparative evaluations.
These are generated by functions that compare two or more runs, returning a ranking or other feedback.