The types of the evaluators.
EvaluatorType()Question answering evaluator, which grades answers to questions directly using an LLM.
Chain of thought question answering evaluator, which grades answers to questions using chain of thought 'reasoning'.
Question answering evaluator that incorporates 'context' in the response.
The pairwise string evaluator, which predicts the preferred prediction from between two models.
The scored string evaluator, which gives a score between 1 and 10 to a prediction.
The labeled pairwise string evaluator, which predicts the preferred prediction from between two models based on a ground truth reference label.
The labeled scored string evaluator, which gives a score between 1 and 10 to a prediction based on a ground truth reference label.
The agent trajectory evaluator, which grades the agent's intermediate steps.
The criteria evaluator, which evaluates a model based on a custom set of criteria without any reference labels.
The labeled criteria evaluator, which evaluates a model based on a custom set of criteria, with a reference label.
Compare predictions to a reference answer using string edit distances.
Compare predictions to a reference answer using exact matching.
Compare predictions to a reference answer using regular expressions.
Compare predictions based on string edit distances.
Compare a prediction to a reference label using embedding distance.
Compare two predictions using embedding distance.
Check if a prediction is valid JSON.
Check if a prediction is equal to a reference JSON.
Compute the edit distance between two JSON strings after canonicalization.
Check if a prediction is valid JSON according to a JSON schema.