LangChain Reference home pageLangChain ReferenceLangChain Reference
  • GitHub
  • Main Docs
Deep Agents
LangChain
LangGraph
Integrations
LangSmith
  • Overview
  • Client
  • AsyncClient
  • Run Helpers
  • Run Trees
  • Evaluation
  • Schemas
  • Utilities
  • Wrappers
  • Anonymizer
  • Testing
  • Expect API
  • Middleware
  • Pytest Plugin
  • Deployment SDK
⌘I

LangChain Assistant

Ask a question to get started

Enter to send•Shift+Enter new line

Menu

OverviewClientAsyncClientRun HelpersRun TreesEvaluationSchemasUtilitiesWrappersAnonymizerTestingExpect APIMiddlewarePytest PluginDeployment SDK
Language
Theme
Pythonlangsmithevaluationllm_evaluatorLLMEvaluator
Class●Since v0.1

LLMEvaluator

Copy
LLMEvaluator(
  self,
  *,
  prompt_template: Union[str, list[tuple[str, str]]]

Bases

RunEvaluator

Constructors

Methods

View source on GitHub
,
score_config
:
Union
[
CategoricalScoreConfig
,
ContinuousScoreConfig
]
,
map_variables
:
Optional
[
Callable
[
[
Run
,
Optional
[
Example
]
]
,
dict
]
]
=
None
,
model_name
:
str
=
'gpt-4o'
,
model_provider
:
str
=
'openai'
,
**
kwargs
=
{
}
)

Parameters

NameTypeDescription
prompt_template*Union[str, List[Tuple[str, str]]

The prompt template to use for the evaluation. If a string is provided, it is assumed to be a human / user message.

score_config*Union[CategoricalScoreConfig, ContinuousScoreConfig]
map_variablesOptional[Callable[[Run, Example], dict]]
Default:None
model_nameOptional[str]
Default:'gpt-4o'
model_providerOptional[str]
Default:'openai'
constructor
__init__
NameType
prompt_templateUnion[str, list[tuple[str, str]]]
score_configUnion[CategoricalScoreConfig, ContinuousScoreConfig]
map_variablesOptional[Callable[[Run, Optional[Example]], dict]]
model_namestr
model_providerstr
method
from_model

Create an LLMEvaluator instance from a BaseChatModel instance.

method
evaluate_run

Evaluate a run.

method
aevaluate_run

Asynchronously evaluate a run.

A class for building LLM-as-a-judge evaluators.

.. deprecated:: 0.5.0

LLMEvaluator is deprecated. Use openevals instead: https://github.com/langchain-ai/openevals

The configuration for the score, either categorical or continuous.

A function that maps the run and example to the variables in the prompt.

If None, it is assumed that the prompt only requires 'input', 'output', and 'expected'.

The model to use for the evaluation.

The model provider to use for the evaluation.