LangChain Reference home pageLangChain ReferenceLangChain Reference
  • GitHub
  • Main Docs
Deep Agents
LangChain
LangGraph
Integrations
LangSmith
  • Overview
  • MCP Adapters
    • Overview
    • Agents
    • Callbacks
    • Chains
    • Chat models
    • Embeddings
    • Evaluation
    • Globals
    • Hub
    • Memory
    • Output parsers
    • Retrievers
    • Runnables
    • LangSmith
    • Storage
    Standard Tests
    Text Splitters
    ⌘I

    LangChain Assistant

    Ask a question to get started

    Enter to send•Shift+Enter new line

    Menu

    MCP Adapters
    OverviewAgentsCallbacksChainsChat modelsEmbeddingsEvaluationGlobalsHubMemoryOutput parsersRetrieversRunnablesLangSmithStorage
    Standard Tests
    Text Splitters
    Language
    Theme
    Pythonlangchain-classicsmithevaluationrunner_utils
    Module●Since v1.0

    runner_utils

    Utilities for running language models or Chains over datasets.

    Attributes

    attribute
    logger

    Functions

    function
    load_evaluator

    Load the requested evaluation chain specified by a string.

    Parameters

    evaluator : EvaluatorType The type of evaluator to load. llm : BaseLanguageModel, optional The language model to use for evaluation, by default None **kwargs : Any Additional keyword arguments to pass to the evaluator.

    Returns:

    Chain The loaded evaluation chain.

    Examples:

    from langchain_classic.evaluation import load_evaluator, EvaluatorType evaluator = load_evaluator(EvaluatorType.QA)

    function
    arun_on_dataset

    Run on dataset.

    Run the Chain or language model on a dataset and store traces to the specified project name.

    For the (usually faster) async version of this function, see arun_on_dataset.

    function
    run_on_dataset

    Run on dataset.

    Run the Chain or language model on a dataset and store traces to the specified project name.

    For the (usually faster) async version of this function, see arun_on_dataset.

    Classes

    class
    Chain

    Abstract base class for creating structured sequences of calls to components.

    Chains should be used to encode a sequence of calls to components like models, document retrievers, other chains, etc., and provide a simple interface to this sequence.

    class
    EvaluatorType

    The types of the evaluators.

    class
    PairwiseStringEvaluator

    Compare the output of two models (or two outputs of the same model).

    class
    StringEvaluator

    String evaluator interface.

    Grade, tag, or otherwise evaluate predictions relative to their inputs and/or reference labels.

    class
    InputFormatError

    Raised when the input format is invalid.

    class
    TestResult

    A dictionary of the results of a single test run.

    class
    EvalError

    Your architecture raised an error.

    class
    ChatModelInput

    Input for a chat model.

    Type Aliases

    typeAlias
    MODEL_OR_CHAIN_FACTORY: Callable[[], Chain | Runnable] | BaseLanguageModel | Callable[[dict], Any] | Runnable | Chain
    typeAlias
    MCF: Callable[[], Chain | Runnable] | BaseLanguageModel

    Modules

    module
    smith_eval

    LangSmith evaluation utilities.

    This module provides utilities for evaluating Chains and other language model applications using LangChain evaluators and LangSmith.

    For more information on the LangSmith API, see the LangSmith API documentation.

    Example

    from langsmith import Client
    from langchain_openai import ChatOpenAI
    from langchain_classic.chains import LLMChain
    from langchain_classic.smith import EvaluatorType, RunEvalConfig, run_on_dataset
    
    def construct_chain():
        model = ChatOpenAI(temperature=0)
        chain = LLMChain.from_string(model, "What's the answer to {your_input_key}")
        return chain
    
    evaluation_config = RunEvalConfig(
        evaluators=[
            EvaluatorType.QA,  # "Correctness" against a reference answer
            EvaluatorType.EMBEDDING_DISTANCE,
            RunEvalConfig.Criteria("helpfulness"),
            RunEvalConfig.Criteria(
                {
                    "fifth-grader-score": "Do you have to be smarter than a fifth "
                    "grader to answer this question?"
                }
            ),
        ]
    )
    
    client = Client()
    run_on_dataset(
        client, "<my_dataset_name>", construct_chain, evaluation=evaluation_config
    )

    Attributes

    • arun_on_dataset: Asynchronous function to evaluate a chain or other LangChain component over a dataset.
    • run_on_dataset: Function to evaluate a chain or other LangChain component over a dataset.
    • RunEvalConfig: Class representing the configuration for running evaluation.
    • StringRunEvaluatorChain: Class representing a string run evaluator chain.
    • InputFormatError: Exception raised when the input format is incorrect.
    module
    smith_eval_config

    Configuration for run evaluators.

    module
    name_generation
    module
    progress

    A simple progress bar for the console.

    View source on GitHub