It is more general than a vector store. A retriever does not need to be able to store documents, only to return (or retrieve) it. Vector stores can be used as the backbone of a retriever, but there are other types of retrievers as well.

module

load

Serialization and deserialization.

module

chat_loaders

Chat Loaders load chat messages from common communications platforms.

Load chat messages from various communications platforms such as Facebook Messenger, Telegram, and WhatsApp. The loaded chat messages can be used for fine-tuning models.

module

chat_models

Chat Models are a variation on language models.

While Chat Models use language models under the hood, the interface they expose is a bit different. Rather than expose a "text in, text out" API, they expose an interface where "chat messages" are the inputs and outputs.

module

output_parsers

OutputParser classes parse the output of an LLM call.

module

tools

Tools are classes that an Agent uses to interact with the world.

Each tool has a description. Agent uses the description to choose the right tool for the job.

module

utilities

Utilities are the integrations with third-part systems and packages.

Other LangChain classes use Utilities to interact with third-part systems and packages.

module

graphs

Graphs provide a natural language interface to graph databases.

module

smith

LangSmith utilities.

This module provides utilities for connecting to LangSmith.

Evaluation

LangSmith helps you evaluate Chains and other language model application components using a number of LangChain evaluators. An example of this is shown below, assuming you've created a LangSmith dataset called <my_dataset_name>:

from langsmith import Client
from langchain_openai import ChatOpenAI
from langchain_classic.chains import LLMChain
from langchain_classic.smith import RunEvalConfig, run_on_dataset

# Chains may have memory. Passing in a constructor function lets the
# evaluation framework avoid cross-contamination between runs.
def construct_chain():
    model = ChatOpenAI(temperature=0)
    chain = LLMChain.from_string(model, "What's the answer to {your_input_key}")
    return chain

# Load off-the-shelf evaluators via config or the EvaluatorType (string or enum)
evaluation_config = RunEvalConfig(
    evaluators=[
        "qa",  # "Correctness" against a reference answer
        "embedding_distance",
        RunEvalConfig.Criteria("helpfulness"),
        RunEvalConfig.Criteria(
            {
                "fifth-grader-score": "Do you have to be smarter than a fifth "
                "grader to answer this question?"
            }
        ),
    ]
)

client = Client()
run_on_dataset(
    client,
    "<my_dataset_name>",
    construct_chain,
    evaluation=evaluation_config,
)

You can also create custom evaluators by subclassing the StringEvaluator <langchain.evaluation.schema.StringEvaluator> or LangSmith's RunEvaluator classes.

from typing import Optional
from langchain_classic.evaluation import StringEvaluator

class MyStringEvaluator(StringEvaluator):
    @property
    def requires_input(self) -> bool:
        return False

    @property
    def requires_reference(self) -> bool:
        return True

    @property
    def evaluation_name(self) -> str:
        return "exact_match"

    def _evaluate_strings(
        self, prediction, reference=None, input=None, **kwargs
    ) -> dict:
        return {"score": prediction == reference}

evaluation_config = RunEvalConfig(
    custom_evaluators=[MyStringEvaluator()],
)

run_on_dataset(
    client,
    "<my_dataset_name>",
    construct_chain,
    evaluation=evaluation_config,
)

Primary Functions

arun_on_dataset <langchain.smith.evaluation.runner_utils.arun_on_dataset>: Asynchronous function to evaluate a chain, agent, or other LangChain component over a dataset.
run_on_dataset <langchain.smith.evaluation.runner_utils.run_on_dataset>: Function to evaluate a chain, agent, or other LangChain component over a dataset.
RunEvalConfig <langchain.smith.evaluation.config.RunEvalConfig>: Class representing the configuration for running evaluation. You can select evaluators by EvaluatorType <langchain.evaluation.schema.EvaluatorType> or config, or you can pass in custom_evaluators.

module

agents

Agent is a class that uses an LLM to choose a sequence of actions to take.

In Chains, a sequence of actions is hardcoded. In Agents, a language model is used as a reasoning engine to determine which actions to take and in which order.

Agents select and use Tools and Toolkits for actions.

module

docstore

Docstores are classes to store and load Documents.

The Docstore is a simplified version of the Document Loader.

module

document_transformers

Document Transformers are classes to transform Documents.

Document Transformers usually used to transform a lot of Documents in a single run.

module

prompts

Prompt is the input to the model.

Prompt is often constructed from multiple components. Prompt classes and functions make constructing and working with prompts easy.

module

chains

Chains are easily reusable components linked together.

Chains encode a sequence of calls to components like models, document retrievers, other Chains, etc., and provide a simple interface to this sequence.

The Chain interface makes it easy to create apps that are:

- **Stateful:** add Memory to any Chain to give it state,
- **Observable:** pass Callbacks to a Chain to execute additional functionality,
    like logging, outside the main sequence of component calls,
- **Composable:** combine Chains with other components, including other Chains.

module

callbacks

Callback handlers allow listening to events in LangChain.

module

vectorstores

Vector store stores embedded data and performs vector search.

One of the most common ways to store and search over unstructured data is to embed it and store the resulting embedding vectors, and then query the store and retrieve the data that are 'most similar' to the embedded query.

module

memory

Memory maintains Chain state, incorporating context from past runs.

Indexes.

Index is used to avoid writing duplicated content into the vectostore and to avoid over-writing content if it's unchanged.

Indexes also :

Create knowledge graphs from data.
Support indexing workflows from LangChain data loaders to vectorstores.

Importantly, Index keeps on working even if the content being written is derived via a set of transformations from some source content (e.g., indexing children documents that were derived from parent documents by chunking.)

module

llms

LLMs.

LLM classes provide access to the large language model (LLM) APIs and services.

module

evaluation

Evaluation chains for grading LLM and Chain outputs.

This module contains off-the-shelf evaluation chains for grading the output of LangChain primitives such as language models and chains.

Loading an evaluator

To load an evaluator, you can use the load_evaluators <langchain.evaluation.loading.load_evaluators> or load_evaluator <langchain.evaluation.loading.load_evaluator> functions with the names of the evaluators to load.

from langchain_classic.evaluation import load_evaluator

evaluator = load_evaluator("qa")
evaluator.evaluate_strings(
    prediction="We sold more than 40,000 units last week",
    input="How many units did we sell last week?",
    reference="We sold 32,378 units",
)

The evaluator must be one of EvaluatorType <langchain.evaluation.schema.EvaluatorType>.

Datasets

To load one of the LangChain HuggingFace datasets, you can use the load_dataset <langchain.evaluation.loading.load_dataset> function with the name of the dataset to load.

from langchain_classic.evaluation import load_dataset

ds = load_dataset("llm-math")

Some common use cases for evaluation include:

Grading the accuracy of a response against ground truth answers: QAEvalChain <langchain.evaluation.qa.eval_chain.QAEvalChain>
Comparing the output of two models: PairwiseStringEvalChain <langchain.evaluation.comparison.eval_chain.PairwiseStringEvalChain> or LabeledPairwiseStringEvalChain <langchain.evaluation.comparison.eval_chain.LabeledPairwiseStringEvalChain> when there is additionally a reference label.
Judging the efficacy of an agent's tool usage: TrajectoryEvalChain <langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain>
Checking whether an output complies with a set of criteria: CriteriaEvalChain <langchain.evaluation.criteria.eval_chain.CriteriaEvalChain> or LabeledCriteriaEvalChain <langchain.evaluation.criteria.eval_chain.LabeledCriteriaEvalChain> when there is additionally a reference label.
Computing semantic difference between a prediction and reference: EmbeddingDistanceEvalChain <langchain.evaluation.embedding_distance.base.EmbeddingDistanceEvalChain> or between two predictions: PairwiseEmbeddingDistanceEvalChain <langchain.evaluation.embedding_distance.base.PairwiseEmbeddingDistanceEvalChain>
Measuring the string distance between a prediction and reference StringDistanceEvalChain <langchain.evaluation.string_distance.base.StringDistanceEvalChain> or between two predictions PairwiseStringDistanceEvalChain <langchain.evaluation.string_distance.base.PairwiseStringDistanceEvalChain>

Low-level API

These evaluators implement one of the following interfaces:

StringEvaluator <langchain.evaluation.schema.StringEvaluator>: Evaluate a prediction string against a reference label and/or input context.
PairwiseStringEvaluator <langchain.evaluation.schema.PairwiseStringEvaluator>: Evaluate two prediction strings against each other. Useful for scoring preferences, measuring similarity between two chain or llm agents, or comparing outputs on similar inputs.
AgentTrajectoryEvaluator <langchain.evaluation.schema.AgentTrajectoryEvaluator> Evaluate the full sequence of actions taken by an agent.

These interfaces enable easier composability and usage within a higher level evaluation framework.

View source on GitHub

langchain_classic

Modules

LangChain Assistant

Menu

langchain_classic

Modules