# aevaluate

> **Function** in `langsmith`

📖 [View in docs](https://reference.langchain.com/python/langsmith/evaluation/_arunner/aevaluate)

Evaluate an async target system on a given dataset.

## Signature

```python
aevaluate(
    target: Union[ATARGET_T, AsyncIterable[dict], Runnable, str, uuid.UUID, schemas.TracerSession],
    /,
    data: Union[DATA_T, AsyncIterable[schemas.Example], Iterable[schemas.Example], None] = None,
    evaluators: Optional[Sequence[Union[EVALUATOR_T, AEVALUATOR_T]]] = None,
    summary_evaluators: Optional[Sequence[SUMMARY_EVALUATOR_T]] = None,
    metadata: Optional[dict] = None,
    experiment_prefix: Optional[str] = None,
    description: Optional[str] = None,
    max_concurrency: Optional[int] = 0,
    num_repetitions: int = 1,
    client: Optional[langsmith.Client] = None,
    blocking: bool = True,
    experiment: Optional[Union[schemas.TracerSession, str, uuid.UUID]] = None,
    upload_results: bool = True,
    error_handling: Literal['log', 'ignore'] = 'log',
    **kwargs: Any = {},
) -> AsyncExperimentResults
```

## Description

**Environment:**

- `LANGSMITH_TEST_CACHE`: If set, API calls will be cached to disk to save time and
cost during testing.

Recommended to commit the cache files to your repository for faster CI/CD runs.

Requires the `'langsmith[vcr]'` package to be installed.

!!! warning "Behavior changed in `langsmith` 0.2.0"

'max_concurrency' default updated from None (no limit on concurrency)
to 0 (no concurrency at all).

## Parameters

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `target` | `AsyncCallable[[dict], dict] \| AsyncIterable[dict] \| Runnable \| EXPERIMENT_T \| Tuple[EXPERIMENT_T, EXPERIMENT_T]` | Yes |  The target system or experiment(s) to evaluate.  Can be an async function that takes a `dict` and returns a `dict`, a langchain `Runnable`, an existing experiment ID, or a two-tuple of experiment IDs. |
| `data` | `Union[DATA_T, AsyncIterable[schemas.Example]]` | No | The dataset to evaluate on.  Can be a dataset name, a list of examples, an async generator of examples, or an async iterable of examples. (default: `None`) |
| `evaluators` | `Optional[Sequence[EVALUATOR_T]]` | No | A list of evaluators to run on each example. (default: `None`) |
| `summary_evaluators` | `Optional[Sequence[SUMMARY_EVALUATOR_T]]` | No | A list of summary evaluators to run on the entire dataset. (default: `None`) |
| `metadata` | `Optional[dict]` | No | Metadata to attach to the experiment. (default: `None`) |
| `experiment_prefix` | `Optional[str]` | No | A prefix to provide for your experiment name. (default: `None`) |
| `description` | `Optional[str]` | No | A description of the experiment. (default: `None`) |
| `max_concurrency` | `int \| None` | No | The maximum number of concurrent evaluations to run.  If `None` then no limit is set. If `0` then no concurrency. (default: `0`) |
| `num_repetitions` | `int` | No | The number of times to run the evaluation. Each item in the dataset will be run and evaluated this many times. (default: `1`) |
| `client` | `Optional[langsmith.Client]` | No | The LangSmith client to use. (default: `None`) |
| `blocking` | `bool` | No | Whether to block until the evaluation is complete. (default: `True`) |
| `experiment` | `Optional[schemas.TracerSession]` | No | An existing experiment to extend.  If provided, `experiment_prefix` is ignored. For advanced usage only. (default: `None`) |
| `error_handling` | `str, default="log"` | No | How to handle individual run errors.  `'log'` will trace the runs with the error message as part of the experiment, `'ignore'` will not count the run as part of the experiment at all. (default: `'log'`) |

## Returns

`AsyncExperimentResults`

An async iterator over the experiment results.

---

[View source on GitHub](https://github.com/langchain-ai/langsmith-sdk/blob/cc54cc2729bb1cfe402fd9f34cf953de60bde30c/python/langsmith/evaluation/_arunner.py#L74)