The expect API lets you make approximate assertions on test results and log scores as feedback to LangSmith. Use it within test cases decorated with @pytest.mark.langsmith.
expect.embedding_distance(prediction, reference) — Compare texts using embedding similarity. Returns a matcher.expect.edit_distance(prediction, reference) — Compare texts using edit distance. Returns a matcher.expect.value(val) — Wrap a value for assertions like .to_contain(), .to_equal(), or .against(fn).expect.score(score, key=...) — Log a numeric score. Optionally chain assertions.All matchers support:
.to_be_less_than(threshold).to_be_greater_than(threshold).to_be_between(min, max).to_be_approximately(target, precision).to_equal(value).to_be_none().to_contain(substring).against(fn) — Custom assertion callableimport pytest
from langsmith import expect
@pytest.mark.langsmith
def test_output_semantically_close():
response = oai_client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Say hello!"},
],
)
response_txt = response.choices[0].message.content
# Compare with embedding distance
expect.embedding_distance(
prediction=response_txt,
reference="Hello!",
).to_be_less_than(0.9)
# Compare with edit distance
expect.edit_distance(
prediction=response_txt,
reference="Hello!",
).to_be_less_than(1)
# Assert on values directly
expect.value(response_txt).to_contain("Hello!")
expect.value(response_txt).against(lambda x: "Hello" in x)
# Log scores
expect.score(0.8)
expect.score(0.7, key="similarity").to_be_greater_than(0.7)