Testing
testing
¶
LangSmith pytest testing module.
| FUNCTION | DESCRIPTION |
|---|---|
log_feedback |
Log run feedback from within a pytest test run. |
log_inputs |
Log run inputs from within a pytest test run. |
log_outputs |
Log run outputs from within a pytest test run. |
log_reference_outputs |
Log example reference outputs from within a pytest test run. |
trace_feedback |
Trace the computation of a pytest run feedback as its own run. |
log_feedback
¶
log_feedback(
feedback: dict | list[dict] | None = None,
/,
*,
key: str,
score: int | bool | float | None = None,
value: str | int | float | bool | None = None,
**kwargs: Any,
) -> None
Log run feedback from within a pytest test run.
Warning
This API is in beta and might change in future versions.
Should only be used in pytest tests decorated with @pytest.mark.langsmith.
| PARAMETER | DESCRIPTION |
|---|---|
key
|
Feedback name.
TYPE:
|
score
|
Numerical feedback value. |
value
|
Categorical feedback value |
kwargs
|
Any other Client.create_feedback args.
TYPE:
|
log_inputs
¶
log_inputs(inputs: dict) -> None
log_outputs
¶
log_outputs(outputs: dict) -> None
log_reference_outputs
¶
log_reference_outputs(reference_outputs: dict) -> None
trace_feedback
¶
Trace the computation of a pytest run feedback as its own run.
Warning
This API is in beta and might change in future versions.
| PARAMETER | DESCRIPTION |
|---|---|
name
|
Feedback run name. Defaults to "Feedback".
TYPE:
|
Example
import openai
import pytest
from langsmith import testing as t
from langsmith import wrappers
oai_client = wrappers.wrap_openai(openai.Client())
@pytest.mark.langsmith
def test_openai_says_hello():
# Traced code will be included in the test case
text = "Say hello!"
response = oai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": text},
],
)
t.log_inputs({"text": text})
t.log_outputs({"response": response.choices[0].message.content})
t.log_reference_outputs({"response": "hello!"})
# Use this context manager to trace any steps used for generating evaluation
# feedback separately from the main application logic
with t.trace_feedback():
grade = oai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": "Return 1 if 'hello' is in the user message and 0 otherwise.",
},
{
"role": "user",
"content": response.choices[0].message.content,
},
],
)
# Make sure to log relevant feedback within the context for the
# trace to be associated with this feedback.
t.log_feedback(
key="llm_judge", score=float(grade.choices[0].message.content)
)
assert "hello" in response.choices[0].message.content.lower()