LangChain Reference home pageLangChain ReferenceLangChain Reference
  • GitHub
  • Main Docs
Deep Agents
LangChain
LangGraph
Integrations
LangSmith
  • Overview
  • Client
  • AsyncClient
  • Run Helpers
  • Run Trees
  • Evaluation
  • Schemas
  • Utilities
  • Wrappers
  • Anonymizer
  • Testing
  • Expect API
  • Middleware
  • Pytest Plugin
  • Deployment SDK
  • RemoteGraph
⌘I

LangChain Assistant

Ask a question to get started

Enter to send•Shift+Enter new line

Menu

OverviewClientAsyncClientRun HelpersRun TreesEvaluationSchemasUtilitiesWrappersAnonymizerTestingExpect APIMiddlewarePytest PluginDeployment SDKRemoteGraph
Language
Theme
PythonlangsmithclientClientupload_dataframe
Method●Since v0.0

upload_dataframe

Upload a dataframe as individual examples to the LangSmith API.

Copy
upload_dataframe(
  self,
  df: pd.DataFrame,
  name: str,
  input_keys: Sequence[str],
  output_keys: Sequence[str],
  *,
  description: Optional[str] = None,
  data_type: Optional[ls_schemas.DataType] = ls_schemas.DataType.kv
) -> ls_schemas.Dataset

Example:

from langsmith import Client
import os
import pandas as pd

client = Client()

df = pd.read_parquet("path/to/your/myfile.parquet")
input_keys = ["column1", "column2"]  # replace with your input column names
output_keys = ["output1", "output2"]  # replace with your output column names

dataset = client.upload_dataframe(
    df=df,
    input_keys=input_keys,
    output_keys=output_keys,
    name="My Parquet Dataset",
    description="Dataset created from a parquet file",
    data_type="kv",  # The default
)

Parameters

NameTypeDescription
df*pd.DataFrame

The dataframe to upload.

name*str

The name of the dataset.

input_keys*Sequence[str]

The input keys.

output_keys*Sequence[str]

The output keys.

descriptionOptional[str]
Default:None

The description of the dataset.

data_typeOptional[ls_schemas.DataType]
Default:ls_schemas.DataType.kv

The data type of the dataset.

View source on GitHub