Ask a question to get started
Enter to send•Shift+Enter new line
PySparkDataFrameLoader( self, spark_session: Optional[SparkSession] = None, df: Optional[Any
BaseLoader
spark_session
Optional[SparkSession]
None
The SparkSession object.
df
Optional[Any]
The Spark DataFrame object.
page_content_column
str
'text'
fraction_of_memory
float
0.1
Gets the number of "feasible" rows for the DataFrame
A lazy loader for document content.
Load from the dataframe.
Load PySpark DataFrames.
PySpark
The name of the column containing the page content. Defaults to "text".
The fraction of memory to use. Defaults to 0.1.