Method●Since v0.3

batch_parse

Parses a list of blobs lazily.

batch_parse(
  self,
  blobs: Sequence[Blob],
  gcs_output_path: Optional[str] = None,
  timeout_sec: int = 3600,
  check_in_interval_sec: int = 60
) -> Iterator[Document]

This is a long-running operation. A recommended way is to decouple parsing from creating LangChain Documents:

operations = parser.docai_parse(blobs, gcs_path) parser.is_running(operations) You can get operations names and save them: names = [op.operation.name for op in operations] And when all operations are finished, you can use their results: operations = parser.operations_from_names(operation_names) results = parser.get_results(operations) docs = parser.parse_from_results(results)

Parameters

Name	Type	Description
`blobs`*	`Sequence[Blob]`	a list of blobs to parse.
`gcs_output_path`	`Optional[str]`	Default:`None` a path on Google Cloud Storage to store parsing results.
`timeout_sec`	`int`	Default:`3600` a timeout to wait for Document AI to complete, in seconds.
`check_in_interval_sec`	`int`	Default:`60` an interval to wait until next check whether parsing operations have been completed, in seconds

View source on GitHub

batch_parse

Parameters

LangChain Assistant

Menu

batch_parse

Parameters