Ask a question to get started
Enter to send•Shift+Enter new line
TavilyCrawl( self, **kwargs: Any = {}, )
BaseTool
Max depth of the crawl. Defines how far from the base URL the crawler can explore.
max_depth must be greater than 0
default is 1
The maximum number of links to follow per level of the tree (i.e., per page).
max_breadth must be greater than 0
default is 20
Total number of links the crawler will process before stopping.
limit must be greater than 0
default is 50
Natural language instructions for the crawler.
ex. "Python SDK"
Regex patterns to select only URLs with specific path patterns.
ex. ["/api/v1.*"]
Regex patterns to select only URLs from specific domains or subdomains.
ex. ["^docs.example.com$"]
Regex patterns to exclude URLs with specific path patterns ex. [/private/., /admin/.]
Regex patterns to exclude specific domains or subdomains from crawling ex. [^private.example.com$]
Whether to allow following links that go to external domains.
default is False
Whether to include images in the crawl results.
Filter URLs using predefined categories like 'Documentation', 'Blogs', etc.
Advanced extraction retrieves more data, including tables and embedded content, with higher success but may increase latency.
default is basic
The format of the extracted web page content. markdown returns content in markdown format. text returns plain text and may increase latency.
default is markdown
Whether to include the favicon URL for each result.
Default is False.
Whether to include credit usage information in the response.
Number of content chunks to return per source URL.
Use this to limit the amount of content returned from each crawled URL.
Tool that sends requests to the Tavily Crawl API with dynamically settable parameters.