LangChain Reference home pageLangChain ReferenceLangChain Reference
  • GitHub
  • Main Docs
Deep Agents
LangChain
LangGraph
Integrations
LangSmith
  • Overview
  • MCP Adapters
    • Overview
    • Agents
    • Callbacks
    • Chains
    • Chat models
    • Embeddings
    • Evaluation
    • Globals
    • Hub
    • Memory
    • Output parsers
    • Retrievers
    • Runnables
    • LangSmith
    • Storage
    Standard Tests
    Text Splitters
    ⌘I

    LangChain Assistant

    Ask a question to get started

    Enter to send•Shift+Enter new line

    Menu

    MCP Adapters
    OverviewAgentsCallbacksChainsChat modelsEmbeddingsEvaluationGlobalsHubMemoryOutput parsersRetrieversRunnablesLangSmithStorage
    Standard Tests
    Text Splitters
    Language
    Theme
    Pythonlangchain-classicchainsnatbotcrawlerCrawler
    Class●Since v1.0

    Crawler

    A crawler for web pages.

    Security Note: This is an implementation of a crawler that uses a browser via Playwright.

    This crawler can be used to load arbitrary webpages INCLUDING content
    from the local file system.
    
    Control access to who can submit crawling requests and what network access
    the crawler has.
    
    Make sure to scope permissions to the minimal permissions necessary for
    the application.
    
    See https://docs.langchain.com/oss/python/security-policy for more information.
    
    Copy
    Crawler(
        self,
    )

    Constructors

    constructor
    __init__

    Attributes

    attribute
    browser: Browser
    attribute
    page: Page
    attribute
    page_element_buffer: dict[int, ElementInViewPort]
    attribute
    client: CDPSession

    Methods

    method
    go_to_page

    Navigate to the given URL.

    method
    scroll

    Scroll the page in the given direction.

    method
    click

    Click on an element with the given id.

    method
    type

    Type text into an element with the given id.

    method
    enter

    Press the Enter key.

    method
    crawl

    Crawl the current page.

    View source on GitHub