Skip to main content
  • Type: object
This property defines how the Crawler acquires a session to access protected content.

Basic auth

The Crawler extracts the Set-Cookie response header from the login page, stores that cookie and sends it in a Cookie header when crawling all pages of the website defined in the configuration.
This cookie is only retrieved at the beginning of each complete crawl. It won’t be renewed automatically if it expires.
The Crawler can interact with your login page in these ways:
  • With a direct request with the credentials to your login endpoint, like a standard curl command (fetchRequest).
  • By emulating a web browser, loading your login page, entering the credentials and validating the login form (browserRequest).

OAuth 2.0

The crawler supports the OAuth 2.0 Client Credentials Grant flow. It performs an Access Token Request using the provided credentials, stores the retrieved token in an Authorization header, and sends it when crawling all pages of the website defined in the configuration.
This token is only retrieved at the beginning of each complete crawl. It won’t be renewed automatically if it expires.
Client authentication is performed by passing the client credentials (client_id and client_secret) in the request body as described in RFC 6749. The following providers are supported.

Parameters

fetchRequest
object
Manually create the login request.Example:
JavaScript
{
  login: {
    fetchRequest: {
      url: "https://example.com/secure/login-with-post",
      requestOptions: {
        method: "POST",
        headers: { "Content-Type": "application/x-www-form-urlencoded" },
        body: "id=my-id&password=my-password",
        timeout: 5000 // in milliseconds
      }
    }
  }
}
browserRequest
object
Use a web browser to visit your login page and validate the login form similar to a human.Example:
JavaScript
{
  login: {
    browserRequest: {
      url: "https://example.com/secure/login-page",
      username: "my-id",
      password: "my-password",
    }
  }
}
oauthRequest
object
Use the OAuth 2.0 Client Credentials Grant flow to generate an Authorization header.Example:
JavaScript
{
  login: {
    oauthRequest: {
      accessTokenRequest: {
        url: "https://example.com/oauth2/token",
        grant_type: "client_credentials",
        client_id: "my-client-id",
        client_secret: "my-client-secret",
        extraParameters: {
          resource: "https://protected.example.com/"
        }
      }
    }
  }
}