Sign up at dashboard.crawlcrawl.com. Copy your crk_... key. Set it as CRAWLCRAWL_API_KEY in your environment.
export CRAWLCRAWL_API_KEY=crk_...
Check your API key works:
curl -X GET \
https://api.crawlcrawl.com/v1/usage \
-H 'Authorization: Bearer ${CRAWLCRAWL_API_KEY}'
Expect an HTTP 200 response.
Start a crawl:
curl -X POST \
https://api.crawlcrawl.com/v1/crawls \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ${CRAWLCRAWL_API_KEY}' \
-d '{"url": "https://example.com", "max_pages": 5}'
Response:
{
"id": "crawl_...",
"status": "queued"
}
Poll GET /v1/crawls/{id} until status is done. Retrieve pages:
curl -X GET \
https://api.crawlcrawl.com/v1/crawls/{id}/pages \
-H 'Authorization: Bearer ${CRAWLCRAWL_API_KEY}'
Schedule a recurring crawl with a webhook:
curl -X POST \
https://api.crawlcrawl.com/v1/crawls \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ${CRAWLCRAWL_API_KEY}' \
-d '{"url": "https://example.com", "cron": "*/5 * * * *", "webhook_url": "https://your-webhook.com", "return_only_changed": true}'
Webhooks are HMAC-SHA256 signed. Verify signatures server-side.