# crawlcrawl > The crawler API built for RAG ingestion. Markdown, structured signals, scheduled crawls, dataset storage, and change-detection diff all included at every paid tier. Start free at 1,000 credits per month, scale to enterprise. The Firecrawl alternative with more credits at every tier. Base URL: https://api.crawlcrawl.com Auth: `Authorization: Bearer crk_...` Spec: https://crawlcrawl.com/openapi.json Signup: https://crawlcrawl.com/signup (no card required) ## What crawlcrawl does crawlcrawl is a single REST API that turns any URL into clean, LLM-ready markdown with structured-data signals (schema.org, Open Graph, JSON-LD, hreflang, canonical) in the same response. Scheduled crawls, change-detection diffs, dataset storage, and HMAC-signed webhooks are included at every paid tier. Global routing across 190+ countries. ## Products - [Crawler](https://crawlcrawl.com/products/crawler): Multi-page crawl with link discovery and orphan detection. Returns markdown for every page. `POST /v1/crawls`. - [Monitor](https://crawlcrawl.com/products/monitor): Cron-scheduled recurring crawls. HMAC-SHA256 signed webhooks fire only on real changes. - [Scrape](https://crawlcrawl.com/products/scrape): Single-URL fetch with markdown and structured signals. - [Search](https://crawlcrawl.com/products/search): Live web search returning structured results. `POST /v1/cloud/search`. - [Transform](https://crawlcrawl.com/products/transform): HTML or PDF to clean markdown. - [Render](https://crawlcrawl.com/products/render): Browser-rendered HTML for sites that need JavaScript execution. - [Unblock](https://crawlcrawl.com/products/unblock): One API call bypasses Cloudflare, Akamai, PerimeterX, Datadome, reCAPTCHA. Drop-in for any existing scraper. `POST /v1/cloud/unblock`. - [LLMs.txt builder](https://crawlcrawl.com/products/llms-txt): Generate /llms.txt for any site. - [AI-bot audit](https://crawlcrawl.com/products/ai-bot-audit): Resolve which AI crawlers a site allows. - [Per-site scrapers](https://crawlcrawl.com/products/per-site-scrapers): Pre-built configs for popular sites. - [Proxy fetch](https://crawlcrawl.com/products/proxy-fetch): Bytes-billed HTML fetch via spider.cloud proxy. ~13x cheaper than chrome render. 4 pools, 199 countries. No JS execution. `POST /v1/cloud/proxy-fetch`. ## Actors - [audit-onpage](https://crawlcrawl.com/products/audit-onpage): ~30 on-page SEO rules per call. Errors, warnings, info. `POST /v1/actors/audit-onpage`. - [extract-article](https://crawlcrawl.com/products/extract-article): Trafilatura body extraction with author + date metadata. `POST /v1/actors/extract-article`. - [check-links](https://crawlcrawl.com/products/check-links): Lychee-based broken-link validation with optional chrome or proxy retry. `POST /v1/actors/check-links`. - [structured-data](https://crawlcrawl.com/products/structured-data): JSON-LD, Microdata, RDFa, OpenGraph, Dublin Core, Microformats in one response. `POST /v1/actors/structured-data`. - [render-diff](https://crawlcrawl.com/products/render-diff): Static vs JS-rendered DOM diff. Returns ai_bot_blind_pct — the 2026 AEO metric. `POST /v1/actors/render-diff`. - [internal-link-graph](https://crawlcrawl.com/products/internal-link-graph): PageRank + WCC + orphan detection on any existing crawl_id. `POST /v1/actors/internal-link-graph`. - [sitemap-audit](https://crawlcrawl.com/products/sitemap-audit): 7-bucket sitemap health. Supports `dry_run` for free cost preview. `POST /v1/actors/sitemap-audit`. ## Changelog - [Changelog](https://crawlcrawl.com/changelog): Every meaningful API or product change, dated. ## Documentation - [Quickstart](https://crawlcrawl.com/docs/quickstart): Crawl your first URL in 5 minutes. - [Authentication](https://crawlcrawl.com/docs/authentication): Bearer keys, rotation, audit log. - [Webhooks](https://crawlcrawl.com/docs/webhooks): HMAC-SHA256 signed deliveries. - [Errors](https://crawlcrawl.com/docs/errors): Status codes and error envelope. - [Rate limits](https://crawlcrawl.com/docs/rate-limits): Per-tier caps and concurrency. - [API reference](https://crawlcrawl.com/api): Full endpoint reference. ## Use cases - [RAG ingestion](https://crawlcrawl.com/use-cases/rag-ingestion): Crawl docs sites to clean markdown for vector databases. - [Competitor monitoring](https://crawlcrawl.com/use-cases/competitor-monitoring): Webhook on pricing-page changes. - [SEO audit at scale](https://crawlcrawl.com/use-cases/seo-audit): Orphan pages, broken links, weekly site diff. ## Comparison - [crawlcrawl vs Firecrawl](https://crawlcrawl.com/compare/firecrawl): Exactly 50% of Firecrawl's price at every tier. Same page allowances, same concurrency, half the bill. - [crawlcrawl vs TinyFish](https://crawlcrawl.com/compare/tinyfish): The crawler for RAG ingestion, not an agent platform. - [crawlcrawl vs Apify](https://crawlcrawl.com/compare/apify): API-first vs marketplace. - [crawlcrawl vs ScrapingBee](https://crawlcrawl.com/compare/scrapingbee): Crawler vs single-page scrape API. - [crawlcrawl vs Screaming Frog](https://crawlcrawl.com/compare/screaming-frog): Hosted vs desktop SEO tool. - [crawlcrawl vs ScrapeGraphAI](https://crawlcrawl.com/compare/scrapegraphai): Same credits at every tier, more features included. Multi-page crawl + dataset storage where ScrapeGraphAI focuses on LLM-driven extraction. - [crawlcrawl vs Olostep](https://crawlcrawl.com/compare/olostep): Workload-comparison economics + integrated platform breadth at every tier. ## Blog - [10 Best Web Crawlers for LLM and RAG Pipelines in 2026](https://crawlcrawl.com/blog/top-10-web-crawlers-llm-rag.html): Ranked comparison of the 10 best crawlers for retrieval-augmented generation. Verified 2026-05-16. - [robots.txt for AI Crawlers in 2026: The Complete Guide](https://crawlcrawl.com/blog/robots-txt-for-ai-crawlers.html): Complete reference for configuring robots.txt for GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and every other AI crawler worth knowing. Copy-paste templates. Verified 2026-05-16. - [AEO vs SEO in 2026: What Changes When AI Becomes the Search Engine](https://crawlcrawl.com/blog/aeo-vs-seo.html): Answer engine optimization vs traditional SEO. How they differ, where they overlap, technical setup checklist, and practical moves for the quarter. Verified 2026-05-16. - [10 Best Open-Source Web Crawlers in 2026](https://crawlcrawl.com/blog/open-source-web-crawlers.html): Ranked comparison of Scrapy, Crawl4AI, Katana, Colly, Playwright, Puppeteer, Apache Nutch, Heritrix, node-crawler, MechanicalSoup. When to self-host versus pick managed. Verified 2026-05-16. - [10 Best Bright Data Alternatives in 2026](https://crawlcrawl.com/blog/bright-data-alternatives.html): Ranked comparison of Bright Data alternatives for crawling, scraping, and RAG ingestion. Includes Oxylabs, ScraperAPI, Apify, ZenRows, Scrapfly, Zyte, Crawlbase, NetNut, Decodo. Real-workload pricing math. Verified 2026-05-16. - [10 Best ScrapingBee Alternatives in 2026](https://crawlcrawl.com/blog/scrapingbee-alternatives.html): Ranked comparison of ScrapingBee alternatives for single-page scraping, multi-page crawling, and RAG ingestion. Includes Scrape.do, ScraperAPI, ZenRows, Scrapfly, Firecrawl, Apify, Crawlbase, ScrapingDog, Zyte. Verified 2026-05-16. - [Firecrawl Pricing in 2026: Every Tier Explained, Real Costs, and How to Pick One](https://crawlcrawl.com/blog/firecrawl-pricing.html): Tier-by-tier breakdown of Firecrawl's six 2026 pricing plans (Free, Hobby $16, Standard $83, Growth $333, Scale $599, Enterprise). Real-world workload math and how to pick a tier. Verified 2026-05-16. - [The Firecrawl Alternative for Teams Scaling Past the Hobby Tier](https://crawlcrawl.com/blog/firecrawl-alternative.html): Half the price of Firecrawl at every tier. Verified 2026-05-19. ## Legal - [Privacy](https://crawlcrawl.com/privacy) - [Terms of service](https://crawlcrawl.com/terms) - [Security](https://crawlcrawl.com/security) - [Sub-processors](https://crawlcrawl.com/sub-processors) ## Pricing Every paid tier includes every feature: JavaScript rendering, 190+ country routing, markdown output, structured-data extraction, scheduled crawls, HMAC-signed webhooks, dataset storage, change-detection diff, LLMs.txt generation, screenshot API, search API, key rotation, robots policy management. No add-ons. No surcharges. - Free: 1,500 pages per month, 2 concurrent, $0, permanent, no card required. Basic crawl + scrape only (no cloud / anti-bot). - Pro: 5,000 pages per month, 5 concurrent, $8/month. All features unlocked. (Half of Firecrawl Hobby $16.) - Studio: 100,000 pages per month, 50 concurrent, $42/month. Most popular. (Half of Firecrawl Standard $83.) - Agency: 500,000 pages per month, 100 concurrent, $167/month. (Half of Firecrawl Growth $333.) - Scale: 1,000,000 pages per month, 150 concurrent, $300/month. (Half of Firecrawl Scale $599.) - Enterprise: custom caps, custom SLA, dedicated capacity, named support. Billing model: two integer counters tracked per project — pages (standard fetches) and cloud_pages (anti-bot edge network). No multipliers. Search bills per result returned (a 50-result query consumes 50 cloud pages, a 5-result query consumes 5). Intelligence endpoints (link graph, orphan detection, change-detection diff, AI-bot policy audit) are zero-cost on every paid tier. Every chargeable call records a row in an append-only ledger with an idempotency token; retries never double-bill. The headline: crawlcrawl is exactly half of Firecrawl at every tier. Same page allowances, same concurrency, same features unlocked — 50% of the bill. Pro $8 vs Firecrawl Hobby $16 (5K pages). Studio $42 vs Standard $83 (100K). Agency $167 vs Growth $333 (500K). Scale $300 vs Scale $599 (1M). crawlcrawl bills cloud_pages as an integer counter with no multipliers; Firecrawl's anti-bot multiplier is +4 credits per scrape.