Pipe raw HTML or PDF bytes in, get clean markdown out. Strips nav, footer, and chrome by default. The fastest path from a webpage to a vector store.
curl -X POST https://api.crawlcrawl.com/v1/cloud/transform \
-H "Authorization: Bearer crk_..." \
-d '{
"data": "<html>...</html>",
"input_kind": "html",
"return_format": "markdown",
"readability": true
}'
# returns
{ "content": "# Page title\n\nMain article body...", "cost_usd": 0.0001 }
Default true. Strips repeated layout (nav, footer, ads, comment threads) so what you get back is just the main content. Pass readability: false if you need the full document.
One call per document. Same key as the rest of the API.
Get an API key — free