Six syntaxes in one call: JSON-LD, Microdata, RDFa, OpenGraph, Dublin Core, Microformats. Uniform JSON output. Drop-in for schema.org audits, rich-results readiness, and AI-search visibility checks.
Most "schema extraction" tools only handle JSON-LD. The reality is that production sites still use Microdata (legacy ecommerce), RDFa (older CMS templates), OpenGraph (social previews), Dublin Core (publishing), and Microformats (older blog templates). A complete audit needs all six. structured-data extracts all of them in one call with a consistent output shape.
curl -X POST https://api.crawlcrawl.com/v1/actors/structured-data \
-H "Authorization: Bearer crk_..." \
-d '{"url":"https://your-client.com/product/123"}'
# → 200
{
"actor": "structured-data",
"elapsed_ms": 114,
"data": {
"counts": {
"json-ld": 2, "microdata": 0, "rdfa": 1,
"opengraph": 1, "dublincore": 1, "microformat": 0
},
"data": {
"json-ld": [ { "@context": "https://schema.org", "@type": "Product", ... } ],
"rdfa": [ ... ],
"opengraph": [ { "og:title": "...", "og:image": "...", ... } ],
"dublincore": [ { "elements": [ ... ] } ]
}
}
}
extruct (scrapinghub) is the same library Zyte uses in production and the same library most academic structured-data crawlers rely on. We run it as a Python sidecar accessible only to the crawler API — you get production-grade parsing without managing the dependency.
<script type="application/ld+json"> block, parsed and validated. Multiple blocks per page are returned as an array.itemscope / itemprop trees, resolved to typed objects.og:* meta tag plus type-specific extensions (article:*, product:*).Rich-results readiness. Audit every product page for valid Product/Offer JSON-LD before Google's structured-data report catches issues at scale.
AI-search visibility. AI search engines (Perplexity, ChatGPT search, Claude) lean heavily on JSON-LD for entity extraction. Pages without it are invisible to the new search surface. Pair with render-diff to see how much of your schema is post-JS rendered (and thus skipped by non-rendering crawlers).
Migration verification. After a CMS or theme change, run structured-data on every template type to confirm schema didn't regress.
Competitor research. Pull the JSON-LD off a competitor's product page to see exactly what typed properties they emit.
One page-credit per call. The $42 Studio tier includes 50,000 page-credits a month. See full pricing →
$42/mo for 100,000 schema audits. Six syntaxes, one JSON.
Get an API key — free