Web Toolkit
On demand validation and alternative formats of web page content.
Our Web Toolkit offers access to some of the algorithms and practices that we use internally while processing the web. Learn more from the links below, or keep reading about the free and paid plans.
- Structured Data for extracting semantic data, such as schema.org, that publishers explicitly include in content. In addition to extraction, we apply common data normalization and validation practices for easier consumption by downstream clients.
- Text Content for extracting text-based formats, such as Markdown, of web pages. Our hybrid processors can analyze semantic markup, stylesheets, and common accessibility practices to build a meaningful profile of the underlying content.
Pricing
The following, self-service plans are available from your Settings page to quickly get started on your own. For more advanced requirements, contact us to discuss additional options.
Free | 100K Plan | 1M Plan | |
---|---|---|---|
Price | — | 25 USD /month | 250 USD /month |
Usage | 250 credits /day | 100,000 credits /month | 1,000,000 credits /month |
Document Extraction | |||
↳ Structured Data | Yes | Yes | Yes |
↳ Text Content | Yes | Yes | Yes |
↳ User-provided File | 1 credit /request | 1 credit /request | 1 credit /request |
↳ Verified Fetch + Render | 6 credits /request | 6 credits /request | 6 credits /request |
↳ Maximum Document Size | 2 MiB | 8 MiB | 8 MiB |
Workspace Features | |||
↳ Access Tokens | — | 2 tokens | 5 tokens |
↳ Verified Domains | 2 domains | 10 domains | 50 domains |
Tools | |||
↳ Web Application | Yes | Yes | Yes |
↳ Browser Extension | Yes | Yes | Yes |
↳ Developer API | — | Yes | Yes |
Anonymous, IP-based access is limited to 50 credits per day and does not support Workspace-related features.