Extractions Overview
Submit documents for AI extraction, track processing status, and retrieve structured results.
An Extraction is an asynchronous job that processes one or more documents against a parser. Submit files, poll until processing finishes, then download structured JSON results.
Documents are nested under extractions — there is no top-level /v1/documents resource. You always need both extraction_id and document_id to fetch a single document.
Which endpoint do I call?
| Goal | Endpoint |
|---|---|
| Submit files | POST /v1/extractions |
| Is the job done? | GET /v1/extractions/:extraction_id |
| All results at once | GET /v1/extractions/:extraction_id/results |
| One result (webhook retry, large batch) | GET /v1/extractions/:extraction_id/documents/:document_id |
| Export batch | GET /v1/extractions/:extraction_id/export |
| Export one file | GET /v1/extractions/:extraction_id/documents/:document_id/export |
Typical workflow
- Submit an extraction —
POST /v1/extractionswithparser_idand file(s). Returns202withextraction.extraction_idand per-document IDs. - Check status — poll
GET /v1/extractions/:extraction_iduntil status iscompleted,failed, orpartial. - Get results — fetch structured data for all documents in the batch.
- Export results — (optional) download results as JSON, CSV, or Excel.
For real-time delivery instead of polling, configure Webhooks. Webhook payloads include both extraction_id and document_id for nested follow-up requests.
Endpoints
| Endpoint | Description |
|---|---|
| Submit Extraction | POST /v1/extractions |
| List Extractions | GET /v1/extractions |
| Get Status | GET /v1/extractions/:extraction_id |
| Get Results | GET /v1/extractions/:extraction_id/results |
| Export Results | GET /v1/extractions/:extraction_id/export |
| Get Document | GET /v1/extractions/:extraction_id/documents/:document_id |
| Export Document | GET /v1/extractions/:extraction_id/documents/:document_id/export |
Billing
Credits are charged per page (1 page = 1 credit). The submit endpoint counts pages server-side and rejects the request with 402 INSUFFICIENT_CREDITS if your balance is too low. POST /v1/extractions has a tighter rate limit of 20 requests/minute.
Related
- Parsers — configure the parser and schema before extracting.
- Automation Integrations — polling, webhooks, and idempotency patterns.