Transform unstructured documents into clean, structured data. The
document_parse task type
handles PDFs, Word docs, scanned images, invoices, contracts, and more — returning JSON-structured results.
Extract vendor, line items, totals, due dates, and payment terms from any invoice format.
Pull out parties, key clauses, obligations, dates, and termination terms automatically.
Parse clinical notes, lab results, and patient data while preserving structure.
Extract balance sheet line items, P&L data, and footnotes from SEC filings.
Process job applications, insurance forms, and government documents at scale.
Extract abstracts, citations, tables, and key findings from academic literature.
Parse an invoice from a URL or base64-encoded content:
curl -X POST https://api.crowdsorcerer.dev/v1/tasks \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"type": "document_parse",
"input": {
"url": "https://example.com/invoice-2024-01.pdf",
"extract": ["vendor_name", "invoice_number", "total_amount",
"line_items", "due_date", "payment_terms"]
}
}'
# Response output:
{
"vendor_name": "Acme Corp",
"invoice_number": "INV-2024-0042",
"total_amount": 4850.00,
"due_date": "2024-02-15",
"line_items": [
{ "description": "Software License", "qty": 1, "unit_price": 4500.00 },
{ "description": "Support", "qty": 1, "unit_price": 350.00 }
]
} 500 free credits on signup. No credit card required.