Home Use Cases Document Parsing
📄

Document Parsing & Extraction

Transform unstructured documents into clean, structured data. The document_parse task type handles PDFs, Word docs, scanned images, invoices, contracts, and more — returning JSON-structured results.

🧾

Invoice Processing

Extract vendor, line items, totals, due dates, and payment terms from any invoice format.

📑

Contract Analysis

Pull out parties, key clauses, obligations, dates, and termination terms automatically.

🏥

Medical Records

Parse clinical notes, lab results, and patient data while preserving structure.

🏦

Financial Statements

Extract balance sheet line items, P&L data, and footnotes from SEC filings.

📋

Forms & Applications

Process job applications, insurance forms, and government documents at scale.

📚

Research Papers

Extract abstracts, citations, tables, and key findings from academic literature.

API Example

Parse an invoice from a URL or base64-encoded content:

curl -X POST https://api.crowdsorcerer.dev/v1/tasks \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "document_parse",
    "input": {
      "url": "https://example.com/invoice-2024-01.pdf",
      "extract": ["vendor_name", "invoice_number", "total_amount",
                  "line_items", "due_date", "payment_terms"]
    }
  }'

# Response output:
{
  "vendor_name": "Acme Corp",
  "invoice_number": "INV-2024-0042",
  "total_amount": 4850.00,
  "due_date": "2024-02-15",
  "line_items": [
    { "description": "Software License", "qty": 1, "unit_price": 4500.00 },
    { "description": "Support", "qty": 1, "unit_price": 350.00 }
  ]
}

Supported Formats

PDFDOCXXLSXPNG/JPG (OCR)HTMLCSVTXTRTFPPTXScanned docs

Automate your document workflow

500 free credits on signup. No credit card required.