DocuClipper logo

Async processing

Every extraction job runs asynchronously. The recommended path is to register a webhook and let DocuClipper push the completed extraction to you. Polling GET /api/v1/agent/jobs/:id is supported as a fallback.

Webhooks (recommended)

Subscribe to the doctype-specific event for your job:

  • bank_statement.extraction.completed — bank statements + check images
  • invoice.extraction.completed — invoices / receipts
  • form.extraction.completed — tax forms
  • document.extraction.failed — always subscribe to catch errors

Full setup, signature verification, retries, and cleanup live on the Webhooks page.

Polling (fallback)

bash
curl -s -H "Authorization: Bearer $PAT" \
  "https://www.docuclipper.com/api/v1/agent/jobs/12345"

Sample response when the job is finished:

json
{
  "id": "12345",
  "name": "sdk-demo-bank",
  "status": "Succeeded",
  "type": "ExtractData",
  "isGeneric": false,
  "isBankMode": true,
  "reviewed": false,
  "createdAt": "2026-05-26T15:27:04.682Z",
  "updatedAt": "2026-05-26T15:27:25.911Z",
  "documents": [
    { "id": "2666907", "originalname": "statement.pdf", "mimetype": "application/pdf", "numPages": 4, "isProcessed": true, "isScanned": false }
  ],
  "summary": { "totalDocuments": 1, "totalPages": 4, "transactionLineCount": 137 }
}

Job statuses

  • Pending — queued, not yet picked up by a worker
  • InProgress — worker actively processing
  • Succeeded — done; webhook fired (if subscribed) and GET /agent/jobs/:id/data is ready
  • Failed — see document.extraction.failed webhook for the error message

Why webhooks beat polling

  • No wasted requests.Most polls during the average job's ~30 second runtime return InProgress. Skip them.
  • The payload is free.The webhook body already contains the full extracted JSON — you don't need a follow-up GET /agent/jobs/:id/data call to get the result.
  • Failure handling. document.extraction.failed carries the error message inline. Polling makes you reconstruct it from the job row.