Tax Form OCR API
Extract key-value fields from W-2, 1099, and other tax forms. Output preserves the form’s labels as keys so you can map fields into your own schema.
- W-2 and 1099 support
- Label-preserving key-value extraction
- JSON export keyed by documentId
Quick example
bash
curl -X POST "https://www.docuclipper.com/api/v1/protected/document" \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "document=@w2.pdf"Create job with jobType: "Form", then export with jobType: "Form" and format: "json".
Sample file: Use data/w2-b (dragged).pdf for testing.
Field reference
| Field | Type | Description |
|---|---|---|
| [documentId][] | array | List of extracted form objects for that document (one per detected form instance) |
| [documentId][i].rowNumber | number | Form index within the document (0-based) |
| [documentId][i].pageNumber | number | Page number associated with that extracted form |
| [documentId][i].form | object | Key-value map of extracted fields. Keys are the form’s labels (strings); values are extracted strings. |
Full example (no SDK)
Complete flow: upload → create job (jobType: Form) → poll until Succeeded → export.
javascript
const fs = require('fs');
const BASE = 'https://www.docuclipper.com/api/v1';
const API_KEY = process.env.DOCUCLIPPER_API_KEY;
const auth = (h = {}) => ({ Authorization: `Bearer ${API_KEY}`, ...h });
const form = new FormData();
form.append('document', new Blob([fs.readFileSync('w2.pdf')]), 'w2.pdf');
const up = await fetch(`${BASE}/protected/document`, { method: 'POST', headers: auth(), body: form });
const docId = (await up.json()).document.id;
const job = await fetch(`${BASE}/protected/job`, {
method: 'POST', headers: auth({ 'Content-Type': 'application/json' }),
body: JSON.stringify({ jobName: '', documents: [docId], jobType: 'Form' })
});
const jobId = (await job.json()).id;
let status;
while ((status = (await fetch(`${BASE}/protected/job/${jobId}`, { headers: auth() }).then(r => r.json())).status) !== 'Succeeded') {
if (status === 'Failed') throw new Error('Job failed');
await new Promise(r => setTimeout(r, 2000));
}
const exp = await fetch(`${BASE}/protected/job/${jobId}/export`, {
method: 'POST', headers: auth({ 'Content-Type': 'application/json' }),
body: JSON.stringify({ jobType: 'Form', flattenTables: true, format: 'json' })
});
console.log(await exp.json());