Invoice OCR API

Extract structured data from invoice PDFs: vendor, line items, totals, and dates. Output is JSON or CSV for accounting systems and expense tools.

  • Vendor name, invoice number, dates
  • Line items with quantities and amounts
  • Multi-page and scanned invoice support

Quick example

bash
curl -X POST "https://www.docuclipper.com/api/v1/protected/document" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "document=@invoice.pdf"

Create job with jobType: "Invoice", then export with jobType: "Invoice".

Sample file: In the repo, use data/invoice2 copy.pdf for testing.

Field reference

FieldTypeDescription
vendorstringVendor/supplier name
invoiceNumberstringInvoice number
datestringInvoice date
lineItems[]arrayLine items with description, quantity, amount
totaldecimalTotal amount

Full example (no SDK)

Complete flow using only HTTP and JSON: upload → create job (jobType: Invoice) → poll → export.

javascript
const fs = require('fs');
const BASE = 'https://www.docuclipper.com/api/v1';
const API_KEY = process.env.DOCUCLIPPER_API_KEY;
const auth = (h = {}) => ({ Authorization: `Bearer ${API_KEY}`, ...h });

const form = new FormData();
form.append('document', new Blob([fs.readFileSync('invoice.pdf')]), 'invoice.pdf');
const up = await fetch(`${BASE}/protected/document`, { method: 'POST', headers: auth(), body: form });
const docId = (await up.json()).document.id;

const job = await fetch(`${BASE}/protected/job`, {
  method: 'POST', headers: auth({ 'Content-Type': 'application/json' }),
  body: JSON.stringify({ jobName: '', documents: [docId], jobType: 'Invoice' })
});
const jobId = (await job.json()).id;

let status;
while ((status = (await fetch(`${BASE}/protected/job/${jobId}`, { headers: auth() }).then(r => r.json())).status) !== 'Succeeded') {
  if (status === 'Failed') throw new Error('Job failed');
  await new Promise(r => setTimeout(r, 2000));
}

const exp = await fetch(`${BASE}/protected/job/${jobId}/export`, {
  method: 'POST', headers: auth({ 'Content-Type': 'application/json' }),
  body: JSON.stringify({ jobType: 'Invoice', flattenTables: true, format: 'json' })
});
console.log(await exp.json());