AWS Textract
Amazon’s cloud-based document text and data extraction service. The Analyze Expense feature is purpose-built for receipts and invoices, providing the highest line-item accuracy in third-party benchmarks.
Why It Matters for Project Aries
Textract is the accuracy leader for receipt parsing among cloud APIs. If Project Aries’ receipt scanning backend runs on AWS infrastructure, it’s the natural choice. The structured output is the most detailed — providing labeled fields and line items out of the box.
Key Facts
- Provider: Amazon Web Services
- Pricing: 0.008/page (1M+), 1,000 pages free first 3 months
- Accuracy: 93% field-level, 89% line-item across 100 tested receipts
- Features: Tables, forms, key-value pairs, Analyze Expense (receipt/invoice specific)
Receipt-Specific Performance
| Receipt Type | Field Accuracy | Line-Item Accuracy |
|---|---|---|
| Restaurant | 96% | 93% |
| Fuel station | 95% | 91% |
| Pharmacy | 93% | 88% |
| Supermarket | 91% | 86% |
| Hardware store | 92% | 87% |
Strengths
- Highest overall line-item accuracy tested
- Best structured output with labeled field types (VENDOR_NAME, TOTAL, INVOICE_RECEIPT_DATE, ITEM, PRICE)
- Native AWS integration (S3, Lambda, DynamoDB)
- Handles complex layouts (multi-column, price modifiers)
Weaknesses
- Requires AWS account/IAM setup (steeper learning curve than REST APIs)
- No permanent receipt-specific free tier (3 months only)
- Verbose response format — needs SDK helpers or custom parsing
- Cloud-only (no on-device option)
Related Pages
- ios-receipt-scanning — Comprehensive receipt scanning guide
- tabscanner — Receipt-specialized competitor
- veryfi — Another receipt OCR API