Tesseract OCR
Open-source optical character recognition engine, originally developed by HP and now maintained by Google. One of the oldest and most widely used OCR engines. Can be compiled for iOS via Swift package or CocoaPods wrappers.
Why It Matters for Project Aries
Tesseract is the “free and offline” option. Useful for prototyping, offline-first apps, or as a fallback OCR step in a multi-engine pipeline. However, its raw accuracy on receipts is poor (50-70%) and it provides zero structured extraction — significant custom development is needed to make it receipt-usable.
Key Facts
- First release: 1985 (HP); open-sourced 2005; Google-maintained since 2006
- Version: 5.x (LSTM-based recognition engine)
- License: Apache 2.0
- Language support: 100+ languages via training data packs
- iOS availability: Via community wrappers (SwiftyTesseract, Tesseract-iOS)
Accuracy on Receipts
- 50-70% on real-world receipts (Google AI Mode estimate, September 2025)
- 75% with well-tuned custom preprocessing pipeline
- No line-item extraction — returns raw text strings
- Struggles with complex layouts, multi-column, handwriting
Advantages
- Completely free (no per-page cost, no API limits)
- Offline — no network needed, privacy-preserving
- Highly customisable (preprocessing, character white-listing, dictionary tuning)
- Multi-language
Disadvantages
- Requires heavy custom development for receipt use cases
- Image preprocessing (binarization, deskew, orientation correction) needed for usable results
- No structured output — post-processing entirely custom
- Poor on handwritten text
- Performance tuning is domain-specific and time-intensive
Related Pages
- ios-receipt-scanning — Comprehensive receipt scanning guide
- apple-vision-framework — Higher-accuracy on-device alternative
- google-ml-kit — Faster on-device alternative