Apple Vision Framework
Apple’s native on-device machine learning framework for computer vision tasks. Available on iOS, macOS, iPadOS, tvOS, and visionOS. All processing runs entirely on-device for privacy and performance.
Why It Matters for Project Aries
Vision provides the primary on-device OCR and document scanning capability for any iOS app. It is free, deeply integrated with the OS, and has been expanding with each WWDC — WWDC25 added structured document reading that directly supports receipt scanning.
Key Capabilities
Text Recognition (VNRecognizeTextRequest)
- Fast mode: Character-detection + small ML model — similar to traditional OCR
- Accurate mode: Neural network finds strings/lines, then words — human-like reading
- Multi-language support with language correction via NLP
- Custom word lists for domain-specific jargon
- Confidence scores per recognized string (0-1)
- Word-level bounding boxes
Document Scanning (VisionKit)
VNDocumentCameraViewController: built-in document scanning UI since iOS 13- Automatic edge detection and perspective correction
- Page-by-page scanning with delegate callbacks
Document Structure (RecognizeDocumentsRequest — WWDC25)
See recognizedocumentsrequest for full details.
- Tables, lists, paragraphs, barcodes
- Data Detection: email, phone, URLs, dates, currency, addresses
- 26 languages, all on-device
Other
- 31+ existing APIs: face/body/hand detection, pose tracking, trajectory analysis
DetectLensSmudgeRequest(WWDC25): image quality rejection- Rectangle detection for custom camera experiences
Trade-offs vs ML Kit
- Vision is 6x slower (~0.31s vs ~0.05s per image) but has richer API features
- Slightly better on rotated text (>20°)
- iOS-only (ML Kit is cross-platform)
- Supports Chinese (ML Kit is Latin-only)
Platform
iOS 11+ (basic), iOS 13+ (document scanning), iOS 19+ (RecognizeDocumentsRequest)
Related Pages
- ios-receipt-scanning — Comprehensive receipt scanning guide
- recognizedocumentsrequest — WWDC25 structured document API
- google-ml-kit — Competing on-device OCR framework