Apple Vision Framework

Apple’s native on-device machine learning framework for computer vision tasks. Available on iOS, macOS, iPadOS, tvOS, and visionOS. All processing runs entirely on-device for privacy and performance.

Why It Matters for Project Aries

Vision provides the primary on-device OCR and document scanning capability for any iOS app. It is free, deeply integrated with the OS, and has been expanding with each WWDC — WWDC25 added structured document reading that directly supports receipt scanning.

Key Capabilities

Text Recognition (VNRecognizeTextRequest)

Fast mode: Character-detection + small ML model — similar to traditional OCR
Accurate mode: Neural network finds strings/lines, then words — human-like reading
Multi-language support with language correction via NLP
Custom word lists for domain-specific jargon
Confidence scores per recognized string (0-1)
Word-level bounding boxes

Document Scanning (VisionKit)

VNDocumentCameraViewController: built-in document scanning UI since iOS 13
Automatic edge detection and perspective correction
Page-by-page scanning with delegate callbacks

Document Structure (RecognizeDocumentsRequest — WWDC25)

See recognizedocumentsrequest for full details.

Tables, lists, paragraphs, barcodes
Data Detection: email, phone, URLs, dates, currency, addresses
26 languages, all on-device

Other

31+ existing APIs: face/body/hand detection, pose tracking, trajectory analysis
DetectLensSmudgeRequest (WWDC25): image quality rejection
Rectangle detection for custom camera experiences

Trade-offs vs ML Kit

Vision is 6x slower (~0.31s vs ~0.05s per image) but has richer API features
Slightly better on rotated text (>20°)
iOS-only (ML Kit is cross-platform)
Supports Chinese (ML Kit is Latin-only)

Platform

iOS 11+ (basic), iOS 13+ (document scanning), iOS 19+ (RecognizeDocumentsRequest)

ios-receipt-scanning — Comprehensive receipt scanning guide
recognizedocumentsrequest — WWDC25 structured document API
google-ml-kit — Competing on-device OCR framework

type	entity
tags	ios, mobile, ocr, computer-vision, text-recognition
confidence	high

Project Aries

Explorer

Apple Vision Framework

Apple Vision Framework

Why It Matters for Project Aries

Key Capabilities

Text Recognition (VNRecognizeTextRequest)

Document Scanning (VisionKit)

Document Structure (RecognizeDocumentsRequest — WWDC25)

Other

Trade-offs vs ML Kit

Platform

Graph View

Table of Contents

Backlinks

Project Aries

Explorer

Apple Vision Framework

Apple Vision Framework

Why It Matters for Project Aries

Key Capabilities

Text Recognition (VNRecognizeTextRequest)

Document Scanning (VisionKit)

Document Structure (RecognizeDocumentsRequest — WWDC25)

Other

Trade-offs vs ML Kit

Platform

Related Pages

Graph View

Table of Contents

Backlinks