How Business Card OCR Works — From Card Photo to Contact Fields
Table of Contents
Business card OCR does two things that sound simple but involve considerable technology: reading every character on a card, then figuring out which characters are a name versus an email address versus a phone number. Here's how both steps work and what affects accuracy.
Step 1: OCR — Reading the Characters
OCR converts an image of text into machine-readable characters. The process:
- Image preprocessing: The input photo is deskewed, contrast-adjusted, and denoised to make the text as clear as possible before recognition
- Text detection: Regions of the image containing text are identified and isolated
- Character recognition: Each character is identified using pattern matching against trained character models
- Word assembly: Characters are assembled into words, maintaining the spatial relationships from the card layout
Modern browser-based OCR uses Tesseract, a well-regarded open-source OCR engine that handles dozens of languages and font styles. The output is the raw text extracted from the card — every character the engine detected, arranged roughly as it appeared on the card.
Step 2: Field Classification — Name vs. Email vs. Phone
After OCR produces raw text, the second step categorizes that text into contact fields. This uses a combination of pattern matching and heuristics:
- Email addresses: Easy — regex pattern matching for [email protected] format
- Phone numbers: Pattern matching for sequences of digits with standard phone number formatting (+1, dashes, parentheses)
- URLs/websites: Pattern matching for http://, www., or common TLDs
- Name: Harder — typically the largest text on the card, identified by font size, position, or by eliminating other categories. Does not match phone/email/company patterns.
- Company name: Often the second-largest text or appears in prominent position; may be identified by organizational suffixes (Inc., LLC, Ltd., Corp.)
- Title/Position: Short phrases associated with professional roles; positioned near the name on most cards
- Address: Multi-line text with number and street/city/state/zip patterns
Why Accuracy Varies Between Cards
Several factors affect how well OCR works on a specific card:
- Font choice: Standard serif and sans-serif fonts are recognized reliably. Script, decorative, and custom fonts may be misread
- Color contrast: Dark text on light background works best. Light text on dark backgrounds, colored text on colored backgrounds, and metallic foil text are harder
- Embossed or textured printing: Raised text or textured paper can confuse edge detection
- Very small text size: Sub-8pt text is harder to resolve clearly even in high-resolution photos
- Photo quality: Blurry, angled, or poorly-lit photos give the OCR engine less information to work with
The raw OCR text panel in the scanner shows exactly what the engine read — if a field is missing or wrong, the raw text helps you identify what was captured and fix it manually.
Browser-Based OCR vs. Cloud OCR
Cloud OCR services (like Google Cloud Vision or AWS Textract) have access to large training datasets and can apply more sophisticated models than what runs in a browser. This generally gives them an accuracy edge on difficult cards — unusual fonts, complex layouts, rare languages.
Browser-based OCR (Tesseract in your browser) performs comparably to cloud services for standard business cards with common fonts and clear photos. The tradeoff is privacy: browser-based processing keeps your data local; cloud services upload the image.
For most business cards you'll encounter at professional events, browser-based OCR produces accurate enough results that manual correction is minimal.
Try It Free — No Signup Required
Runs 100% in your browser. No data is collected, stored, or sent anywhere.
Open Free Business Card ScannerFrequently Asked Questions
Can OCR read handwritten business cards?
Standard OCR is trained on printed fonts and has low accuracy on handwriting. Cards with printed text but handwritten additions (like a mobile number written in pen) may have the handwritten portion missed. For mostly-handwritten cards, use the Handwriting to Text OCR tool.
Why does OCR sometimes mix up the name and company fields?
If the card uses similar font sizes for the name and company, or if the layout is unusual, the classifier may not correctly distinguish them. The raw OCR text always shows both — copy the correct text manually into the right field.

