Browser Based Passport MRZ Reader with Tesseract.js

Why?

This project was born out of a real world need: VAT refund processing for tourists shopping in Dubai. When visitors make purchases in the UAE, they're eligible for VAT returns but claiming that refund requires passport information at the point of sale. Rather than manually typing passport details (slow, error prone, and frustrating for both staff and customers), we needed a fast, accurate way to capture passport data directly from the document.

Need a practical solution to speed up retail transactions while ensuring accurate tourist identification for VAT compliance.

How?

Solution lies in the Machine Readable Zone (MRZ) those two lines of seemingly cryptic text at the bottom of your passport's data page. In this post, We'll walk through a solution fully client side MRZ reader using nothing but Javascript and WebAssembly.

What is MRZ?

The Machine Readable Zone is a standardized format defined by ICAO (International Civil Aviation Organization). For passports (TD3 format), it consists of two lines of 44 characters each, totaling 88 characters. This compact format encodes:

Document type and issuing country
Holder's name (surname and given names)
Passport number with check digit
Nationality
Date of birth with check digit
Gender
Expiration date with check digit
Personal number (optional)

Here's what a typical MRZ looks like:

P<UTOERIKSSON<<ANNA<MARIA<<<<<<<<<<<<<<<<<<<
L898902C36UTO7408122F1204159ZE184226B<<<<<10

The Challenge

Traditional MRZ readers require specialized hardware infrared scanners or dedicated OCR devices. We wanted something different: a solution that works with any smartphone or laptop camera, runs entirely in the browser, and keeps sensitive passport data private by never sending it to a server.

Our Approach

The entire solution is built with vanilla Javascript - no frameworks, no build process, no bundler required. Just plain HTML, CSS, and Javascript files that you can serve directly. This keeps the project simple, easy to understand, and eliminates tooling complexity.

1. Tesseract.js with Custom Training

The backbone of the solution is Tesseract.js, a pure Javascript port of the Tesseract OCR engine compiled to WebAssembly. Out of the box, Tesseract is trained for general text recognition. MRZ, however, uses a specific font (OCR-B) and a limited character set (A-Z, 0-9, and <).

We trained a custom Tesseract model specifically for MRZ recognition. This dramatically improved accuracy compared to the default English model, especially for commonly confused characters like:

`0` (zero) vs `O` (letter O)
`1` (one) vs `I` (letter I)
`<` (filler) vs `K` or `X`

2. Camera Integration

Using the MediaDevices API, we access the device camera with a preference for the rearfacing camera (better for document scanning):

const constraints = {
  video: {
    facingMode: 'environment',
    width: { min: 888 },
    height: { min: 500 }
  }
};

navigator.mediaDevices.getUserMedia(constraints)
  .then(stream => {
    video.srcObject = stream;
  });

3. MRZ Detection and Validation

Not every captured image contains a valid MRZ. We use regex pattern matching to detect passport MRZ signatures:

function isPassportMRZ(text) {
  const passportMRZPattern = /P<\w{3}[A-Z0-9<]+/;
  return passportMRZPattern.test(text);
}

The `P<` prefix indicates a passport document, followed by a three letter country code.

4. Parsing the MRZ

Once detected, we parse the 88 character string into structured data. The first line contains identity information:

const firstLineMatch = firstLine.match(/P<([A-Z]{3})([A-Z<]*)<<([A-Z<]*)/);
// Group 1: Nationality (3 letters)
// Group 2: Surname (< used as padding)
// Group 3: Given names

The second line contains document details:

const secondLineMatch = secondLine.match(
  /([A-Z0-9<]{9})([0-9])([A-Z]{3})([0-9]{6})([0-9])([MF<])([0-9]{6})([0-9])([A-Z0-9<]*)/
);
// Passport number, check digits, dates, gender, etc.

5. Visual Feedback

To help users understand what the OCR engine is seeing, we draw bounding boxes around recognized words:

function drawBoundingBoxes(words) {
  context.strokeStyle = 'red';
  context.lineWidth = 2;

  words.forEach((word) => {
    const { bbox } = word;
    context.strokeRect(bbox.x0, bbox.y0, bbox.x1 - bbox.x0, bbox.y1 - bbox.y0);
  });
}

Privacy by Design

A critical design decision was keeping everything client side. The passport image and extracted data never leave the user's browser. The entire OCR process runs locally via WebAssembly:

No server uploads
No API calls with personal data
No data retention

This makes the solution suitable for privacy sensitive applications where transmitting passport data to external servers is not acceptable.

Technical Architecture

┌─────────────────────────────────────────────────────────┐
│                      Browser                            │
├─────────────────────────────────────────────────────────┤
│  ┌─────────┐    ┌─────────┐    ┌──────────────────┐    │
│  │ Camera  │───▶│ Canvas  │───▶│ Tesseract.js     │    │
│  │  API    │    │ Capture │    │ (WASM Engine)    │    │
│  └─────────┘    └─────────┘    └────────┬─────────┘    │
│                                         │              │
│                                         ▼              │
│                               ┌──────────────────┐     │
│                               │  MRZ Parser      │     │
│                               │  & Validator     │     │
│                               └────────┬─────────┘     │
│                                         │              │
│                                         ▼              │
│                               ┌──────────────────┐     │
│                               │  Structured      │     │
│                               │  JSON Output     │     │
│                               └──────────────────┘     │
└─────────────────────────────────────────────────────────┘

Results

The custom model achieves high accuracy on well lit, properly positioned passport images. The entire recognition process takes 1-3 seconds depending on device capabilities, with SIMD enabled browsers seeing the fastest results.

Sample output:

{
  "Nationality": "UTO",
  "Surname": "ERIKSSON",
  "Given Names": "ANNA MARIA",
  "Passport Number": "L898902C3",
  "Issuing Country": "UTO",
  "Date of Birth": "740812",
  "Gender": "Female",
  "Expiration Date": "120415",
  "Personal Number": "ZE184226B"
}

Lessons Learned

1. Custom training matters - Generic OCR models struggle with MRZ's specific character set and font.

2. WebAssembly is production ready - Running Tesseract in the browser via WASM provides almost native performance.

3. Camera quality varies wildly - Desktop webcams, phone cameras, and tablets all produce different results. Good lighting is crucial.

4. The MRZ spec is well-designed - Check digits, fixed positions, and limited character sets make parsing reliable once OCR accuracy is high enough.

What's Next

We're exploring several enhancements:

Real-time detection - Automatically capture when a valid MRZ is detected
Image preprocessing - Contrast enhancement and skew correction before OCR
TD1/TD2 support - Expand beyond passports to ID cards
Check digit validation - Verify OCR accuracy using built-in checksums

Try It Yourself

The entire project runs in any modern browser. Clone the repository, serve the files over HTTPS (or localhost), and point your camera at a passport. All processing happens on your device your data stays with you.

This project demonstrates how modern web technologies can deliver functionality that once required specialized hardware, all while respecting user privacy.