A privacy-first, in-browser OCR engine for the Mon language (mnw), powered by Rust, WebAssembly, and ONNX Runtime.
[!NOTE] The Mon language is classified as a "vulnerable" language in UNESCO's Atlas of the World’s Languages in Danger.
This project aims to digitize the Mon script, establishing a digital foundation suitable for future development, system integrations, and AI-driven preservation efforts.
MonOCR Web brings high-performance optical character recognition for the Mon script directly to the browser. By leveraging ONNX Runtime Web and a custom Wasm backend, all processing is performed locally on the user's device. This architecture ensures zero latency, offline capability, and absolute privacy—no images ever leave the browser.
[!TIP] File size is limited to 50MB for web and 20MB for mobile. For processing larger files or leveraging more powerful hardware, please use the CLI or package directly via
uv add monocrorpip install monocr.
Image (Canvas/Blob)
LineSegmenter → horizontal projection profile → List<LineSegment>
ImagePreprocessor → grayscale + normalize [-1.0, 1.0]
MonOcrEngine → ONNX Runtime Web (monocr.onnx)
CtcDecoder → greedy CTC decode → String
| Attribute | Specification |
|---|---|
| Architecture | MobileNetV3 + BiLSTM-384 + CTC |
| Precision | FP32 (ONNX) |
| Parameters | ~6.6M |
| Input | 128 × Variable (H × W) |
| Asset Size | ~25 MB |
monocr-web/
├── src/
│ ├── lib/
│ │ ├── engine/ # OCR Pipeline (ONNX/Wasm)
│ │ ├── components/ # Svelte UI Components
│ │ └── utils/ # Image & PDF Processing
│ └── routes/ # Application Pages
├── static/
│ ├── wasm/ # ONNX Runtime Wasm Binaries
│ └── fonts/ # Mon/Myanmar Unicode Fonts
├── scripts/ # Build & Asset Management
└── playwright/ # E2E Testing Suite
MonOCR is a unified cross-platform ecosystem designed for parity and performance:
pnpm install
Copy the pre-built ONNX Runtime WASM files to the static directory:
pnpm run copy-wasm
pnpm dev
pnpm build
[!IMPORTANT] The build script automatically optimizes the
monocr.onnxmodel deployment to comply with edge asset limits. In production, models are fetched from the HuggingFace CDN.
MIT