Available OCR Engines
MIT OCR Models (Recommended)
MIT OCR Models (Recommended)
Overview
Manga-image-translator OCR models. Supports Japanese, English, and Chinese with automatic color extraction.Available Variants
MIT 32px OCR
MIT 48px OCR (Better Accuracy)
MIT 48px CTC OCR
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
chunk_size | selector | 16 | Batch size for processing (8, 16, 24, 32) |
device | selector | auto | Processing device (cuda/cpu/mps) |
Features
- Multi-language: Japanese, English, Chinese
- Color Extraction: Automatically extracts foreground/background colors
- GPU Accelerated: Fast processing with CUDA support
- Line-based: Processes text line by line
Model Files
- MIT 32px:
data/models/mit32px_ocr.ckpt - MIT 48px:
data/models/ocr_ar_48px.ckpt+data/alphabet-all-v7.txt - MIT 48px CTC:
data/models/mit48pxctc_ocr.ckpt+data/alphabet-all-v5.txt
Manga OCR
Manga OCR
Overview
manga-ocr by kha-white. Optimized for Japanese manga text.Configuration
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
device | selector | auto | Processing device (cuda/cpu/mps) |
Features
- Japanese Focused: Excellent for Japanese manga
- Transformer-based: Uses Vision Encoder-Decoder architecture
- No Color Detection: Returns text only
Model Files
data/models/manga-ocr-base/(multiple files from HuggingFace)
PaddleOCR
PaddleOCR
Overview
Multi-language OCR supporting 80+ languages. Highly configurable with multiple model versions.Configuration
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
language | selector | English | Target language (80+ options) |
device | selector | auto | Processing device |
use_angle_cls | checkbox | False | Enable rotation detection |
ocr_version | selector | PP-OCRv4 | Model version (v2-v4) |
enable_mkldnn | checkbox | False | CPU acceleration via MKL-DNN |
det_limit_side_len | int | 960 | Max detection side length |
rec_batch_num | int | 6 | Recognition batch size |
drop_score | float | 0.5 | Confidence threshold |
text_case | selector | Capitalize | Output text case |
output_format | selector | As Recognized | Format output text |
Supported Languages
Over 80 languages including:- Asian: Chinese (Simplified/Traditional), Japanese, Korean, Thai, Vietnamese
- European: English, French, German, Spanish, Italian, Russian, etc.
- Middle Eastern: Arabic, Persian, Urdu, Hebrew
- Indic: Hindi, Tamil, Telugu, Bengali, etc.
Text Case Options
- Uppercase: ALL CAPS
- Capitalize Sentences: First word capitalized
- Lowercase: all lowercase
Output Format
- Single Line: Concatenate all text to one line
- As Recognized: Preserve detected line breaks
Features
- Multi-language: 80+ languages
- Version Selection: PP-OCRv2, v3, v4
- Angle Classification: Detect and correct rotated text
- Text Formatting: Built-in case conversion
- CPU Acceleration: MKL-DNN support
PaddleOCR VL Manga
PaddleOCR VL Manga
Overview
PaddleOCR-VL-For-Manga - Vision-Language model for manga.Features
- Manga-Optimized: Trained specifically for manga text
- Japanese Support: Excellent Japanese recognition
- VL Architecture: Vision-Language model approach
Stariver OCR
Stariver OCR
Overview
Cloud-based OCR from Stariver Cloud.Setup
Features
- Cloud-Based: No local model required
- Combined with Detector: Best used with Stariver Detector
- High Accuracy: Professional OCR service
Recommended Usage
For optimal performance with Stariver:- Use Stariver Detector for text detection
- Set OCR to
none_ocr - Text is extracted directly from detector
- Saves API credits and processing time
Google Vision OCR
Google Vision OCR
Bing Lens OCR
Bing Lens OCR
macOS/Windows Built-in OCR
macOS/Windows Built-in OCR
LLM-based OCR
LLM-based OCR
OneOCR
OneOCR
None OCR
None OCR
Usage Example
Choosing an OCR Engine
| Use Case | Recommended Engine | Notes |
|---|---|---|
| Japanese manga | MIT 48px or Manga OCR | Best accuracy for Japanese |
| English comics | MIT 48px or PaddleOCR | Good general performance |
| Chinese manga | MIT 48px or PaddleOCR | Supports both variants |
| Multi-language | PaddleOCR | 80+ languages |
| Need colors | MIT models | Only MIT extracts colors |
| Cloud-based | Stariver or Google Vision | High accuracy, costs credits |
| Offline/Free | MIT or PaddleOCR | Best offline options |
Performance Tips
- Chunk Size: Larger chunks (24-32) faster but use more VRAM
- Device: Always use CUDA if available
- Model Selection: MIT 48px offers best accuracy/speed balance
- Batch Processing: Process multiple pages in batches
Color Extraction
Only MIT OCR models extract text colors:Advanced Configuration
PaddleOCR Custom Models
Models are stored indata/models/paddle-ocr/{lang}/{version}/:
det/: Detection modelrec/: Recognition modelcls/: Angle classification model (optional)
