Overview
Inpainters remove text from images by filling text regions with background patterns. BallonTranslator provides multiple inpainting algorithms optimized for manga/comic translation.
InpainterBase Class
Base class for all inpainting modules.
Import
from modules.inpaint.base import InpainterBase, INPAINTERS, register_inpainter
Class Definition
class InpainterBase(BaseModule):
"""
Base class for inpainting modules.
Removes text from images by filling masked regions.
Supports both full-image and block-based inpainting.
"""
inpaint_by_block = True # Process blocks individually
check_need_inpaint = True # Skip inpainting for solid backgrounds
_preprocess_hooks = OrderedDict()
_postprocess_hooks = OrderedDict()
Methods
inpaint
Inpaint text regions in an image.Parameters:
img (np.ndarray): Input image (RGB or RGBA)
mask (np.ndarray): Binary mask of text regions (255 = text, 0 = background)
textblock_list (List[TextBlock] | None): Text blocks for block-based inpainting
check_need_inpaint (bool): Whether to skip solid backgrounds (default: False)
Returns: np.ndarray - Inpainted image (same format as input)inpainter = INPAINTERS['lama_large_512px']()
# Full image inpainting
inpainted = inpainter.inpaint(img, mask)
# Block-based inpainting
inpainted = inpainter.inpaint(img, mask, text_blocks)
The inpaint method automatically:
- Loads the model if not already loaded
- Preserves alpha channel for RGBA images
- Handles CUDA out-of-memory errors by falling back to CPU
- Optionally skips inpainting for solid color backgrounds
_inpaint (Override Required)
Internal inpainting method. Must be implemented by subclasses.Parameters:
img (np.ndarray): Input image (RGB only)
mask (np.ndarray): Binary mask
textblock_list (List[TextBlock] | None): Text blocks
Returns: np.ndarray - Inpainted image (RGB)def _inpaint(self, img: np.ndarray, mask: np.ndarray, textblock_list=None) -> np.ndarray:
# Your inpainting logic
result = self.model.inpaint(img, mask)
return result
moveToDevice
Move model to a different device.Parameters:
device (str): Target device (cpu, cuda, etc.)
precision (str | None): Model precision (fp32, fp16, bf16)
inpainter.moveToDevice('cuda', precision='fp16')
Properties
Inpainter name from registry. Automatically set during initialization.
Whether to process each text block individually. Block-based inpainting is more memory-efficient.
Whether to detect solid backgrounds and skip inpainting (fills with median color instead).
Available Inpainters
LaMa Large (Recommended)
High-quality deep learning inpainter with large receptive field.
from modules.inpaint.base import INPAINTERS
inpainter = INPAINTERS['lama_large_512px'](
device='cuda',
inpaint_size=1536,
precision='bf16'
)
Parameters
Computation device.Options: cpu, cuda, mps, xpu (not privateuseone)Default: Auto-detected
Maximum inpainting size. Larger values improve quality but use more memory.Options: 512, 768, 1024, 1536, 2048Default: 1536
Model precision for inference.Options: fp32, bf16Default: bf16 (if supported), otherwise fp32
Features
- State-of-the-art quality
- Large architecture for better context
- Supports bfloat16 for faster inference
- Optimized for manga/comic inpainting
Model Files
Downloads from HuggingFace: dreMaz/AnimeMangaInpainting/lama_large_512px.ckpt
LaMa MPE
LaMa with Masked Position Encoding - good balance of quality and speed.
inpainter = INPAINTERS['lama_mpe'](
device='cuda',
inpaint_size=2048
)
Parameters
Computation device.Default: Auto-detected
Maximum inpainting size.Options: 1024, 2048Default: 2048
Features
- Fast inference with MPE
- Good quality for most use cases
- Lower memory usage than LaMa Large
AOT Inpainter
Aggregated Contextual Transformations inpainter.
inpainter = INPAINTERS['aot'](
device='cuda',
inpaint_size=2048
)
Parameters
Computation device.Default: Auto-detected
Maximum inpainting size.Options: 1024, 2048Default: 2048
Features
- Fast and efficient
- Good for simple backgrounds
- Lower quality than LaMa models
PatchMatch Inpainter
Traditional algorithm-based inpainter (no deep learning).
inpainter = INPAINTERS['patchmatch']()
Features
- No GPU required
- Fast for small regions
- Good for textured backgrounds
- Lower quality than deep learning methods
Requires platform-specific libraries:
- macOS:
macos_patchmatch_libs.7z
- Windows:
windows_patchmatch_libs.7z
OpenCV Telea
Simple OpenCV-based inpainter.
inpainter = INPAINTERS['opencv-tela']()
Features
- Lightweight
- No model download needed
- Fastest option
- Lowest quality
Example Usage
Basic Inpainting
import cv2
from modules.base import init_inpainter_registries
from modules.inpaint.base import INPAINTERS
# Initialize
init_inpainter_registries()
# Load image and mask
img = cv2.imread('page.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
mask = cv2.imread('text_mask.png', cv2.IMREAD_GRAYSCALE)
# Get inpainter
inpainter = INPAINTERS['lama_large_512px'](device='cuda')
# Inpaint
inpainted = inpainter.inpaint(img, mask)
# Save result
cv2.imwrite('result.jpg', cv2.cvtColor(inpainted, cv2.COLOR_RGB2BGR))
Complete Translation Pipeline
from modules.textdetector.base import TEXTDETECTORS
from modules.inpaint.base import INPAINTERS
# Detect text
detector = TEXTDETECTORS['ctd'](device='cuda')
mask, text_blocks = detector.detect(img)
# Inpaint text regions
inpainter = INPAINTERS['lama_large_512px'](device='cuda')
inpainted = inpainter.inpaint(img, mask, text_blocks)
# Now you can render translated text on clean background
RGBA Image Support
# Load RGBA image (with transparency)
img = cv2.imread('page.png', cv2.IMREAD_UNCHANGED)
# Inpaint (alpha channel is preserved)
inpainted = inpainter.inpaint(img, mask)
# Result maintains alpha channel
assert inpainted.shape[2] == 4 # RGBA
Memory-Efficient Block-Based Inpainting
# For large images, process blocks individually
inpainter = INPAINTERS['lama_large_512px'](
device='cuda',
inpaint_size=1024 # Smaller size for memory efficiency
)
# Block-based inpainting (default)
inpainted = inpainter.inpaint(img, mask, text_blocks)
# Force full-image inpainting
inpainter.inpaint_by_block = False
inpainted = inpainter.inpaint(img, mask)
Creating Custom Inpainters
Basic Template
import numpy as np
from typing import List
import torch
from modules.inpaint.base import InpainterBase, register_inpainter, TextBlock, DEVICE_SELECTOR
@register_inpainter('my_inpainter')
class MyInpainter(InpainterBase):
params = {
'device': DEVICE_SELECTOR(),
'inpaint_size': {
'type': 'selector',
'options': [512, 1024, 2048],
'value': 1024
},
'description': 'My custom inpainter'
}
_load_model_keys = {'model'}
def __init__(self, **params):
super().__init__(**params)
self.device = self.get_param_value('device')
self.inpaint_size = self.get_param_value('inpaint_size')
self.model = None
def _load_model(self):
"""Load inpainting model."""
self.model = load_my_inpainting_model(device=self.device)
def _inpaint(self, img: np.ndarray, mask: np.ndarray, textblock_list=None) -> np.ndarray:
"""Inpaint the image."""
# Preprocess
img_tensor = self.preprocess(img, mask)
# Run model
with torch.no_grad():
output = self.model(img_tensor)
# Postprocess
result = self.postprocess(output, img.shape)
return result
def moveToDevice(self, device: str, precision: str = None):
"""Move model to device."""
self.model.to(device)
self.device = device
Advanced: With Preprocessing
import cv2
from utils.imgproc_utils import resize_keepasp
@register_inpainter('advanced_inpainter')
class AdvancedInpainter(InpainterBase):
def inpaint_preprocess(self, img: np.ndarray, mask: np.ndarray):
"""Preprocess image and mask."""
img_original = np.copy(img)
mask_original = np.copy(mask)
# Threshold mask
mask_original[mask_original < 127] = 0
mask_original[mask_original >= 127] = 1
mask_original = mask_original[:, :, None]
# Resize if needed
new_shape = self.inpaint_size if max(img.shape[:2]) > self.inpaint_size else None
img_resized = resize_keepasp(img, new_shape, stride=64)
mask_resized = resize_keepasp(mask, new_shape, stride=64)
# Convert to tensor
img_tensor = torch.from_numpy(img_resized).permute(2, 0, 1).unsqueeze(0).float() / 255.0
mask_tensor = torch.from_numpy(mask_resized).unsqueeze(0).unsqueeze(0).float() / 255.0
if self.device != 'cpu':
img_tensor = img_tensor.to(self.device)
mask_tensor = mask_tensor.to(self.device)
# Apply mask to image
img_tensor *= (1 - mask_tensor)
return img_tensor, mask_tensor, img_original, mask_original
@torch.no_grad()
def _inpaint(self, img: np.ndarray, mask: np.ndarray, textblock_list=None) -> np.ndarray:
"""Inpaint with preprocessing."""
im_h, im_w = img.shape[:2]
# Preprocess
img_t, mask_t, img_orig, mask_orig = self.inpaint_preprocess(img, mask)
# Inpaint
result_t = self.model(img_t, mask_t)
# Postprocess
result = (result_t.cpu().squeeze(0).permute(1, 2, 0).numpy() * 255).astype(np.uint8)
# Resize back
if result.shape[:2] != (im_h, im_w):
result = cv2.resize(result, (im_w, im_h), interpolation=cv2.INTER_LINEAR)
# Blend with original
result = result * mask_orig + img_orig * (1 - mask_orig)
return result.astype(np.uint8)
Best Practices
1. Choosing the Right Inpainter
# For best quality (slower, more memory)
inpainter = INPAINTERS['lama_large_512px'](
device='cuda',
inpaint_size=1536,
precision='bf16'
)
# For balanced quality/speed
inpainter = INPAINTERS['lama_mpe'](device='cuda')
# For speed (lower quality)
inpainter = INPAINTERS['aot'](device='cuda')
# For CPU-only systems
inpainter = INPAINTERS['patchmatch']()
2. Memory Management
# For large images, use smaller inpaint_size
inpainter = INPAINTERS['lama_large_512px'](
device='cuda',
inpaint_size=1024 # Instead of 2048
)
# Process by blocks to save memory
inpainter.inpaint_by_block = True
inpainted = inpainter.inpaint(img, mask, text_blocks)
# Unload when done
inpainter.unload_model(empty_cache=True)
3. Handling Out-of-Memory Errors
# The inpainter automatically falls back to CPU on OOM
try:
inpainted = inpainter.inpaint(img, mask)
except Exception as e:
# Manual fallback
print(f"Failed: {e}")
inpainter.updateParam('device', 'cpu')
inpainted = inpainter.inpaint(img, mask)
4. Optimizing for Solid Backgrounds
# Enable automatic solid background detection
inpainter = INPAINTERS['lama_large_512px']()
# Skips inpainting for solid colors, fills with median instead
inpainted = inpainter.inpaint(img, mask, text_blocks, check_need_inpaint=True)
5. Batch Processing
import glob
inpainter = INPAINTERS['lama_large_512px'](device='cuda')
for img_path, mask_path in zip(image_paths, mask_paths):
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
inpainted = inpainter.inpaint(img, mask)
# Save
cv2.imwrite(f'inpainted_{img_path}', cv2.cvtColor(inpainted, cv2.COLOR_RGB2BGR))
# Clean up
inpainter.unload_model(empty_cache=True)
6. Precision Settings
from modules.base import BF16_SUPPORTED
# Use bf16 for faster inference if supported
if BF16_SUPPORTED:
inpainter = INPAINTERS['lama_large_512px'](
device='cuda',
precision='bf16'
)
else:
# Fall back to fp32
inpainter = INPAINTERS['lama_large_512px'](
device='cuda',
precision='fp32'
)
Utility Functions
Alpha Channel Handling
from modules.inpaint.base import inpaint_handle_alpha_channel
# Automatically adjusts alpha channel based on surrounding transparency
original_alpha = img[:, :, 3:4] # Extract alpha
result_alpha = inpaint_handle_alpha_channel(original_alpha, mask)
# Combine with inpainted RGB
result = np.concatenate([inpainted_rgb, result_alpha], axis=2)
Registry Usage
Listing Available Inpainters
from modules.base import init_inpainter_registries
from modules.inpaint.base import INPAINTERS
init_inpainter_registries()
print("Available inpainters:")
for name, inpainter_class in INPAINTERS.module_dict.items():
print(f" - {name}: {inpainter_class}")
Dynamic Selection
def get_best_inpainter(device='cuda', quality='high'):
"""Get best inpainter for requirements."""
if quality == 'high':
return INPAINTERS['lama_large_512px'](device=device)
elif quality == 'medium':
return INPAINTERS['lama_mpe'](device=device)
elif quality == 'fast':
return INPAINTERS['aot'](device=device)
else:
return INPAINTERS['patchmatch']()
inpainter = get_best_inpainter(quality='high')