Skip to main content

Overview

Inpainters remove text from images by filling text regions with background patterns. BallonTranslator provides multiple inpainting algorithms optimized for manga/comic translation.

InpainterBase Class

Base class for all inpainting modules.

Import

from modules.inpaint.base import InpainterBase, INPAINTERS, register_inpainter

Class Definition

class InpainterBase(BaseModule):
    """
    Base class for inpainting modules.
    
    Removes text from images by filling masked regions.
    Supports both full-image and block-based inpainting.
    """
    
    inpaint_by_block = True      # Process blocks individually
    check_need_inpaint = True    # Skip inpainting for solid backgrounds
    
    _preprocess_hooks = OrderedDict()
    _postprocess_hooks = OrderedDict()

Methods

inpaint

inpaint
method
Inpaint text regions in an image.Parameters:
  • img (np.ndarray): Input image (RGB or RGBA)
  • mask (np.ndarray): Binary mask of text regions (255 = text, 0 = background)
  • textblock_list (List[TextBlock] | None): Text blocks for block-based inpainting
  • check_need_inpaint (bool): Whether to skip solid backgrounds (default: False)
Returns: np.ndarray - Inpainted image (same format as input)
inpainter = INPAINTERS['lama_large_512px']()

# Full image inpainting
inpainted = inpainter.inpaint(img, mask)

# Block-based inpainting
inpainted = inpainter.inpaint(img, mask, text_blocks)
The inpaint method automatically:
  • Loads the model if not already loaded
  • Preserves alpha channel for RGBA images
  • Handles CUDA out-of-memory errors by falling back to CPU
  • Optionally skips inpainting for solid color backgrounds

_inpaint (Override Required)

_inpaint
method
Internal inpainting method. Must be implemented by subclasses.Parameters:
  • img (np.ndarray): Input image (RGB only)
  • mask (np.ndarray): Binary mask
  • textblock_list (List[TextBlock] | None): Text blocks
Returns: np.ndarray - Inpainted image (RGB)
def _inpaint(self, img: np.ndarray, mask: np.ndarray, textblock_list=None) -> np.ndarray:
    # Your inpainting logic
    result = self.model.inpaint(img, mask)
    return result

moveToDevice

moveToDevice
method
Move model to a different device.Parameters:
  • device (str): Target device (cpu, cuda, etc.)
  • precision (str | None): Model precision (fp32, fp16, bf16)
inpainter.moveToDevice('cuda', precision='fp16')

Properties

name
str
Inpainter name from registry. Automatically set during initialization.
inpaint_by_block
bool
Whether to process each text block individually. Block-based inpainting is more memory-efficient.
check_need_inpaint
bool
Whether to detect solid backgrounds and skip inpainting (fills with median color instead).

Available Inpainters

High-quality deep learning inpainter with large receptive field.
from modules.inpaint.base import INPAINTERS

inpainter = INPAINTERS['lama_large_512px'](
    device='cuda',
    inpaint_size=1536,
    precision='bf16'
)

Parameters

device
str
Computation device.Options: cpu, cuda, mps, xpu (not privateuseone)Default: Auto-detected
inpaint_size
int
Maximum inpainting size. Larger values improve quality but use more memory.Options: 512, 768, 1024, 1536, 2048Default: 1536
precision
str
Model precision for inference.Options: fp32, bf16Default: bf16 (if supported), otherwise fp32

Features

  • State-of-the-art quality
  • Large architecture for better context
  • Supports bfloat16 for faster inference
  • Optimized for manga/comic inpainting

Model Files

Downloads from HuggingFace: dreMaz/AnimeMangaInpainting/lama_large_512px.ckpt

LaMa MPE

LaMa with Masked Position Encoding - good balance of quality and speed.
inpainter = INPAINTERS['lama_mpe'](
    device='cuda',
    inpaint_size=2048
)

Parameters

device
str
Computation device.Default: Auto-detected
inpaint_size
int
Maximum inpainting size.Options: 1024, 2048Default: 2048

Features

  • Fast inference with MPE
  • Good quality for most use cases
  • Lower memory usage than LaMa Large

AOT Inpainter

Aggregated Contextual Transformations inpainter.
inpainter = INPAINTERS['aot'](
    device='cuda',
    inpaint_size=2048
)

Parameters

device
str
Computation device.Default: Auto-detected
inpaint_size
int
Maximum inpainting size.Options: 1024, 2048Default: 2048

Features

  • Fast and efficient
  • Good for simple backgrounds
  • Lower quality than LaMa models

PatchMatch Inpainter

Traditional algorithm-based inpainter (no deep learning).
inpainter = INPAINTERS['patchmatch']()

Features

  • No GPU required
  • Fast for small regions
  • Good for textured backgrounds
  • Lower quality than deep learning methods

Platform Support

Requires platform-specific libraries:
  • macOS: macos_patchmatch_libs.7z
  • Windows: windows_patchmatch_libs.7z

OpenCV Telea

Simple OpenCV-based inpainter.
inpainter = INPAINTERS['opencv-tela']()

Features

  • Lightweight
  • No model download needed
  • Fastest option
  • Lowest quality

Example Usage

Basic Inpainting

import cv2
from modules.base import init_inpainter_registries
from modules.inpaint.base import INPAINTERS

# Initialize
init_inpainter_registries()

# Load image and mask
img = cv2.imread('page.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
mask = cv2.imread('text_mask.png', cv2.IMREAD_GRAYSCALE)

# Get inpainter
inpainter = INPAINTERS['lama_large_512px'](device='cuda')

# Inpaint
inpainted = inpainter.inpaint(img, mask)

# Save result
cv2.imwrite('result.jpg', cv2.cvtColor(inpainted, cv2.COLOR_RGB2BGR))

Complete Translation Pipeline

from modules.textdetector.base import TEXTDETECTORS
from modules.inpaint.base import INPAINTERS

# Detect text
detector = TEXTDETECTORS['ctd'](device='cuda')
mask, text_blocks = detector.detect(img)

# Inpaint text regions
inpainter = INPAINTERS['lama_large_512px'](device='cuda')
inpainted = inpainter.inpaint(img, mask, text_blocks)

# Now you can render translated text on clean background

RGBA Image Support

# Load RGBA image (with transparency)
img = cv2.imread('page.png', cv2.IMREAD_UNCHANGED)

# Inpaint (alpha channel is preserved)
inpainted = inpainter.inpaint(img, mask)

# Result maintains alpha channel
assert inpainted.shape[2] == 4  # RGBA

Memory-Efficient Block-Based Inpainting

# For large images, process blocks individually
inpainter = INPAINTERS['lama_large_512px'](
    device='cuda',
    inpaint_size=1024  # Smaller size for memory efficiency
)

# Block-based inpainting (default)
inpainted = inpainter.inpaint(img, mask, text_blocks)

# Force full-image inpainting
inpainter.inpaint_by_block = False
inpainted = inpainter.inpaint(img, mask)

Creating Custom Inpainters

Basic Template

import numpy as np
from typing import List
import torch
from modules.inpaint.base import InpainterBase, register_inpainter, TextBlock, DEVICE_SELECTOR

@register_inpainter('my_inpainter')
class MyInpainter(InpainterBase):
    
    params = {
        'device': DEVICE_SELECTOR(),
        'inpaint_size': {
            'type': 'selector',
            'options': [512, 1024, 2048],
            'value': 1024
        },
        'description': 'My custom inpainter'
    }
    
    _load_model_keys = {'model'}
    
    def __init__(self, **params):
        super().__init__(**params)
        self.device = self.get_param_value('device')
        self.inpaint_size = self.get_param_value('inpaint_size')
        self.model = None
    
    def _load_model(self):
        """Load inpainting model."""
        self.model = load_my_inpainting_model(device=self.device)
    
    def _inpaint(self, img: np.ndarray, mask: np.ndarray, textblock_list=None) -> np.ndarray:
        """Inpaint the image."""
        # Preprocess
        img_tensor = self.preprocess(img, mask)
        
        # Run model
        with torch.no_grad():
            output = self.model(img_tensor)
        
        # Postprocess
        result = self.postprocess(output, img.shape)
        
        return result
    
    def moveToDevice(self, device: str, precision: str = None):
        """Move model to device."""
        self.model.to(device)
        self.device = device

Advanced: With Preprocessing

import cv2
from utils.imgproc_utils import resize_keepasp

@register_inpainter('advanced_inpainter')
class AdvancedInpainter(InpainterBase):
    
    def inpaint_preprocess(self, img: np.ndarray, mask: np.ndarray):
        """Preprocess image and mask."""
        img_original = np.copy(img)
        mask_original = np.copy(mask)
        
        # Threshold mask
        mask_original[mask_original < 127] = 0
        mask_original[mask_original >= 127] = 1
        mask_original = mask_original[:, :, None]
        
        # Resize if needed
        new_shape = self.inpaint_size if max(img.shape[:2]) > self.inpaint_size else None
        img_resized = resize_keepasp(img, new_shape, stride=64)
        mask_resized = resize_keepasp(mask, new_shape, stride=64)
        
        # Convert to tensor
        img_tensor = torch.from_numpy(img_resized).permute(2, 0, 1).unsqueeze(0).float() / 255.0
        mask_tensor = torch.from_numpy(mask_resized).unsqueeze(0).unsqueeze(0).float() / 255.0
        
        if self.device != 'cpu':
            img_tensor = img_tensor.to(self.device)
            mask_tensor = mask_tensor.to(self.device)
        
        # Apply mask to image
        img_tensor *= (1 - mask_tensor)
        
        return img_tensor, mask_tensor, img_original, mask_original
    
    @torch.no_grad()
    def _inpaint(self, img: np.ndarray, mask: np.ndarray, textblock_list=None) -> np.ndarray:
        """Inpaint with preprocessing."""
        im_h, im_w = img.shape[:2]
        
        # Preprocess
        img_t, mask_t, img_orig, mask_orig = self.inpaint_preprocess(img, mask)
        
        # Inpaint
        result_t = self.model(img_t, mask_t)
        
        # Postprocess
        result = (result_t.cpu().squeeze(0).permute(1, 2, 0).numpy() * 255).astype(np.uint8)
        
        # Resize back
        if result.shape[:2] != (im_h, im_w):
            result = cv2.resize(result, (im_w, im_h), interpolation=cv2.INTER_LINEAR)
        
        # Blend with original
        result = result * mask_orig + img_orig * (1 - mask_orig)
        
        return result.astype(np.uint8)

Best Practices

1. Choosing the Right Inpainter

# For best quality (slower, more memory)
inpainter = INPAINTERS['lama_large_512px'](
    device='cuda',
    inpaint_size=1536,
    precision='bf16'
)

# For balanced quality/speed
inpainter = INPAINTERS['lama_mpe'](device='cuda')

# For speed (lower quality)
inpainter = INPAINTERS['aot'](device='cuda')

# For CPU-only systems
inpainter = INPAINTERS['patchmatch']()

2. Memory Management

# For large images, use smaller inpaint_size
inpainter = INPAINTERS['lama_large_512px'](
    device='cuda',
    inpaint_size=1024  # Instead of 2048
)

# Process by blocks to save memory
inpainter.inpaint_by_block = True
inpainted = inpainter.inpaint(img, mask, text_blocks)

# Unload when done
inpainter.unload_model(empty_cache=True)

3. Handling Out-of-Memory Errors

# The inpainter automatically falls back to CPU on OOM
try:
    inpainted = inpainter.inpaint(img, mask)
except Exception as e:
    # Manual fallback
    print(f"Failed: {e}")
    inpainter.updateParam('device', 'cpu')
    inpainted = inpainter.inpaint(img, mask)

4. Optimizing for Solid Backgrounds

# Enable automatic solid background detection
inpainter = INPAINTERS['lama_large_512px']()

# Skips inpainting for solid colors, fills with median instead
inpainted = inpainter.inpaint(img, mask, text_blocks, check_need_inpaint=True)

5. Batch Processing

import glob

inpainter = INPAINTERS['lama_large_512px'](device='cuda')

for img_path, mask_path in zip(image_paths, mask_paths):
    img = cv2.imread(img_path)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
    
    inpainted = inpainter.inpaint(img, mask)
    
    # Save
    cv2.imwrite(f'inpainted_{img_path}', cv2.cvtColor(inpainted, cv2.COLOR_RGB2BGR))

# Clean up
inpainter.unload_model(empty_cache=True)

6. Precision Settings

from modules.base import BF16_SUPPORTED

# Use bf16 for faster inference if supported
if BF16_SUPPORTED:
    inpainter = INPAINTERS['lama_large_512px'](
        device='cuda',
        precision='bf16'
    )
else:
    # Fall back to fp32
    inpainter = INPAINTERS['lama_large_512px'](
        device='cuda',
        precision='fp32'
    )

Utility Functions

Alpha Channel Handling

from modules.inpaint.base import inpaint_handle_alpha_channel

# Automatically adjusts alpha channel based on surrounding transparency
original_alpha = img[:, :, 3:4]  # Extract alpha
result_alpha = inpaint_handle_alpha_channel(original_alpha, mask)

# Combine with inpainted RGB
result = np.concatenate([inpainted_rgb, result_alpha], axis=2)

Registry Usage

Listing Available Inpainters

from modules.base import init_inpainter_registries
from modules.inpaint.base import INPAINTERS

init_inpainter_registries()

print("Available inpainters:")
for name, inpainter_class in INPAINTERS.module_dict.items():
    print(f"  - {name}: {inpainter_class}")

Dynamic Selection

def get_best_inpainter(device='cuda', quality='high'):
    """Get best inpainter for requirements."""
    if quality == 'high':
        return INPAINTERS['lama_large_512px'](device=device)
    elif quality == 'medium':
        return INPAINTERS['lama_mpe'](device=device)
    elif quality == 'fast':
        return INPAINTERS['aot'](device=device)
    else:
        return INPAINTERS['patchmatch']()

inpainter = get_best_inpainter(quality='high')