Skip to main content
The inpainting module removes text from images and reconstructs the background. BallonTranslator supports both deep learning and traditional computer vision approaches.

Available Inpainting Methods

Overview

Deep learning inpainting model from manga-image-translator. Good balance of speed and quality.

Configuration

from modules.inpaint import AOTInpainter

inpainter = AOTInpainter(
    device='cuda',
    inpaint_size=2048
)

inpainted = inpainter.inpaint(img, mask, text_blocks)

Parameters

ParameterTypeDefaultDescription
deviceselectorautoProcessing device (cuda/cpu/mps)
inpaint_sizeint2048Maximum inpainting resolution
precisionselectorfp32Model precision (fp32/fp16/bf16)

Features

  • GPU Accelerated: Fast processing with CUDA
  • High Quality: Good reconstruction of complex backgrounds
  • Context-Aware: Uses surrounding context for inpainting
  • Mixed Precision: Support for fp16/bf16 to save VRAM

Model Files

  • data/models/aot_inpainter.ckpt

Performance Notes

  • Higher inpaint_size = better quality but more VRAM
  • Use fp16 precision to reduce memory usage
  • Automatic CUDA OOM handling with CPU fallback

Overview

State-of-the-art inpainting model from advimman/lama. Fine-tuned for manga.

Configuration

from modules.inpaint import LamaInpainter

inpainter = LamaInpainter(
    device='cuda',
    inpaint_size=2048,
    precision='fp16'
)

Parameters

ParameterTypeDefaultDescription
deviceselectorautoProcessing device
inpaint_sizeint2048Max resolution
precisionselectorfp32Model precision

Available Variants

  • lama_large_512px: Standard LAMA model
  • lama_mpe: Manga-optimized fine-tuned version

Features

  • Excellent Quality: Best results for complex backgrounds
  • Large Mask Support: Handles large text regions well
  • Fine-tuned: Optimized for manga/comic content
  • FFC Architecture: Uses Fast Fourier Convolution

Model Files

  • data/models/lama_large_512px.ckpt
  • data/models/lama_mpe.ckpt (manga-optimized)

When to Use

  • Complex textured backgrounds
  • Large text regions
  • Maximum quality needed
  • VRAM is not a constraint

Overview

Non-deep-learning algorithm based on PyPatchMatch. Same algorithm as Photoshop’s Content-Aware Fill.

Configuration

from modules.inpaint import PatchMatchInpainter

inpainter = PatchMatchInpainter(
    patch_size=15
)

inpainted = inpainter.inpaint(img, mask)

Parameters

ParameterTypeDefaultDescription
patch_sizeint15Patch size for matching

Features

  • No GPU Required: Pure CPU algorithm
  • Fast: Quick processing for small regions
  • No Model Files: Doesn’t require downloading models
  • Deterministic: Reproducible results

Implementation Details

from modules.inpaint.patch_match import inpaint, inpaint_regularity

# Basic inpainting
result = inpaint(image, mask, patch_size=15)

# With regularity guidance
result = inpaint_regularity(
    image, 
    mask, 
    ijmap,  # Coordinate guidance map
    patch_size=15,
    guide_weight=0.25
)

Platform Support

  • Windows: data/libs/patchmatch_inpaint.dll
  • macOS: data/libs/macos_libpatchmatch_inpaint.dylib
  • Linux: data/libs/libpatchmatch.so

When to Use

  • Simple backgrounds
  • Limited GPU memory
  • Small text regions
  • No model downloads desired

Limitations

  • Less effective on complex textures
  • May produce artifacts on large regions
  • Slower than GPU methods for large images

Inpainting Strategies

Block-based Inpainting

Inpaint each text block individually:
inpainter.inpaint_by_block = True
inpainted = inpainter.inpaint(img, mask, text_blocks)
Advantages:
  • More efficient use of resolution
  • Better for scattered text
  • Lower memory usage

Full Image Inpainting

Inpaint entire image at once:
inpainter.inpaint_by_block = False
inpainted = inpainter.inpaint(img, mask)
Advantages:
  • Better context across regions
  • More coherent results
  • Better for dense text

Usage Example

import cv2
from modules.textdetector import TEXTDETECTORS
from modules.inpaint import INPAINTERS

# Load image
img = cv2.imread('manga_page.jpg')

# Detect text
detector = TEXTDETECTORS.module_dict['ctd']()
mask, text_blocks = detector.detect(img)

# Inpaint
inpainter_class = INPAINTERS.module_dict['lama_large_512px']
inpainter = inpainter_class(device='cuda', inpaint_size=2048)
inpainted = inpainter.inpaint(img, mask, text_blocks)

# Save result
cv2.imwrite('inpainted.jpg', inpainted)

RGBA Image Support

Inpainting automatically handles transparency:
# Load RGBA image
img = cv2.imread('page.png', cv2.IMREAD_UNCHANGED)  # Keep alpha

# Alpha channel is preserved automatically
inpainted = inpainter.inpaint(img, mask)

# Result maintains original transparency
assert inpainted.shape[2] == 4  # Still RGBA
The inpainter:
  1. Separates RGB and alpha channels
  2. Inpaints RGB only
  3. Reconstructs alpha based on surroundings
  4. Returns RGBA result

Memory Management

Automatic CUDA OOM Handling

The inpainter automatically handles out-of-memory errors:
try:
    result = inpainter.inpaint(img, mask)
except torch.cuda.OutOfMemoryError:
    # Automatically retries with cleared cache
    # Falls back to CPU if necessary
    pass
Automatic behaviors:
  1. First OOM: Clear CUDA cache and retry
  2. Second OOM: Switch to CPU temporarily
  3. Restore GPU after completion

Reducing Memory Usage

# Lower resolution
inpainter = LamaInpainter(
    inpaint_size=1024,  # Default 2048
    precision='fp16'     # vs fp32
)

# Process blocks individually
inpainter.inpaint_by_block = True

Advanced Configuration

Precision Modes

PrecisionVRAM UsageSpeedQualityNotes
fp32100%BaselineBestDefault
fp16~50%FasterExcellentCUDA only
bf16~50%FasterExcellentAmpere+ GPUs

Custom Inpainting Resolution

# Adjust based on VRAM
if torch.cuda.get_device_properties(0).total_memory < 8e9:  # < 8GB
    inpaint_size = 1024
else:
    inpaint_size = 2048

inpainter = LamaInpainter(
    inpaint_size=inpaint_size,
    precision='fp16'
)

Mask Preprocessing

import cv2
import numpy as np

# Dilate mask for better coverage
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
mask_dilated = cv2.dilate(mask, kernel)

# Inpaint with expanded mask
result = inpainter.inpaint(img, mask_dilated)

Balloon Detection Integration

Automatic balloon detection for better inpainting:
inpainter.check_need_inpaint = True  # Enable balloon detection

# Automatically:
# 1. Detects speech balloons
# 2. Separates balloon from non-balloon regions
# 3. Applies different inpainting strategies
# 4. Blends results

Choosing an Inpainting Method

ScenarioRecommended MethodNotes
Best qualityLAMA (lama_mpe)Manga-optimized
BalancedAOTGood speed/quality
No GPUPatchMatchCPU-only option
Low VRAMAOT with fp16Memory efficient
Large regionsLAMABest for big masks
Simple backgroundsPatchMatchFast and effective
Complex texturesLAMABest reconstruction

Custom Inpainting Models

To add custom inpainting models:
from modules.inpaint import InpainterBase, register_inpainter
import torch

@register_inpainter('my_inpainter')
class MyInpainter(InpainterBase):
    
    def _load_model(self):
        self.model = torch.load('my_model.pth')
        self.model.eval()
    
    def _inpaint(self, img, mask, textblock_list=None):
        # Your inpainting logic
        result = self.model(img, mask)
        return result

Performance Benchmarks

Speed (1024x1024 image)

MethodGPU (RTX 3090)CPU (i9)
AOT~0.5s~8s
LAMA~0.8s~15s
PatchMatchN/A~3s

Quality Ranking

  1. LAMA (lama_mpe): Best overall
  2. LAMA (lama_large_512px): Excellent
  3. AOT: Very good
  4. PatchMatch: Good for simple cases

Troubleshooting

CUDA Out of Memory

# Reduce resolution
inpaint_size = 1024  # or lower

# Use fp16 precision
precision = 'fp16'

# Enable block-based inpainting
inpaint_by_block = True

Poor Quality Results

  1. Increase resolution: inpaint_size = 2048
  2. Try LAMA: Best quality model
  3. Dilate mask: Expand coverage slightly
  4. Use full image mode: inpaint_by_block = False

Slow Performance

  1. Use GPU: Set device='cuda'
  2. Lower resolution: Reduce inpaint_size
  3. Use fp16: Faster than fp32
  4. Try AOT: Faster than LAMA