Abstract:
Thangka images, revered as the Tibetan encyclopedia for their diverse content, represent an invaluable cultural heritage. However, years of display, veneration, and impro...Show MoreMetadata
Abstract:
Thangka images, revered as the Tibetan encyclopedia for their diverse content, represent an invaluable cultural heritage. However, years of display, veneration, and improper preservation often led to varying degrees of damage to Thangka paintings. High-quality Thangka image inpainting, where damaged regions are filled with plausible content according to prior information, remains a significant challenge. Despite the excellent performance of recent text-to-image diffusion models have attracted extensive attention due to their excellent performance in generating various high-quality natural images. However, they encounter challenges in understanding textual descriptions related to complex Thangka images. To address this, we propose a new two-stage inpainting framework that employs both text and edge guidance. First, we employ GAN-based edge prediction network to predict the missing edges. Subsequently, these predicted edges guide the diffusion model through ControlNet, enhancing the inpainting areas and blending them with the original image. Additionally, we incorporate multiple LoRA models trained on Thangka datasets to integrate text guidance with edge control. Experiments results showed the effectiveness of our method in inpainting Thangka images with plausible visual consistency and preserved details.
Date of Conference: 15-19 July 2024
Date Added to IEEE Xplore: 30 September 2024
ISBN Information: