Abstract:
The task of image matting is an active research area in computer vision, and various trimap-free methods have been proposed to improve its performance. However, these met...Show MoreMetadata
Abstract:
The task of image matting is an active research area in computer vision, and various trimap-free methods have been proposed to improve its performance. However, these methods do not consider the gap between composited and real-world images, resulting in limited generalization ability. To address this issue, we propose a domain alignment (DA) module that consists of local region-wise alignment (LRA) and global harmonious alignment (GHA). The LRA aligns the most diverse pixels in the transparent regions of the foreground between composited and real images. On the other hand, the GHA aligns the global image harmonization for both composited and real images, which helps the network choose the appropriate semantics for real harmonious images. Additionally, we design a transformer-based network with dynamic attention pruning (DAP) mechanism to accurately locate domain-sensitive regions, allowing the DA module to work more effectively. Furthermore, we introduce a new dataset, the Real-world Matting Dataset (RM-1k), to advance the real-world matting task. Our proposed method is evaluated on two composited benchmarks (Composite-1k and Distinctions-646) and two real-world datasets (AIM-500 and RM-1k), and the results show that our method achieves robust performance on both composited and real-world images.
Published in: IEEE Transactions on Circuits and Systems for Video Technology ( Volume: 34, Issue: 4, April 2024)