Abstract:
Unsupervised methods have received increasing attention in homography learning due to their promising performance and label-free training. However, existing methods do no...Show MoreMetadata
Abstract:
Unsupervised methods have received increasing attention in homography learning due to their promising performance and label-free training. However, existing methods do not explicitly consider the plane-induced parallax, making the prediction compromised on multiple planes. In this work, we propose a novel method HomoGAN to guide unsupervised homography estimation to focus on the dominant plane. First, a multi-scale transformer is designed to predict homography from the feature pyramids of input images in a coarse-to-fine fashion. Moreover, we propose an unsupervised GAN to impose coplanarity constraint on the predicted homography, which is realized by using a generator to predict a mask of aligned regions, and then a discriminator to check if two masked feature maps are induced by a single homography. Based on the global homography framework, we extend it to the local mesh-grid homography estimation, namely, MeshHomoGAN, where plane constraints can be enforced on each mesh cell to go beyond a single dominant plane, such that scenes with multiple depth planes can be better aligned. To validate the effectiveness of our method and its components, we conduct extensive experiments on large-scale datasets. Results show that our matching error is 22% lower than previous SOTA methods.
Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 47, Issue: 3, March 2025)