Abstract:
Panoramic depth estimation is crucial for acquiring comprehensive 3D environmental perception information, serving as a foundational basis for numerous panoramic vision t...Show MoreMetadata
Abstract:
Panoramic depth estimation is crucial for acquiring comprehensive 3D environmental perception information, serving as a foundational basis for numerous panoramic vision tasks. The key challenge in panoramic depth estimation is how to address various distortions in 360° omnidirectional images. Most panoramic images are displayed as 2D equirectangular projections, which exhibit significant distortion, particularly with the severe fisheye effect near the equatorial regions. Traditional depth estimation methods for perspective images are unsuitable for such projections. On the other hand, cubemap projection consists of six distortion-free perspective images, allowing the use of existing depth estimation methods. However, the boundaries between faces of a cubemap projection introduce discontinuities, causing a loss of global information when using cube maps alone. In this work, we propose an innovative geometric priors assisted dual-projection fusion network (GADFNet) that leverages geometric priors of panoramic images and the strengths of both projection types to enhance the accuracy of panoramic depth estimation. Specifically, to better focus the network on key areas, we introduce a distortion perception module (DPM) and incorporate geometric information into the loss function. To more effectively extract global information from the equirectangular projection branch, we propose a scene understanding module (SUM), which captures features from different dimensions. Additionally, to achieve effective fusion of the two projections, we design a dual projection adaptive fusion module (DPAFM) to dynamically adjust the weights of the two branches during fusion. Extensive experiments conducted on four public datasets (including both virtual and real-world scenarios) demonstrate that our proposed GADFNet outperforms existing methods, achieving superior performance.
Published in: IEEE Transactions on Circuits and Systems for Video Technology ( Early Access )