BroadBEV: Collaborative LiDAR-camera Fusion for Broad-sighted Bird’s Eye View Map Construction | IEEE Conference Publication | IEEE Xplore

BroadBEV: Collaborative LiDAR-camera Fusion for Broad-sighted Bird’s Eye View Map Construction


Abstract:

A recent sensor fusion in a Bird’s Eye View (BEV) space has shown its utility in various tasks such as 3D detection, map segmentation, etc. However, the approach struggle...Show More

Abstract:

A recent sensor fusion in a Bird’s Eye View (BEV) space has shown its utility in various tasks such as 3D detection, map segmentation, etc. However, the approach struggles with inaccurate camera BEV estimation, and a perception of distant areas due to the sparsity of LiDAR points. In this paper, we propose a BEV fusion (BroadBEV) that aims to enhance camera BEV estimation for broad perception in the pre-defined BEV range, while simultaneously improving the completion of LiDAR’s sparsity in the entire BEV space. Toward that end, we devise Point-scattering that scatters LiDAR BEV distribution to camera depth distribution. The method boosts the learning of depth estimation of the camera branch and induces accurate location of dense camera features in BEV space. For an effective BEV fusion between the spatially synchronized features, we suggest ColFusion that applies self-attention weights of LiDAR and camera BEV features to each other. Our extensive experiments demonstrate that the suggested methods enable a broad BEV perception with remarkable performance gains.
Date of Conference: 13-17 May 2024
Date Added to IEEE Xplore: 08 August 2024
ISBN Information:
Conference Location: Yokohama, Japan

I. Introduction

Visual perception and understanding of the surrounding environment are crucial to implementing reliable robotic systems such as Simultaneous Localization and Mapping (SLAM), and Advanced Driver Assistance Systems (ADAS). Because the perception provides an ego-frame agent with detailed local features and structural information, various approaches to the perceptions have been actively studied including 3D detection and semantic segmentation. As a representation of latent variables for the tasks, Bird’s Eye View (BEV) space has been frequently employed. BEV space is free from distortions of homogeneous coordinate systems and categorizes object shapes into a few classes. Thus, it provides a robust representation of elements in 3D space including cars, buildings, pedestrians, large-scale scenes, etc.

Contact IEEE to Subscribe

References

References is not available for this document.