Conferences >ICASSP 2025 - 2025 IEEE Inter...

V-Fusion: 2D Detection-enhanced Multimodal 3D BEV Object Detection

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Integrating information from multiple sensors enhances the performance of autonomous vehicle perception systems. However, current multimodal 3D object detection methods f...Show More

Metadata

Abstract:

Integrating information from multiple sensors enhances the performance of autonomous vehicle perception systems. However, current multimodal 3D object detection methods focus on unifying modalities into a bird’s-eye view (BEV) representation, which overlooks the inherent characteristics of camera perspective view (PV), where 2D detection performance significantly surpasses that of state-of-the-art 3D detectors. In this paper, we propose V-Fusion, a high-quality 2D detection-enhanced multimodal BEV object detection method. By leveraging the 2D priors of PV, we construct 3D query proposals that complement BEV 3D queries. To address the modal discrepancy in generating 3D queries from 2D priors, we propose a depth-robust 2D-to-3D query generation strategy. Additionally, we introduce a novel geometry-constrained self-attention mechanism to enhance the interaction of BEV 3D queries and employ an additional set of learnable 3D queries to account for potentially missed objects. Notably, V-Fusion achieves 74.1 NDS performance on the challenging nuScenes dataset, outperforming SparseFusion in 1.0 NDS and offering comparable inference speed.

Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date of Conference: 06-11 April 2025

Date Added to IEEE Xplore: 07 March 2025

ISBN Information:

ISSN Information:

DOI: 10.1109/ICASSP49660.2025.10889489

Conference Location: Hyderabad, India

Contents

References is not available for this document.

V-Fusion: 2D Detection-enhanced Multimodal 3D BEV Object Detection

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

V-Fusion: 2D Detection-enhanced Multimodal 3D BEV Object Detection

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Authors

Figures

References

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?