Abstract:
Binocular stereo vision is a depth estimation technique by imitating human eyes. It is widely used in various fields, such as self-driving cars, SLAM, and 3D reconstructi...Show MoreMetadata
Abstract:
Binocular stereo vision is a depth estimation technique by imitating human eyes. It is widely used in various fields, such as self-driving cars, SLAM, and 3D reconstruction. However, designing a hardware architecture that can balance resource utilization, processing speed, and estimation accuracy remains a significant challenge. This paper proposes a compact and efficient hardware-based design that incorporates linear fitting-based cost fusion, disparity optimization with subpixel interpolation, and multi-directional occlusion filling techniques. Firstly, a gradient-enhanced pipelined matching costs architecture with a resource-saving scheme and re-computation paradigm is proposed to improve the accuracy of edge information. Then, we approximate the nonlinear exponential function by linear fitting to save the hardware resource. Moreover, we design the subpixel interpolation with an SRT radix-4 divider to refine the disparity, which significantly enhances the accuracy of the disparity map in real situations. Finally, we proposed a resource-reused architecture for synchronous hole filling and median filter in post-processing. The disparity map quality of the proposed architecture is evaluated on KITTI2015 datasets, which delivers leading accuracy compared to other state-of-the-art works. The architecture has been successfully implemented and demonstrated on the Stratix-V FPGA platform and achieved 54 frames per second operating at 112 MHz under a resolution of 1920\times 1080 .
Published in: IEEE Transactions on Circuits and Systems I: Regular Papers ( Volume: 71, Issue: 6, June 2024)