I. Introduction
Depth serves as a realistic prerequisite for sensing the surrounding 3D scene structure in many high-level 3D vision tasks [1], [2], [3]. Image-based passive depth estimation approaches compare favorably to active sensors in terms of cost, working range, and flexibility. Among these methods, the well-posed stereo vision is preferentially chosen due to its straightforward settings, excellent accuracy, and reasonable cost.