Skip to Main Content
We propose a viewpoint-independent object-detection algorithm that detects objects in videos based on their 2-D and 3-D information. Object-specific quasi-3-D templates are proposed and applied to match objects' 2-D contours and to calculate their 3-D sizes. A quasi-3-D template is the contour and the 3-D bounding cube of an object viewed from a certain panning and tilting angle. Pedestrian templates amounting to 2660 and 1995 vehicle templates encompassing 19 tilting and 35 panning angles are used in this study. To detect objects, we first match the 2-D contours of object candidates with known objects' contours, and some object templates with large 2-D contour-matching scores are identified. In this step, we exploit some prior knowledge on the viewpoint on which the object is viewed to speed up the template matching, and the viewpoint likelihood for each contour-matched template is also assigned. Then, we calculate the 3-D widths, heights, and lengths of the contour-matched candidates, as well as the corresponding 3-D-size-matching scores. The overall matching score is obtained by combining the aforementioned likelihood and scores. The major contributions of this paper are to explore the joint use of 2-D and 3-D features in object detection. It shows that, by considering 2-D contours and 3-D sizes, one can achieve promising object detection rates. The proposed algorithms were evaluated on both pedestrian and vehicle sequences. It yielded significantly better detection results than the best results reported in PETS 2009, showing that our algorithm outperformed the state-of-the-art pedestrian-detection algorithms.