Skip to Main Content
In this paper, we learn a reconfigurable template for detecting vehicles and classifying their types. We adopt a popular design for the part based model that has one coarse template covering entire object window and several small high-resolution templates representing parts. The reconfigurable template can learn part configurations that capture the spatial correlation of features for a deformable part based model. The features of templates are Histograms of Gradients (HoG). In order to better describe the actual dimensions and locations of “parts” (i.e. features with strong spatial correlations), we design a dictionary of rectangular primitives of various sizes, aspect-ratios and positions. A configuration is defined as a subset of non-overlapping primitives from this dictionary. To learn the optimal configuration using SVM amounts, we need to find the subset of parts that minimize the regularized hinge loss, which leads to a non-convex optimization problem. We solve this problem by replacing the hinge loss with a negative sigmoid loss that can be approximately decomposed into losses (or negative sigmoid scores) of individual parts. In the experiment, we compare our method empirically with group lasso and a state of the art method  and demonstrate that models learned with our method outperform others on two computer vision applications: vehicle localization and vehicle model recognition.
Date of Conference: 9-11 Jan. 2012