Skip to Main Content
We propose a hierarchical association approach to multiple target tracking from a single camera by progressively linking detection responses into longer track fragments (i.e., tracklets). Given frame-by-frame detection results, a conservative dual-threshold method that only links very similar detection responses between consecutive frames is adopted to generate initial tracklets with minimum identity switches. Further association of these highly fragmented tracklets at each level of the hierarchy is formulated as a Maximum A Posteriori (MAP) problem that considers initialization, termination, and transition of tracklets as well as the possibility of them being false alarms, which can be efficiently computed by the Hungarian algorithm. The tracklet affinity model, which measures the likelihood of two tracklets belonging to the same target, is a linear combination of automatically learned weak nonparametric models upon various features, which is distinct from most of previous work that relies on heuristic selection of parametric models and manual tuning of their parameters. For this purpose, we develop a novel bag ranking method and train the crucial tracklet affinity models by the boosting algorithm. This bag ranking method utilizes the soft max function to relax the oversufficient objective function used by the conventional instance ranking method. It provides a tighter upper bound of empirical errors in distinguishing correct associations from the incorrect ones, and thus yields more accurate tracklet affinity models for the tracklet association problem. We apply this approach to the challenging multiple pedestrian tracking task. Systematic experiments conducted on two real-life datasets show that the proposed approach outperforms previous state-of-the-art algorithms in terms of tracking accuracy, in particular, considerably reducing fragmentations and identity switches.