Skip to Main Content
Matching the visual appearances of the target over consecutive image frames is the most critical issue in video-based object tracking. Choosing an appropriate distance metric for matching determines its accuracy and robustness, and thus significantly influences the tracking performance. Most existing tracking methods employ fixed pre-specified distance metrics. However, this simple treatment is problematic and limited in practice, because a pre-specified metric does not likely to guarantee the closest match to be the true target of interest. This paper presents a new tracking approach that incorporates adaptive metric learning into the framework of visual object tracking. Collecting a set of supervised training samples on-the-fly in the observed video, this new approach automatically learns the optimal distance metric for more accurate matching. The design of the learned metric ensures that the closest match is very likely to be the true target of interest based on the supervised training. Such a learned metric is discriminative and adaptive. This paper substantializes this new approach in a solid case study of adaptive-metric differential tracking, and obtains a closed-form analytical solution to motion estimation and visual tracking. Moreover, this paper extends the basic linear distance metric learning method to a more powerful nonlinear kernel metric learning method. Extensive experiments validate the effectiveness of the proposed approach, and demonstrate the improved performance of the proposed new tracking method.