Skip to Main Content
Matching visual appearances of the target over consecutive video frames is a fundamental yet challenging task in visual tracking. Its performance largely depends on the distance metric that determines the quality of visual matching. Rather than using fixed and predefined metric, recent attempts of integrating metric learning-based trackers have shown more robust and promising results, as the learned metric can be more discriminative. In general, these global metric adjustment methods are computationally demanding in real-time visual tracking tasks, and they tend to underfit the data when the target exhibits dynamic appearance variation. This paper presents a nonparametric data-driven local metric adjustment method. The proposed method finds a spatially adaptive metric that exhibits different properties at different locations in the feature space, due to the differences of the data distribution in a local neighborhood. It minimizes the deviation of the empirical misclassification probability to obtain the optimal metric such that the asymptotic error as if using an infinite set of training samples can be approximated. Moreover, by taking the data local distribution into consideration, it is spatially adaptive. Integrating this new local metric learning method into target tracking leads to efficient and robust tracking performance. Extensive experiments have demonstrated the superiority and effectiveness of the proposed tracking method in various tracking scenarios.