I. Introduction
Multi-target tracking is an important topic in the field of computer vision. It is a highly challenging problem which aims at extracting trajectory information of targets from video sequences. The problem becomes even more sophisticated when it comes to human tracking, especially in complex and crowded environments where frequent occlusions and interactions would often occur, which makes detection and tracking of targets far more difficult. Figure 1 shows a typical scenario for multi-target tracking in a crowd scene. During recent years detection-based tracking method have achieved impressive progress, which is mostly due to the improvement in object models, either complex appearance models or detectors for specific kinds of objects [5], [6], [7], [8]. The main idea of the method is to link target detection or short tracklets gradually into longer ones, optimizing the global linking scores or probabilities between tracklets. Numerous studies have proved the power of the framework. Current systems are now able to handle long and challenging sequences automatically with high precision.