PathTrack: Fast Trajectory Annotation with Path Supervision | IEEE Conference Publication | IEEE Xplore

PathTrack: Fast Trajectory Annotation with Path Supervision


Abstract:

Progress in Multiple Object Tracking (MOT) has been historically limited by the size of the available datasets. We present an efficient framework to annotate trajectories...Show More

Abstract:

Progress in Multiple Object Tracking (MOT) has been historically limited by the size of the available datasets. We present an efficient framework to annotate trajectories and use it to produce a MOT dataset of unprecedented size. In our novel path supervision the annotator loosely follows the object with the cursor while watching the video, providing a path annotation for each object in the sequence. Our approach is able to turn such weak annotations into dense box trajectories. Our experiments on existing datasets prove that our framework produces more accurate annotations than the state of the art, in a fraction of the time. We further validate our approach by crowdsourcing the PathTrack dataset, with more than 15,000 person trajectories in 720 sequences. Tracking approaches can benefit training on such large-scale datasets, as did object recognition. We prove this by re-training an off-the-shelf person matching network, originally trained on the MOT15 dataset, almost halving the misclassification rate. Additionally, training on our data consistently improves tracking results, both on our dataset and on MOT15. On the latter, we improve the top-performing tracker (NOMT) dropping the number of ID Switches by 18% and fragments by 5%.
Date of Conference: 22-29 October 2017
Date Added to IEEE Xplore: 25 December 2017
ISBN Information:
Electronic ISSN: 2380-7504
Conference Location: Venice, Italy

1. Introduction

Progress in vision has been fueled by the emergence of datasets of ever-increasing scale. An example is the surge of Deep Learning thanks to ImageNet [26], [44]. The scaling up of datasets for Multiple Object Tracking (MOT) however has been limited due to the difficulty and cost to annotate complex video scenes with many objects. As a consequence, MOT datasets consist of only a couple dozens of sequences [18], [29], [35] or are restricted to the surveillance scenario [53]. This has hindered the development of fully learned MOT systems that can generalize to any scenario. In this paper, we tackle these issues by introducing a fast and intuitive way to annotate trajectories in videos and use it to create a large-scale MOT dataset.

This sequence is heavily crowded with similarly-looking people. Annotating such sequences is typically time-consuming and tedious. In our path supervision, the user effortlessly follows the object while watching the video, collecting path annotations. Our approach produces dense box trajectory annotations from such path annotations.

Contact IEEE to Subscribe

References

References is not available for this document.