Skip to Main Content
A fundamental requirement for effective automated analysis of object behavior and interactions in video is that each object must be consistently identified over time. This is difficult when the objects are often occluded for long periods: nearly all tracking algorithms will terminate a track with loss of identity on a long gap. The problem is further confounded by objects in close proximity, tracking failures due to shadows, etc. Recently, some work has been done to address these issues using higher level reasoning, by linking tracks from multiple objects over long gaps. However, these efforts have assumed a one-to-one correspondence between tracks on either side of the gap. This is often not true in real scenarios of interest, where the objects are closely spaced and dynamically occlude each other, causing trackers to merge objects into single tracks. In this paper, we show how to efficiently handle splitting and merging during track linking. Moreover, we show that we can maintain the identities of objects that merge together and subsequently split. This enables the identity of objects to be maintained throughout long sequences with difficult conditions. We demonstrate our approach on a highly challenging, oblique-view video sequence of dense traffic of a highway interchange. We successfully track the large majority of the hundreds of moving vehicles in the scene, many in close proximity, through long occlusions and shadows.