Skip to Main Content
This paper presents a graphical model approach that fuses thermal infrared (IR) and visible spectrum video for human tracking. The proposed model uses unobserved variables to describe the data in terms of the process that generates them. It is thus able to capture and exploit the statistical structure of the IR and the visible data separately, as well as their mutual dependencies. Model parameters are learned form data using the expectation maximization (EM) algorithm. Automatic calibration is performed as part of this procedure. Tracking is done by Bayesian inference of the object location from the observed data. The effectiveness of the proposed method is demonstrated by the experimental results on the video clips captured in real world scenarios.