Skip to Main Content
Target detection and tracking represent two fundamental steps in automatic video-based surveillance systems where the goal is to provide intelligent recognition capabilities by analyzing target behavior. This paper presents a framework for video-based surveillance where detection and tracking are addressed simultaneously in a unified framework (i.e., detection results trigger tracking, and tracking re-enforces detections)to improve detection results. In contrast to methods that apply target detection and tracking sequentially and independently from each other (i.e., "detect-then-track"), we feed the results of tracking back to the detection stage to adaptively optimize the threshold used in the detection stage and improve system robustness (i.e., "detect-and-track"). Specifically, the initial locations and representations of the targets are extracted by background subtraction. To model the background, we employ Support Vector Regression (SVR) along with an on-line learning scheme to update it efficiently over time. Target detection is performed by thresholding the outputs of the SVR model. Tracking uses shape projection histograms to iteratively localize the targets and achieve a high shape matching confidence level. Feeding back the results of tracking to the detection stage restricts the range of threshold values, suppress false alarms due to noise, and allows to continuously detect small targets as well as targets undergoing projection distortions. We have validated the proposed framework by detecting vehicles and pedestrians in traffic scenes using both visible and thermal video sequences. Experimental results and comparisons with framebased detection and kernel-based tracking methods illustrate the robustness of our approach.