Abstract:
Visual surveillance in complex urban environments requires an intelligent system to automatically track and identify multiple objects of interest in a network of distribu...Show MoreMetadata
Abstract:
Visual surveillance in complex urban environments requires an intelligent system to automatically track and identify multiple objects of interest in a network of distributed cameras. The ability to perform robust object recognition is critical to compensate adverse conditions and improve performance, such as multi-object association, visual occlusion, and data fusion with hybrid sensor modalities. In this paper, we propose an efficient distributed data compression and fusion scheme to encode and transmit SIFT-based visual histograms in a multi-hop network to perform accurate 3-D object recognition. The method harnesses an emerging theory of (distributed) compressive sensing to encode high-dimensional, nonnegative sparse signals via random projection, which is unsupervised and independent to the sensor modality. A multi-hop protocol then transmits the compressed visual data to a base-station computer, which preserves a constant bandwidth regardless of the number of active camera nodes in the network. Finally, the multiple-view object features are simultaneously recovered via ℓ1-minimization as an efficient decoder. The efficacy of the algorithm is validated using up to four Berkeley CITRIC camera motes deployed in a realistic indoor environment. The substantial computation power on the CITRIC mote also enables fast compression of SIFT-type visual features extracted from object images.
Published in: 2009 12th International Conference on Information Fusion
Date of Conference: 06-09 July 2009
Date Added to IEEE Xplore: 18 August 2009
Print ISBN:978-0-9824-4380-4
Conference Location: Seattle, WA, USA