Skip to Main Content
This paper presents the framework of a novel approach to combine multi-modal sensor information from audio and video modalities to gain valuable supplementary information compared to traditional video-based observation systems or even just CCTV systems. A hierarchical, multi-modal sensor processing architecture for observation and surveillance systems is proposed. It recognizes a set of pre-defined behavior and learns about usual behavior. Deviations from ldquonormalityrdquo are reported in a way understandable even for staff without special training. The processing architecture including the physical sensor nodes is called SENSE (smart embedded network of sensing entities) (Zucker and Frangu, 2007).