Skip to Main Content
In recent years, intelligent video surveillance attempts to provide content analysis tools to understand and predict the actions via video sensor networks (VSN) for automated wide-area surveillance. In this emerging network, visual object data is transmitted through different devices to adapt to the needs of the specific content analysis task. Therefore, they raise a new challenge for video delivery: how to efficiently transmit visual object data to various devices such as storage device, content analysis server, and remote client server through the network. Object-based video encoder can be used to reduce transmission bandwidth with minor quality loss. However, the involved motion-compensated technique often leads to high computational complexity and consequently increases the cost of VSN. In this paper, contextual redundancy associated with background and foreground objects in a scene is explored. A scene analysis method is proposed to classify macroblocks (MBs) by type of contextual redundancy. The motion search is only performed on the specific type of context of MB which really involves salient motion. To facilitate the encoding by context of MB, an improved object-based coding architecture, namely dual-closed-loop encoder, is derived. It encodes the classified context of MB in an operational rate-distortion-optimized sense. The experimental results show that the proposed coding framework can achieve higher coding efficiency than MPEG-4 coding and related object-based coding approaches, while significantly reducing coding complexity.