Object-based coding and description in real time are increasingly important for many image and video applications. We propose a very large scale integration (VLSI) architecture based on a novel segmentation algorithm for extracting objects in video. The segmentation architecture of a frame mainly consists of two functional phases. In the first phase, pixel-based static and dynamic features in video are extracted, filtered, normalized. These multiple features are labeled using a self-organizing feature map neural networks architecture to generate initial segmentation labels. An edge fusion module in the second phase combines the initial segmentation labels and a linked edge map of a frame to generate more accurate segmentation where region boundaries are closer to the actual object boundaries. Computational and hardware complexities of the proposed architecture are estimated in terms of the number of clocks, arithmetic components, gates, and memory requirements. The performance of the proposed VLSI architecture demonstrates the possibility of performing object-oriented video segmentation in real time.
Published in:
Circuits and Systems for Video Technology, IEEE Transactions on
(Volume:13
,
Issue:
1
)
Date of Publication: Jan 2003