Skip to Main Content
Stereo video, targeting at matching what the humans see in the real world, offers depth perception on observed scenes. The problem of compressing such a huge amount of video data has received considerable attention in the past few years. Its challenge comes from the much more complicated parameter selection (e.g., mode decision) process than single-channel video coding. To cope with this problem, the currently developed H.264/AVC-based JMVM platform is adopted here for implementation, where the encoding of the left-view channel is purely based on predictions from the temporal domain, while for the right-view channel, combined predictions from the temporal and the inter-view domains are exploited and a hierarchical two-stage neural classifier is designed for fast mode decision. The first-stage neural classifier determines candidates of block partition for each macroblock, while the second-stage classifier aims to choose the most probable prediction sources among the temporally forward/backward and inter-view directions. All input features for both stages of neural classifiers are calculated from simple inter-frame and inter-view analyses. In our scheme, the popular fast motion estimation schemes can be also cooperated for further speedup. Experiment results reveal that our proposed algorithm is capable of achieving up to 97% of time savings with nearly ignorable quality degradation and acceptable bit-rate increase (up to 5.67%).
Selected Topics in Signal Processing, IEEE Journal of (Volume:5 , Issue: 2 )
Date of Publication: April 2011