Skip to Main Content
Texture and depth maps of two neighboring camera viewpoints are typically required for synthesis of an intermediate virtual view via depth-image-based rendering (DIBR). However, the bitrate overhead required for reconstruction of multiple texture and depth maps at decoder can be large. The performance of multiview video encoders such as MVC is limited by the simple fact that the chosen representation is inherently redundant: a texture or depth pixel visible from both camera viewpoints is represented twice. In this paper, we propose an alternative 3D scene representation without such redundancy, yet at decoder, one can still reconstruct texture and depth maps of two camera viewpoints for DIBR-based synthesis of intermediate views. In particular, we propose to first encode texture and depth videos of a single viewpoint, which are used to synthesize the uncoded viewpoint via DIBR at decoder. Then, we encode additional rate-distortion (RD) optimal auxiliary information (AI) to guide an inpainting-based hole-filling algorithm at decoder and complete the missing information due to disocclusion. For a missing pixel patch in the synthesized view, the AI can: i) be skipped and then let the decoder by itself retrieve the missing information, ii) identify a suitable spatial region in the reconstructed view for patch-matching, or iii) explicitly encode missing pixel patch if no satisfactory patch can be found in the reconstructed view. Experimental results show that our alternative representation can achieve up to 41% bit-savings compared to H.264/MVC implementation.
Date of Conference: 15-17 Oct. 2012