Skip to Main Content
In this paper, we propose a novel algorithm to generate multiple virtual views from a video-plus-depth sequence for modern autostereoscopic displays. To synthesize realistic content in the disocclusion regions at the virtual views is the main challenging problem for this task. Spatial coherence and temporal consistency are the two key factors to produce perceptually satisfactory virtual images. The proposed algorithm employs the spatio-temporal consistency constraint to handle the uncertain pixels in the disocclusion regions. On the one hand, regarding the spatial coherence, we incorporate the intensity gradient strength with the depth information to determine the filling priority for inpainting the disocclusion regions, so that the continuity of image structures can be preserved. On the other hand, the temporal consistency is enforced by estimating the intensities in the disocclusion regions across the adjacent frames with an optimization process. We propose an iterative re-weighted framework to jointly consider intensity and depth consistency in the adjacent frames, which not only imposes temporal consistency but also reduces noise disturbance. Finally, for accelerating the multi-view synthesis process, we apply the proposed view synthesis algorithm to generate the intensity and depth maps at the leftmost and rightmost viewpoints, so that the intermediate views are efficiently interpolated through image warping according to the associated depth maps between the two synthesized images and their corresponding symmetric depths. In the experimental validation, we perform quantitative evaluation on synthetic data as well as subjective assessment on real video data with comparison to some representative methods to demonstrate the superior performance of the proposed algorithm.