Skip to Main Content
In this paper, we present a model-based video coding method that uses input from colour and depth cameras, such as the Microsoft Kinect. The model-based approach uses a 3D representation of the scene, enabling several other applications besides video playback. Some of these applications are stereoscopic viewing, object insertion for augmented reality and free viewpoint viewing. The video encoding step uses computer vision to estimate the camera motion. The scene geometry is represented by key frames, which are encoded as 3D quads using a quad tree, allowing good compression rates. Camera motion in-between key frames is approximated to be linear. The relative camera positions at key frames and the scene geometry are then compressed and transmitted to the decoder. Our experiments demonstrate that the model-based approach delivers a high level of detail at competitively low bit rates.