Skip to Main Content
Free viewpoint video enables the visualisation of a scene from arbitrary viewpoints and directions. However, this flexibility in video rendering provides a challenge in 3D media for achieving spatial synchronicity between the audio and video objects. When the viewpoint is changed, its effect on the perceived audio scene should be considered to avoid mismatches in the perceived positions of audiovisual objects. Spatial audio coding with such flexibility requires decomposing the sound scene into audio objects initially, and then synthesizing the new scene according to the geometric relations between the A/V capturing setup, selected viewpoint and the rendering system. This paper proposes a free viewpoint audio coding framework for 3D media systems utilising multiview cameras and a microphone array. A real-time source separation technique is used for object decomposition followed by spatial audio coding. Binaural, multichannel sound systems and wave field synthesis systems are addressed. Subjective test results shows that the method achieves spatial synchronicity for various viewpoints consistently, which is not possible by conventional recording techniques.