Abstract:
In this work we propose novel joint and sequential multimodal approaches for the task of single channel audio source separation in videos. This is done within the popular...Show MoreMetadata
Abstract:
In this work we propose novel joint and sequential multimodal approaches for the task of single channel audio source separation in videos. This is done within the popular non-negative matrix factorization framework using information about the sounding object's motion. Specifically, we present methods that utilize non-negative least squares formulation to couple motion and audio information. The proposed techniques generalize recent work carried out on NMF-based motion-informed source separation and easily extend to video data. Experiments with two distinct multimodal datasets of string instrument performance recordings illustrate their advantages over the existing methods.
Published in: 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
Date of Conference: 15-18 October 2017
Date Added to IEEE Xplore: 11 December 2017
ISBN Information:
Electronic ISSN: 1947-1629