Abstract:
Speaker diarization is the task of estimating “who spoke when” in a meeting. To realize accurate diarization for real meetings, we have to deal with noise, speaker overla...Show MoreMetadata
Abstract:
Speaker diarization is the task of estimating “who spoke when” in a meeting. To realize accurate diarization for real meetings, we have to deal with noise, speaker overlap, reverberation, etc. In this work, we propose to model directional statistics of spatial clusters via a dictionary of probabilistic models. The dictionary is trained using spatial features of possible source locations. Observed mixtures of multiple source signals are statistically represented as the weighted sum of the trained models, where each weight defines the activity of a source associated with a spatial location or a cluster. To detect the active clusters and perform the speaker diarization, the weights are estimated by applying Bayes' rule. Furthermore, a Laplace distribution is proposed to model the background noise. The proposed method was evaluated in real meetings, and it provided high performance comparing to a baseline method.
Date of Conference: 13-16 September 2016
Date Added to IEEE Xplore: 24 October 2016
ISBN Information:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Speaker Diarization ,
- Spatial Dictionary ,
- Background Noise ,
- Probabilistic Model ,
- Spatial Features ,
- Spatial Clustering ,
- Baseline Methods ,
- Signal Source ,
- Active Clusters ,
- Laplace Distribution ,
- Mixture Of Signals ,
- Test Phase ,
- Source Activity ,
- Exact Match ,
- Source Images ,
- Noise Model ,
- Short-time Fourier Transform ,
- Time Difference Of Arrival ,
- Neighbor Clustering ,
- Mixture Weights ,
- Speech Detection ,
- Empirical Covariance Matrix ,
- Number Of Utterances
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Speaker Diarization ,
- Spatial Dictionary ,
- Background Noise ,
- Probabilistic Model ,
- Spatial Features ,
- Spatial Clustering ,
- Baseline Methods ,
- Signal Source ,
- Active Clusters ,
- Laplace Distribution ,
- Mixture Of Signals ,
- Test Phase ,
- Source Activity ,
- Exact Match ,
- Source Images ,
- Noise Model ,
- Short-time Fourier Transform ,
- Time Difference Of Arrival ,
- Neighbor Clustering ,
- Mixture Weights ,
- Speech Detection ,
- Empirical Covariance Matrix ,
- Number Of Utterances
- Author Keywords