Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion | IEEE Journals & Magazine | IEEE Xplore