Muse: Multi-Modal Target Speaker Extraction with Visual Cues | IEEE Conference Publication | IEEE Xplore