Visually Guided Binaural Audio Generation with Cross-Modal Consistency | IEEE Conference Publication | IEEE Xplore