Audio-Visual Speech Enhancement and Separation by Utilizing Multi-Modal Self-Supervised Embeddings | IEEE Conference Publication | IEEE Xplore