Multimodal One-shot Learning of Speech and Images | IEEE Conference Publication | IEEE Xplore