Diverse Audio-to-Image Generation via Semantics and Feature Consistency | IEEE Conference Publication | IEEE Xplore