Multimodal Learning for Image Caption Generation: A Deep Learning Approach | IEEE Conference Publication | IEEE Xplore