Multimodal Convolutional Neural Networks for Matching Image and Sentence | IEEE Conference Publication | IEEE Xplore