Building A Voice Based Image Caption Generator with Deep Learning | IEEE Conference Publication | IEEE Xplore