Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning | IEEE Conference Publication | IEEE Xplore