From Deterministic to Generative: Multimodal Stochastic RNNs for Video Captioning | IEEE Journals & Magazine | IEEE Xplore