Scaling Up Vision-Language Pretraining for Image Captioning | IEEE Conference Publication | IEEE Xplore