Aligning Where to See and What to Tell: Image Captioning with Region-Based Attention and Scene-Specific Contexts | IEEE Journals & Magazine | IEEE Xplore