Visual Question Answering based on multimodal triplet knowledge accumuation | IEEE Conference Publication | IEEE Xplore