An architecture for incremental information fusion of cross-modal representations | IEEE Conference Publication | IEEE Xplore