Multimodal object recognition from visual and audio sequences | IEEE Conference Publication | IEEE Xplore