First- And Third-Person Video Co-Analysis By Learning Spatial-Temporal Joint Attention | IEEE Journals & Magazine | IEEE Xplore