End-to-end Video-level Representation Learning for Action Recognition | IEEE Conference Publication | IEEE Xplore