End-to-End Pre-Training With Hierarchical Matching and Momentum Contrast for Text-Video Retrieval | IEEE Journals & Magazine | IEEE Xplore