Finite Sample Analysis of Minmax Variant of Offline Reinforcement Learning for General MDPs | IEEE Journals & Magazine | IEEE Xplore