Multi Pseudo Q-Learning-Based Deterministic Policy Gradient for Tracking Control of Autonomous Underwater Vehicles | IEEE Journals & Magazine | IEEE Xplore