Energy-Efficient Deep Reinforcement Learning Accelerator Designs for Mobile Autonomous Systems | IEEE Conference Publication | IEEE Xplore

Energy-Efficient Deep Reinforcement Learning Accelerator Designs for Mobile Autonomous Systems


Abstract:

Deep reinforcement learning (DRL) is widely used for autonomous systems including autonomous driving, robots, and drones. DRL training is essential for human-level contro...Show More

Abstract:

Deep reinforcement learning (DRL) is widely used for autonomous systems including autonomous driving, robots, and drones. DRL training is essential for human-level control and adaptation to rapidly changing environments in mobile autonomous systems. However, acceleration of DRL training has three challenges: 1) large memory access, 2) various data patterns, 3) complex data dependency due to utilization of multiple DNNs. Two CMOS DRL accelerators have been proposed to support high speed, high energy-efficiency DRL training in mobile autonomous systems. One accelerator handles different data patterns with transposable PE architecture and reduces large feature map memory access with top-3 experience compression. The other accelerator supports group-sparse training for weight compression and integrates the on-line DRL task scheduler to support multi-DNNs operations.
Date of Conference: 06-09 June 2021
Date Added to IEEE Xplore: 23 June 2021
ISBN Information:
Conference Location: Washington DC, DC, USA

I. Introduction

Recently, reinforcement learning (RL) has been widely investigated for autonomous systems including autonomous driving [1], [2], drones [3], [4], and robots [5], [6]. The main reason for the recent interest in reinforcement learning is the impressive increase of its decision-making performance, driven by deep reinforcement learning (DRL). DRL adopts deep neural networks (DNNs) for approximate high-complexity environments to overcome the low control performance due to the limited state dimension of the classical RL algorithms. DRL has achieved human-level or even better control performance in a lot of complex environments. [5] trained a four-legged walking robot through DRL. The trained robot could adapt to sudden environment changes including slope variations and new obstacles. [6] showed a human-level performance robotic curling team winning 3 out of 4 expert teams in an actual curling match. In the recent DRAPA air combat evolution program, the agents trained with state-of-the-art DRL algorithms such as SAC [7], TD3 [8], and PPO [9] beat human pilots.

Contact IEEE to Subscribe

References

References is not available for this document.