Loading web-font TeX/Main/Regular
Low Complexity and Scalable Architecture of Dual-Mode Training and Inference Hardware Accelerator for Deep Q-Network (DQN) Based Edge Computing | IEEE Journals & Magazine | IEEE Xplore

Low Complexity and Scalable Architecture of Dual-Mode Training and Inference Hardware Accelerator for Deep Q-Network (DQN) Based Edge Computing


The overall process of prediction is shown in graphical abstract, including the process of data acquisition, data preprocessing, forecasting, and recombinin.

Abstract:

Reinforcement Learning (RL) has shown great potential in a wide range of applications, including robotics, autonomous vehicles, and communication systems. However, deploy...Show More

Abstract:

Reinforcement Learning (RL) has shown great potential in a wide range of applications, including robotics, autonomous vehicles, and communication systems. However, deploying RL on edge devices poses significant challenges due to limited computational resources and the complexity of real-time decision-making. This paper proposes a novel hardware accelerator architecture for Deep Q-Networks (DQN) tailored to edge computing environments. The design supports both inference and training modes and utilizes shared hardware modules, approximate activation functions, and lightweight multipliers to reduce area and power consumption. The system also enables flexible configuration of neural network parameters and the execution of key RL components—such as policy generation, environment modeling, and reward calculation—via software on the processing system (PS). Quantization techniques, including Piecewise Linear (PWL) approximations and fixed-point representation, are employed to maintain computational accuracy and model convergence. The proposed accelerator, implemented on an Ultra96 FPGA at 70 MHz, achieves 671\times and 287\times speedups for inference and training, respectively, with only 13 mW power consumption. This architecture provides an efficient and scalable solution for deploying RL-based autonomous systems in resource-constrained edge environments.
The overall process of prediction is shown in graphical abstract, including the process of data acquisition, data preprocessing, forecasting, and recombinin.
Published in: IEEE Access ( Volume: 13)
Page(s): 71705 - 71722
Date of Publication: 21 April 2025
Electronic ISSN: 2169-3536

Funding Agency:


References

References is not available for this document.