Hardware mapping analysis of DDPG algorithm based on FPGA and robot motion skill learning
CSTR:
Author:
Affiliation:

(1.School of Information Science and Technology, Beijing University of Technology, Beijing 100039, China; 2.CNNC Hexin Information Technology (Beijing) Co., LTD., Beijing 100091, China; 3.Nuclear Industry X Intelligence Laboratory (Beijing University of Technology), Beijing 100124, China)

Clc Number:

TP242

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    This paper investigates the intrinsic connection between neural networks, reinforcement learning (RL) algorithms, and the evolutionary principles of higher animals by developing an observable and interpretable autonomous control system for a wheel-legged robot. Leveraging the Deep Deterministic Policy Gradient (DDPG) algorithm, an Actor-Critic neural network has been implemented directly on Field-programmable gate arrays (FPGA). An FPGA-ARM robot control system is further designed to export weight activation signals in real time and generate weight heatmaps, thereby visualizing the strategy evolution process. Experimental results demonstrate that the proposed system has the ability of reducing the single-step computation latency to 28 μs and achieves convergence within 5 000 steps. Moreover, the weight heatmaps reveal the dynamic evolution of strategies across three phases——early, middle, and late stages. Qualitative analysis indicates that non-salient regions have minimal impact on the overall strategy, resulting in more efficient resource utilization. The proposed hardware-algorithm co-design framework establishes a novel paradigm for improving the interpretability and reducing the “black-box” nature of RL. It also showcases the unique advantages of FPGA in embedded robot control, namely low latency, high parallelism, and low power consumption. This work lays a robust foundation and presents promising prospects for real-time skill learning and hardware acceleration in scenarios involving multi-agent cooperation and heterogeneous computing platforms.

    Reference
    Related
    Cited by
Get Citation
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:August 17,2025
  • Revised:
  • Adopted:
  • Online: January 08,2026
  • Published:
Article QR Code