Hardware mapping analysis of DDPG algorithm based on FPGA and robot motion skill learning

doi:10.11918/202508035

Home > Archive>Volume 58, Issue 1, 2026 >24-34. DOI:10.11918/202508035

Hardware mapping analysis of DDPG algorithm based on FPGA and robot motion skill learning
DOI:
                        10.11918/202508035
                    
CSTR:
                        
Author:
                        
Affiliation:(1.School of Information Science and Technology, Beijing University of Technology, Beijing 100039, China; 2.CNNC Hexin Information Technology (Beijing) Co., LTD., Beijing 100091, China; 3.Nuclear Industry X Intelligence Laboratory (Beijing University of Technology), Beijing 100124, China)
Clc Number:TP242
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

This paper investigates the intrinsic connection between neural networks, reinforcement learning (RL) algorithms, and the evolutionary principles of higher animals by developing an observable and interpretable autonomous control system for a wheel-legged robot. Leveraging the Deep Deterministic Policy Gradient (DDPG) algorithm, an Actor-Critic neural network has been implemented directly on Field-programmable gate arrays (FPGA). An FPGA-ARM robot control system is further designed to export weight activation signals in real time and generate weight heatmaps, thereby visualizing the strategy evolution process. Experimental results demonstrate that the proposed system has the ability of reducing the single-step computation latency to 28 μs and achieves convergence within 5 000 steps. Moreover, the weight heatmaps reveal the dynamic evolution of strategies across three phases——early, middle, and late stages. Qualitative analysis indicates that non-salient regions have minimal impact on the overall strategy, resulting in more efficient resource utilization. The proposed hardware-algorithm co-design framework establishes a novel paradigm for improving the interpretability and reducing the “black-box” nature of RL. It also showcases the unique advantages of FPGA in embedded robot control, namely low latency, high parallelism, and low power consumption. This work lays a robust foundation and presents promising prospects for real-time skill learning and hardware acceleration in scenarios involving multi-agent cooperation and heterogeneous computing platforms.

Reference

Cited by

Get Citation

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 17,2025
Revised:
Adopted:
Online: January 08,2026
Published:

Publication Statement

Journal Subscription

Get Citation

Related Videos

Share

Article Metrics

History

Article QR Code