A multi-step delay parameter update parallel optimization method for deep neural network

doi:10.11918/202407052

Home > Archive>Volume 57, Issue 9, 2025 >95-108. DOI:10.11918/202407052

A multi-step delay parameter update parallel optimization method for deep neural network
DOI:
                        10.11918/202407052
                    
CSTR:
                        
Author:
                        
Affiliation:(1.School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China; 2.School of Computer Science and Technology,Xi′an Jiaotong University, Xi′an 710049, China)
Clc Number:TP391
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

To address the high communication overhead caused by global gradient parameter updates at aggregation nodes in distributed data parallel training of deep neural network (DNN), a parallel optimization method of multi-step delay parameter updates for deep neural network is proposed. Firstly, an adaptive multi-step update interval selection strategy was designed. After completing multiple local iterative parameter updates, node gradients are aggregated to update the global model parameters, reducing the excessive communication overhead caused by frequent gradient aggregation. At the same time, to prevent the local model from deviating from the global model after several local updates, a parameter correction strategy is proposed to ensure the accuracy of model training. Secondly, during gradient aggregation, the gradient tensor is split into several sub-tensors. By combining sub-tensor priority scheduling, communication and computation during gradient aggregation are maximally overlapped, further accelerating the model training process. Finally, on the CIFAR-100 and ImageNet-mini datasets, the proposed method is compared with SSGD, Local SGD training methods. Results show that the proposed method can significantly reduce communication overhead due to parameter updating on the basis of ensuring model training accuracy. It can maximize the overlap of communication and computing, and make full use of computing resources to improve the speed of parallel training. The results of this study can provide a new resolution to reduce communication costs in the distributed training process of deep neural network.

Reference

Cited by

Get Citation

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:July 17,2024
Revised:
Adopted:
Online: September 15,2025
Published:

Publication Statement

Journal Subscription

Get Citation

Related Videos

Share

Article Metrics

History

Article QR Code