Abstract:To improve the accuracy of image time-series prediction, a time-series prediction network of MA-LSTM is proposed based on LSTM and attention mechanism. This model is consist of multi-scale attention module (MAB), multi-scale attention layer (MALayer) and super-resolution reconstruction module (SRRM), it could improve the express spatiotemporal features and long-range dependencies. Firstly, MAB module is designed, and detail modeling is improved through the spatiotemporal feature enhancement layer (GSTA), then the channel feature enhancement layer (GCA), overcoming SwinLSTM′s insufficient capture of fine-grained features, is used to enhance the cross-dimensional information interactions of the feature map. Secondly, a simplified LSTM structure is employed, and MALayer is constructed in combination with MAB to improve modeling of time series information. Finally, the SRRM module is designed during feature map reconstruction to improve the prediction output. Experimental results show that MA-LSTM achieves a structural similarity index(SSIM) of 0.960 2 and 0.924 3 on two datasets in different fields: MovingMNIST and KTH. Compared with SwinLSTM, PhyDNET, PredRNN, and ConvLSTM networks, the highest accuracy improvement of 0.337 and 0.212, respectively. This model demonstrates the higher efficiency and applicability in time series prediction tasks and the well potential for cross-domain promotion, and the ablation experiments also show the effectiveness of the proposed module.