| 引用本文: | 沈瑜,马煜堃,赵永刚,魏子易,李江柽,王若暄,刘佳英,闫佳荣.多尺度特征建模的图像时间序列预测网络[J].哈尔滨工业大学学报,2026,58(1):119.DOI:10.11918/202503001 |
| SHEN Yu,MA Yukun,ZHAO Yonggang,WEI Ziyi,LI Jiangcheng,WANG Ruoxuan,LIU Jiaying,YAN Jiarong.Multi-scale feature modeling for image time-series prediction network[J].Journal of Harbin Institute of Technology,2026,58(1):119.DOI:10.11918/202503001 |
|
| |
|
|
| 本文已被:浏览 2613次 下载 876次 |
 码上扫一扫! |
|
|
| 多尺度特征建模的图像时间序列预测网络 |
|
沈瑜1,马煜堃1,赵永刚2,魏子易1,李江柽1,王若暄1,刘佳英1,闫佳荣1
|
|
(1.兰州交通大学 电子与信息工程学院,兰州 730070;2.兰州萃英信息科技有限公司,兰州 730070)
|
|
| 摘要: |
| 为提高图像时间序列预测的精度,本研究提出了一种基于 长短期记忆网络(long short-term memory,LSTM)与注意力机制的时间序列预测网络:MA-LSTM。该网络整体由多尺度注意力模块(multi-scale attention block,MAB)、多尺度注意力层(multi-scale attention layer,MALayer)和超分辨率重建模块(super resolution reconstruction module,SRRM)组成,以多尺度特征建模为核心,着重提升时空特征表达能力与长程依赖建模能力。首先,MA-LSTM设计了MAB模块,通过时空特征增强层 提升模型的细节建模能力,并利用通道特征增强层加强了特征图的跨维度信息交互,解决了SwinLSTM对于细粒度特征捕捉不足的问题。其次,MA-LSTM引入了简化的LSTM结构,与MAB结合构建了MALayer,增强模型对时序信息的建模能力。最后,在特征图重建时设计了SRRM模块,有效增强模型预测输出的细节表达能力。研究表明,MA-LSTM在MovingMNIST和KTH两个不同领域的数据集上,结构相似性指数分别达到0.960 2和0.924 3,与SwinLSTM、PhyDNet、PredRNN、ConvLSTM网络进行的对比试验结果表明,结构相似性指数最高提升了0.337和0.212,展现了其在时序预测任务中的高效性和适用性,且具备跨领域的推广潜力。此外,消融实验进一步证明了本文所提出模块的有效性。 |
| 关键词: 图像时间序列 预测网络 LSTM 移位窗口注意力 多注意力融合 |
| DOI:10.11918/202503001 |
| 分类号:TP183 |
| 文献标识码:A |
| 基金项目:国家自然科学基金青年科学基金A类(42325502);甘肃省重点研发计划(甘科计[2024]10号-24YFGA037);国家自然科学基金(6,5);甘肃省科技专员专项(甘科计[2023]18号-23CXGA0008);“智慧天路”建设重大专项-QZzhtlzx(2023QZzhtl1102);兰州局集团公司科技研究开发计划LZJKY2024079-1;中国国家铁路集团有限公司重点课题(N2023X050);兰州交通大学重点研发项目(LZJTU-ZDYF2305) |
|
| Multi-scale feature modeling for image time-series prediction network |
|
SHEN Yu1,MA Yukun1,ZHAO Yonggang2,WEI Ziyi1,LI Jiangcheng1,WANG Ruoxuan1,LIU Jiaying1,YAN Jiarong1
|
|
(1.School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China;2.Lanzhou Trying Information Technology Co., Ltd., Lanzhou 730070, China)
|
| Abstract: |
| To improve the accuracy of image time-series prediction, a time-series prediction network of MA-LSTM is proposed based on LSTM and attention mechanism. This model is consist of multi-scale attention module (MAB), multi-scale attention layer (MALayer) and super-resolution reconstruction module (SRRM), it could improve the express spatiotemporal features and long-range dependencies. Firstly, MAB module is designed, and detail modeling is improved through the spatiotemporal feature enhancement layer (GSTA), then the channel feature enhancement layer (GCA), overcoming SwinLSTM′s insufficient capture of fine-grained features, is used to enhance the cross-dimensional information interactions of the feature map. Secondly, a simplified LSTM structure is employed, and MALayer is constructed in combination with MAB to improve modeling of time series information. Finally, the SRRM module is designed during feature map reconstruction to improve the prediction output. Experimental results show that MA-LSTM achieves a structural similarity index(SSIM) of 0.960 2 and 0.924 3 on two datasets in different fields: MovingMNIST and KTH. Compared with SwinLSTM, PhyDNET, PredRNN, and ConvLSTM networks, the highest accuracy improvement of 0.337 and 0.212, respectively. This model demonstrates the higher efficiency and applicability in time series prediction tasks and the well potential for cross-domain promotion, and the ablation experiments also show the effectiveness of the proposed module. |
| Key words: image time series data prediction network LSTM shifted window attention multi-attention fusion |
|
|
|
|