泰勒展开与复合注意力引导的红外与可见光图像融合
CSTR:
作者:
作者单位:

(兰州交通大学 电子与信息工程学院,兰州 730070)

作者简介:

杨艳春(1979—),女,副教授,硕士生导师

通讯作者:

李毅,1544726016@qq.com

中图分类号:

TP391;TN29

基金项目:

国家自然科学基金(3,6);甘肃省重点研发计划(25YFGA047);甘肃省自然科学基金(23JRRA7,1JR7RA300)


Infrared and visible image fusion guided by Taylor expansion and composite attention
Author:
Affiliation:

(School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China)

Fund Project:

undefined

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为解决深度学习融合算法中存在的忽略像素间相关性,导致融合结果丢失重要全局纹理,以及难以平衡目标突出与场景增强的问题,本文提出了一种泰勒展开与复合注意力机制引导的红外与可见光图像融合算法。首先,设计了一种泰勒展开网络,将输入图像分解为映射层与导数层,从而实现对图像多层次特征信息的有效提取;其次,采用双分支特征提取网络,其中平行卷积网络负责捕获局部细节特征,SwinTransformer模块则专注于提取全局上下文信息,确保局部与全局特征的高效保留;再次,引入复合注意力机制来进一步提升特征融合的精度,该机制通过轴向注意力融合空间维度特征,同时利用通道注意力强化通道间的特征响应,以实现更精细的特征选择与融合。最后,通过图像重建得到融合图像。在公开数据集MSRS和RoadScene进行了相关实验,结果表明,本文方法融合图像不仅在纹理细节保持与全局信息保留方面更完整,而且在客观指标中取得显著优势。该研究结果可为深度学习图像融合领域提供新的思路。

    Abstract:

    In order to solve the problems of ignoring the correlation between pixels in the deep learning fusion algorithm, which leads to the loss of important global texture in the fusion results, and the difficulty of balancing target highlight and scene enhancement, this paper proposed an infrared and visible image fusion algorithm guided by Taylor expansion and composite attention mechanism. Firstly, a Taylor expansion network was designed to decomposition the input image into a mapping layer and a derivative layer, so as to effectively extract the multi-level feature information of the image. Secondly, a dual-branch feature extraction network was used, in which the parallel convolutional network was responsible for capturing local detail features, and the SwinTransformer module focused on extracting global context information to ensure the efficient retention of local and global features. Then, the composite attention mechanism is introduced to further improve the accuracy of feature fusion. This mechanism fuses spatial dimensional features through axial attention, and uses channel attention to strengthen the feature response between channels, so as to achieve more refined feature selection and fusion. Finally, the fused image was obtained by image reconstruction. Experiments are carried out on the public datasets MSRS and RoadScene. The results show that the proposed method is not only more complete in maintaining texture details and global information, but also achieves significant advantages in objective indicators. The research results can provide new ideas for the field of deep learning image fusion.

    参考文献
    相似文献
    引证文献
引用本文

杨艳春,李毅.泰勒展开与复合注意力引导的红外与可见光图像融合[J].哈尔滨工业大学学报,2026,58(5):54. DOI:10.11918/202509025

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-09-07
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2026-05-28
  • 出版日期:
文章二维码