Abstract:In order to solve the problems of ignoring the correlation between pixels in the deep learning fusion algorithm, which leads to the loss of important global texture in the fusion results, and the difficulty of balancing target highlight and scene enhancement, this paper proposed an infrared and visible image fusion algorithm guided by Taylor expansion and composite attention mechanism. Firstly, a Taylor expansion network was designed to decomposition the input image into a mapping layer and a derivative layer, so as to effectively extract the multi-level feature information of the image. Secondly, a dual-branch feature extraction network was used, in which the parallel convolutional network was responsible for capturing local detail features, and the SwinTransformer module focused on extracting global context information to ensure the efficient retention of local and global features. Then, the composite attention mechanism is introduced to further improve the accuracy of feature fusion. This mechanism fuses spatial dimensional features through axial attention, and uses channel attention to strengthen the feature response between channels, so as to achieve more refined feature selection and fusion. Finally, the fused image was obtained by image reconstruction. Experiments are carried out on the public datasets MSRS and RoadScene. The results show that the proposed method is not only more complete in maintaining texture details and global information, but also achieves significant advantages in objective indicators. The research results can provide new ideas for the field of deep learning image fusion.