Objective Infrared time-sensitive targets refer to infrared targets such as ships and aircraft, which have high military value and the opportunity of attack is limited by the time window. Infrared time-sensitive target detection technology is widely used in military and civilian fields such as unmanned cruise, precision strike, battlefield reconnaissance, etc. The target detection algorithm based on deep learning has made great progress in the field of target detection due to its powerful computing power, deep network structure and a large number of labeled data. However, the acquisition of some high-value target images is difficult and costly. Therefore, the infrared time-sensitive target image data is scarce, and the multi-scene and multi-target data for training is lacking, which makes it difficult to ensure the detection effect. Based on this, this paper proposes an infrared time-sensitive target detection technology based on cross-modal data enhancement, which generates "new data" by processing the data, expands the infrared time-sensitive target data set, and improves the model detection accuracy and generalization ability.
Methods We propose an infrared time-sensitive target detection technology based on cross-modal data enhancement. The cross-modal data enhancement method is a two-stage model (Fig.1). First, in the first stage, the visible light image containing time-sensitive targets is converted into infrared images through the mode conversion model based on the CUT network, and then the coordinate attention mechanism is introduced into the second stage model to randomly generate a large number of infrared target images, realizing the data enhancement effect. Finally, an improved Yolov5 target detection architecture based on SE module and CBAM module is proposed (Fig.3).
Results and Discussions The proposed cross-modal infrared time-sensitive target data enhancement method combines the style migration model with the target generation model, and uses the visible light image data set to achieve infrared time-sensitive target data enhancement. We can convert remote sensing visible image into infrared image without losing size, structure and field of view, without distortion, noise, distortion and other problems. It can be seen from Fig.6 that the generated infrared time-sensitive target has good texture details and infrared characteristics, and is clearly distinguished from the background. An improved Yolov5 target detection model is proposed. SE and CBAM attention mechanisms are added to the CSP network to enhance the feature expression of the network and better achieve infrared time-sensitive target detection. It can be seen from the analysis of Tab.2 that compared with using the original data to train the deep learning detection network, the data enhancement algorithm proposed in this paper has significantly improved the detection ability of positive samples, the detection accuracy rate, the recall rate, and the average accuracy have increased by 14.57%, 5.99%, and 8.82% respectively. It can be seen from Tab.3 that compared with SSD, Fast R-CNN and Yolov5, the algorithm in this paper has a great improvement in accuracy, average accuracy and F1 index. Compared with the original Yolov5 network, the accuracy rate, the recall rate, the average accuracy, and the F1 index have increased by 7.36%, 5.43%, 2.74%, and 6.45% respectively. Some test results are shown (Fig.9).
Conclusion Due to the lack of infrared time-sensitive target data and poor detection effect, we proposes a cross-modal data enhancement infrared time-sensitive target detection technology. In the aspect of two-stage model data enhancement, firstly, the visible light remote sensing image containing time-sensitive targets is converted into the target image with infrared characteristics using the mode conversion network. Secondly, the coordinate attention mechanism is introduced into the sample random generation model. Finally, the Yolov5 detection technology based on the improved CSP module is proposed. Multiple sets of experimental results show that the detection accuracy of the algorithm in this paper is up to 98.06% in the infrared time-sensitive target data set, which solves the problem of the lack of infrared time-sensitive target data and has good target detection ability.