-
在FLIR数据集以行人和车辆为目标进行模型性能对比实验,算法采用Pytorch框架,实验环境为Ubuntu18.04,2块NVIDIA RTX 8000显卡,batchsize设置为256,起始学习率为0.01,并采用余弦退火策略训练300 epochs。硬件测试平台为Intel Core i7-10750 H和NVIDIA Quadro T2000,最后在嵌入式平台NVIDIA Jetson Xavier NX进行实验验证。针对模型性能的评价,实验综合考虑mAP(mean Average Precision)、参数量(Parameters)、计算量(FLOPS)以及推理延时(Delay)四个方面对不同模型进行消融实验,验证不同网络结构的性能。
-
在特征提取网络中选择ShuffleNetv2和CSPNet与文中设计的PMFPSNet进行对比。ShuffleNetv2在轻量级结构中具有较好的性能,而CSPNet在主流的网络模型中已被广泛使用。在通道维数相同的情况下,采用不同的特征提取网络进行模型训练比较,如表1所示。实验结果中,PMFPSNet的mAP达到了0.817,参数量仅有1.57 M,计算量为4.98 GFLOPS;与CSPNet结构相比,精度提高了3%,但参数量和计算量分别下降了30%和28%;与ShuffleNet结构相比,具有更加明显的性能优势。
Model mAP Parameters GFLOPS Delay/ms CSPNet 0.787 2.24 M 6.87 7.18 ShuffleNet 0.789 2.12 M 6.26 7.86 Maxpool 0.773 1.57 M 4.98 7.28 YOLOv4-tiny 0.811 6.27 M 17.2 84.6 PMFPSNet 0.817 1.57 M 4.98 7.34 Table 1. Comparison of model performance
-
在轻量级网络中Maxpool能够以较少的参数实现降采样,在其他结构相同的情况下,对Slim-Focus和Maxpool两种降采样结构性能进行比较,如表1所示,采用Slim-Fcous能更好地保留红外目标的特征,在降采样过程中的信息丢失更少,比Maxpool方式的精度高4.4%,验证了Slim-Focus降采样结构的有效性。
-
将提出的anchor-free算法PMFPSNet与anchor-based轻量级网络性能进行对比,YOLOv4-tiny是轻量级网络中的优秀代表,与之相比,在表1中PMFPSNet精度略高,但参数量和计算量分别仅有前者的25%和29%,推理速度仅有9%,尽管其在网络宽度上进行了缩减,但是由于卷积计算采用稠密方式,计算量仍然较大,而且网络输出层的减少限制了其性能。检测效果如图10所示,YOLOv4-tiny对密集红外目标的检测存在较多的漏检,而PMFPSNet对小目标的检测效果更好。综上所述,相比其他结构,文中的轻量级结构能够以更少的计算量和参数量实现更高的精度,并且具有较好的推理速度,模型性能更加优越。
Anchor-free lightweight infrared object detection method (Invited)
doi: 10.3788/IRLA20220193
- Received Date: 2022-03-17
- Rev Recd Date: 2022-04-11
- Accepted Date: 2022-04-11
- Publish Date: 2022-05-06
-
Key words:
- infrared target /
- lightweight /
- object detection /
- neural network /
- asymmetric convolution
Abstract: According to the characteristics of infrared targets, an anchor-free lightweight infrared target detection method was proposed, which improved the detection ability of embedded platform. For the platform with limited computing resources, a new lightweight convolution structure was proposed. Asymmetric convolution was introduced to enhance the feature expression ability of standard convolution, reducing the amount of parameters and computation effectively. A lightweight feature extraction unit was constructed by designing parallel multi-feature path, which generated rich features through channel concatation, then combining with attention module and channel shuffle. SkipBranch was added to promote the transmission of shallow information to the high level and further enrich the characteristics of the high level. Experiments on FLIR dataset showed that the accuracy of the designed lightweight network structure was 81.7%, which exceeded YOLOv4-tiny. However, the model parameters and calculation amount were reduced by 75.0% and 71.1% respectively, and the reasoning time was compressed by 91.3%, which could meet the real-time detection requirements of infrared object on embedded platform.