Anchor-free lightweight infrared object detection method (<i>Invited</i>)

Gao Fan; Yang Xiaogang; Lu Ruitao; Wang Siyu; Gao Jiuan; Xia Hai

doi:10.3788/IRLA20220193

Volume 51 Issue 4

May 2022

Turn off MathJax

Article Contents

Article Navigation > Infrared and Laser Engineering > 2022 > 51(4): 20220193

Gao Fan, Yang Xiaogang, Lu Ruitao, Wang Siyu, Gao Jiuan, Xia Hai. Anchor-free lightweight infrared object detection method (Invited)[J]. Infrared and Laser Engineering, 2022, 51(4): 20220193. doi: 10.3788/IRLA20220193

Citation:

Gao Fan, Yang Xiaogang, Lu Ruitao, Wang Siyu, Gao Jiuan, Xia Hai. Anchor-free lightweight infrared object detection method (Invited)[J]. Infrared and Laser Engineering, 2022, 51(4): 20220193. doi: 10.3788/IRLA20220193

Anchor-free lightweight infrared object detection method (Invited)

doi: 10.3788/IRLA20220193

Gao Fan^{1, 2
,},
Yang Xiaogang^2
,,
Lu Ruitao²,
Wang Siyu²,
Gao Jiuan²,
Xia Hai²

1.
Beijing Huahang Radio Measurement Institute, Beijing 100013, China
2.
Missile Engineering Institute, Rocket Force University of Engineering, Xi’an 710025, China

Funds: National Natural Science Foundation of China （61806209）；Natural Science Foundation of Shaanxi Province（2020 JQ-490）；Chinese Aeronautical Establishment（201851 U8012）

Received Date: 2022-03-17
Rev Recd Date: 2022-04-11
Accepted Date: 2022-04-11
Publish Date: 2022-05-06

Abstract

According to the characteristics of infrared targets, an anchor-free lightweight infrared target detection method was proposed, which improved the detection ability of embedded platform. For the platform with limited computing resources, a new lightweight convolution structure was proposed. Asymmetric convolution was introduced to enhance the feature expression ability of standard convolution, reducing the amount of parameters and computation effectively. A lightweight feature extraction unit was constructed by designing parallel multi-feature path, which generated rich features through channel concatation, then combining with attention module and channel shuffle. SkipBranch was added to promote the transmission of shallow information to the high level and further enrich the characteristics of the high level. Experiments on FLIR dataset showed that the accuracy of the designed lightweight network structure was 81.7%, which exceeded YOLOv4-tiny. However, the model parameters and calculation amount were reduced by 75.0% and 71.1% respectively, and the reasoning time was compressed by 91.3%, which could meet the real-time detection requirements of infrared object on embedded platform.
- infrared target,
- lightweight,
- object detection,
- neural network,
- asymmetric convolution

References

[1]	Howard A G, Zhu M, Chen B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications [J]. arXiv preprint, 2017: 1704.04861. doi: 10.48550/arXiv.1704.04861
[2]	Sandler M, Howard A, Zhu M, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]//IEEE/CVF Conference on Computer Vision & Pattern Recognition, 2018: 4510-4520.
[3]	Howard A, Sandler M, Chen B, et al. Searching for MobileNetV3[C]//IEEE/CVF International Conference on Computer Vision, 2019: 1314-1324.
[4]	Hu Jie, Shen Li, Sun Gang, et al. Squeeze-and-excitation networks[C]//IEEE/CVF Conference on Computer Vision & Pattern Recognition, 2018: 7132-7141.
[5]	Zhang X, Zhou X, Lin M, et al. ShuffleNet: An extremely efficient convolutional neural network for mobile devices[C]//CVF Conference on Computer Vision & Pattern Recognition, 2018: 6848-6856.
[6]	Ma N, Zhang X, Zheng H T, et al. ShuffleNetV2: Practical guidelines for efficient CNN architecture design[C]//European Conference on Computer Vision, 2018, 11218: 122-138.
[7]	Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50 x fewer parameters and <0.5 MB model size [J]. arXiv preprint, 2016: 1602.07360.
[8]	Han K, Wang Y, Tian Q, et al. GhostNet: More features from cheap operations[C]//CVF Conference on Computer Vision & Pattern Recognition, 2020: 1577-1586.
[9]	Tan M X, Le Q V. EfficientNet: Rethinking model scaling for convolutional neural networks [J]. arXiv preprint, 2019: 1905.11946. doi: 10.48550/arXiv.1905.11946
[10]	Tan M X, Le Q V. EfficientNetV2: Smaller models and faster training [J]. arXiv preprint, 2021: 2104.00298. doi: 10.48550/arXiv.2104.00298
[11]	Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031
[12]	Wang Chen, Zhang Xiufeng, Liu Chao, et al. Detection method of wheel hub weld defects based on the improved YOLOv3 [J]. Optics and Precision Engineering, 2021, 29(8): 1942-1954. (in Chinese) doi: 10.37188/OPE.20212908.1942
[13]	Cheng Yan, Yu Xuelian, Qian Weixian, et al. Ship wake extraction and detection from infrared remote sensing images [J]. Infrared and Laser Engineering, 2022, 51(2): 20210844. (in Chinese) doi: 10.3788/IRLA20210844
[14]	Wang Chunzhe, An Junshe, Jiang Xiujie, et al. Region proposal optimization algorithm based on convolutional neural networks [J]. Chinese Optics, 2019, 12(6): 1348-1361. (in Chinese) doi: 10.3788/CO.20191206.1348
[15]	Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-ResNet and the impact of residual connections on learning[C]//AAAI Conference on Artificial Intelligence, 2017: 4278-4284.
[16]	Zhang Ruiyan, Jiang Xiujie, An Junshe, et al. Design of global-contextual detection model for optical remote sensing targets [J]. Chinese Optics, 2020, 13(6): 1302-1313. (in Chinese) doi: 10.37188/CO.2020-0057
[17]	Li Weipeng, Yang Xiaogang, Li Chuanxiang, et al. Infrared object detection network compression using Lp normalized weight [J]. Infrared and Laser Engineering, 2021, 50(8): 20200510. (in Chinese) doi: 10.3788/IRLA20200510
[18]	Yang Lingxiao, Zhang Ru-Yuan, Li Lida, et al. SimAM: A simple, parameter-free attention module for convolutional neural networks[C]//International Conference on Machine Learning, 2021, 139: 11863-11874.
[19]	Ju Moran, Luo Haibo, Liu Guangqi, et al. Infrared dim and small target detection network based on spatial attention mechanism [J]. Optics and Precision Engineering, 2021, 29(4): 843-853. (in Chinese) doi: 10.37188/OPE.20212904.0843
[20]	Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]//IEEE Computer Society Conference on Computer Vision & Pattern Recognition, 2017: 936-944.
[21]	Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation[C]//IEEE Conference on Computer Vision & Pattern Recognition, 2018: 8759–8768.
[22]	Tian Z, Shen C, Chen H, et al. FCOS: Fully convolutional one-stage object detection[C]//CVF International Conference on Computer Vision, 2019: 9626-9635.

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(10) / Tables(1)

Get Citation

PDF

XML

Article Metrics

Article views(427) PDF downloads(62) Cited by()

Proportional views

HTML

0. 引　言

目标检测作为计算机视觉的一个重要的分支，随着相关理论的深入研究和技术的广泛应用，取得了巨大的进步。而人工智能在实际中的应用，严重依赖高性能服务器的算力支撑。由于内存、计算等硬件的发展难以满足当前不断进步的神经网络模型庞大的存储和运算需求，模型的轻量化成为亟待解决的问题。

由于端侧和边缘设备的硬件资源有限，在模型的参数量、计算量以及推理速度与精度之间取得更加平衡的性能，才能满足实际需求。在神经网络的研究中，设计合理有效的结构来使得各层获取丰富多样的特征是网络性能提升的关键。由于参数量较少，轻量化的网络难以通过获取和复杂结构相比拟的丰富特征。

目前轻量化神经网络的设计取得了一系列成果。MobileNetv1^[1]采用深度可分离卷积替代传统卷积，并使用分辨率超参数控制输入图像分辨率和宽度超参数调节网络宽度，有效减少了模型参数量；MobileNetv2^[2]通过inverted residual block结构，优化MobileNetv1的性能；MobileNetv3^[3]引入轻量级注意力机制SENet^[4]对通道关系重新建模，并利用神经网络结构搜索（Neural Architecture Search，NAS）技术，进一步提高模型性能。ShuffleNetv1^[5]提出了pointwise group convolution来降低计算复杂度，并引入Channel Shuffle，以提高通道不同组内的信息流动；ShuffleNetv2^[6]针对神经网络在硬件上的实际运行情况，提出了网络设计准则并设计了新的轻量级结构，进一步提高了模型在硬件上的推理速度。SqueezeNet^[7]通过1×1和3×3的卷积共同组成Fire模块，有效减少了参数量。GhostNet^[8]在卷积生成特征图的基础上，又通过有效的线性运算来进行特征图扩展，减少了参数量和计算量。EfficientNet^[9]研究了网络深度、宽度和分辨率对性能指标的影响，并通过NAS获得了效果更好的模型；EfficientNetv2^[10]在此基础上引入了Fused-MBConv模块，并通过渐进式学习策略加快了训练速度。上述高效的模型大多通过深度可分离卷积和NAS实现，对计算资源有极高的要求，而且不能完全适用于特定的红外场景。

以Faster R-CNN^[11]和YOLO^[12]系列为代表的anchor-based算法在模型训练之前需要对数据进行聚类分析，确定最优的锚框，而锚框的设置对模型的性能会造成影响，对于不同的任务场景又需要进行重新调整。Anchor-free算法通过消除先验框，有效缓解了预设锚框带来的超参数干扰，简化了训练过程；同时避免了大量负样本带来的样本不均衡，大幅减少了IoU计算，降低了内存占用和时间消耗，适用于完成端侧实时精确目标检测任务。

针对红外图像分辨率较低、目标纹理特征不明显^[13]的问题，文中设计了一种轻量级特征提取网络，将非对称卷积和标准卷积相结合，提高对不同尺度目标特征的表达能力，降低了参数和计算量；在特征通道设置不同大小的卷积核，融合不同卷积结构的细节特征；并引入注意力机制和Channel Shuffle增强通道维度的特征获取和信息流动。为缓解红外图像在下采样过程中的细节丢失，采用改进的Fcous结构，同时提高了推理速度。通过SkipBranch结构直接将浅层定位信息和高层语义信息相融合，丰富高层的特征，加强轻量级结构的特征描述。实验结果表明，文中的轻量化模型具有较高的检测精度，在模型参数和计算量大幅压缩的条件下，实现了嵌入式平台红外目标实时检测。

4. 结　论

文中提出了一种anchor-free轻量级红外目标检测方法，在模型PMFPSNet中通过并行多特征通道轻量级卷积结构PMFP提高特征提取单元对不同尺度目标特征的获取能力，经过通道融合生成丰富的特征，同时有效减少参数和计算量；结合无参数注意力模块SimAM和Channel Shuffle在不增加参数的情况下提高模型性能，采用Slim-Focus结构改善在降采样过程中的红外特征丢失，增加SkipBranch分支促进浅层信息向深层网络的流动，提高模型在学习过程中的效率。在FCOS算法的基础上，利用IoU分支融合定位信息和分类信息，提高网络的精度。实验结果表明，PMFPSNet模型的检测精度更高，且参数量和计算量大幅减少，能够更好地完成嵌入式平台的红外目标实时检测任务。设计的轻量级模型PMFPSNet实现网络结构精简的情况下，mAP为81.7%且高于其他轻量级网络。同时相较于anchor-based模型，参数量和计算量分别下降75.0%和71.1%，具有更快的推理速度。

Reference (22)

[1]	Howard A G, Zhu M, Chen B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications [J]. arXiv preprint, 2017: 1704.04861.
[2]	Sandler M, Howard A, Zhu M, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]//IEEE/CVF Conference on Computer Vision & Pattern Recognition, 2018: 4510-4520.
[3]	Howard A, Sandler M, Chen B, et al. Searching for MobileNetV3[C]//IEEE/CVF International Conference on Computer Vision, 2019: 1314-1324.
[4]	Hu Jie, Shen Li, Sun Gang, et al. Squeeze-and-excitation networks[C]//IEEE/CVF Conference on Computer Vision & Pattern Recognition, 2018: 7132-7141.
[5]	Zhang X, Zhou X, Lin M, et al. ShuffleNet: An extremely efficient convolutional neural network for mobile devices[C]//CVF Conference on Computer Vision & Pattern Recognition, 2018: 6848-6856.
[6]	Ma N, Zhang X, Zheng H T, et al. ShuffleNetV2: Practical guidelines for efficient CNN architecture design[C]//European Conference on Computer Vision, 2018, 11218: 122-138.
[7]	Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50 x fewer parameters and <0.5 MB model size [J]. arXiv preprint, 2016: 1602.07360.
[8]	Han K, Wang Y, Tian Q, et al. GhostNet: More features from cheap operations[C]//CVF Conference on Computer Vision & Pattern Recognition, 2020: 1577-1586.
[9]	Tan M X, Le Q V. EfficientNet: Rethinking model scaling for convolutional neural networks [J]. arXiv preprint, 2019: 1905.11946.
[10]	Tan M X, Le Q V. EfficientNetV2: Smaller models and faster training [J]. arXiv preprint, 2021: 2104.00298.
[11]	Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6): 1137-1149.
[12]	Wang Chen, Zhang Xiufeng, Liu Chao, et al. Detection method of wheel hub weld defects based on the improved YOLOv3 [J]. Optics and Precision Engineering, 2021, 29(8): 1942-1954. (in Chinese)
[13]	Cheng Yan, Yu Xuelian, Qian Weixian, et al. Ship wake extraction and detection from infrared remote sensing images [J]. Infrared and Laser Engineering, 2022, 51(2): 20210844. (in Chinese)
[14]	Wang Chunzhe, An Junshe, Jiang Xiujie, et al. Region proposal optimization algorithm based on convolutional neural networks [J]. Chinese Optics, 2019, 12(6): 1348-1361. (in Chinese)
[15]	Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-ResNet and the impact of residual connections on learning[C]//AAAI Conference on Artificial Intelligence, 2017: 4278-4284.
[16]	Zhang Ruiyan, Jiang Xiujie, An Junshe, et al. Design of global-contextual detection model for optical remote sensing targets [J]. Chinese Optics, 2020, 13(6): 1302-1313. (in Chinese)
[17]	Li Weipeng, Yang Xiaogang, Li Chuanxiang, et al. Infrared object detection network compression using Lp normalized weight [J]. Infrared and Laser Engineering, 2021, 50(8): 20200510. (in Chinese)
[18]	Yang Lingxiao, Zhang Ru-Yuan, Li Lida, et al. SimAM: A simple, parameter-free attention module for convolutional neural networks[C]//International Conference on Machine Learning, 2021, 139: 11863-11874.
[19]	Ju Moran, Luo Haibo, Liu Guangqi, et al. Infrared dim and small target detection network based on spatial attention mechanism [J]. Optics and Precision Engineering, 2021, 29(4): 843-853. (in Chinese)
[20]	Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]//IEEE Computer Society Conference on Computer Vision & Pattern Recognition, 2017: 936-944.
[21]	Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation[C]//IEEE Conference on Computer Vision & Pattern Recognition, 2018: 8759–8768.
[22]	Tian Z, Shen C, Chen H, et al. FCOS: Fully convolutional one-stage object detection[C]//CVF International Conference on Computer Vision, 2019: 9626-9635.

Model	mAP	Parameters	GFLOPS	Delay/ms
CSPNet	0.787	2.24 M	6.87	7.18
ShuffleNet	0.789	2.12 M	6.26	7.86
Maxpool	0.773	1.57 M	4.98	7.28
YOLOv4-tiny	0.811	6.27 M	17.2	84.6
PMFPSNet	0.817	1.57 M	4.98	7.34

Anchor-free lightweight infrared object detection method (Invited)

doi: 10.3788/IRLA20220193

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Related

Proportional views

Anchor-free lightweight infrared object detection method (Invited)

doi: 10.3788/IRLA20220193

1. Beijing Huahang Radio Measurement Institute, Beijing 100013, China

2. Missile Engineering Institute, Rocket Force University of Engineering, Xi’an 710025, China

HTML

1.1. 并行多特征通道轻量级卷积结构

1.2. 注意力机制和Channel Shuffle

1.3. Slim-Focus降采样

1.4. SkipBranch结构

2.1. PMFPSNet结构

2.2. Anchor-free算法实现

3.1. 特征提取网络性能对比

3.2. 下采样结构性能对比

3.3. 轻量级网络性能对比

Catalog

Anchor-free lightweight infrared object detection method (Invited)

doi: 10.3788/IRLA20220193

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Related

Proportional views

Anchor-free lightweight infrared object detection method (Invited)

doi: 10.3788/IRLA20220193

1. Beijing Huahang Radio Measurement Institute, Beijing 100013, China 2. Missile Engineering Institute, Rocket Force University of Engineering, Xi’an 710025, China

HTML

1.1. 并行多特征通道轻量级卷积结构

1.2. 注意力机制和Channel Shuffle

1.3. Slim-Focus降采样

1.4. SkipBranch结构

2.1. PMFPSNet结构

2.2. Anchor-free算法实现

3.1. 特征提取网络性能对比

3.2. 下采样结构性能对比

3.3. 轻量级网络性能对比

Catalog

Export File

Citation

Format

Content

1. Beijing Huahang Radio Measurement Institute, Beijing 100013, China

2. Missile Engineering Institute, Rocket Force University of Engineering, Xi’an 710025, China