使用Lp归一化权重的红外目标检测网络压缩

李维鹏; 杨小冈; 李传祥; 卢瑞涛; 谢学立; 何川

doi:10.3788/IRLA20200510

使用Lp归一化权重的红外目标检测网络压缩

doi: 10.3788/IRLA20200510

火箭军工程大学导弹工程学院，陕西西安 710025

基金项目: 国家自然科学基金（61806209，61773389）；陕西省自然科学基金（2020JQ-490）；航空基金（201851U8012）

详细信息

作者简介:
李维鹏，男，博士生，主要从事计算机视觉、深度学习、模式识别等方面的研究

杨小冈，男，教授，博士生导师，博士，主要从事视觉导航、目标检测、图像处理等方面的研究

中图分类号: TP391.4

Infrared object detection network compression using Lp normalized weight

Institute of Missile Engineering, Rocket Force Engineering University, Xi’an 710025, China

摘要: 针对红外图像相比于RGB图像纹理较少的特性，提出一种使用Lp归一化权重的红外目标检测网络压缩方法，旨在改进基于卷积神经网络的目标检测方法对红外图像场景的适应性，在压缩网络规模的同时提升其泛化能力。首先阐述了Lp归一化权重的稀疏性可以通过调节p进行精确控制这一现象。基于该现象，提出了一种目标检测网络稀疏化训练方法。该方法分别使用Lp球面梯度下降与经典梯度下降训练主干网络和检测器，以平衡网络规模与拟合精度。仿真红外数据集测试结果表明，其在网络规模和目标检测精度方面均优于稠密模型：在网络规模上，稀疏化方法将Faster R-CNN、(Single Shot multibox Detector，SSD)与YOLOv3的有效参数分别减少了52%、78%和66%；在检测精度上，稀疏化方法将Faster R-CNN、SSD和YOLOv3的(mean Average Precision, mAP)分别提高了0.1%、0.3%和0.2%，验证了所提出方法的有效性。
- 红外目标检测 /
- 稀疏神经网络 /
- Lp归一化 /
- 受约束梯度下降
Abstract: In view of the characteristic that the infrared image has less texture compared with RGB image, an infrared object detection network compression method using Lp normalized weight was proposed. It aimed at improving the adaptability of convolutional neural network based object detection framework to the infrared images, and compressing the scale of network while improving its generalization ability. Firstly, the phenomenon that the sparsity of Lp normalized weight can be precisely controlled by adjusting p was revealed. Based on the phenomenon, a sparsification method for object detection network was proposed. It respectively trained the backbone network and the detector with Lp spherical gradient descent and classical gradient descent, to balance the network scale and fitting accuracy. The tests on simulated infrared image dataset show that, the proposed method is superior to the dense model on both of network scale and detection accuracy: in terms of network scale, the sparsification reduces the effective parameters of Faster R-CNN, Single Shot multibox Detector (SSD) and YOLOv3 by 52%, 78% and 66% respectively; it also improves the mean Average Precision (mAP) of Faster R-CNN, SSD and YOLOv3 by 0.1%, 0.3% and 0.2%, thus verifying the effectiveness of the proposed method.
- infrared object detection /
- sparse neural network /
- Lp normalization /
- constrained gradient descent

图 1 单个神经元权重分布随p的变化

Figure 1. Weight distribution of a neuron with respect to p

下载: 全尺寸图片幻灯片

图 2 卷积层权重稀疏性随p的变化

Figure 2. Sparsity of weight with respect to p at convolutional layers

下载: 全尺寸图片幻灯片

图 3 目标检测网络稀疏化训练流程

Figure 3. Training process of sparse neural network for object detection

下载: 全尺寸图片幻灯片

图 4 经典网络和稀疏化网络红外目标检测结果对比

Figure 4. Result comparison of infrared object detection between classical neural networks and sparse neural networks

下载: 全尺寸图片幻灯片

图 5 SGD和LpSGD收敛过程对比

Figure 5. Comparison of convergence process between SGD and LpSGD

下载: 全尺寸图片幻灯片

表 1 红外仿真数据集

Table 1. Simulated infrared dataset

Classification	Training	Test	Total
Class 1	208	28	236
Class 2	210	26	236
Class 3	219	30	249
Class 4	192	29	221
Total	829	113	942

下载: 导出CSV

表 2 红外仿真数据集目标检测模型及其结果

Table 2. Object detection model and result on simulated infrared dataset

Method		Scale		AP				mAP
Method		Backbone	Detector	Class 1	Class 2	Class 3	Class 4	mAP
Faster R-CNN	Dense	26 852 416	14 511 140	0.912	0.885	0.927	0.972	0.925
	Sparse	5 337 352	14 511 130	0.910	0.875	0.936	0.982	0.926
SSD300	Dense	22 943 936	1 202 958	0.893	0.879	0.914	0.965	0.914
	Sparse	4 103 396	1 202 958	0.889	0.867	0.924	0.981	0.917
YOLOv3	Dense	55 294 688	6 245 196	0.914	0.898	0.919	0.972	0.926
	Sparse	14 829 742	6 245 196	0.906	0.895	0.927	0.984	0.928

下载: 导出CSV

表 3 VOC2007数据集目标检测模型及其结果

Table 3. Object detection model and result on VOC2007 dataset

Method		Faster R-CNN		SSD 300		YOLOv3
Method		Dense	Sparse	Dense	Sparse	Dense	Sparse
Nonzero parameters	Backbone	26 852 416	15 756 216	22 943 936	14 995 952	55 294 688	37 291 638
Nonzero parameters	Detector	14 593 140	14 593 140	3 341 550	3 341 550	6 331 357	6 331 357
AP	Aero	0.833	0.826	0.854	0.847	0.801	0.802
	Bike	0.781	0.773	0.798	0.795	0.848	0.845
	Bird	0.735	0.737	0.702	0.712	0.716	0.726
	Boat	0.532	0.528	0.568	0.543	0.652	0.641
	Bottle	0.487	0.493	0.457	0.474	0.638	0.647
	Bus	0.774	0.765	0.790	0.781	0.861	0.858
	Car	0.745	0.748	0.757	0.752	0.858	0.859
	Cat	0.887	0.872	0.756	0.765	0.847	0.857
	Chair	0.449	0.443	0.871	0.865	0.547	0.541
	Cow	0.765	0.771	0.524	0.542	0.715	0.725
	Table	0.548	0.536	0.768	0.764	0.690	0.681
	Dog	0.865	0.857	0.605	0.612	0.828	0.827
	Horse	0.817	0.825	0.868	0.874	0.842	0.846
	Mbike	0.804	0.798	0.824	0.846	0.821	0.831
	Person	0.794	0.782	0.820	0.811	0.807	0.802
	Plant	0.391	0.387	0.458	0.447	0.441	0.437
	Sheep	0.723	0.725	0.752	0.747	0.696	0.688
	Sofa	0.608	0.595	0.691	0.698	0.699	0.696
	Train	0.809	0.814	0.809	0.812	0.825	0.834
	Tv	0.612	0.607	0.672	0.667	0.718	0.722
mAP		0.698	0.694	0.717	0.718	0.742	0.743

下载: 导出CSV

[1]	Lienhart R, Maydt J. An extended set of Haar-like features for rapid object detection[C]//International Conference on Image Processing, 2002.
[2]	Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//IEEE Computer Society Conference on Computer Vision & Pattern Recognition, 2005.
[3]	Lowe D G. Distinctive image features from scale-invariant keypoints [J]. International Journal of Computer Vision, 2004, 60(2): 91-110. doi: 10.1023/B:VISI.0000029664.99615.94
[4]	Bay H, Ess A, Tuytelaars T, et al. Speeded-up robust features [J]. Computer Vision and Image Understanding, 2008, 110(3): 346-359. doi: 10.1016/j.cviu.2007.09.014
[5]	Li X, Wang L, Sung E. AdaBoost with SVM-based component classifiers [J]. Engineering Applications of Artificial Intelligence, 2008, 21(5): 785-795. doi: 10.1016/j.engappai.2007.07.001
[6]	Felzenszwalb P F, Huttenlocher D P. Pictorial structures for object recognition [J]. International Journal of Computer Vision, 2005, 61(1): 55-79. doi: 10.1023/B:VISI.0000042934.15159.49
[7]	Felzenszwalb P F, Girshick R B, McAllester D. Cascade object detection with deformable part models[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010: 2241-2248.
[8]	Felzenszwalb P F, Girshick R B, Mcallester D A. Visual object detection with deformable part models[C]//The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010.
[9]	Zhang Xiuling, Hou Daibiao, Zhang Chengcheng, et al. Design of MPCANet fire image recognition model for deep learning [J]. Infrared and Laser Engineering, 2018, 47(2): 0203006. (in Chinese) doi: 10.3788/IRLA201847.0203006
[10]	Gong Junliang, He Xin, Wei Zhonghui, et al. Infrared dim and small target detection method using scale-space theory [J]. Infrared and Laser Engineering, 2013, 42(9): 2566-2573. (in Chinese) doi: 10.3969/j.issn.1007-2276.2013.09.048
[11]	Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6): 1137-1149.
[12]	Dai J, Li Y, He K, et al. R-FCN: Object detection via region-based fully convolutional networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016: 379-387.
[13]	Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition, 2016: 779-788.
[14]	Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector[C]//European Conference on Computer Vision, 2016: 21-37.

[1]	李东亮, 卢贝. 基于深度神经网络的光纤传感识别算法 . 红外与激光工程, 2022, 51(9): 20210971-1-20210971-6. doi: 10.3788/IRLA20210971
[2]	宦克为, 李向阳, 曹宇彤, 陈笑. 卷积神经网络结合NSST的红外与可见光图像融合 . 红外与激光工程, 2022, 51(3): 20210139-1-20210139-8. doi: 10.3788/IRLA20210139
[3]	李保华, 王海星. 基于增强卷积神经网络的尺度不变人脸检测方法 . 红外与激光工程, 2022, 51(7): 20210586-1-20210586-8. doi: 10.3788/IRLA20210586
[4]	齐悦, 董云云, 王溢琴. 基于汇聚级联卷积神经网络的旋转人脸检测方法 . 红外与激光工程, 2022, 51(12): 20220176-1-20220176-8. doi: 10.3788/IRLA20220176
[5]	赵晓枫, 徐叶斌, 吴飞, 牛家辉, 蔡伟, 张志利. 基于并行注意力机制的地面红外目标检测方法（特邀） . 红外与激光工程, 2022, 51(4): 20210290-1-20210290-8. doi: 10.3788/IRLA20210290
[6]	陈文艺, 许洁, 杨辉. 利用双神经网络的相机标定方法 . 红外与激光工程, 2021, 50(11): 20210071-1-20210071-9. doi: 10.3788/IRLA20210071
[7]	赵毅强, 艾西丁·艾克白尔, 陈瑞, 周意遥, 张琦. 基于体素化图卷积网络的三维点云目标检测方法 . 红外与激光工程, 2021, 50(10): 20200500-1-20200500-9. doi: 10.3788/IRLA20200500
[8]	张宏伟, 李晓霞, 朱斌, 张杨. 基于孪生神经网络的两阶段目标跟踪方法 . 红外与激光工程, 2021, 50(9): 20200491-1-20200491-12. doi: 10.3788/IRLA20200491
[9]	李维鹏, 杨小冈, 李传祥, 卢瑞涛, 黄攀. 红外目标检测网络改进半监督迁移学习方法 . 红外与激光工程, 2021, 50(3): 20200511-1-20200511-8. doi: 10.3788/IRLA20200511
[10]	范明明, 田少卿, 刘凯, 赵嘉鑫, 李云松. 基于梯度方向一致性和特征分解的红外小目标检测算法 . 红外与激光工程, 2020, 49(1): 0126001-0126001(12). doi: 10.3788/IRLA202049.0126001
[11]	赵东波, 李辉. 基于中心矩特征和GA-BP神经网络的雷达目标识别 . 红外与激光工程, 2018, 47(8): 826005-0826005(7). doi: 10.3788/IRLA201847.0826005
[12]	李方彪, 何昕, 魏仲慧, 何家维, 何丁龙. 生成式对抗神经网络的多帧红外图像超分辨率重建 . 红外与激光工程, 2018, 47(2): 203003-0203003(8). doi: 10.3788/IRLA201847.0203003
[13]	郭强, 芦晓红, 谢英红, 孙鹏. 基于深度谱卷积神经网络的高效视觉目标跟踪算法 . 红外与激光工程, 2018, 47(6): 626005-0626005(6). doi: 10.3788/IRLA201847.0626005
[14]	田岳鑫, 高昆, 刘莹, 卢岩, 倪国强. 一种基于广义累积和的多波段红外变异点目标检测方法 . 红外与激光工程, 2016, 45(5): 526001-0526001(6). doi: 10.3788/IRLA201645.0526001
[15]	张东彦, 赵晋陵, 黄林生, 马雯萩. 用于高光谱图像分类的归一化光谱指数的构建与应用 . 红外与激光工程, 2014, 43(2): 586-594.
[16]	底晓强, 母一宁, 李锦青, 杨华民. 一种基于TLM超混沌细胞神经网络图像加密新算法 . 红外与激光工程, 2014, 43(12): 4170-4176.
[17]	万磊, 曾文静, 张铁栋, 秦再白. 基于梯度信息融合的海面红外目标实时检测 . 红外与激光工程, 2013, 42(1): 41-45.
[18]	孙韶媛, 李琳娜, 赵海涛. 采用KPCA和BP神经网络的单目车载红外图像深度估计 . 红外与激光工程, 2013, 42(9): 2348-2352.
[19]	赵春晖, 刘振龙. 改进的红外图像神经网络非均匀性校正算法 . 红外与激光工程, 2013, 42(4): 1079-1083.
[20]	曲仕茹, 杨红红. 采用Kalman_BP神经网络的视频序列多目标检测与跟踪 . 红外与激光工程, 2013, 42(9): 2553-2560.

点击查看大图

图(6) / 表(3)

计量

文章访问数: 322
HTML全文浏览量: 157
PDF下载量: 36
被引次数: 0

全文HTML

0. 引　言

目标检测是进行场景内容理解等高级视觉任务的前提，已广泛应用于智能视频监控、基于内容的图像检索、视觉导航等任务中。传统的目标检测主要使用人工设计的特征（如HAAR^[1]、HOG^[2]、SHIFT^[3]、SURF^[4]等），在滑动窗口下使用分类器进行判别，其代表方法有Adaboost-SVM^[5]和形变部件模型（DPM）^[6-8]。上述方法开创了实用化的目标检测之先河，在便携式设备和机器人等领域有着广泛应用。但由于人工设计特征的性能所限，传统方法的准确率始终不高，且通常对新的图像缺乏足够的泛化能力。

相比传统目标检测方法，基于卷积神经网络（CNN）的目标检测方法在准确率方面具有显著优势。CNN通过大量参数拟合各类不同的情形，使用多层架构逐步抽象目标信息，极大地提升了目标检测的泛化能力。然而当前基于CNN的目标检测相关研究集中于RGB图像等多通道图像，而对红外目标检测的研究相对较少。另一方面，红外目标检测领域的相关研究多为针对特定类型目标（例如火灾^[9]）的检测识别，或弱小目标检测^[10]，而在多分类红外目标检测方面依然欠缺。

基于CNN的目标检测架构可分为两大模块：CNN主干网络（Backbone）和检测器网络（Detector），其中CNN主干网络主要用于多层特征提取，检测器主要负责输出目标位置及其类别。由于红外图像与RGB等多通道图像的最大区别在于图像特征层面，红外目标检测的关键在于优化CNN主干网络，检测器部分则可采用Faster R-CNN^[11]、R-FCN^[12]、YOLO^[13]、SSD^[14]等现有方法。相比于RGB图像，红外图像有两大特点：其一，标注数据不足，训练样本相对较少；其二，红外图像的纹理信息远少于RGB图像。上述特点决定了CNN在红外图像中所能有效训练的参数数量远低于RGB图像，因此需要通过约束和在线剪枝剔除冗余参数，避免过拟合。考虑到对网络权重进行Lp归一化能够有效控制神经网络的稀疏性，文中提出了一种使用Lp归一化权重的红外目标检测网络压缩方法，主要用于改进基于CNN的目标检测架构在红外目标检测上的适应性，在压缩网络规模的同时提升其泛化能力。实验结果表明该方法显著降低了红外目标检测网络的权重数目，同时提升了红外目标检测测试精度，验证了所提出方法的有效性。

4. 结　论

文中提出一种使用Lp归一化权重的红外目标检测网络压缩方法，主要用于改进基于CNN的目标检测架构对红外图像的适应性，在压缩网络规模的同时提升其泛化能力。文中首先阐述了Lp归一化权重的稀疏性可以通过p进行精确控制这一现象，在此基础上提出了文中目标检测网络稀疏化训练的方法。该方法分别使用Lp球面梯度下降与经典梯度下降训练主干网络和检测器，以平衡网络规模与拟合精度。在仿真红外数据集实验当中，其在网络规模和检测精度方面均优于稠密模型：在网络规模上，稀疏化方法将Faster R-CNN、SSD与YOLOv3的有效参数分别减少了52%、78%和66%，大幅压缩了目标检测网络的规模；在检测精度上，稀疏化方法将Faster R-CNN、SSD和YOLOv3的mAP分别提高了0.1%、0.3%和0.2%。在VOC2007数据集实验当中，稀疏化方法将Faster R-CNN、SSD与YOLOv3的有效参数分别减少了27%、30%和29%，且将其mAP分别变化了−0.4%、+0.1%和+0.1%。下面将进一步研究红外图像特征的低秩特性，将Lp归一化与低秩分解相结合，进一步压缩有效参数，提高算法性能。

参考文献 (14)

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

使用Lp归一化权重的红外目标检测网络压缩

doi: 10.3788/IRLA20200510

作者简介:
李维鹏，男，博士生，主要从事计算机视觉、深度学习、模式识别等方面的研究

杨小冈，男，教授，博士生导师，博士，主要从事视觉导航、目标检测、图像处理等方面的研究

Infrared object detection network compression using Lp normalized weight

计量

使用Lp归一化权重的红外目标检测网络压缩

doi: 10.3788/IRLA20200510

火箭军工程大学导弹工程学院，陕西西安 710025

作者简介:
李维鹏，男，博士生，主要从事计算机视觉、深度学习、模式识别等方面的研究

杨小冈，男，教授，博士生导师，博士，主要从事视觉导航、目标检测、图像处理等方面的研究

English Abstract

Infrared object detection network compression using Lp normalized weight

Institute of Missile Engineering, Rocket Force Engineering University, Xi’an 710025, China

全文HTML

2.1. 目标检测网络训练方案

2.2. Lp球面梯度下降算法

3.1. 红外仿真数据集目标检测结果对比

3.2. RGB数据集目标检测结果对比

目录

留言板

使用Lp归一化权重的红外目标检测网络压缩

doi: 10.3788/IRLA20200510

作者简介: 李维鹏，男，博士生，主要从事计算机视觉、深度学习、模式识别等方面的研究 杨小冈，男，教授，博士生导师，博士，主要从事视觉导航、目标检测、图像处理等方面的研究

Infrared object detection network compression using Lp normalized weight

计量

出版历程

使用Lp归一化权重的红外目标检测网络压缩

doi: 10.3788/IRLA20200510

火箭军工程大学 导弹工程学院，陕西 西安 710025

作者简介: 李维鹏，男，博士生，主要从事计算机视觉、深度学习、模式识别等方面的研究 杨小冈，男，教授，博士生导师，博士，主要从事视觉导航、目标检测、图像处理等方面的研究

English Abstract

Infrared object detection network compression using Lp normalized weight

Institute of Missile Engineering, Rocket Force Engineering University, Xi’an 710025, China

全文HTML

2.1. 目标检测网络训练方案

2.2. Lp球面梯度下降算法

3.1. 红外仿真数据集目标检测结果对比

3.2. RGB数据集目标检测结果对比

目录

作者简介:
李维鹏，男，博士生，主要从事计算机视觉、深度学习、模式识别等方面的研究

杨小冈，男，教授，博士生导师，博士，主要从事视觉导航、目标检测、图像处理等方面的研究

火箭军工程大学导弹工程学院，陕西西安 710025

作者简介:
李维鹏，男，博士生，主要从事计算机视觉、深度学习、模式识别等方面的研究

杨小冈，男，教授，博士生导师，博士，主要从事视觉导航、目标检测、图像处理等方面的研究