一种改进的Capsule及其在SAR图像目标识别中的应用

张盼盼; 罗海波; 鞠默然; 惠斌; 常铮

doi:10.3788/IRLA20201010

一种改进的Capsule及其在SAR图像目标识别中的应用

doi: 10.3788/IRLA20201010

张盼盼^1,2,3,4,5,,
罗海波^1,2,4,5, ,,
鞠默然^1,2,3,4,5,
惠斌^1,2,4,5,
常铮^1,2,4,5

1.
中国科学院沈阳自动化研究所，辽宁沈阳 110016
2.
中国科学院机器人与智能制造创新研究院，辽宁沈阳 110169
3.
中国科学院大学，北京 100049
4.
中国科学院光电信息处理重点实验室，辽宁沈阳 110016
5.
辽宁省图像理解与视觉计算重点实验室，辽宁沈阳 110016

详细信息

作者简介:
张盼盼(1992-)，博士生，主要从事模式识别与智能系统方面的研究。Email: zhangpanpan@sia.cn

罗海波(1967-)，男，研究员，博士，主要从事图像处理、模式识别与智能系统方面的研究。Email: luohb@sia.cn

An improved Capsule and its application in target recognition of SAR images

Zhang Panpan^{1,2,3,4,5
,},
Luo Haibo^{1,2,4,5
, ,},
Ju Moran^1,2,3,4,5,
Hui Bin^1,2,4,5,
Chang Zheng^1,2,4,5

1.
Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
2.
Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China
3.
University of Chinese Academy of Sciences, Beijing 100049, China
4.
Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang 110016, China
5.
The Key Lab of Image Understanding and Computer Vision, Liaoning Province, Shenyang 110016, China

摘要: 为了解决Capsule网络随着输入图像增大计算量和参数数量急剧增加的问题，对Capsule网络进行了改进并将其用于SAR自动目标识别（SAR-ATR）中。基于大脑视觉皮层以层级结构以及柱状形式处理信息的机制，提出了完全实例化的思想，并运用类脑计算对Capsule网络进行了改进。具体方法是：使用多个卷积层实现层级处理，同时使用了较少的卷积核，但每一层使用的卷积核数量随着层级加深逐渐增加，使得提取的特征更加趋于抽象化；在PrimaryCaps层中，Capsule向量由最后一层卷积层输出的所有特征图构成，使得Capsule单元包含目标局部或整体的全部特征，以实现目标的完全实例化。在SAR-ATR上，将改进的Capsule网络与原Capsule网络、传统目标识别算法和基于经典卷积神经网络的目标识别算法进行对比实验。实验结果表明，改进的Capsule网络训练参数和计算量大大减少，并且训练速度得到很大提升，在SAR图像数据集上的识别准确率较Capsule网络和前两类方法分别提高了0.37和1.96~8.96个百分点。
- 目标识别 /
- Capsule网络 /
- 完全实例化 /
- 类脑计算 /
- 卷积神经网络
Abstract: In order to solve the problem that the Capsule network increases the amount of calculation and the number of parameters increases sharply with the input picture, the Capsule network is improved and the improved Capsule network is used in SAR automatic target recognition (SAR-ATR). In this paper, based on the mechanism of brain visual cortex processing information in hierarchical structure and column form, the idea of complete instantiation was proposed, and the brain-like calculation was used to improve the Capsule network. The specific method was to use multiple convolution layers to achieve hierarchical processing. The number of convolution kernels used in each layer increases with the depth of the hierarchy, which made the extracted abstract features gradually increase. In the PrimaryCaps layer, the Capsule vector consisted of all the feature maps output by the last layer of the convolutional layer, so that the Capsule unit contained all the features of the target part or the whole to achieve full instantiation of the target. On the SAR-ATR, a comparison experiment was performed with the Capsule network, the traditional target recognition algorithm and the target recognition algorithm based on the classical convolutional neural network. The experimental results show that the improved Capsule network training parameters and calculations are greatly reduced, and the training speed is greatly improved, and the recognition accuracy on the SAR image data set is increased by 0.37 and 1.96-8.96 percentage points compared with the Capsule network and the first two methods respectively.
- target recognition /
- Capsule network /
- complete instantiation /
- brain-like calculation /
- convolutional neural networks

图 1 Capsule单元结构

Figure 1. Structure of Capsule unit

下载: 全尺寸图片幻灯片

图 2 Capsule网络的重构层

Figure 2. Layers of reconstruction of Capsule network

下载: 全尺寸图片幻灯片

图 3 原Capsule网络结构与改进的Capsule 网络结构

Figure 3. Structure of original Capsule network and improved Capsule network

下载: 全尺寸图片幻灯片

图 4 （a）和（b）分别为BMP2、BTR70、T72、BTR60和2S1的光学图像和相对应的SAR图像；（c）和（d）分别为BRDM2、D7、T62、ZIL131和ZSU23/4的光学图像和相对应得SAR图像

Figure 4. Optical images and their corresponding MSTAR SAR images for (a) and (b) BMP2, BTR70, T72, BTR60, and 2S1; (c) and (d) BRDM2, D7, T62, ZIL131, and ZSU23/4

下载: 全尺寸图片幻灯片

图 5 改进的Capsule网络的重构结果。(a)原始图像，(b)目标图像，(c)重构图像

Figure 5. Reconstruction result of improved Capsule. (a) Original image，(b) Target image and (c) Reconstruction image

下载: 全尺寸图片幻灯片

图 6 改进的Capsule网络训练中重构错误曲线和训练损失曲线。(a)重构错误曲线，(b)训练损失曲线

Figure 6. Reconstruction error curve and training loss curve of improved Capsule network. (a) Reconstruction error curve and (b) training loss curve

下载: 全尺寸图片幻灯片

表 1 改进的Capsule网络与Capsule网络性能对比

Table 1. Performance comparison of improved Capsule network and Capsule network

	Model size (parameters)	Training_time/epoch	BFLOPs
Capsule	33.73 M	2 min 8 s	33.519
Improved Capsule	21.65 M	1 min 3 s	1.078

下载: 导出CSV

表 3 原Capsule网络10类目标识别结果的混淆矩阵(识别率：98.48%)

Table 3. Confusion matrix of 10-class target recognition results of Capsule network(recognition rate: 98.48%)

Class	BMP2sn-9563	BTR70	T72sn-132	BTR60	2S1	BRDM2	D7	T62	ZIL131	ZSU23/4
BMP2sn-9563	96.92	0.51	2.57	0	0	0	0	0	0	0
BTR70	0	100.00	0	0	0	0	0	0	0	0
T72sn-132	0	0	100.00	0	0	0	0	0	0	0
BTR60	0	0	0	98.46	0	0.51	0	0	0	1.03
2S1	0	0	0	2.92	94.16	1.46	0	0.73	0.73	0
BRDM2	0	0.365	0	0.73	0	97.45	0	0	1.09	0.365
D7	0	0.73	0	0	0	0	99.27	0	0	0
T62	0	0	0	0	0.73	0	0	98.90	0	0.37
ZIL131	0	0	0	0	0	0	0	0	100.00	0
ZSU23/4	0	0	0	0	0	0	0.36	0	0	99.64

下载: 导出CSV

表 4 改进的Capsule网络10类目标识别结果的混淆矩阵(识别率：98.85%)

Table 4. Confusion matrix of 10-class recognition results of improved Capsule network (recognition rate: 98.85%)

Class	BMP2sn-9563	BTR70	T72sn-132	BTR60	2S1	BRDM2	D7	T62	ZIL131	ZSU23/4
BMP2sn-9563	96.41	0	3.59	0	0	0	0	0	0	0
BTR70	0	100.00	0	0	0	0	0	0	0	0
T72sn-132	0	0	100.00	0	0	0	0	0	0	0
BTR60	0	0	0	98.97	0	0	0	0	0	1.03
2S1	0	0	0	2.555	96.35	1.095	0	0	0	0
BRDM2	0	0	0	0.73	0	98.54	0.365	0	0.365	0
D7	0.365	0	0	0	0	0.365	99.27	0	0	0
T62	0	0	0	0	0	0	0	99.63	0	0.37
ZIL131	0	0	0	0	0	0	0.36	0	99.64	0
ZSU23/4	0	0	0	0	0	0	0.36	0	0	99.64

下载: 导出CSV

表 2 用于训练和测试实验的原始SAR数据集

Table 2. Raw SAR dataset for training and testing in experiment

Class	BMP2sn-9563	BTR70	T72sn-132	BTR60	2S1	BRDM2	D7	T62	ZIL131	ZSU23/4
Train samples(${\rm{1}}{{\rm{7}}^ \circ }$)	117	117	116	128	150	149	150	150	150	150
Test samples(${\rm{1}}{{\rm{5}}^ \circ }$)	195	196	196	195	274	274	274	273	274	274

下载: 导出CSV

表 5 不同方法的识别效果

Table 5. Recognition performance of different methods

Methods	SOC
Methods	Rates	Training images
SVM^[7]	90.10%	3 670
AdaBoost^[7]	92.70%	3 670
DCNN^[9]	92.30%	3 671
DCNN^[8]	94.56%	2 747
IGT^[7]	95.00%	3 670
CGM^[10]	97.18%	3 670
2-VDCNN^[11]	97.81%	1 377
CapsNet	98.48%	1 377
Improved CapsNet	98.85%	1 377

下载: 导出CSV

[1]	杨楠, 南琳, 张丁一, 等. 基于深度学习的图像描述研究[J]. 红外与激光工程, 2018, 47(2): 0203002. Yang Nan, Nan Lin, Zhang Dingyi, et al. Research on image interpretation based on deep learning [J]. Infrared and Laser Engineering, 2018, 47(2): 0203002. (in Chinese)
[2]	Cohen T, Welling M. Group equivariant convolutional networks[C] //International Conference on Machine Learning. 2016: 2990-2999.
[3]	Cohen T S, Geiger M, Köhler J, et al. Spherical cnns[J]. arXiv preprint arXiv: 1801.10130, 2018.
[4]	Sabour S, Frosst N, Hinton G E. Dynamic routing between Capsules [J]. Computer Vision and Pattern Recognition, 2017, arXiv: 1710.09829: 1−11.
[5]	Sabour S, Frosst N, Hinton G. Matrix capsules with EM routing[C]//6th International Conference on Learning Representations, ICLR. 2018: 1-15.
[6]	Hinton G E, Krizhevsky A, Wang S D. Transforming auto-encoders[C]//International Conference on Artificial Neural Networks. Berlin, Heidelberg: Springer, 2011: 44-51.
[7]	Srinivas U, Monga V, Raj R G. SAR automatic target recognition using discriminative graphical models [J]. IEEE Transactions on Aerospace and Electronic Systems, 2014, 50(1): 591−606. doi: 10.1109/TAES.2013.120340
[8]	Ding J, Chen B, Liu H, et al. Convolutional neural network with data augmentation for SAR target recognition [J]. IEEE Geoscience and Remote Sensing Letters, 2016, 13(3): 364−368.
[9]	Morgan D A E. Deep convolutional neural networks for ATR from SAR imagery[C]//Algorithms for Synthetic Aperture Radar Imagery XXII. International Society for Optics and Photonics, 2015, 9475: 94750F.
[10]	O'Sullivan J A, DeVore M D, Kedia V, et al. SAR ATR performance using a conditionally Gaussian model [J]. IEEE Transactions on Aerospace and Electronic Systems, 2001, 37(1): 91−108. doi: 10.1109/7.913670
[11]	Pei J, Huang Y, Huo W, et al. SAR automatic target recognition based on multiview deep learning framework [J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(4): 2196−2210. doi: 10.1109/TGRS.2017.2776357
[12]	Eric R K, James H S, Thomas M J, et al. Principles of Neural Science [M]. 5th ed. Beijing: China Machine Press, 2013.
[13]	Zhao Q, Principe J C. Support vector machines for SAR automatic target recognition [J]. IEEE Transactions on Aerospace and Electronic Systems, 2001, 37(2): 643−654. doi: 10.1109/7.937475
[14]	Sun Y, Liu Z, Todorovic S, et al. Adaptive boosting for SAR automatic target recognition [J]. IEEE Transactions on Aerospace & Electronic Systems, 2007, 43(1): 112−125.

[1]	徐瑞书, 罗笑南, 沈瑶琼, 郭创为, 张文涛, 管钰晴, 傅云霞, 雷李华. 基于改进U-Net网络的相位解包裹技术研究 . 红外与激光工程, 2024, 53(2): 20230564-1-20230564-14. doi: 10.3788/IRLA20230564
[2]	蒋筱朵, 赵晓琛, 冒添逸, 何伟基, 陈钱. 采用传感器融合网络的单光子激光雷达成像方法 . 红外与激光工程, 2022, 51(2): 20210871-1-20210871-7. doi: 10.3788/IRLA20210871
[3]	张良, 田晓倩, 李少毅, 杨曦. 基于时空推理网络的空中红外目标抗干扰识别算法 . 红外与激光工程, 2022, 51(7): 20210614-1-20210614-10. doi: 10.3788/IRLA20210614
[4]	齐悦, 董云云, 王溢琴. 基于汇聚级联卷积神经网络的旋转人脸检测方法 . 红外与激光工程, 2022, 51(12): 20220176-1-20220176-8. doi: 10.3788/IRLA20220176
[5]	李保华, 王海星. 基于增强卷积神经网络的尺度不变人脸检测方法 . 红外与激光工程, 2022, 51(7): 20210586-1-20210586-8. doi: 10.3788/IRLA20210586
[6]	庄子波, 邱岳恒, 林家泉, 宋德龙. 基于卷积神经网络的激光雷达湍流预警 . 红外与激光工程, 2022, 51(4): 20210320-1-20210320-10. doi: 10.3788/IRLA20210320
[7]	宦克为, 李向阳, 曹宇彤, 陈笑. 卷积神经网络结合NSST的红外与可见光图像融合 . 红外与激光工程, 2022, 51(3): 20210139-1-20210139-8. doi: 10.3788/IRLA20210139
[8]	刘瀚霖, 辛璟焘, 庄炜, 夏嘉斌, 祝连庆. 基于卷积神经网络的混叠光谱解调方法 . 红外与激光工程, 2022, 51(5): 20210419-1-20210419-9. doi: 10.3788/IRLA20210419
[9]	陆建华. 融合CNN和SRC决策的SAR图像目标识别方法 . 红外与激光工程, 2022, 51(3): 20210421-1-20210421-7. doi: 10.3788/IRLA20210421
[10]	马丹丹. 图像分块匹配的SAR目标识别方法 . 红外与激光工程, 2021, 50(10): 20210120-1-20210120-8. doi: 10.3788/IRLA20210120
[11]	赵璐, 熊森. 多视角红外图像目标识别方法 . 红外与激光工程, 2021, 50(11): 20210206-1-20210206-6. doi: 10.3788/IRLA20210206
[12]	裴晓敏, 范慧杰, 唐延东. 多通道时空融合网络双人交互行为识别 . 红外与激光工程, 2020, 49(5): 20190552-20190552-6. doi: 10.3788/IRLA20190552
[13]	徐云飞, 张笃周, 王立, 华宝成. 非合作目标局部特征识别轻量化特征融合网络设计 . 红外与激光工程, 2020, 49(7): 20200170-1-20200170-7. doi: 10.3788/IRLA20200170
[14]	高泽宇, 李新阳, 叶红卫. 流场测速中基于深度卷积神经网络的光学畸变校正技术 . 红外与激光工程, 2020, 49(10): 20200267-1-20200267-10. doi: 10.3788/IRLA20200267
[15]	薛珊, 张振, 吕琼莹, 曹国华, 毛逸维. 基于卷积神经网络的反无人机系统图像识别方法 . 红外与激光工程, 2020, 49(7): 20200154-1-20200154-8. doi: 10.3788/IRLA20200154
[16]	谢冰, 段哲民, 郑宾, 殷云华. 基于迁移学习SAE的无人机目标识别算法研究 . 红外与激光工程, 2018, 47(6): 626001-0626001(7). doi: 10.3788/IRLA201847.0626001
[17]	殷云华, 李会方. 基于混合卷积自编码极限学习机的RGB-D物体识别 . 红外与激光工程, 2018, 47(2): 203008-0203008(8). doi: 10.3788/IRLA201847.0203008
[18]	张腊梅, 陈泽茜, 邹斌. 基于3D卷积神经网络的PolSAR图像精细分类 . 红外与激光工程, 2018, 47(7): 703001-0703001(8). doi: 10.3788/IRLA201847.0703001
[19]	郭强, 芦晓红, 谢英红, 孙鹏. 基于深度谱卷积神经网络的高效视觉目标跟踪算法 . 红外与激光工程, 2018, 47(6): 626005-0626005(6). doi: 10.3788/IRLA201847.0626005
[20]	杨绪峰, 林伟, 延伟东, 温金环. 采用热核特征的SAR图像目标识别 . 红外与激光工程, 2014, 43(11): 3794-3801.

点击查看大图

图(6) / 表(5)

计量

文章访问数: 2685
HTML全文浏览量: 2846
PDF下载量: 55
被引次数: 0

全文HTML

0. 引　言

作为深度学习网络模型，卷积神经网络在图像分类、计算机视觉等领域取得了长足的进展^[1]。但是，卷积神经网络缺乏对目标的相对位置和空间关系的处理能力。因此，为了识别不同视角的目标，需要更多的训练样本或者更多的网络结构^[2-3]。针对经典卷积神经网络存在的问题，一种新的深度学习网络Capsule^[4-5]被提出。Capsule网络包含很多Capsule单元，而Capsule单元是由多种特征组合而成的多维向量，代表一个物体的整体或者一部分。Capsule输出既有本身的激活概率也有描述它们属性的实例化参数，其中属性包括姿态、形变、方向和纹理等^[6]。在传输过程中，耦合过滤原则被用来激活更高层的Capsules和在Capsules之间建立局部与整体的空间关系。Capsule网络通过动态路由机制^[4-5]调整局部与整体的空间关系，这种空间关系的建立使得Capsule网络能更好地处理目标的相对位置关系和空间关系问题。但是，由于Capsule网络需要消耗大量的计算和存储资源，因此处理图片大小一般不超过32×32。这给Capsule网络在军事领域的应用带来了极大的限制，如在SAR-ATR应用中，合成孔径雷达（SAR）图像的目标切片大小一般大于128×128，可见光和红外图像具有更大的图像尺寸，采用Capsule网络进行目标识别，对计算和存储资源的需求成为了一个挑战。

文中基于脑机制提出了一种针对Capsule网络的改进算法，大大减小了计算量和参数数量。在SAR-ATR上，与Capsule网络、SVM^[7]、AdaBoost^[7]、IGT^[7]、DCNN^[8-9]、CGM^[10]和2-VDCNN^[11]识别算法相比，准确率也得到明显的提升。这为深度学习技术的发展借鉴生物脑机制提供了有力的证据。

4. 结　论

文中首先分析了Capsule网络的结构、原理及特性，并结合脑机制对原Capsule网络存在的不足进行了改进。然后，将原Capsule网络和改进后的Capsule网络应用于SAR图像目标识别，并与其他SAR图像目标识别算法进行了对比。实验结果表明，在性能上，改进的Capsule网络较原Capsule网络的参数数量和计算量分别降低了1.6倍和31倍，训练速度提高了一倍；在SAR-ATR上，改进的Capsule网络的识别率较原Capsule网络和多视角2-VDCNN分别提高了0.37和1.04个百分点。与其他6种算法相比，在训练样本分别减小62.5%和49.9%的条件下，改进的Capsule网络的识别率分别提高了1.96~8.96个百分点。从而验证了基于脑机制改进的Capsule网络较原Capsule网络在性能上有较大的提升，并且在SAR-ATR上具有更好的识别效果。尽管改进的Capsule网络在一定程度上减少了计算量和参数数量，但是针对更大图片的处理仍然存在计算消耗和存储需求较大的问题。因此，在后续的工作中，将考虑去除没有或包含较少信息量的特征，进一步减少Capsule单元的维度，从而使得对计算资源和存储空间的需求降低。

参考文献 (14)

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

一种改进的Capsule及其在SAR图像目标识别中的应用

doi: 10.3788/IRLA20201010

作者简介:
张盼盼(1992-)，博士生，主要从事模式识别与智能系统方面的研究。Email: zhangpanpan@sia.cn

罗海波(1967-)，男，研究员，博士，主要从事图像处理、模式识别与智能系统方面的研究。Email: luohb@sia.cn

An improved Capsule and its application in target recognition of SAR images

计量

一种改进的Capsule及其在SAR图像目标识别中的应用

doi: 10.3788/IRLA20201010

作者简介:
张盼盼(1992-)，博士生，主要从事模式识别与智能系统方面的研究。Email: zhangpanpan@sia.cn

罗海波(1967-)，男，研究员，博士，主要从事图像处理、模式识别与智能系统方面的研究。Email: luohb@sia.cn

English Abstract

An improved Capsule and its application in target recognition of SAR images

全文HTML

1.1. Capsule单元

1.2. Capsule网络结构

1.3. Capsule网络的动态路由机制

1.4. Capsule网络的重构

1.5. Capsule网络的优势

3.1. 数据描述

3.2. 实验结果

3.3. 识别效果对比

目录

留言板

一种改进的Capsule及其在SAR图像目标识别中的应用

doi: 10.3788/IRLA20201010

作者简介: 张盼盼(1992-)，博士生，主要从事模式识别与智能系统方面的研究。Email: zhangpanpan@sia.cn 罗海波(1967-)，男，研究员，博士，主要从事图像处理、模式识别与智能系统方面的研究。Email: luohb@sia.cn

An improved Capsule and its application in target recognition of SAR images

计量

出版历程

一种改进的Capsule及其在SAR图像目标识别中的应用

doi: 10.3788/IRLA20201010

作者简介: 张盼盼(1992-)，博士生，主要从事模式识别与智能系统方面的研究。Email: zhangpanpan@sia.cn 罗海波(1967-)，男，研究员，博士，主要从事图像处理、模式识别与智能系统方面的研究。Email: luohb@sia.cn

English Abstract

An improved Capsule and its application in target recognition of SAR images

全文HTML

1.1. Capsule单元

1.2. Capsule网络结构

1.3. Capsule网络的动态路由机制

1.4. Capsule网络的重构

1.5. Capsule网络的优势

3.1. 数据描述

3.2. 实验结果

3.3. 识别效果对比

目录

作者简介:
张盼盼(1992-)，博士生，主要从事模式识别与智能系统方面的研究。Email: zhangpanpan@sia.cn

罗海波(1967-)，男，研究员，博士，主要从事图像处理、模式识别与智能系统方面的研究。Email: luohb@sia.cn

作者简介:
张盼盼(1992-)，博士生，主要从事模式识别与智能系统方面的研究。Email: zhangpanpan@sia.cn

罗海波(1967-)，男，研究员，博士，主要从事图像处理、模式识别与智能系统方面的研究。Email: luohb@sia.cn