Object point cloud classification and segmentation based on semantic information compensating global features

Lin Sen; Zhao Zhenyu; Ren Xiaokui; Tao Zhiyong

doi:10.3788/IRLA20210702

3D point cloud data processing has played an essential role in object segmentation, medical image segmentation, and virtual reality. However, the existing 3D point cloud learning network has a small global feature extraction range and cannot obtain local high-level semantic information, which leads to incomplete point cloud feature representation. Aiming at these problems, a classification, and segmentation network of object point cloud based on semantic information compensating global features was proposed. Firstly, align the input point cloud data to the specification space, and perform the preprocessing of the input conversion of the data. Then, the expanded edge convolution module was used to extract the features of each layer of the converted data and superimpose them to generate global features. In the local feature extraction, the extracted low-level semantic information was used to describe the high-level semantic features and effective geometric information, which was used to compensate for the missing point cloud features in the global features. Finally, the global feature and local high-level semantic information were combined to obtain the overall feature of the point cloud. The experimental results show that the method in this paper is superior to the current classic and novel algorithms in classification and segmentation performance.

HTML

0. 引　言

近年来，3D模型成像技术发展非常迅速，数据获取更加便捷。点云是3D模型数据中一种重要的表现形式，包含着大量的空间几何信息，使模型的描述更加直观和准确。3D点云处理技术已广泛应用于智能车辆、模型重构、医学成像和遥感测绘等领域，成为计算机视觉和图形学领域的重要研究项目。因此，对3D点云数据处理显得至关重要。

传统点云处理方法主要通过手工设计几何形状^[1]描述符或签名描述符^[2]对特征进行提取。手工设计的描述符效果较差，导致提取到的特征数量变少，进而影响最终的实验结果。近年来，深度学习3D点云算法^[3-4]以处理数据量大等优势受到学者们的广泛关注。由于点云数据的稀疏性、非结构性和无序性，传统的卷积神经网络不适合直接应用于点云领域。针对此问题，Qi等人^[5]提出PointNet网络，通过多层感知器直接学习点的特征，并结合基于对称函数思想的最大池化方法处理点云数据的无序性难题。该算法较好地解决了卷积神经网络直接应用的问题，但点云特征得不到完整地提取。为此，Qi等人^[6]针对PointNet的不足之处进行了修改，设计出PointNet++，首次提出基于最远点采样的局部特征提取。相较于PointNet有显著提升，但是仍缺乏点与点之间相互关系的表示。Li等人^[7]提出点卷积网络(Point Convolutional Neural Networks，PointCNN)以解决点云的无序排列问题，该网络没有采用最大池化作为对称函数，而是训练了一个“X”型的变换网络，但是局部几何信息仍有大量的丢失。Wang等人^[8]提出动态图卷积神经网络(Dynamic Graph Convolutional Neural Networks, DGCNN)，通过最近K阶邻点采样(K Nearest Neighbor, KNN)获取局部信息，构造局部图结构并提取该部分特征，效果比PointCNN^[7]更好，且未造成有效信息的丢失。尽管DGCNN^[8]能够很好地采集低级语义信息，仍无法描述大部分的高级语义信息和隐式高级语义特征。

针对上述研究方法的不足，文中提出一种局部语义信息补偿全局特征的点云学习网络，通过扩张卷积来增大特征提取的范围，在保持点云序列不变的情况下，利用边缘卷积提取几何特征。目的是尽可能考虑点的坐标与邻点的距离，避免部分几何信息提取不完整的问题。在局部特征提取时，利用KNN模块来提取低级语义信息，并利用局部特征融合(Vector of Locally Aggregated Descriptors，VLAD)模块将所提取到的低级语义信息来描述高级语义信息和隐式高级语义特征，进一步补偿全局提取时遗漏的特征及有效信息。实验及分析表明，文中网络对点云的全局及局部特征提取效果具有显著的提高，低级语义信息描述高级语义特征也得到了进一步的完善。

3. 结　论

针对目前物体点云分类分割网络提取点云特征不完整的问题，文中提出了基于语义信息补偿全局特征的物体点云分类和分割的网络。首先将待处理的数据输入至STN模块，使数据转换到对齐的规范空间中，保证输入数据的排列方式不变。其次，通过扩张边缘卷积模块层层提取点云数据的各部分特征，并汇总成全局特征。在局部特征的提取中，采用KNN-VLAD模块将提取到的低级几何信息来尽可能地描述高级语义特征，这些特征和有效信息用于补偿全局遗漏的特征，这两个模块保证了点云特征的完整性。最后，将局部特征和全局特征融合生成整体特征。实验结果表明，文中算法不仅能够有效地提升物体点云数据的分类和分割的准确度，而且在处理稀疏、残缺及复杂场景数据时仍能保持准确度的稳定；通过实验对比，文中算法的网络结构复杂度低、耗时短，相较于其他算法更具优越性。

Reference (19)

[1]	Osada R, Funkhouser T, Chazelle B, et al. Shape distributions [J]. ACM Transactions on Graphics (TOG), 2002, 21(4): 807-832.
[2]	Sun J, Ovsjanikov M, Guibas L. A concise and provably informative multiscale signature based on heat diffusion[C]//Computer graphics forum. Oxford, UK: Blackwell Publishing Ltd, 2009, 28(5): 1383-1392.
[3]	Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep conv neural networks [J]. Communications of the ACM, 2017, 60(6): 84-90.
[4]	Pan X Z, Zhang S Q, Guo W P. Application of multi-mode deep convolutional neural network to video expression recognition [J]. Optics and Precision Engineering, 2019, 27(4): 230-237. (in Chinese)
[5]	Qi C R, Su H, Mo K, et al. PointNet: Deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 77-85.
[6]	Qi C R, Li Y, Hao S, et al. PointNet++: Deep hierarchical feature learning on point sets in a metric space[C]//Advances in Neural Information Processing Systems, 2017, 30: 5099-5108.
[7]	Li Y, Bu R, Sun M, et al. Pointcnn: Convolution on X-transformed points[C]//Advances in Neural Information Process ing Systems, 2018, 31: 820-830.
[8]	Wang Y, Sun Y, Liu Z, et al. Dynamic graph CNN for learning on point clouds [J]. ACM Transactions on Graphics (TOG), 2019, 38(5): 1-12.
[9]	Jaderberg M, Simonyan K, Zisserman A. Spatial transformer networks[C]//Advances in Neural Information Processing Systems, 2015, 28: 2017-2025.
[10]	Hastie T, Tibshirani R. Discriminant adaptive nearest neighbor classification and regression[C]//Advances in Neural Information Processing Systems, 1996, 9: 409-415.
[11]	Jégou H, Douze M, Schmid C, et al. Aggregating local descriptors into a compact image representation[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 2010: 3304-3311.
[12]	Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions [J]. arXiv preprint arXiv, 2015: 1511.07122.
[13]	Wu Z, Song S, Khosla A, et al. 3 D ShapeNets: A deep representation for volumetric shapes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015: 1912-1920.
[14]	Yi L, Guibas L, Hertzmann A, et al. Learning hierarchical shape segmentation and labeling from online repositories [J]. ACM Transactions on Graphics, 2017, 36(4): 1-12.
[15]	Zhang D, He F, Tu Z, et al. Pointwise geometric and semantic learning network on 3 D point clouds [J]. Integrated Computer-Aided Engineering, 2020, 27(1): 57-75.
[16]	Duan Y, Zheng Y, Lu J, et al. Structural relational reasoning of point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 949-958.
[17]	Li D, Shen X, Yu Y, et al. GGM-net: Graph geometric moments convolution neural network for point cloud shape classification[C]//IEEE Access, 2020, 8: 124989-124998.
[18]	Zhai Z, Zhang X, Yao L. Multi-scale dynamic graph convolution network for point clouds classification[C]//IEEE Access, 2020, 8: 65591-65598.
[19]	Lyu Y, Huang X, Zhang Z. EllipsoidNet: Ellipsoid representation for point cloud classification and segmentation [J]. arXiv preprint arXiv, 2021: 2103.02517.

Environment configuration		Model parameters
Name	Configuration	Name	Value
CPU	Intel i7-10700 F	Batch size	32
GPU	RTX3090	Number point	1024
RAM	32 G	Max epoch	250
Operation system	Ubuntu18.04	Optimizer	Adam
Language	Python 3.7	Learning rate	0.001
Learning framework	TensorFlow GPU 1.15.0	Momentum	0.9

Method	Representation	Input	Eval accuracy	Avg class acc
PointNet^[5]	Points	1024×3	89.2%	86.2%
PointNet++^[6]	Points(+normal)	1024×(3+3)	90.7%	87.8%
DGCNN^[8]	Points	1024×3	92.2%	88.9%
PointCNN^[7]	Points	1024×3	91.7%	88.5%
Pointwise^[15]	Points	1024×3	91.6%	89.1%
SRN-PointNet^[16]	Points	1024×3	91.5%	88.6%
GGM^[17]	Points	1024×3	92.5%	89.0%
MSDGCNN^[18]	Points	1024×3	91.8%	88.3%
EllNet^[19]	Points	1024×2	92.6%	89.0%
Ours	Points	1024×3	92.7%	89.3%

Method	Size/MB	Time/s	Accuracy
PointNet^[5]	40	78.9	89.2%
PointNet++^[6]	12	163.2	90.7%
DGCNN^[8]	21	89.7	92.2%
PointCNN^[7]	94	117.0	91.7%
Ours	21	86.4	92.7%

Method	mIoU	Shapes IoU
Method	mIoU	Plane	Bag	Cap	Car	Chair	Earcup	Guitar	Knife	Lamp	Laptop	Motor	Mug	Pistol	Rocket	Skate	Table
PointNet^[5]	83.7%	83.4%	78.7%	82.5%	74.9%	89.6%	73.0%	91.5%	85.9%	80.8%	95.3%	65.2%	93.0%	81.2%	57.9%	72.8%	80.6%
PointNet^[6]	85.1%	82.4%	79.0%	87.7%	77.3%	90.8%	71.8%	91.0%	85.9%	83.7%	95.3%	71.6%	94.1%	81.3%	58.7%	76.4%	82.6%
DGCNN^[8]	85.2%	84.0%	83.4%	86.7%	77.8%	90.6%	74.7%	91.2%	87.5%	82.8%	95.7%	66.3%	94.9%	81.1%	63.5%	74.5%	82.6%
PointCNN^[7]	86.1%	84.1%	86.4%	86.0%	80.8%	90.6%	79.7%	92.3%	88.4%	85.3%	96.1%	77.2%	95.3%	84.8%	64.2%	80.0%	83.0%
Pointwise^[15]	85.1%	82.9%	80.7%	87.8%	76.6%	90.8%	79.2%	91.0%	86.6%	83.3%	95.3%	71.9%	94.4%	80.9%	62.0%	75.1%	82.5%
SRNPNet^[16]	85.3%	82.4%	79.8%	88.1%	77.9%	90.7%	69.6%	90.9%	86.3%	84.0%	95.4%	72.2%	94.9%	81.3%	62.1%	75.9%	83.2%
GMM^[17]	85.2%	83.9%	82.8%	88.0%	79.8%	90.7%	76.8%	91.3%	87.6%	82.6%	95.5%	66.6%	94.8%	81.8%	62.6%	73.8%	82.6%
MSDGCN^[18]	85.4%	83.7%	84.7%	87.5%	77.0%	90.8%	68.2%	91.5%	86.5%	96.0%	95.5%	72.0%	95.1%	83.4%	61.9%	77.4%	82.9%
Ell-Net^[19]	85.0%	82.8%	81.5%	87.6%	76.8%	90.6%	78.8%	90.8%	86.8%	86.9%	95.1%	71.8%	94.2%	80.8%	61.8%	75.0%	82.2%
Ours	85.5%	84.3%	85.8%	88.1%	80.0%	90.8%	79.5%	91.6%	88.2%	91.6%	95.8%	76.7%	96.1%	82.6%	65.6%	81.2%	82.8%

KNN point number	Avg class acc	Eval accuracy
16	89.5%	91.4%
20	89.8%	92.7%
25	89.1%	91.3%
30	88.6%	91.1%
32	88.5%	90.8%

Object point cloud classification and segmentation based on semantic information compensating global features

doi: 10.3788/IRLA20210702

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Related

Proportional views