-
实验验证部分基于自采数据完成。实验使用了Pandar40P的激光雷达、自研Rolling Shutter快门的前视相机(分辨率为648×1152,单帧采集时间45 ms)、PwrPak7D−E1双天线MEMS组合导航系统(Integrated Navigation System,INS)、安装在车辆右后轮的轮速计(平均精度误差在
$ \pm 0.02 \ \mathrm{m}/\mathrm{s} $ )。在实验中将组合导航系统的数据经NovAtel Inertial Explorer解算后作为运动真值,定位精度可达厘米级。在实验前需要标定好外参,比如俞家勇等[13]提出的基于参考面约束的车载移动测量系统的外参标定方法。测试使用的计算机配置为Intel i9−10900 X 3.70 GHz CPU、64 G内存、GeForce RTX 3090 GPU、24 G显存。实验车辆的外观如图5所示。在实验中,视觉部分帧率为15 Hz,单帧跟踪的特征点数为300,当有特征点跟踪丢失后,则提取新的特征点,当前帧与上一关键帧同时满足如下两个条件时创建新的视觉关键帧:距离大于0.8 m,共视特征点数小于110或者平均视差大于30 pixel。LiDAR部分帧率为10 Hz,特征匹配使用的距离阈值为
$\mathrm{m}\mathrm{a}\mathrm{x}\;(0.5, \mathrm{ }4.0-0.5\cdot iteration\_time)\;\mathrm{m}$ 、角度阈值为$\mathrm{m}\mathrm{a}\mathrm{x}\;(10.0,\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }30.0- 5.0\cdot \mathrm{i}\mathrm{t}\mathrm{e}\mathrm{r}\mathrm{a}\mathrm{t}\mathrm{i}\mathrm{o}\mathrm{n}\_\mathrm{t}\mathrm{i}\mathrm{m}\mathrm{e})$ °,当前帧与上一关键帧的距离超过$4.0\;\mathrm{m}$ 或者旋转超过$ 5.0$ ° 则创建新的LiDAR关键帧。后端滑动窗口的总长度为40。 -
为了验证LiDAR前端部分退化因子对ICP结果精度的影响,在含有隧道场景的一段数据中对相隔距离为4 m和8 m的多对点云进行拼接,并对拼接结果的位移和角度误差进行评估。实验分为两组,不使用退化因子的ICP CUDA算法与使用退化因子的ICP CUDA算法。ICP CUDA算法的初值通过IMU和轮速计积分的方法得到。当两帧相隔距离为4 m时结果如图6所示,相隔距离为8 m时结果如图7所示。
Figure 6. Distance and angle error of ICP CUDA to match point cloud pairs with 4 m distance. (a) Using degradation factor; (b) Calculating but not using degradation factor
Figure 7. Distance and angle error of ICP CUDA to match point cloud pairs with 8 m distance. (a) Using degradation factor; (b) Calculating but not using degradation factor
由上述4 m和8 m两组相隔距离实验结果可知,使用退化因子后ICP CUDA结果的平移误差最大值为1.6 m,而不用退化因子时平移误差最大值为4.0 m,可见使用退化因子时,平移误差得到显著减小,仅为前者的40%,这体现了使用退化因子的优势。而角度误差变化不大,这是因为退化的自由度主要在平移分量而非旋转分量。
-
为了评估系统的精度水平,并对比VIO、LIO和VLIO三种运行模式的精度,利用在城市场景中采集的三段时长为5 min的数据对系统初始化后输出的结果进行精度评估。参考KITTI[14]的量化方法,计算了里程计当前时刻到之前100 m范围内相对位移误差(Relative Translation Error,RTE),如图8、表1所示。
Figure 8. (a) Range map of 3 segments of unban scene data with a duration of 5 min. (b1)-(b3) Relative translation error results of 3 segments of urban scene data excluding initialization stage; (c1)-(c3) Path results, where GT is the ground truth. VLIO mode has an average value of 0.2%-0.5% which is better than VIO and LIO mode
Scene ID Mode Min RTE Max RTE Average RTE 1 VIO 0.46 3.02 0.88 LIO 0.60 2.95 0.91 VLIO 0.01 1.35 0.16 2 VIO 0.32 3.05 1.10 LIO 0.20 2.88 1.47 VLIO 0.02 1.86 0.34 3 VIO 0.19 1.56 0.79 LIO 0.79 2.97 1.21 VLIO 0.03 0.93 0.51 Table 1. Relative translation error statistics results of urban scenes
由上述城市场景精度结果可知,实验中系统VIO模式的平均相对位移精度在0.8%~1.1%之间;系统LIO模式的平均相对位移精度在0.9%~1.5%之间;系统完整模式在0.2%~0.5%之间,明显优于关闭LiDAR的VIO模式和关闭视觉的LIO模式,这也体现了多源数据融合对于提高系统精度的作用。
-
为了验证系统在LiDAR退化场景的精度水平,采集了两段包含隧道场景的数据,并分别对系统的VIO、LIO和VLIO模式进行精度评估,评估方法与城市场景相同,实验结果如图9、表2所示。
Scene ID Mode Min RTE Max RTE Average RTE 1 VIO 0.05 3.06 1.00 LIO 0.18 1.35 1.11 VLIO 0.002 2.33 0.61 2 VIO 0.88 2.14 1.52 LIO 1.76 2.60 2.05 VLIO 0.19 2.31 1.31 Table 2. Relative translation error statistics results of tunnel scenes
Figure 9. (d1)−(d2) Relative translation error results of 2 segments of tunnel scene data excluding initialization stage; (e1)−(e2) Path results, VLIO mode has lower accuracy than urban scene, but is better than VIO and LIO mode; (a)−(c), (f)−(h) Corresponding images
由上述两端包含隧道场景的精度结果可知,在隧道场景下系统的平均相对位移精度较城市场景有所下降,大概在0.6%~1.3%之间,但整体仍然优于关闭LiDAR的VIO模式的1.0%~1.5%和关闭视觉的LIO模式的1.1%~2.1%,这表明多源信息融合可以提高系统在LiDAR退化场景的精度。
Research on a real-time odometry system integrating vision, LiDAR and IMU for autonomous driving
doi: 10.3788/IRLA20210651
- Received Date: 2021-09-09
- Rev Recd Date: 2021-10-12
- Accepted Date: 2021-11-02
- Available Online: 2022-08-31
- Publish Date: 2022-08-31
-
Key words:
- autonomous driving /
- LiDAR odometry /
- ICP /
- state estimation
Abstract: Visual/LiDAR odometry can estimate the process of an autonomous driving vehicle moving in multiple degrees of freedom based on sensor data and is an important part of the positioning and mapping system. In this paper, we propose a real-time tightly coupled odometry system that integrates vision, LiDAR, and IMU for autonomous driving vehicles and supports multiple running modes and initialization methods. The front end of the system applies a modified CUDA-based ICP for point cloud registration and traditional optical flow for vision feature tracking and uses the LiDAR points as the depth of visual features. The back end of the system applies a factor graph based on a sliding window to optimize the poses, in which state nodes are related to the poses from vision and LiDAR front end subsystems, and edges are related to preintegration of IMU. The experiments show that the system has an average relative translation accuracy of 0.2%-0.5% in urban scenes. The system with both LiDAR and visual front end subsystem is superior to a system that only contains one of them. The method proposed in this paper is of positive significance for improving the accuracy of the autonomous driving positioning and mapping systems.