系统仿真学报 ›› 2020, Vol. 32 ›› Issue (7): 1244-1256.doi: 10.16182/j.issn1004731x.joss.19-VR0466
刘瑞军*, 王向上, 张晨, 章博华
收稿日期:2019-08-30
修回日期:2019-12-01
出版日期:2020-07-25
发布日期:2020-07-15
通讯作者:
刘瑞军(1982-),男,北京,博士,副教授,研究方向为虚拟现实、图像处理
第一作者简介:刘瑞军(通讯作者1982-),男,北京,博士,副教授,研究方向为虚拟现实、图像处理;王向上(1995-),男,河南,硕士生,研究方向为虚拟现实、图像处理。
Liu Ruijun*, Wang Xiangshang, Zhang Chen, Zhang Bohua
Received:2019-08-30
Revised:2019-12-01
Online:2020-07-25
Published:2020-07-15
摘要: 随着计算机视觉和机器人技术的发展,视觉同时定位与地图创建已成为无人系统领域的研究焦点,深度学习在图像处理方面展现出的强大优势,为二者的广泛结合创造了机会。总结了深度学习与视觉里程计、闭环检测和语义同时定位与地图创建结合的突出研究成果,对传统算法与基于深度学习的方法做了对比,展望了基于深度学习的视觉同时定位与地图创建发展方向。
中图分类号:
刘瑞军,王向上,张晨等 . 基于深度学习的视觉SLAM综述[J]. 系统仿真学报, 2020, 32(7): 1244-1256.
Liu Ruijun,Wang Xiangshang,Zhang Chen,et al . A Survey on Visual SLAM based on Deep Learning[J]. Journal of System Simulation, 2020, 32(7): 1244-1256.
| [1] | Cadena C, Carlone L, Carrillo H, et al.Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age[J]. IEEE Transactions on Robotics (S1552-3098), 2016, 32(6): 1309-1332. |
| [2] | Fuentes-Pacheco J, Ruiz-Ascencio J, Rendón-Mancha J M, et al. Visual simultaneous localization and mapping: A survey[J]. Artificial Intelligence Review (S0269-2821), 2015, 43(1): 55-81. |
| [3] | 刘浩敏, 章国峰, 鲍虎军. 基于单目视觉的同时定位与地图构建方法综述[J]. 计算机辅助设计与图形学学报, 2016, 28(6): 855-868.Liu Haomin, Zhang Guofeng, Bao Hujun.A survey of monocular simiultaneous localization and mapping[J]. Journal of Computer-Aided Design & Computer Graphics, 2016, 28(6): 855-868. |
| [4] | Geiger A, Lenz P, Urtasun R.Are we ready for autonomous driving? The KITTI vision benchmark suite[C]// IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2012: 3354-3361. |
| [5] | Kummerle R, Grisetti G, Strasdat H, et al.g2o: A general framework for graph optimization[C]// IEEE International Conference on Robotics and Automation. Piscataway, USA: IEEE, 2011: 3607-3613. |
| [6] | Belter D, Skrzypczyński P.Precise self-localization of a walking robot on rough terrain using ptam[M]. Baltimore, USA: Adaptive Mobile Robotics, 2012: 89-96. |
| [7] | Mur-artal R, Tardos J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics (S1552-3098), 2017, 23(5): 1255-1262. |
| [8] | Engel J, Koltunk V, Cremers D.Direct sparse odometry[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence (S0162-8828), 2018, 40(3): 611-625. |
| [9] | He K, Zhang X, Ren S, et al.Deep residual learning for image recognition[C]// Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. LAS VEGAS: IEEE, 2016: 779-788. |
| [10] | Ren S, He K, Girshick R B, et al.Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence (S0162-8828), 2017, 39(6): 1137-1149. |
| [11] | Donahue J, Anne Hendricks L, Guadarrama S, et al.Long-term recurrent convolutional networks for visual recognition and description[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3128-3137. |
| [12] | Sünderhauf N, Pham T T, Latif Y, et al.Meaningful maps with object-oriented semantic mapping[C]// 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver, Canada: IEEE, 2017: 5079-5085. |
| [13] | Zhou Y, Li H, Kneip L.Canny-vo: Visual odometry with rgb-d cameras based on geometric 3-d-2-d edge alignment[J]. IEEE Transactions on Robotics (S1552-3098), 2018, 35(1): 184-199. |
| [14] | Costante G, Mancini M, Valigi P, et al.Exploring representation learning with CNNs for frame-to-frame ego-motion estimation[J]. IEEE Robotics and Automation Letters (S2377-3766), 2016, 1(1): 18-25. |
| [15] | Shahid M, Naseer T, Burgard W.DTLC: Deeply trained loop closure detections for lifelong visual SLAM[C]// Proceedings, Workshop on Visual Place Recognition, Conference on Robotics: Science and Systems (RSS). Ann Arbor, USA: RSS, 2016: 1-8. |
| [16] | Hou Y, Zhang H, Zhou S L.Convolutional neural networkbased image representation for visual loop closure detection[C]// IEEE International Conference on Information and Automation. Piscataway, USA: IEEE, 2015: 2238-2245. |
| [17] | Daniel D, Malisiewicz T, Rabinovich A. Toward geometric deep SLAM[EB/OL]. (2017-07-24) [2019-08-20], https://arxiv.org /pdf/1707.07410.pdf. |
| [18] | Sharif Razavian A, Azizpour H, Sullivan J, et al.CNN features off-the-shelf: an astounding baseline for recognition[C]// Proceedings of the IEEE conference on computer vision and pattern recognition workshops. Columbus, Ohio: IEEE, 2014: 806-813. |
| [19] | Wang S, Clark R, Wen H, et al.Deepvo: Towards end-to-end visual odometry with deep recurrent convolutional neural networks[C]// 2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, 2017: 2043-2050. |
| [20] | Donahue J, Anne Hendricks L, Guadarrama S, et al.Long-term recurrent convolutional networks for visual recognition and description[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 2625-2634. |
| [21] | Elman J L.Finding structure in time[J]. Cognitive science (S0364-0213), 1990, 14(2): 179-211. |
| [22] | Graves A.Supervised Sequence Labeling with Recurrent Neural Networks[M]. Heidelberg: Springer, 2012: 5-13. |
| [23] | Chen Z, Jacobson A, Sünderhauf N, et al.Deep learning features at scale for visual place recognition[C]// 2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, 2017: 3223-3230. |
| [24] | Sünderhauf N, Dayoub F, Shirazi S, et al.On the Performance of ConvNet Features for Place Recognition[C]// International Conference on Intelligent Robots and Systems (IROS). Hamburg: IEEE, 2015: 4297-4304. |
| [25] | Yi H, Hong Z, Zhou S.BoCNF: efficient image matching with Bag of ConvNet features for scalable and robust visual place recognition[J]. Autonomous Robots (S0929-5593), 2017, 42(9): 1-17. |
| [26] | Lin K, Yang H F, Hsiao J H, et al.Deep learning of binary hash codes for fast image retrieval[C]// Proceedings of the IEEE conference on computer vision and pattern recognition workshops. Boston, USA: IEEE, 2015: 27-35. |
| [27] | Sünderhauf N, Shirazi S, Jacobson A, et al.Place recognition with convnet landmarks: Viewpoint-robust, condition-robust, training-free[C]// Proceedings of Robotics: Science and Systems XI. Michigan, USA: RSS, 2015: 1-10. |
| [28] | Parisotto E, Singh Chaplot D, Zhang J, et al.Global pose estimation with an attention-based recurrent network[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Salt Lake City: IEEE, 2018: 237-246. |
| [29] | Hwang J, Park S, Kwak N.Athlete pose estimation by a global-local network[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu, Hawaii: IEEE, 2017: 58-65. |
| [30] | Southall C, Stables R, Hockman J.Automatic Drum Transcription for Polyphonic Recordings Using Soft Attention Mechanisms and Convolutional Neural Networks[C]// The 18th International Society for Music Information Retrieval Conference. Suzhou: ISMIR, 2017: 606-612. |
| [31] | Sünderhauf N, Pham T T, Latif Y, et al.Meaningful Maps with Object-Oriented Semantic Mapping[C]// 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). New York: IEEE, 2017: 5079-5085. |
| [32] | Ng P C, Henikoff S.SIFT: Predicting amino acid changes that affect protein function[J]. Nucleic Acids Research (S0305-1048), 2003, 31(13): 3812-3814. |
| [33] | Lei H, Akhtar N, Mian A.Octree guided CNN with Spherical Kernels for 3D Point Clouds[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, CA: IEEE, 2019: 9631-9640. |
| [34] | Mani I, Zhang I.KNN approach to unbalanced data distributions: a case study involving information extraction[C]// Proceedings of workshop on learning from imbalanced datasets. Washington: ICML, 2003: 126. |
| [35] | Radwan N, Valada A, Burgard W.VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry[J]. IEEE Robotics and Automation Letters (S2377-3766), 2018, 3(4): 4407-4414. |
| [36] | Girisha S, Manohara P, Ujjwal V, et al.Semantic Segmentation of UAV Aerial Videos using Convolutional Neural Networks[C]// 2019 IEEE Knowledge Engineering (AIKE). Sardinia, Italy: IEEE, 2019: 21-27. |
| [37] | Han Y, Ye J C.Framing U-Net via deep convolutional framelets: Application to sparse-view CT[J]. IEEE Transactions on Medical Imaging (S0278-0062), 2018, 37(6): 1418-1429. |
| [38] | Bowman S L, Atanasov N, Daniilidis K, et al.Probabilistic data association for semantic slam[C]// 2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, 2017: 1722-1729. |
| [39] | Jordan M I, Jacobs R A.Hierarchical Mixtures of Experts and the EM Algorithm[J]. Neural Computation (S0899-7667), 1994, 6(2): 181-214. |
| [40] | Engel J, Koltun V, Cremers D.Direct sparse odometry[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence (S0162-8828), 2017, 40(3): 611-625. |
| [41] | Geiger A, Ziegler J, Stiller C.Stereoscan: Dense 3d reconstruction in real-time[C]// 2011 IEEE Intelligent Vehicles Symposium (IV). Baden-Baden, Germany: IEEE, 2011: 963-968. |
| [42] | Loo S Y, Amiri A J, Mashohor S, et al.CNN-SVO: Improving the mapping in semi-direct visual odometry using single-image depth prediction[C]// 2019 International Conference on Robotics and Automation (ICRA). Montreal, Canada: IEEE, 2019: 5218-5223. |
| [43] | Zhan H, Weerasekera C S, Bian J, et al. Visual Odometry Revisited: What Should Be Learnt?[EB/OL]. (2019/09/21) [2019/10/05], https://arxiv.org/abs/1909.09803.pdf. |
| [44] | Zhou T, Brown M, Snavely N, et al.Unsupervised learning of depth and ego-motion from video[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, Hawaii: IEEE , 2017: 1851-1858. |
| [45] | Cieslewski T, Choudhary S, Scaramuzza D.Data-efficient decentralized visual SLAM[C]// 2018 IEEE International Conference on Robotics and Automation (ICRA). Prague, Czech Republic,: IEEE, 2018: 2466-2473. |
| [46] | Li S, Zhi Y, Anestis Z, et al.Recurrent-OctoMap: Learning State-based Map Refinement for Long-Term Semantic Mapping with 3D-Lidar Data[J]. IEEE Robotics and Automation Letters (S2377-3766), 2018, 3(4): 3749-3756. |
| [47] | Hornung A, Kai M W, Bennewitz M, et al.OctoMap: An efficient probabilistic 3D mapping framework based on octrees[J]. Autonomous Robots (S0929-5593), 2013, 34(3): 189-206. |
| [48] | Zhang J, Singh S.Laser-visual-inertial odometry and mapping with high robustness and low drift[J]. Journal of Field Robotics (S1556-4967), 2018, 35(8): 1242-1264. |
| [49] | Garcia-Fidalgo E, Ortiz A.Vision-based topological mapping and localization methods: a survey[J]. Robotics and Autonomous Systems (S0921-8890), 2015, 64: 1-20. |
| [50] | Engel J, Schöps T, Cremers D.LSD-SLAM: Large-Scale Direct Monocular SLAM[M]. Munich: Computer Vision - ECCV 2014. 2014: 834-849. |
| [51] | Scherer S A, Zell A.Efficient onbard RGBD-SLAM for autonomous MAVs[C]// 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan: IEEE, 2013: 1062-1068. |
| [52] | Vijayanarasimhan S, Ricco S, Schmid C, et al. Sfm-net: Learning of structure and motion from video[EB/OL]. (2017/04/25) [2019/08/25], https://arxiv.org/abs/ 1704.07804.pdf. |
| [53] | 张峻宁, 苏群星, 刘鹏远, 等. 一种自适应特征地图匹配的改进VSLAM算法[J]. 自动化学报, 2019, 45(3): 553-565.Zhang Junning, Su Qunxing, Liu Pengyuan, et al.An Improved VSLAM Algorithm Based on Adaptive Feature Map[J]. Acta Automatica Sinica, 2019, 45(3): 553-565. |
| [54] | Grisetti G, Kümmerle R, Strasdat H, et al.g2o: A general framework for (hyper) graph optimization[C]// 2011 IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China: IEEE, 2011: 3607-3613. |
| [1] | 王秉珩, 刘庭瑞, 杨帆, 张欢, 李伟, 马萍, 杨明. 仿真可信度智能评估需求及方法研究[J]. 系统仿真学报, 2025, 37(7): 1710-1722. |
| [2] | 陈坤, 陈亮, 谢济铭, 刘丰博, 陈泰熊, 位路宽. 基于LSTM-GNN的畸形交叉口自适应信号控制仿真研究[J]. 系统仿真学报, 2025, 37(6): 1343-1351. |
| [3] | 江达伟, 董阳阳, 张立东, 路宵, 董春曦. 基于深度学习的空中目标威胁评估技术研究[J]. 系统仿真学报, 2025, 37(3): 791-802. |
| [4] | 汪潇, 李向阳, 梁丰, 张志利. 基于ResNet-50和Laplacian滤波的红外可见光融合方法研究[J]. 系统仿真学报, 2025, 37(12): 3202-3211. |
| [5] | 胡阳, 李梓豪, 付德义, 宋子秋, 房方, 刘吉臻. 大型风电机组齿轮箱多尺度特性深度学习建模[J]. 系统仿真学报, 2025, 37(10): 2454-2468. |
| [6] | 顾皓, 王佳宇, 熊伟丽. 双流框架下的改进Transformer软测量建模[J]. 系统仿真学报, 2025, 37(10): 2594-2604. |
| [7] | 郭业才, 仝爽. 基于自动睡眠分期的多模态残差时空融合模型[J]. 系统仿真学报, 2024, 36(9): 2065-2074. |
| [8] | 刘泽森, 毕盛, 郭传鈜, 王延葵, 董敏. 基于深度学习的机器人局部路径规划方法[J]. 系统仿真学报, 2024, 36(5): 1199-1210. |
| [9] | 韦金阳, 王科平, 杨艺, 费树岷. 基于多重迁移注意力的增量式图像去雾算法[J]. 系统仿真学报, 2024, 36(4): 969-980. |
| [10] | 杨哲, 崔颖函, 郭灵犀, 李嘉鑫, 吴旭生. 融合数据仿真与深度学习算法的飞行器残骸搜寻技术[J]. 系统仿真学报, 2024, 36(10): 2238-2245. |
| [11] | 李晨, 何明, 董晨, 李伟. 基于余弦相似性的定向注意力行为识别模型[J]. 系统仿真学报, 2024, 36(1): 67-82. |
| [12] | 张凤全, 曹铎, 马晓寒, 陈柏君, 张江霄. 一种面向戏曲妆容细节生成的风格迁移网络[J]. 系统仿真学报, 2023, 35(9): 2064-2076. |
| [13] | 杜宇, 杨新权, 张建华, 袁素春, 肖化超, 袁晶晶. 基于循环谱截面智能分析的混合信号调制识别方法[J]. 系统仿真学报, 2023, 35(1): 146-157. |
| [14] | 张大永, 杨镜宇, 吴曦. 兵棋推演空中任务智能预测方法研究[J]. 系统仿真学报, 2023, 35(1): 212-220. |
| [15] | 柴慧敏, 张勇, 李欣粤, 宋雅楠. 基于深度学习的空中目标威胁评估方法[J]. 系统仿真学报, 2022, 34(7): 1459-1467. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||