系统仿真学报 ›› 2020, Vol. 32 ›› Issue (7): 1244-1256.doi: 10.16182/j.issn1004731x.joss.19-VR0466
刘瑞军*, 王向上, 张晨, 章博华
收稿日期:
2019-08-30
修回日期:
2019-12-01
出版日期:
2020-07-25
发布日期:
2020-07-15
通讯作者:
刘瑞军(1982-),男,北京,博士,副教授,研究方向为虚拟现实、图像处理
作者简介:
刘瑞军(通讯作者1982-),男,北京,博士,副教授,研究方向为虚拟现实、图像处理;王向上(1995-),男,河南,硕士生,研究方向为虚拟现实、图像处理。
Liu Ruijun*, Wang Xiangshang, Zhang Chen, Zhang Bohua
Received:
2019-08-30
Revised:
2019-12-01
Online:
2020-07-25
Published:
2020-07-15
摘要: 随着计算机视觉和机器人技术的发展,视觉同时定位与地图创建已成为无人系统领域的研究焦点,深度学习在图像处理方面展现出的强大优势,为二者的广泛结合创造了机会。总结了深度学习与视觉里程计、闭环检测和语义同时定位与地图创建结合的突出研究成果,对传统算法与基于深度学习的方法做了对比,展望了基于深度学习的视觉同时定位与地图创建发展方向。
中图分类号:
刘瑞军, 王向上, 张晨, 章博华. 基于深度学习的视觉SLAM综述[J]. 系统仿真学报, 2020, 32(7): 1244-1256.
Liu Ruijun, Wang Xiangshang, Zhang Chen, Zhang Bohua. A Survey on Visual SLAM based on Deep Learning[J]. Journal of System Simulation, 2020, 32(7): 1244-1256.
[1] | Cadena C, Carlone L, Carrillo H, et al.Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age[J]. IEEE Transactions on Robotics (S1552-3098), 2016, 32(6): 1309-1332. |
[2] | Fuentes-Pacheco J, Ruiz-Ascencio J, Rendón-Mancha J M, et al. Visual simultaneous localization and mapping: A survey[J]. Artificial Intelligence Review (S0269-2821), 2015, 43(1): 55-81. |
[3] | 刘浩敏, 章国峰, 鲍虎军. 基于单目视觉的同时定位与地图构建方法综述[J]. 计算机辅助设计与图形学学报, 2016, 28(6): 855-868.Liu Haomin, Zhang Guofeng, Bao Hujun.A survey of monocular simiultaneous localization and mapping[J]. Journal of Computer-Aided Design & Computer Graphics, 2016, 28(6): 855-868. |
[4] | Geiger A, Lenz P, Urtasun R.Are we ready for autonomous driving? The KITTI vision benchmark suite[C]// IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2012: 3354-3361. |
[5] | Kummerle R, Grisetti G, Strasdat H, et al.g2o: A general framework for graph optimization[C]// IEEE International Conference on Robotics and Automation. Piscataway, USA: IEEE, 2011: 3607-3613. |
[6] | Belter D, Skrzypczyński P.Precise self-localization of a walking robot on rough terrain using ptam[M]. Baltimore, USA: Adaptive Mobile Robotics, 2012: 89-96. |
[7] | Mur-artal R, Tardos J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics (S1552-3098), 2017, 23(5): 1255-1262. |
[8] | Engel J, Koltunk V, Cremers D.Direct sparse odometry[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence (S0162-8828), 2018, 40(3): 611-625. |
[9] | He K, Zhang X, Ren S, et al.Deep residual learning for image recognition[C]// Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. LAS VEGAS: IEEE, 2016: 779-788. |
[10] | Ren S, He K, Girshick R B, et al.Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence (S0162-8828), 2017, 39(6): 1137-1149. |
[11] | Donahue J, Anne Hendricks L, Guadarrama S, et al.Long-term recurrent convolutional networks for visual recognition and description[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3128-3137. |
[12] | Sünderhauf N, Pham T T, Latif Y, et al.Meaningful maps with object-oriented semantic mapping[C]// 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver, Canada: IEEE, 2017: 5079-5085. |
[13] | Zhou Y, Li H, Kneip L.Canny-vo: Visual odometry with rgb-d cameras based on geometric 3-d-2-d edge alignment[J]. IEEE Transactions on Robotics (S1552-3098), 2018, 35(1): 184-199. |
[14] | Costante G, Mancini M, Valigi P, et al.Exploring representation learning with CNNs for frame-to-frame ego-motion estimation[J]. IEEE Robotics and Automation Letters (S2377-3766), 2016, 1(1): 18-25. |
[15] | Shahid M, Naseer T, Burgard W.DTLC: Deeply trained loop closure detections for lifelong visual SLAM[C]// Proceedings, Workshop on Visual Place Recognition, Conference on Robotics: Science and Systems (RSS). Ann Arbor, USA: RSS, 2016: 1-8. |
[16] | Hou Y, Zhang H, Zhou S L.Convolutional neural networkbased image representation for visual loop closure detection[C]// IEEE International Conference on Information and Automation. Piscataway, USA: IEEE, 2015: 2238-2245. |
[17] | Daniel D, Malisiewicz T, Rabinovich A. Toward geometric deep SLAM[EB/OL]. (2017-07-24) [2019-08-20], https://arxiv.org /pdf/1707.07410.pdf. |
[18] | Sharif Razavian A, Azizpour H, Sullivan J, et al.CNN features off-the-shelf: an astounding baseline for recognition[C]// Proceedings of the IEEE conference on computer vision and pattern recognition workshops. Columbus, Ohio: IEEE, 2014: 806-813. |
[19] | Wang S, Clark R, Wen H, et al.Deepvo: Towards end-to-end visual odometry with deep recurrent convolutional neural networks[C]// 2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, 2017: 2043-2050. |
[20] | Donahue J, Anne Hendricks L, Guadarrama S, et al.Long-term recurrent convolutional networks for visual recognition and description[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 2625-2634. |
[21] | Elman J L.Finding structure in time[J]. Cognitive science (S0364-0213), 1990, 14(2): 179-211. |
[22] | Graves A.Supervised Sequence Labeling with Recurrent Neural Networks[M]. Heidelberg: Springer, 2012: 5-13. |
[23] | Chen Z, Jacobson A, Sünderhauf N, et al.Deep learning features at scale for visual place recognition[C]// 2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, 2017: 3223-3230. |
[24] | Sünderhauf N, Dayoub F, Shirazi S, et al.On the Performance of ConvNet Features for Place Recognition[C]// International Conference on Intelligent Robots and Systems (IROS). Hamburg: IEEE, 2015: 4297-4304. |
[25] | Yi H, Hong Z, Zhou S.BoCNF: efficient image matching with Bag of ConvNet features for scalable and robust visual place recognition[J]. Autonomous Robots (S0929-5593), 2017, 42(9): 1-17. |
[26] | Lin K, Yang H F, Hsiao J H, et al.Deep learning of binary hash codes for fast image retrieval[C]// Proceedings of the IEEE conference on computer vision and pattern recognition workshops. Boston, USA: IEEE, 2015: 27-35. |
[27] | Sünderhauf N, Shirazi S, Jacobson A, et al.Place recognition with convnet landmarks: Viewpoint-robust, condition-robust, training-free[C]// Proceedings of Robotics: Science and Systems XI. Michigan, USA: RSS, 2015: 1-10. |
[28] | Parisotto E, Singh Chaplot D, Zhang J, et al.Global pose estimation with an attention-based recurrent network[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Salt Lake City: IEEE, 2018: 237-246. |
[29] | Hwang J, Park S, Kwak N.Athlete pose estimation by a global-local network[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu, Hawaii: IEEE, 2017: 58-65. |
[30] | Southall C, Stables R, Hockman J.Automatic Drum Transcription for Polyphonic Recordings Using Soft Attention Mechanisms and Convolutional Neural Networks[C]// The 18th International Society for Music Information Retrieval Conference. Suzhou: ISMIR, 2017: 606-612. |
[31] | Sünderhauf N, Pham T T, Latif Y, et al.Meaningful Maps with Object-Oriented Semantic Mapping[C]// 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). New York: IEEE, 2017: 5079-5085. |
[32] | Ng P C, Henikoff S.SIFT: Predicting amino acid changes that affect protein function[J]. Nucleic Acids Research (S0305-1048), 2003, 31(13): 3812-3814. |
[33] | Lei H, Akhtar N, Mian A.Octree guided CNN with Spherical Kernels for 3D Point Clouds[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, CA: IEEE, 2019: 9631-9640. |
[34] | Mani I, Zhang I.KNN approach to unbalanced data distributions: a case study involving information extraction[C]// Proceedings of workshop on learning from imbalanced datasets. Washington: ICML, 2003: 126. |
[35] | Radwan N, Valada A, Burgard W.VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry[J]. IEEE Robotics and Automation Letters (S2377-3766), 2018, 3(4): 4407-4414. |
[36] | Girisha S, Manohara P, Ujjwal V, et al.Semantic Segmentation of UAV Aerial Videos using Convolutional Neural Networks[C]// 2019 IEEE Knowledge Engineering (AIKE). Sardinia, Italy: IEEE, 2019: 21-27. |
[37] | Han Y, Ye J C.Framing U-Net via deep convolutional framelets: Application to sparse-view CT[J]. IEEE Transactions on Medical Imaging (S0278-0062), 2018, 37(6): 1418-1429. |
[38] | Bowman S L, Atanasov N, Daniilidis K, et al.Probabilistic data association for semantic slam[C]// 2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, 2017: 1722-1729. |
[39] | Jordan M I, Jacobs R A.Hierarchical Mixtures of Experts and the EM Algorithm[J]. Neural Computation (S0899-7667), 1994, 6(2): 181-214. |
[40] | Engel J, Koltun V, Cremers D.Direct sparse odometry[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence (S0162-8828), 2017, 40(3): 611-625. |
[41] | Geiger A, Ziegler J, Stiller C.Stereoscan: Dense 3d reconstruction in real-time[C]// 2011 IEEE Intelligent Vehicles Symposium (IV). Baden-Baden, Germany: IEEE, 2011: 963-968. |
[42] | Loo S Y, Amiri A J, Mashohor S, et al.CNN-SVO: Improving the mapping in semi-direct visual odometry using single-image depth prediction[C]// 2019 International Conference on Robotics and Automation (ICRA). Montreal, Canada: IEEE, 2019: 5218-5223. |
[43] | Zhan H, Weerasekera C S, Bian J, et al. Visual Odometry Revisited: What Should Be Learnt?[EB/OL]. (2019/09/21) [2019/10/05], https://arxiv.org/abs/1909.09803.pdf. |
[44] | Zhou T, Brown M, Snavely N, et al.Unsupervised learning of depth and ego-motion from video[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, Hawaii: IEEE , 2017: 1851-1858. |
[45] | Cieslewski T, Choudhary S, Scaramuzza D.Data-efficient decentralized visual SLAM[C]// 2018 IEEE International Conference on Robotics and Automation (ICRA). Prague, Czech Republic,: IEEE, 2018: 2466-2473. |
[46] | Li S, Zhi Y, Anestis Z, et al.Recurrent-OctoMap: Learning State-based Map Refinement for Long-Term Semantic Mapping with 3D-Lidar Data[J]. IEEE Robotics and Automation Letters (S2377-3766), 2018, 3(4): 3749-3756. |
[47] | Hornung A, Kai M W, Bennewitz M, et al.OctoMap: An efficient probabilistic 3D mapping framework based on octrees[J]. Autonomous Robots (S0929-5593), 2013, 34(3): 189-206. |
[48] | Zhang J, Singh S.Laser-visual-inertial odometry and mapping with high robustness and low drift[J]. Journal of Field Robotics (S1556-4967), 2018, 35(8): 1242-1264. |
[49] | Garcia-Fidalgo E, Ortiz A.Vision-based topological mapping and localization methods: a survey[J]. Robotics and Autonomous Systems (S0921-8890), 2015, 64: 1-20. |
[50] | Engel J, Schöps T, Cremers D.LSD-SLAM: Large-Scale Direct Monocular SLAM[M]. Munich: Computer Vision - ECCV 2014. 2014: 834-849. |
[51] | Scherer S A, Zell A.Efficient onbard RGBD-SLAM for autonomous MAVs[C]// 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan: IEEE, 2013: 1062-1068. |
[52] | Vijayanarasimhan S, Ricco S, Schmid C, et al. Sfm-net: Learning of structure and motion from video[EB/OL]. (2017/04/25) [2019/08/25], https://arxiv.org/abs/ 1704.07804.pdf. |
[53] | 张峻宁, 苏群星, 刘鹏远, 等. 一种自适应特征地图匹配的改进VSLAM算法[J]. 自动化学报, 2019, 45(3): 553-565.Zhang Junning, Su Qunxing, Liu Pengyuan, et al.An Improved VSLAM Algorithm Based on Adaptive Feature Map[J]. Acta Automatica Sinica, 2019, 45(3): 553-565. |
[54] | Grisetti G, Kümmerle R, Strasdat H, et al.g2o: A general framework for (hyper) graph optimization[C]// 2011 IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China: IEEE, 2011: 3607-3613. |
[1] | 康旭, 张晓峰. 基于生成对抗神经网络的雷达遥感数据增广方法[J]. 系统仿真学报, 2022, 34(4): 920-927. |
[2] | 于雅楠, 史敦煌, 华春杰. 特征点法SLAM视觉里程计自适应优化算法[J]. 系统仿真学报, 2022, 34(1): 104-112. |
[3] | 林硕, 安磊, 高治军, 单丹, 尚文利. 结合栈式自编码及长短时记忆的入侵检测研究[J]. 系统仿真学报, 2021, 33(6): 1288-1296. |
[4] | 程文聪, 史小康, 王志刚. 基于生成对抗网络的仿真卫星云图生成方法[J]. 系统仿真学报, 2021, 33(6): 1297-1306. |
[5] | 董书琴, 张斌. 面向不平衡数据的网络流量异常检测方法[J]. 系统仿真学报, 2021, 33(3): 679-689. |
[6] | 王霄汉, 张霖, 任磊, 谢堃钰, 王昆玉, 叶飞, 陈真. 基于强化学习的车间调度问题研究简述[J]. 系统仿真学报, 2021, 33(12): 2782-2791. |
[7] | 王步维, 王敏, 范谦, 王雅男, 章涵文, 乐云亮. 基于深度学习的晶体性质预测研究[J]. 系统仿真学报, 2021, 33(12): 2854-2863. |
[8] | 冯晓, 张辉, 周蕊, 乔璐, 魏东, 李丹丹, 张玉尧, 郑国清. 基于深度学习和籽粒双面特征的玉米品种识别[J]. 系统仿真学报, 2021, 33(12): 2983-2991. |
[9] | 马榕, 陈秋瑞, 张晗, 梅铮, 王锐, 魏伟. 基于单目深度估计的低功耗视觉里程计[J]. 系统仿真学报, 2021, 33(12): 3001-3011. |
[10] | 杜金莲, 李淑飞, 金雪云. 三维烟雾流场超分辨率数据生成网络模型的研究[J]. 系统仿真学报, 2021, 33(10): 2381-2389. |
[11] | 阴敬方, 朱登明, 石敏, 王兆其. 基于引导对抗网络的人体深度图像修补方法[J]. 系统仿真学报, 2020, 32(7): 1312-1321. |
[12] | 戢晓峰, 戈艺澄. 基于深度学习的节假日高速公路交通流预测方法[J]. 系统仿真学报, 2020, 32(6): 1164-1171. |
[13] | 孔锐, 谢玮, 雷泰. 基于神经网络的图像描述方法研究[J]. 系统仿真学报, 2020, 32(4): 601-611. |
[14] | 秦胜伟, 李重, 李金锋, 陈梓浩, 丁靖骞, 刘万顺. 校园漫游互动AR系统设计与实现[J]. 系统仿真学报, 2019, 31(7): 1367-1376. |
[15] | 叶继华, 时淑霞, 李汉曦, 王仕民, 杨思渝. 基于深度学习的驾驶关注区域检测研究与实现[J]. 系统仿真学报, 2019, 31(7): 1421-1428. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||