系统仿真学报 ›› 2025, Vol. 37 ›› Issue (11): 2701-2713.doi: 10.16182/j.issn1004731x.joss.24-0564

• 论文 •    

基于改进YOLOv8提升SLAM在AR中的定位精度

刘佳1,2,3, 张增伟1,2, 陈大鹏1,2,3, 黄楠瑄1,2, 王斌1,2, 宋泓1,2   

  1. 1.南京信息工程大学 天长研究院,安徽 滁州 239356
    2.南京信息工程大学 自动化学院,江苏 南京 210044
    3.江苏省大气环境与装备技术协同创新中心,江苏 南京 210044
  • 收稿日期:2024-05-24 修回日期:2024-07-20 出版日期:2025-11-18 发布日期:2025-11-27
  • 通讯作者: 陈大鹏
  • 第一作者简介:刘佳(1981-),女,教授,博士,研究方向为虚拟/增强现实、计算机视觉、图像处理。
  • 基金资助:
    国家自然科学基金(62003169);江苏产业前瞻与关键技术重点项目(BE2020006-2);江苏省自然科学基金(BK20200823)

Improvement of SLAM Localization Accuracy in AR by Enhancing YOLOv8

Liu Jia1,2,3, Zhang Zengwei1,2, Chen Dapeng1,2,3, Huang Nanxuan1,2, Wang Bin1,2, Song Hong1,2   

  1. 1.Tianchang Research Institute, Nanjing University of Information Science and Technology, Chuzhou 239356, China
    2.School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, China
    3.Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing 210044, China
  • Received:2024-05-24 Revised:2024-07-20 Online:2025-11-18 Published:2025-11-27
  • Contact: Chen Dapeng

摘要:

针对传统SLAM方法在进行增强现实的三维注册时,由于环境中存在动态干扰,会出现虚拟物体注册精度和稳定性降低等问题,提出一种动态场景下基于语义分割和光流追踪的改进方法。在YOLOv8中加入CBAM注意力机制以增强其对环境中动态物体的关注度,提高检测性能和准确性;将改进YOLOv8的语义分割功能嵌入ORB-SLAM3的前端,分割出场景中的动态物体并去除影响地图构建的动态特征点,结合光流法进一步跟踪移动物体,提高相机的定位精度;在TUM数据集以及真实场景中进行验证。结果表明:相比于传统ORB-SLAM3,提出的方法在动态场景中的定位精度显著提高,提升了AR中三维注册的稳定性。

关键词: 增强现实, 三维注册, YOLOv8, ORB-SLAM3, 语义分割, 动态检测

Abstract:

In the presence of dynamic interference in the environment, traditional simultaneous localization and mapping (SLAM) methods often experience reduced precision and stability in the registration of virtual objects during three-dimensional registration in augmented reality (AR). To address these issues, an improved method for dynamic scenes based on semantic segmentation and optical flow tracking was proposed. The convolutional block attention module (CBAM) attention mechanism was incorporated into YOLOv8 to enhance its focus on dynamic objects in the environment, thereby improving detection performance and accuracy. The semantic segmentation functionality of the improved YOLOv8 was integrated into the front-end of ORB-SLAM3 to segment dynamic objects in the scene and remove dynamic feature points that affect map construction. The optical flow method was further used to track moving objects, thereby improving the positioning accuracy of the camera. Validation was conducted on the TUM dataset and in real-world scenarios. The results indicate that, compared to traditional ORB-SLAM3, the proposed method improves positioning accuracy in dynamic scenes, significantly enhancing the stability of 3D registration in AR.

Key words: augmented reality(AR), three-dimensional registration, YOLOv8, ORB-SLAM3, semantic segmentation, dynamic detection

中图分类号: