系统仿真学报 ›› 2026, Vol. 38 ›› Issue (1): 45-57.doi: 10.16182/j.issn1004731x.joss.25-0830

• 论文 • 上一篇    下一篇

结合区域分类和局部特征增强的视觉重定位方法

王宜宁1, 刘艳丽1, 邢冠宇2   

  1. 1.四川大学 计算机学院,四川 成都 610065
    2.四川大学 网络空间安全学院,四川 成都 610065
  • 收稿日期:2025-09-01 修回日期:2025-10-16 出版日期:2026-01-18 发布日期:2026-01-28
  • 通讯作者: 刘艳丽
  • 第一作者简介:王宜宁(2000-),男,硕士生,研究方向为增强现实、计算机视觉。
  • 基金资助:
    国家自然科学基金(U25A20439);国家自然科学基金(62172290);四川省科技计划(2026NSFSCZY0126)

Visual Relocalization Method Combining Region Classification and Local Feature Enhancement

Wang Yining1, Liu Yanli1, Xing Guanyu2   

  1. 1.College of Computer Science, Sichuan University, Chengdu 610065, China
    2.School of Cyber Science and Engineering, Sichuan University, Chengdu 610065, China
  • Received:2025-09-01 Revised:2025-10-16 Online:2026-01-18 Published:2026-01-28
  • Contact: Liu Yanli

摘要:

视觉重定位在机器人、数字孪生、增强现实等领域具有重要应用价值。针对目前主流视觉重定位方法在实际应用中面临坐标回归尺度与感受野不匹配,局部信息关注不足等问题,提出基于神经网络的结合区域分类和局部特征增强的视觉重定位方法。将大规模场景的坐标回归问题转变为多区域的分类问题与小型场景的坐标回归问题,显著降低了坐标回归的不确定性,使网络全局具有较大的感受野。使用深度特征融合的调节层将上层分类的结果引入下层网络。通过图注意力机制进行局部区域内的特征学习与融合,使网络可以同时学习全局与局部的特征信息,结合分层次的回归框架,提升了重定位的稳定性。在公开多场景数据集上进行对比实验和分析,结果表明:所提视觉重定位方法取得了更为精确的重定位结果与更高的重定位准确率。

关键词: 视觉重定位, 坐标回归, 区域分类, 空间分割, 图注意力

Abstract:

Visual relocalization tasks have important application value in fields such as digital twin and augmented reality. The current mainstream methods still face challenges such as mismatch between coordinate regression scale and receptive field and insufficient attention to local information. A visual relocalization method that combines region classification and local feature enhancement is proposed. The coordinate regression problem in large space is transformed into a multi-region classification problem and a coordinate regression problem inside a small scene, which significantly reduces the un-certainty of coordinate regression and makes the network globally have a large receptive field. A conditioning layer using deep feature fusion introduces the results of the upper classification layer into the lower network. Feature learning and fusion within a local region through the graph attention mechanism allows the network to learn both global and local feature information, which combined with the hierarchical regression framework, improves the stability of relocalization. Comparative experiments and analyses of the proposed method with mainstream visual relocalization methods are conducted on a publicly available multi-scene dataset. The experimental results show that the visual relocalization method proposed in this paper achieves more precise relocalization results with higher relocalization accuracy.

Key words: visual relocalization, coordinate regression, region classification, space division, graph attention

中图分类号: