系统仿真学报 ›› 2023, Vol. 35 ›› Issue (9): 1931-1947.doi: 10.16182/j.issn1004731x.joss.22-1372

• 论文 • 上一篇    下一篇

基于交叉细化和循环注意力的RGB-D显著性目标检测

董庆庆(), 吴昊(), 钱文华, 孔凤玲   

  1. 云南大学 信息学院,云南 昆明 650504
  • 收稿日期:2022-11-17 修回日期:2023-02-03 出版日期:2023-09-25 发布日期:2023-09-19
  • 通讯作者: 吴昊 E-mail:2239967500@qq.com;haowu1982@vip.163.com
  • 第一作者简介:董庆庆(1993-),女,硕士生,研究方向为计算机视觉和图像处理。E-mail:2239967500@qq.com
  • 基金资助:
    国家自然科学基金(62061049);云南省基础研究项目(2018FB100)

RGB-D Saliency Object Detection Based on Cross-refinement and Circular Attention

Dong Qingqing(), Wu Hao(), Qian Wenhua, Kong Fengling   

  1. Yunnan University School of Information Science and Engineering, Kuming 650504, China
  • Received:2022-11-17 Revised:2023-02-03 Online:2023-09-25 Published:2023-09-19
  • Contact: Wu Hao E-mail:2239967500@qq.com;haowu1982@vip.163.com

摘要:

针对显著性目标检测区域边界模糊以及检测区域不精确不完整的问题,提出了基于交叉细化和循环注意力的RGB-D显著性目标检测方法。在利用编码器提取特征的阶段设计了交叉细化模块,用于补充对方的特征信息,改善了融合前的特征质量,抑制了质量较差的深度图带来的消极影响,解决了显著性目标边缘模糊的问题。针对融合后的特征,提出联合注意力机制与卷积长短期记忆网络单元的循环模块以模拟大脑的内部生成机制,通过检索过往的记忆帮助推断当前的决策,从而获得需要长期记忆的语义场景,可以全面学习融合特征的内部语义关系,生成检测区域更完整,更准确的显著性图。在6个公开数据集上进行的实验表明,所提的方法可以得到边缘清晰且准确度更高的显著图。

关键词: RGB-D, 显著性目标检测, 交叉细化, 注意力机制, 卷积长短期记忆网络, 循环模块

Abstract:

In order to solve the problems that the boundary of the saliency object detection area is vague, and the detection area is incomplete or inaccurate, an RGB-D saliency object detection method based on cross-refinement and circular attention is proposed. A cross-refinement module is designed at the stage of extracting features using encoders, which is used to supplement feature information of each other and improve the feature quality before fusion. It also suppresses the negative impact of poor-quality depth maps and addresses the issue that the edges of the saliency object are blurred. For the features after fusion, the circular module is proposed, which combines the attention mechanism with convolutional long short-term memory (LSTM) network unit to simulate the internal generation mechanism of the brain and help infer the current decision by retrieving past memories, so as to obtain semantic scenes that require long-term memory. The module can comprehensively learn the internal semantic relationships of fusion features to generate a more complete and accurate saliency map for the detection area. Experiments conducted on six public datasets show that the proposed method can obtain a saliency map with clear edges and high accuracy.

Key words: RGB-D, saliency object detection, cross-refinement, attention mechanism, convolutional long short-term memory network, circular module

中图分类号: