系统仿真学报 ›› 2023, Vol. 35 ›› Issue (6): 1226-1234.doi: 10.16182/j.issn1004731x.joss.22-0169

• 论文 • 上一篇    下一篇

基于自适应融合和注意力细化的语义分割模型

魏赟(), 罗琦(), 赵迎志   

  1. 上海理工大学 光电信息与计算机工程学院,上海 200093
  • 收稿日期:2022-03-06 修回日期:2022-03-21 出版日期:2023-06-29 发布日期:2023-06-20
  • 通讯作者: 罗琦 E-mail:wy535study@163.com;895331587@qq.com
  • 作者简介:魏赟(1976-),女,副教授,博士,研究方向为智能交通控制、网络信息控制、分布式系统。E-mail:wy535study@163.com
  • 基金资助:
    国家重点研发计划(2018YFB1700902)

Semantic Segmentation Model Based on Adaptive Fusion and Attention Refinement

Yun Wei(), Qi Luo(), Yingzhi Zhao   

  1. School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
  • Received:2022-03-06 Revised:2022-03-21 Online:2023-06-29 Published:2023-06-20
  • Contact: Qi Luo E-mail:wy535study@163.com;895331587@qq.com

摘要:

针对现有语义分割中存在的上下文信息利用不足和细节信息丢失等问题,提出了一种基于自适应融合和注意力细化的语义分割模型。该模型在编码的过程中引入一个自适应融合模块,通过让每个特征图按照相应的权重进行融合的方式来解决上下文信息利用不足的问题。在解码的过程中设计了一个注意力细化模块,使低阶特征与高阶特征之间能够进行相互指导优化,从而解决细节信息丢失的问题。实验结果表明:该模型在PASCAL VOC 2012数据集上的平均交并比达到了83.7%,比基于编解码的语义分割模型提高了1.1%;在Cityscapes数据集上取得了81.7%的平均交并比,进一步验证了该模型的泛化性。

关键词: 语义分割, 金字塔池化, 注意力机制, 自适应融合, 编码-解码架构

Abstract:

Aiming at the insufficient use of context information and loss of detail information of the existing semantic segmentation, a model based on adaptive fusion and attention refinement is proposed. The model introduces an adaptive fusion module in the process of coding, and solves the insufficient use of context information by fusing each feature map according to the corresponding weight. An attention thinning module is designed in the process of decoding, so that the low-order features and high-order features can guide and optimize each other to solve the loss of detail information. The experimental results show that the average intersection union ratio of the model on PASCAL VOC 2012 dataset reaches 83.7%, which is 1.1% higher than the semantic segmentation model based on encoding and decoding. The average intersection union ratio of 81.7% is obtained on cityscapes dataset, which further verifies the generalization of the model.

Key words: semantic segmentation, pyramid pooling, attention mechanism, adaptive fusion, encoding-decoding architecture

中图分类号: