系统仿真学报 ›› 2023, Vol. 35 ›› Issue (11): 2333-2344.doi: 10.16182/j.issn1004731x.joss.22-0690

• 论文 • 上一篇    下一篇

基于改进的DeepLabv3+图像语义分割算法研究

赵为平1,2(), 陈雨2(), 项松1, 刘远强1, 王超越1   

  1. 1.沈阳航空航天大学 辽宁通航研究院,辽宁 沈阳 110034
    2.沈阳航空航天大学 电子信息工程学院,辽宁 沈阳 110034
  • 收稿日期:2022-06-17 修回日期:2022-08-16 出版日期:2023-11-25 发布日期:2023-11-24
  • 通讯作者: 陈雨 E-mail:3370477370@qq.com;1009857106@qq.com
  • 第一作者简介:赵为平(1968-),男,副教授,博士,研究方向为飞行器设计、图像处理。E-mail:3370477370@qq.com
  • 基金资助:
    辽宁省教育厅重点公关项目(JYT2020162);电动水上飞机可靠性设计技术研究(JYT2020162)

Image Semantic Segmentation Algorithm Based on Improved DeepLabv3+

Zhao Weiping1,2(), Chen Yu2(), Xiang Song1, Liu Yuanqiang1, Wang Chaoyue1   

  1. 1.Liaoning General Aviation Academy, Shenyang Aerospace University, Shenyang 110034, China
    2.College of Electronic Information Engineering, Shenyang Aerospace University, Shenyang 110034, China
  • Received:2022-06-17 Revised:2022-08-16 Online:2023-11-25 Published:2023-11-24
  • Contact: Chen Yu E-mail:3370477370@qq.com;1009857106@qq.com

摘要:

目前主流图像语义分割网络往往存在误分割、分割不连续和模型复杂度高的问题,不能灵活高效地部署于实际场景中。针对这一现象,通过综合考虑网络的参数量、预测时间和准确度,设计出一种优化DeepLabv3+模型的图像语义分割网络。骨干网络改用轻量级EfficientNetv2网络提取特征,提高参数利用率;在空洞空间金字塔池化模块中使用混合条带池化模块代替全局平均池化,引入深度可分离膨胀卷积,减少参数量和提高学习多尺度信息的能力;使用注意力机制增强模型表征力,提取骨干网络多条浅层特征,丰富图像的几何细节信息。实验表明,本文算法可达到mIoU为81.19%,参数量为55.51×106,有效优化了分割精度和模型复杂度,同时也提高了模型泛化性。

关键词: DeepLabv3+, 图像语义分割, 空洞空间金字塔池化, 注意力机制, 深度可分离膨胀卷积

Abstract:

Mainstream image semantic segmentation networks currently face problems such as incorrect segmentation, discontinuous segmentation, and high model complexity, which cannot be flexibly and efficiently deployed in practical scenarios. To this end, an image semantic segmentation network that optimizes the DeepLabv3+ model is designed by comprehensively considering the network parameters, prediction time, and accuracy. The lightweight EfficientNetv2 is adopted to extract backbone network features and improve parameter utilization. In the atrous spatial pyramid pooling module, the mixed strip pooling is utilized to replace the global average pooling, and a depthwise separable dilated convolution is introduced to reduce parameters and improve the ability to learn multi-scale information. The attention mechanism is employed to enhance the model's representation power, and the multiple shallow features of the backbone network are extracted to enrich the image's geometric details. The experiment shows that the algorithm achieves 81.19% mIoU with a parameter size of 55.51×106, which optimizes the segmentation accuracy and model complexity and improves model generalization.

Key words: DeepLabv3+, image semantic segmentation, atrous spatial pyramid pooling, attention mechanism, depthwise separable dilated convolution

中图分类号: