系统仿真学报 ›› 2025, Vol. 37 ›› Issue (5): 1222-1233.doi: 10.16182/j.issn1004731x.joss.24-0018

• 研究论文 • 上一篇    下一篇

面向遮挡行人检测的自适应多尺度特征金字塔网络

周华平1,2, 吴涛2, 孙克雷1   

  1. 1.安徽理工大学 计算机科学与工程学院,安徽 淮南 232001
    2.安徽理工大学 经济与管理学院,安徽 淮南 232001
  • 收稿日期:2024-01-05 修回日期:2024-03-04 出版日期:2025-05-20 发布日期:2025-05-23
  • 通讯作者: 吴涛
  • 第一作者简介:周华平(1977-),女,教授,博士,研究方向为目标检测、深度学习。
  • 基金资助:
    国家自然科学基金(52374154);安徽省重点研发计划(202004b11020029)

Adaptive Multi-scale Feature Pyramid Network for Occlusion Pedestrian Detection

Zhou Huaping1,2, Wu Tao2, Sun Kelei1   

  1. 1.School of computer science and Engineering Anhui University of Science and Technology, Huainan 232001, China
    2.School of Economics and management, Anhui University of Science and Technology, Huainan 232001, China
  • Received:2024-01-05 Revised:2024-03-04 Online:2025-05-20 Published:2025-05-23
  • Contact: Wu Tao

摘要:

针对遮挡环境下现有检测器难以提取行人完整特征从而导致检测精度低的情况,提出了一种自适应多尺度特征金字塔网络。提出一个多尺度特征增强模块(multi-scale feature enhancement module,MFEM),通过不同感受野的多分支网络来捕获不同尺度行人可视区域。设计了一种自适应融合模块(adaptive fusion module,AFM),分别通过优化空间、特征层面的均值方差来计算不同像素点的重要程度,从而增强行人的纹理特征及语义特征并更加高效的融合不同尺度特征。且这两个模块可以构建为完整的特征金字塔网络用于更多下游任务。提出一个非极大值抑制算法(soft-set-non maximum suppression,Soft-SNMS),通过设计不同的衰减函数,在预测一个提案中所有候选框时保留高质量候选框,衰减多余的候选框,并提高模型训练效率。所提方法分别在CrowdHuman和Widerperson两个数据集上进行了实验,在AP指标上相较于原始方法分别提高了4.04%和1.51%,表明该方法可有效提高遮挡环境下行人目标的检测精度。

关键词: 遮挡行人检测, 多尺度特征, 自适应融合, 特征金字塔, 非极大值抑制

Abstract:

To address the issue of current pedestrian detectors, which struggle to extract complete features in occlusion-heavy environments and consequently have low detection accuracy. A novel adaptive multi-scale feature pyramid network is proposed. A multi-scale feature enhancement module (MFEM) is developed. It captures the visible area of pedestrians at different scales through a multi-branch network with different receptive fields. An AFM (adaptive fusion module) is proposed. It calculates the importance of different pixels by optimizing the mean variance at the spatial and feature levels. It enhances the texture and semantic features of pedestrians and fuses the features of different scales more efficiently. In addition, these two modules can be built as a complete feature pyramid network for further downstream tasks. A non-maximum suppression algorithm named Soft-SNMS (soft-set-non maximum suppression) is proposed. When predicting all the candidate boxes in the proposal, it retains high-quality candidate boxes through different decay functions. In addition, it can remove useless candidate boxes and improve the efficiency of model training. The proposed method is tested on the CrowdHuman and WiderPerson datasets, respectively. It achieves an improvement of 4.04% and 1.51% in the AP metric compared to the original method. The results indicate that our method can effectively improve the detection accuracy of pedestrian targets in occluded environments.

Key words: occlusion pedestrian detection, multi-scale features, adaptive fusion, feature pyramid, non-maximum suppression

中图分类号: