面向遮挡行人检测的自适应多尺度特征金字塔网络

doi:10.16182/j.issn1004731x.joss.24-0018

摘要/Abstract

摘要：

针对遮挡环境下现有检测器难以提取行人完整特征从而导致检测精度低的情况，提出了一种自适应多尺度特征金字塔网络。提出一个多尺度特征增强模块（multi-scale feature enhancement module，MFEM），通过不同感受野的多分支网络来捕获不同尺度行人可视区域。设计了一种自适应融合模块（adaptive fusion module，AFM），分别通过优化空间、特征层面的均值方差来计算不同像素点的重要程度，从而增强行人的纹理特征及语义特征并更加高效的融合不同尺度特征。且这两个模块可以构建为完整的特征金字塔网络用于更多下游任务。提出一个非极大值抑制算法（soft-set-non maximum suppression，Soft-SNMS），通过设计不同的衰减函数，在预测一个提案中所有候选框时保留高质量候选框，衰减多余的候选框，并提高模型训练效率。所提方法分别在CrowdHuman和Widerperson两个数据集上进行了实验，在AP指标上相较于原始方法分别提高了4.04%和1.51%，表明该方法可有效提高遮挡环境下行人目标的检测精度。

关键词: 遮挡行人检测, 多尺度特征, 自适应融合, 特征金字塔, 非极大值抑制

Abstract:

To address the issue of current pedestrian detectors, which struggle to extract complete features in occlusion-heavy environments and consequently have low detection accuracy. A novel adaptive multi-scale feature pyramid network is proposed. A multi-scale feature enhancement module (MFEM) is developed. It captures the visible area of pedestrians at different scales through a multi-branch network with different receptive fields. An AFM (adaptive fusion module) is proposed. It calculates the importance of different pixels by optimizing the mean variance at the spatial and feature levels. It enhances the texture and semantic features of pedestrians and fuses the features of different scales more efficiently. In addition, these two modules can be built as a complete feature pyramid network for further downstream tasks. A non-maximum suppression algorithm named Soft-SNMS (soft-set-non maximum suppression) is proposed. When predicting all the candidate boxes in the proposal, it retains high-quality candidate boxes through different decay functions. In addition, it can remove useless candidate boxes and improve the efficiency of model training. The proposed method is tested on the CrowdHuman and WiderPerson datasets, respectively. It achieves an improvement of 4.04% and 1.51% in the AP metric compared to the original method. The results indicate that our method can effectively improve the detection accuracy of pedestrian targets in occluded environments.

Key words: occlusion pedestrian detection, multi-scale features, adaptive fusion, feature pyramid, non-maximum suppression

中图分类号:

TP391

周华平,吴涛,孙克雷 . 面向遮挡行人检测的自适应多尺度特征金字塔网络[J]. 系统仿真学报, 2025, 37(5): 1222-1233.

Zhou Huaping,Wu Tao,Sun Kelei . Adaptive Multi-scale Feature Pyramid Network for Occlusion Pedestrian Detection[J]. Journal of System Simulation, 2025, 37(5): 1222-1233.

图/表 11

图1

图2

图3

图4

表1

表2

图5

图6

表3

CrowdHuman数据集上的对比结果 (%)

Method	$A P$	$M R$ ^-2	$J I$ ^-2
Iter-det^[9]	88.08	49.44	—
DMSFLN^[24]	89.18	43.59	—
Li et al.^[18]	89.75	48.28	—
PEDR^[23]	91.60	43.70	83.3
E2EDET^[11]	92.10	41.50	84.00
OAF-Net^[5]	89.80	45.00	—
DDAD^[26]	92.58	39.70	83.58
O2F^[27]	90.90	—	—
OPL^[4]	91.00	44.9	—
MADet^[28]	90.20	47.5	—
本文算法	92.12	39.63	84.18

表3

表4

Widerperson数据集上的对比结果 (%)

Method	$A P$	$M R$ ^-2
Zhou et al.^[12]	85.12	—
Soft NMS^[15]	—	60.05
IterDet 1-iter^[9]	89.49	40.35
IterDet 2-iter ^[9]	91.95	40.78
Double Mask R-CNN^[29]	86.80	39.07
DMSFLN^[24]	91.29	40.43
本文算法	93.46	38.87

表4

图6

参考文献 29

1	Chen Long, Lin Shaobo, Lu Xiankai, et al. Deep Neural Network Based Vehicle and Pedestrian Detection for Autonomous Driving: A Survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(6): 3234-3246.
2	王润民, 朱宇, 赵祥模, 等. 自动驾驶测试场景研究进展[J]. 交通运输工程学报, 2021, 21(2): 21-37.
	Wang Runmin, Zhu Yu, Zhao Xiangmo, et al. Research Progress on Test Scenario of Autonomous Driving[J]. Journal of Traffic and Transportation Engineering, 2021, 21(2): 21-37.
3	沈峘, 李舜酩, 柏方超, 等. 路面车辆实时检测与跟踪的视觉方法[J]. 光学学报, 2010, 30(4): 1076-1083.
	Shen Huan, Li Shunming, Bo Fangchao, et al. On Road Vehicles Real-time Detection and Tracking Using Vision Based Approach[J]. Acta Optica Sinica, 2010, 30(4): 1076-1083.
4	Song Xiaolin, Chen Binghui, Li Pengyu, et al. Optimal Proposal Learning for Deployable End-to-end Pedestrian Detection[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2023: 3250-3260.
5	Zhang Yi, Zhang Yicheng, Su Rong. Pedestrian-safety-aware Traffic Light Control Strategy for Urban Traffic Congestion Alleviation[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(1): 178-193.
6	孙迎春, 潘树国, 赵涛, 等. 基于优化YOLOv3算法的交通灯检测[J]. 光学学报, 2020, 40(12): 137-145.
	Sun Yingchun, Pan Shuguo, Zhao Tao, et al. Traffic Light Detection Based on Optimized YOLOv3 Algorithm[J]. Acta Optica Sinica, 2020, 40(12): 137-145.
7	苏彤, 王颖, 邓启扬, 等. 基于YOLOv5改进的雾天行人与车辆检测算法[J]. 系统仿真学报, 2024, 36(10): 2413-2422.
	Su Tong, Wang Ying, Deng Qiyang, et al. Improved Foggy Pedestrian and Vehicle Detection Algorithm Based on YOLOv5[J]. Journal of System Simulation, 2024, 36(10): 2413-2422.
8	向南, 王璐, 贾崇柳, 等. 改进YOLO的遮挡行人检测仿真[J]. 系统仿真学报, 2023, 35(2): 286-299.
	Xiang Nan, Wang Lu, Jia Chongliu, et al. Simulation of Occluded Pedestrian Detection Based on Improved YOLO[J]. Journal of System Simulation, 2023, 35(2): 286-299.
9	Rukhovich Danila, Sofiiuk Konstantin, Galeev Danil, et al. IterDet: Iterative Scheme for Object Detection in Crowded Environments[C]//Structural, Syntactic, and Statistical Pattern Recognition. Cham: Springer International Publishing, 2021: 344-354.
10	Ren Shaoqing, He Kaiming, Girshick R, et al. Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
11	Zheng Anlin, Zhang Yuang, Zhang Xiangyu, et al. Progressive End-to-end Object Detection in Crowded Scenes[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2022: 847-856.
12	邹梓吟, 盖绍彦, 达飞鹏, 等. 基于注意力机制的遮挡行人检测算法[J]. 光学学报, 2021, 41(15): 149-157.
	Zou Ziyin, Ge Shaoyan, Feipeng Da, et al. Occluded Pedestrian Detection Algorithm Based on Attention Mechanism[J]. Acta Optica Sinica, 2021, 41(15): 149-157.
13	Lu Ruiqi, Ma Huimin, Wang Yu. Semantic Head Enhanced Pedestrian Detection in a Crowd[J]. Neurocomputing, 2020, 400: 343-351.
14	Zhang K, Xiong Feng, Sun Peize, et al. Double Anchor R-CNN for Human Detection in a Crowd[EB/OL]. (2019-09-22) [2022-9-23]. .
15	Bodla N, Singh B, Chellappa R, et al. Soft-NMS - Improving Object Detection with One Line of Code[C]//2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2017: 5562-5570.
16	He Yihui, Zhang Xiangyu, Savvides M, et al. Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection[EB/OL]. (2018-09-23) [2023-10-06]. .
17	Liu Songtao, Huang Di, Wang Yunhong. Adaptive NMS: Refining Pedestrian Detection in a Crowd[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2019: 6452-6461.
18	李翔, 何淼, 罗海波. 一种面向遮挡行人检测的改进YOLOv3算法[J]. 光学学报, 2022, 42(14): 152-161.
	Li Xiang, He Miao, Luo Haibo. Occluded Pedestrian Detection Algorithm Based on Improved YOLOv3[J]. Acta Optica Sinica, 2022, 42(14): 152-161.
19	Chu Xuangeng, Zheng Anlin, Zhang Xiangyu, et al. Detection in Crowded Scenes: One Proposal, Multiple Predictions[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 12211-12220.
20	Liu Shu, Qi Lu, Qin Haifang, et al. Path Aggregation Network for Instance Segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768.
21	Shao Shuai, Zhao Zijian, Li Boxun, et al. Crowdhuman: A Benchmark for Detecting Human in a Crowd[EB/OL]. (2018-04-30) [2023-11-02]. .
22	Zhang Shifeng, Xie Yiliang, Wan Jun, et al. WiderPerson: A Diverse Dataset for Dense Pedestrian Detection in the Wild[J]. IEEE Transactions on Multimedia, 2020, 22(2): 380-393.
23	Lin M, Li Chuming, Bu Xingyuan, et al. Detr for Crowd Pedestrian Detection[EB/OL]. ([2020-12-12) [2022-11-02]. .
24	He Ye, Zhu Chao, Yin Xucheng. Occluded Pedestrian Detection via Distribution-based Mutual-supervised Feature Learning[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(8): 10514-10529.
25	Li Qiming, Su Yijing, Gao Yin, et al. OAF-Net: An Occlusion-aware Anchor-free Network for Pedestrian Detection in a Crowd[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(11): 21291-21300.
26	Tang Wenxiao, Liu Kun, Shakeel M S, et al. DDAD: Detachable Crowd Density Estimation Assisted Pedestrian Detection[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(2): 1867-1878.
27	Li S, Li M, Li R, et al. One-to-few Label Assignment for End-to-end Dense Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. [S.l. : s.n.], 2023: 7350-7359.
28	Xie Xingxing, Lang Chunbo, Miao Shicheng, et al. Mutual-Assistance Learning for Object Detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(12): 15171-15184.
29	Liu Congqiang, Wang Haosen, Liu Chunjian. Double Mask R-CNN for Pedestrian Detection in a Crowd[J]. Mobile Information Systems, 2022, 2022(1): 4012252.

Dataset	CrowdHuman	Widerperson
person/img	22.64	28.87
IoU > 0.5	2.40	2.15

MFEM	AFM	SS	CrowdHuman			Widerperson		fps
MFEM	AFM	SS	AP/%	MR^-2/%	JI^-2/%	AP/%	MR^-2/%	fps
			90.30	41.28	82.63	92.15	40.23	70
√			90.92	40.86	82.97	92.48	39.76	61
√	√		91.45	40.17	83.45	93.13	39.12	53
√	√	√	92.12	39.62	84.18	93.46	38.87	50

[1]	李明煜, 林家泉. 基于YOLOv8-DF的轻量化驾驶员面部目标检测算法[J]. 系统仿真学报, 2025, 37(8): 2103-2114.
[2]	杨路, 裴俊莹. 融合多尺度特征的航拍目标检测算法[J]. 系统仿真学报, 2025, 37(6): 1486-1498.
[3]	许明, 齐光尧, 奇格奇. 基于动态反投影网络的细粒度交通流推断模型[J]. 系统仿真学报, 2025, 37(3): 657-666.
[4]	张稀柳, 张晓玲, 何敏军. 基于改进YOLOX-s的车辆检测方法研究[J]. 系统仿真学报, 2024, 36(2): 487-496.
[5]	伍云鹏, 付应雄, 沈丽君, 崔峰. 基于YOLOX-Tiny的长尾分布交通标志识别模型[J]. 系统仿真学报, 2024, 36(11): 2503-2516.
[6]	于豪, 蒋锦霞, 赖晓翰, 梅峰, 王庆. 基于自适应感受野的电力设备表面缺陷检测方法[J]. 系统仿真学报, 2023, 35(7): 1572-1580.
[7]	魏赟, 罗琦, 赵迎志. 基于自适应融合和注意力细化的语义分割模型[J]. 系统仿真学报, 2023, 35(6): 1226-1234.
[8]	付玉, 张垚, 赵萌, 王绵沼, 郑江鹏, 贾晨, 陈胜勇. 基于仿真数据迁移学习的固定翼无人机检测[J]. 系统仿真学报, 2023, 35(5): 998-1007.
[9]	许仁杰, 张小明, 王晨, 吴鹏. 基于跨模块注意力的3D目标检测方法研究[J]. 系统仿真学报, 2023, 35(12): 2680-2691.
[10]	孙红, 凌岳览, 张玉香. 融合边界监督策略的改进特征金字塔算法研究[J]. 系统仿真学报, 2022, 34(10): 2119-2129.
[11]	刘望, 孙金玉, 马世伟. 基于时空特征金字塔网络的动作时序检测方法[J]. 系统仿真学报, 2019, 31(11): 2382-2387.