Journal of System Simulation ›› 2025, Vol. 37 ›› Issue (4): 1025-1040.doi: 10.16182/j.issn1004731x.joss.23-1526
• Papers • Previous Articles
Li Jie1, Liu Yang1, Li Liang2, Su Bengan2, Wei Jialong1, Zhou Guangda1, Shi Yanmin3, Zhao Zhen1
Received:2023-12-13
															
							
																	Revised:2024-02-06
															
							
															
							
																	Online:2025-04-17
															
							
																	Published:2025-04-16
															
						Contact:
								Zhao Zhen   
																					CLC Number:
Li Jie, Liu Yang, Li Liang, Su Bengan, Wei Jialong, Zhou Guangda, Shi Yanmin, Zhao Zhen. Remote Sensing Small Object Detection Based on Cross-stage Two-branch Feature Aggregation[J]. Journal of System Simulation, 2025, 37(4): 1025-1040.
Table 1
Detection accuracy in DIOR dataset
| 目标类别 | YOLOv8 | 本文模型 | 
|---|---|---|
| 均值 | 74.44 | 75.76 | 
| 高速公路服务区 | 64.62 | 64.13 | 
| 高速公路收费站 | 59.92 | 62.03 | 
| 飞机 | 90.02 | 92.77 | 
| 机场 | 82.91 | 85.49 | 
| 棒球场 | 78.58 | 78.47 | 
| 篮球场 | 90.80 | 91.95 | 
| 桥梁 | 50.30 | 51.98 | 
| 烟囱 | 80.69 | 83.16 | 
| 水坝 | 63.82 | 65.90 | 
| 高尔夫球场 | 81.03 | 80.58 | 
| 田径场 | 80.43 | 79.91 | 
| 港口 | 65.77 | 67.30 | 
| 立交桥 | 61.81 | 63.88 | 
| 船只 | 89.94 | 92.32 | 
| 体育场 | 73.27 | 73.23 | 
| 储罐 | 79.11 | 81.72 | 
| 网球场 | 91.23 | 90.83 | 
| 火车站 | 65.85 | 65.54 | 
| 车辆 | 53.92 | 56.81 | 
| 风车 | 84.70 | 87.18 | 
Table 3
Comparative experimental results on DIOR dataset
| 模型 | Params(M) | 帧率/(帧/s) | mAP50/% | APS/% | APM/% | APL/% | 
|---|---|---|---|---|---|---|
| FasterRCNN[ | 28.50 | 6.1 | 63.10 | 6.5 | 32.3 | 57.6 | 
| CenterNet[ | 32.70 | 19.3 | 56.05 | 5.4 | 25.2 | 51.4 | 
| YOLOv3[ | 5.50 | 69.8 | 57.10 | 6.8 | 25.5 | 48.1 | 
| YOLOv4[ | 5.90 | 66.9 | 61.01 | 6.7 | 31.3 | 50.5 | 
| YOLOv5 | 7.10 | 50.2 | 66.97 | 11.1 | 37.4 | 62.0 | 
| YOLOX[ | 5.04 | 56.1 | 69.79 | 11.3 | 35.3 | 62.7 | 
| YOLOv7[ | 6.10 | 66.3 | 72.83 | 12.3 | 38.9 | 69.1 | 
| CF2PN[ | 91.60 | 19.7 | 67.25 | 11.3 | 36.0 | 61.4 | 
| DEA-Net[ | 59.90 | 12.5 | 69.64 | 11.9 | 35.5 | 61.7 | 
| MSA RCNN[ | — | — | 74.37 | 12.8 | 40.6 | 72.4 | 
| YOLOv8 | 11.10 | 80.7 | 74.44 | 12.7 | 40.8 | 72.6 | 
| 本文模型 | 29.80 | 74.2 | 75.76 | 13.9 | 41.6 | 72.1 | 
Table 4
Comparative experimental results on RSOD dataset
| 模型 | Params(M) | 帧率/(帧/s) | mAP50/% | APS/% | APM/% | APL/% | 
|---|---|---|---|---|---|---|
| FasterRCNN | 28.50 | 6.1 | 90.7 | 39.7 | 65.1 | 74.6 | 
| YOLOv4 | 5.90 | 66.9 | 86.7 | 38.9 | 63.3 | 73.5 | 
| CenterNet | 32.70 | 19.3 | 85.6 | 37.7 | 62.6 | 72.4 | 
| YOLOv5 | 7.10 | 50.2 | 92.2 | 40.3 | 66.4 | 75.0 | 
| YOLOX | 5.04 | 56.1 | 94.7 | 40.7 | 68.6 | 77.1 | 
| DEA-Net | 59.90 | 12.5 | 93.1 | 40.5 | 67.9 | 76.7 | 
| YOLOv8 | 11.10 | 80.7 | 96.0 | 41.8 | 69.8 | 78.4 | 
| 本文模型 | 29.80 | 74.2 | 98.1 | 45.1 | 72.7 | 76.9 | 
Table 5
Ablation results on RSOD dataset
| 编号 | CSCAP | CSEVC | EMCBAM | Params(M) | 帧率/(帧/s) | mAP50/% | APS/% | APM/% | APL/% | 
|---|---|---|---|---|---|---|---|---|---|
| Ⅰ | 11.1 | 80.7 | 96.0 | 41.8 | 69.8 | 78.4 | |||
| Ⅱ | √ | 11.2 | 77.3 | 96.2 | 44.5 | 70.4 | 75.7 | ||
| Ⅲ | √ | 29.2 | 75.2 | 97.1 | 44.1 | 72.2 | 78.1 | ||
| Ⅳ | √ | 11.6 | 78.5 | 97.7 | 42.5 | 70.0 | 79.5 | ||
| Ⅴ | √ | √ | 29.3 | 74.7 | 97.2 | 44.5 | 72.4 | 76.6 | |
| Ⅵ | √ | √ | 11.7 | 77.1 | 97.7 | 44.9 | 70.5 | 77.1 | |
| Ⅶ | √ | √ | 29.7 | 74.9 | 97.9 | 44.6 | 72.3 | 78.0 | |
| Ⅷ | √ | √ | √ | 29.8 | 74.2 | 98.1 | 45.1 | 72.7 | 76.9 | 
| 1 | Dutta Suparna, Das Monidipa. Remote Sensing Scene Classification Under Scarcity of Labelled Samples—A Survey of the State-of-the-arts[J]. Computers & Geosciences, 2023, 171: 105295. | 
| 2 | Girshick R, Donahue J, Darrell T, et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. | 
| 3 | Ren Shaoqing, He Kaiming, Girshick R, et al. Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 91-99. | 
| 4 | Redmon J, Divvala S, Girshick R, et al. You Only Look Once: Unified, Real-time Object Detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2016: 779-788. | 
| 5 | Redmon J, Farhadi A. YOLO9000: Better, Faster, Stronger[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2017: 6517-6525. | 
| 6 | Redmon J, Farhadi A. YOLOv3: An Incremental Improvement[EB/OL]. (2018-04-08) [2023-12-06]. . | 
| 7 | Bochkovskiy A, Wang C Y, Liao Hongyuan. YOLOv4: Optimal Speed and Accuracy of Object Detection[EB/OL]. (2020-04-23) [2023-12-06]. . | 
| 8 | Ge Zheng, Liu Songtao, Wang Feng, et al. YOLOX: Exceeding YOLO Series in 2021[EB/OL]. (2021-08-06) [2023-12-06]. . | 
| 9 | Li Chuyi, Li Lulu, Jiang Hongliang, et al. YOLOv6: A Single-stage Object Detection Framework for Industrial Applications[EB/OL]. (2022-09-07) [2023-12-06]. . | 
| 10 | Wang C Y, Bochkovskiy A, Liao Hongyuan. YOLOv7: Trainable Bag-of-freebies Sets New State-of-the-art for Real-time Object Detectors[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2023: 7464-7475. | 
| 11 | Mei Yuan, Wu Kaijun, Xu Zehao, et al. SNG-YOLOX: Non-obvious Remote Sensing Target Detection Based on Enhanced YOLOX[EB/OL]. (2022-04-22) [2023-12-07]. . | 
| 12 | Li Ronghao, Shen Ying. YOLOSR-IST: A Deep Learning Method for Small Target Detection in Infrared Remote Sensing Images Based on Super-resolution and YOLO[J]. Signal Processing, 2023, 208: 108962. | 
| 13 | 赵文清, 康怿瑾, 赵振兵, 等. 改进YOLOv5s的遥感图像目标检测[J]. 智能系统学报, 2023, 18(1): 86-95. | 
| Zhao Wenqing, Kang Yijin, Zhao Zhenbing, et al. A Remote Sensing Image Object Detection Algorithm with Improved YOLOv5s[J]. CAAI Transactions on Intelligent Systems, 2023, 18(1): 86-95. | |
| 14 | Fan Qihang, Huang Huaibo, Guan Jiyang, et al. Rethinking Local Perception in Lightweight Vision Transformer[EB/OL]. (2023-06-01) [2023-12-09]. . | 
| 15 | Quan Yu, Zhang Dong, Zhang Liyan, et al. Centralized Feature Pyramid for Object Detection[J]. IEEE Transactions on Image Processing, 2023, 32: 4341-4354. | 
| 16 | Ouyang Daliang, He Su, Zhang Guozhong, et al. Efficient Multi-scale Attention Module with Cross-spatial Learning[C]//ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Piscataway: IEEE, 2023: 1-5. | 
| 17 | Qu Junsuo, Tang Zongbing, Zhang Le, et al. Remote Sensing Small Object Detection Network Based on Attention Mechanism and Multi-scale Feature Fusion[J]. Remote Sensing, 2023, 15(11): 2728. | 
| 18 | Zhou Liming, Zheng Chang, Yan Haoxin, et al. RepDarkNet: A Multi-branched Detector for Small-target Detection in Remote Sensing Images[J]. ISPRS International Journal of Geo-Information, 2022, 11(3): 158. | 
| 19 | Pei Wenjing, Shi Zhanhao, Gong Kai. Small Target Detection with Remote Sensing Images Based on an Improved YOLOv5 Algorithm[J]. Frontiers in Neurorobotics, 2022, 16: 1074862. | 
| 20 | Tan Mingxing, Le Q V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks[EB/OL]. (2020-09-11) [2023-12-11]. . | 
| 21 | 邱天衡, 王玲, 王鹏, 等. 基于改进YOLOv5的目标检测算法研究[J]. 计算机工程与应用, 2022, 58(13): 63-73. | 
| Qiu Tianheng, Wang Ling, Wang Peng, et al. Research on Object Detection Algorithm Based on Improved YOLOv5[J]. Computer Engineering and Applications, 2022, 58(13): 63-73. | |
| 22 | Qiao Siyuan, Chen L C, Yuille A. DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2021: 10208-10219. | 
| 23 | Zhao Qijie, Sheng Tao, Wang Yongtao, et al. M2Det: A Single-shot Object Detector Based on Multi-level Feature Pyramid Network[C]//Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Conference on Innovative Applications of Artificial Intelligence and Ninth Symposium on Educational Advances in Artificial Intelligence. Palo Alto: AAAI Press, 2019: 9259-9266. | 
| 24 | 李超, 王凯, 丁才昌, 等. 改进特征融合网络的遥感图像小目标检测[J]. 计算机工程与应用, 2023, 59(17): 232-241. | 
| Li Chao, Wang Kai, Ding Caichang, et al. Improved Feature Fusion Network for Small Object Detection in Remote Sensing Images[J]. Computer Engineering and Applications, 2023, 59(17): 232-241. | |
| 25 | Hu Jie, Shen Li, Sun Gang. Squeeze-and-excitation Networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. | 
| 26 | Woo Sanghyun, Park Jongchan, Lee J Y, et al. CBAM: Convolutional Block Attention Module[C]//Computer Vision—ECCV 2018. Cham: Springer International Publishing, 2018: 3-19. | 
| 27 | Si Chenyang, Yu Weihao, Zhou Pan, et al. Inception Transformer[C]//Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2024: 23495-23509. | 
| 28 | Howard A G, Zhu Menglong, Chen Bo, et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications[EB/OL]. (2017-04-17) [2023-12-12]. . | 
| 29 | Tolstikhin I, Houlsby N, Kolesnikov A, et al. MLP-mixer: An All-MLP Architecture for Vision[C]//Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc, 2024: 24261-24272. | 
| 30 | Yu Weihao, Luo Mi, Zhou Pan, et al. MetaFormer is Actually What You Need for Vision[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2022: 10809-10819. | 
| 31 | Larsson G, Maire M, Shakhnarovich G. FractalNet: Ultra-deep Neural Networks Without Residuals[EB/OL]. (2017-05-26) [2023-12-13]. . | 
| 32 | Duan Kaiwen, Bai Song, Xie Lingxi, et al. CenterNet: Keypoint Triplets for Object Detection[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2019: 6568-6577. | 
| 33 | Huang Wei, Li Guanyi, Chen Qiqiang, et al. CF2PN: A Cross-scale Feature Fusion Pyramid Network Based Remote Sensing Target Detection[J]. Remote Sensing, 2021, 13(5): 847. | 
| 34 | Liang Dong, Geng Qixiang, Wei Zongqi, et al. Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-13. | 
| 35 | Sharifuzzaman Sagar A S M, Chen Yu, Xie Yakun, et al. MSA R-CNN: A Comprehensive Approach to Remote Sensing Object Detection and Scene Understanding[J]. Expert Systems with Applications, 2024, 241: 122788. | 
| [1] | Wang He, Xu Jianing, Yan Guangyu. Research on Pedestrian Avoidance Strategy for AGV Based on Deep Reinforcement Learning [J]. Journal of System Simulation, 2025, 37(3): 595-606. | 
| [2] | Li Xiang, Ren Xiaoyu, Zhou Yongbing, Zhang Jian. Research on Flexible Integrated Scheduling Under Stochastic Processing Times Based on Improved D3QN Algorithm [J]. Journal of System Simulation, 2025, 37(2): 474-486. | 
| [3] | Fei Shuaidi, Cai Changlong, Liu Fei, Chen Minghui, Liu Xiaoming. Research on the Target Allocation Method for Air Defense and Anti-missile Defense of Naval Ships [J]. Journal of System Simulation, 2025, 37(2): 508-516. | 
| [4] | Li Chao, Li Jiabao, Ding Caichang, Ye Zhiwei, Zuo Fangwei. Edge Surveillance Task Offloading and Resource Allocation Algorithm Based on DRL [J]. Journal of System Simulation, 2024, 36(9): 2113-2126. | 
| [5] | Liu Peijin, Fu Xuefeng, Sun Haofeng, He Lin, Liu Shujie. A Highly Robust Target Tracking Algorithm Merging CNN and Transformer [J]. Journal of System Simulation, 2024, 36(8): 1854-1868. | 
| [6] | Lu Yang, Liu Pengfei, Xu Siyuan, Liu Qiwang, Gu Fuqian, Wang Peng. Simulation of Rice Disease Recognition Based on Improved Attention Mechanism Embedded in PR-Net Model [J]. Journal of System Simulation, 2024, 36(6): 1322-1333. | 
| [7] | Liu Jinhui, Chen Mengyuan, Han Pengpeng, Chen Hebao, Zhang Yukun. A Graph Neural Network Visual SLAM Algorithm for Large-angle View Motion [J]. Journal of System Simulation, 2024, 36(5): 1043-1060. | 
| [8] | Zhang Xiliu, Zhang Xiaoling, He Minjun. Research on Vehicle Detection Method Based on Improved YOLOX-s [J]. Journal of System Simulation, 2024, 36(2): 487-496. | 
| [9] | Wu Yunpeng, Fu Yingxiong, Shen Lijun, Cui Feng. Traffic Sign Recognition Model with Long-Tail Distribution Based on YOLOX-Tiny [J]. Journal of System Simulation, 2024, 36(11): 2503-2516. | 
| [10] | Xu Zhongkai, Liu Yanling, Sheng Xiaojuan, Wang Chao, Ke Wenjun. Automatic Detection Algorithm for Typical Defects of Substation Based on Improved YOLOv5 [J]. Journal of System Simulation, 2024, 36(11): 2604-2615. | 
| [11] | Lu Bin, Wang Minghan, Sun Yang, Yang Zhenyu. Global-local Fusion for Efficient 3D Object Detection [J]. Journal of System Simulation, 2024, 36(11): 2616-2630. | 
| [12] | Dong Qingqing, Wu Hao, Qian Wenhua, Kong Fengling. RGB-D Saliency Object Detection Based on Cross-refinement and Circular Attention [J]. Journal of System Simulation, 2023, 35(9): 1931-1947. | 
| [13] | Hao Yu, Jinxia Jiang, Xiaohan Lai, Feng Mei, Qing Wang. Surface Defect Detection of Power Equipment Using Adaptive Receptive Field Network [J]. Journal of System Simulation, 2023, 35(7): 1572-1580. | 
| [14] | Yun Wei, Qi Luo, Yingzhi Zhao. Semantic Segmentation Model Based on Adaptive Fusion and Attention Refinement [J]. Journal of System Simulation, 2023, 35(6): 1226-1234. | 
| [15] | Ding Shi, Xuefeng Yan, Lina Gong, Jingxuan Zhang, Donghai Guan, Mingqiang Wei. Multi-agent Cooperative Combat Simulation in Naval Battlefield with Reinforcement Learning [J]. Journal of System Simulation, 2023, 35(4): 786-796. | 
| Viewed | ||||||
| 
										Full text | 
									
										 | 
								|||||
| 
										Abstract | 
									
										 | 
								|||||