系统仿真学报 ›› 2025, Vol. 37 ›› Issue (4): 1025-1040.doi: 10.16182/j.issn1004731x.joss.23-1526

• 论文 • 上一篇    

基于跨阶段双分支特征聚合的遥感小目标检测

李杰1, 刘扬1, 李良2, 苏本淦2, 魏佳隆1, 周广达1, 石艳敏3, 赵振1   

  1. 1.青岛科技大学 信息科学技术学院,山东 青岛 266061
    2.青岛淄柴博洋柴油机股份有限公司,山东 青岛 266700
    3.中国联通青岛市分公司,山东 青岛 266001
  • 收稿日期:2023-12-13 修回日期:2024-02-06 出版日期:2025-04-17 发布日期:2025-04-16
  • 通讯作者: 赵振
  • 第一作者简介:李杰(2000-),男,硕士生,研究方向为机器视觉。
  • 基金资助:
    国家自然科学基金(62201314);山东省自然科学基金(ZR2020QF007);强链计划(23-1-2-qdjh-18-gx)

Remote Sensing Small Object Detection Based on Cross-stage Two-branch Feature Aggregation

Li Jie1, Liu Yang1, Li Liang2, Su Bengan2, Wei Jialong1, Zhou Guangda1, Shi Yanmin3, Zhao Zhen1   

  1. 1.College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
    2.Qingdao Zichai Boyang Diesel Engine Company, Qingdao 266700, China
    3.China Unicom Qingdao Branch, Qingdao 266001, China
  • Received:2023-12-13 Revised:2024-02-06 Online:2025-04-17 Published:2025-04-16
  • Contact: Zhao Zhen

摘要:

针对YOLOv8在遥感小目标检测中由目标尺度差异和复杂背景引起的漏检和误检问题,提出了基于跨阶段双分支特征聚合的遥感图像小目标检测方法。融合卷积算子中的全局共享权重与注意力中特定token的上下文感知权重,获得高频局部信息和低频全局信息;使用轻量级MLP捕获全局远程依赖关系,并设计并行跨阶段可学习视觉中心机制捕获输入图像的局部角区域信息;设计多维度残差注意力机制,聚合两个并行分支的输出特征,捕获像素级的成对关系以及跨通道和跨空间信息。实验结果表明:该模型在DIOR和RSOD数据集上的mAP分别达到了73.8%和98.1%,比对比方法分别提高了1.3%和2.1%。

关键词: YOLOv8, 遥感图像, 小目标检测, 特征融合, 注意力机制

Abstract:

Aiming at YOLOv8's leakage and false detection problems caused by target scale difference and complex background in remote sensing small target detection, this paper proposes a remote sensing image small target detection method based on cross-stage two-branch feature aggregation. The global shared weights in the convolution operator and the context-aware weights of specific tokens in the attention are fused to obtain high-frequency local information and low-frequency global information; the global remote dependencies are captured using a lightweight MLP, and the parallel cross-stage learnable vision center mechanism is designed to capture the information of the local corner regions of the input image; a multidimensional residual attention mechanism is designed to aggregate the output features of two parallel branches to capture pixel-level pairwise relationships as well as cross-channel and cross-space information. The experimental results show that the proposed model achieves 73.8% and 98.1% mAP on DIOR and RSOD datasets respectively, which is 1.3% and 2.1% higher than the current state-of-the-art methods.

Key words: YOLOv8, remote sensing image, small object detection, feature fusion, attention mechanism

中图分类号: