系统仿真学报 ›› 2026, Vol. 38 ›› Issue (2): 360-371.doi: 10.16182/j.issn1004731x.joss.25-0595

• 机器学习算法 • 上一篇    

多约束条件下基于强化学习的无人机团队定向优化方法

杨灿, 陈凯, 朱峰   

  1. 国防科技大学 系统工程学院,湖南 长沙 410073
  • 收稿日期:2025-06-24 修回日期:2025-09-07 出版日期:2026-02-18 发布日期:2026-02-11
  • 通讯作者: 朱峰
  • 第一作者简介:杨灿(2000-),男,硕士生,研究方向为智能行为建模。
  • 基金资助:
    国家自然科学基金(61903368)

Reinforcement Learning Based Method for UAV Team Orienteering Optimization under Multi-constraint Condition

Yang Can, Chen Kai, Zhu Feng   

  1. College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
  • Received:2025-06-24 Revised:2025-09-07 Online:2026-02-18 Published:2026-02-11
  • Contact: Zhu Feng

摘要:

为解决多重复杂场景下传统优化方法难以高效求解,而现有强化学习方法存在求解质量不高、训练效率低的问题,提出了一种基于注意力机制的强化学习高效求解方法。设计多信息融合的动态注意力策略网络以提升解的质量;结合可视图法简化威胁区约束,加快了训练收敛速度;解码阶段引入顺序重排机制,优化了解的性能。仿真结果表明:该方法可在毫秒级时间内生成高质量解,其总奖励逼近甚至优于Ortools与PyVRP等传统求解器在数秒至数百秒内所得结果,训练效率大幅提升,单轮训练时间由数小时缩短至约30 min。

关键词: 强化学习, 团队定向问题, 多无人机系统, 注意力机制

Abstract:

Traditional optimization methods struggle with efficiency, while reinforcement learning approaches often yield low solution quality and high training costs. In response, this paper proposes an attention mechanism-based reinforcement learning method. A dynamic attentionstrategy network with multi-information fusion is designed to improve solution quality. A visibility-graph approach is employed to simplify threat zone constraints and speed up convergence, and a decoding sequence reordering mechanism is introduced for further performance optimization of the solution. The simulation results show that the method generates high-quality solutions within milliseconds, achieving total rewards that approach or even surpass those obtained by traditional solvers such as Ortools and PyVRP within several seconds to hundreds of seconds. The training efficiency is enhanced significantly, with the training time per epoch reducing from several hours to about 30 minutes.

Key words: reinforcement learning, team orienteering problem, multi-UAV systems, attention mechanism

中图分类号: