系统仿真学报 ›› 2023, Vol. 35 ›› Issue (10): 2249-2261.doi: 10.16182/j.issn1004731x.joss.23-FZ0824

• 论文 • 上一篇    下一篇

基于分层的智能建模方法的多机空战行为建模

王宇琨(), 王泽, 董力维, 李妮()   

  1. 北京航空航天大学 自动化科学与电气工程学院,北京 100191
  • 收稿日期:2023-07-03 修回日期:2023-09-15 出版日期:2023-10-30 发布日期:2023-10-26
  • 通讯作者: 李妮 E-mail:wyk_13@foxmail.com;lini@buaa.edu.cn
  • 第一作者简介:王宇琨(1999-),男,满族,博士生,研究方向为系统建模与仿真。E-mail:wyk_13@foxmail.com

Research on Multi-aircraft Air Combat Behavior Modeling Based on Hierarchical Intelligent Modeling Methods

Wang Yukun(), Wang Ze, Dong Liwei, Li Ni()   

  1. School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China
  • Received:2023-07-03 Revised:2023-09-15 Online:2023-10-30 Published:2023-10-26
  • Contact: Li Ni E-mail:wyk_13@foxmail.com;lini@buaa.edu.cn

摘要:

针对多机空战对抗场景中高维状态-行为空间约束下兵力博弈决策困难的问题,采用基于深度强化学习的兵力智能体决策生成策略,提出面向兵力智能博弈的态势认知和奖励回报生成算法构建基于混合的智能建模方法的行为建模分层框架。解决了强化学习过程中存在的稀疏奖励技术难点,为解决大规模、多机型、要素多的空战问题提供一种可行的强化学习训练方法。

关键词: 作战仿真, 多智能体, 深度强化学习, 非稀疏奖励函数

Abstract:

In response to the problem of the difficulty of decision-making in the game of force under the constraints of high-dimensional state-space in multi-machine air combat confrontation scenarios, a force intelligent agent decision-making generation strategy based on deep reinforcement learning is adopted. Thedeveloping situational cognition and reward feedback generation algorithms for force intelligent game are proposed, a behavior modeling hierarchical framework based on hybrid intelligence modeling method is constructed, which solve the technical difficulty of sparse reward in the reinforcement learning process. It provides an feasible reinforcement learning training method that can solve the large-scale, multi-model, and multi-element air combat problems.

Key words: combat simulation, Multi-agent system, DRL, non-sparse reward function

中图分类号: