系统仿真学报 ›› 2025, Vol. 37 ›› Issue (5): 1256-1265.doi: 10.16182/j.issn1004731x.joss.24-0079

• 研究论文 • 上一篇    下一篇

基于联合DQN的定向能系统火力智能决策建模仿真方法

屈长虹, 王俊杰, 王坤, 崔清勇, 陈蒋洋, 王鑫鹏   

  1. 中国久远高新技术装备有限公司,北京 100094
  • 收稿日期:2024-01-19 修回日期:2024-04-23 出版日期:2025-05-20 发布日期:2025-05-23
  • 通讯作者: 王俊杰
  • 第一作者简介:屈长虹(1979-),男,副研究员,硕士,研究方向为高新技术装备论证与效能评估。

A Modeling and Simulation Method for Firepower Intelligent Decision-making of Directed Energy System Based on Joint DQN

Qu Changhong, Wang Junjie, Wang Kun, Cui Qingyong, Chen Jiangyang, Wang Xinpeng   

  1. China Jiuyuan Hi-tech Equipment Corporation Limited, Beijing 100094, China
  • Received:2024-01-19 Revised:2024-04-23 Online:2025-05-20 Published:2025-05-23
  • Contact: Wang Junjie

摘要:

针对利用定向能系统反无人机集群作战中如何兼容多种部署方案动态解决火力智能决策的问题,建立了一个深度强化学习模型,并针对该模型多智能体状态和动作空间维度高的特点,提出了一种基于联合深度Q网络(DQN)的定向能系统火力智能决策建模仿真方法。以定向能系统的状态、无人机集群的状态和定向能系统部署区域的状态构造状态空间,利用联合机制共享各个装备的状态信息以及同类型装备的网络参数,设计威胁评估机制提高泛化性,并建立动作屏蔽机制屏蔽无效动作,有效解决了多智能体因状态和动作维度灾难引起的训练发散、收敛缓慢等问题,提高基于联合DQN网络学习效率和泛化性。根据仿真实验结果,该方法优于传统基于规则的方法,验证了该方法的可行性与实用性,为兼容多种部署方案的定向能系统反无人机集群火力智能决策提供一个新思路。

关键词: 定向能系统, 反无人机集群, 深度Q网络, 联合机制, 威胁评估机制, 动作屏蔽机制

Abstract:

In order to solve the problem of dynamically addressing firepower intelligent decision-making in anti-UAV cluster combat using a directed energy system, a deep reinforcement learning model is established. Based on the high multi-agent state and action space dimensions of this model, a modeling and simulation method of firepower intelligent decision-making of directed energy system based on joint deep Q network (DQN) is proposed. The state space is constructed from the state of directed energy system, UAV cluster and the directed energy system deployment area. The joint mechanism is used to share the state information of each equipment and the network parameters of the same type of equipment. The threat assessment mechanism is designed to improve generalization, and the action shielding mechanism is established to shield invalid actions. The problems of divergence and slow convergence of multi-agent training, caused by state and action dimension disasters, are effectively solved, and the learning efficiency and generalization of Joint DQN network are improved. According to the simulation results, this method is superior to the traditional rule-based method, which verifies the feasibility and practicability of this method, and provides a new idea for intelligent decision-making of anti-UAV cluster firepower of directed energy system compatible with various deployment schemes.

Key words: directed energy system, anti-UAV cluster, deep Q network(DQN), joint mechanism, threat assessment mechanism, action shielding mechanism

中图分类号: