基于联合DQN的定向能系统火力智能决策建模仿真方法

doi:10.16182/j.issn1004731x.joss.24-0079

系统仿真学报 ›› 2025, Vol. 37 ›› Issue (5): 1256-1265.doi: 10.16182/j.issn1004731x.joss.24-0079

基于联合DQN的定向能系统火力智能决策建模仿真方法

屈长虹, 王俊杰, 王坤, 崔清勇, 陈蒋洋, 王鑫鹏

中国久远高新技术装备有限公司，北京 100094

收稿日期:2024-01-19 修回日期:2024-04-23 出版日期:2025-05-20 发布日期:2025-05-23
通讯作者: 王俊杰
第一作者简介:屈长虹(1979-)，男，副研究员，硕士，研究方向为高新技术装备论证与效能评估。

A Modeling and Simulation Method for Firepower Intelligent Decision-making of Directed Energy System Based on Joint DQN

Qu Changhong, Wang Junjie, Wang Kun, Cui Qingyong, Chen Jiangyang, Wang Xinpeng

China Jiuyuan Hi-tech Equipment Corporation Limited, Beijing 100094, China

Received:2024-01-19 Revised:2024-04-23 Online:2025-05-20 Published:2025-05-23
Contact: Wang Junjie

摘要/Abstract

摘要：

针对利用定向能系统反无人机集群作战中如何兼容多种部署方案动态解决火力智能决策的问题，建立了一个深度强化学习模型，并针对该模型多智能体状态和动作空间维度高的特点，提出了一种基于联合深度Q网络（DQN）的定向能系统火力智能决策建模仿真方法。以定向能系统的状态、无人机集群的状态和定向能系统部署区域的状态构造状态空间，利用联合机制共享各个装备的状态信息以及同类型装备的网络参数，设计威胁评估机制提高泛化性，并建立动作屏蔽机制屏蔽无效动作，有效解决了多智能体因状态和动作维度灾难引起的训练发散、收敛缓慢等问题，提高基于联合DQN网络学习效率和泛化性。根据仿真实验结果，该方法优于传统基于规则的方法，验证了该方法的可行性与实用性，为兼容多种部署方案的定向能系统反无人机集群火力智能决策提供一个新思路。

关键词: 定向能系统, 反无人机集群, 深度Q网络, 联合机制, 威胁评估机制, 动作屏蔽机制

Abstract:

In order to solve the problem of dynamically addressing firepower intelligent decision-making in anti-UAV cluster combat using a directed energy system, a deep reinforcement learning model is established. Based on the high multi-agent state and action space dimensions of this model, a modeling and simulation method of firepower intelligent decision-making of directed energy system based on joint deep Q network (DQN) is proposed. The state space is constructed from the state of directed energy system, UAV cluster and the directed energy system deployment area. The joint mechanism is used to share the state information of each equipment and the network parameters of the same type of equipment. The threat assessment mechanism is designed to improve generalization, and the action shielding mechanism is established to shield invalid actions. The problems of divergence and slow convergence of multi-agent training, caused by state and action dimension disasters, are effectively solved, and the learning efficiency and generalization of Joint DQN network are improved. According to the simulation results, this method is superior to the traditional rule-based method, which verifies the feasibility and practicability of this method, and provides a new idea for intelligent decision-making of anti-UAV cluster firepower of directed energy system compatible with various deployment schemes.

Key words: directed energy system, anti-UAV cluster, deep Q network(DQN), joint mechanism, threat assessment mechanism, action shielding mechanism

中图分类号:

TP391.9

屈长虹,王俊杰,王坤等 . 基于联合DQN的定向能系统火力智能决策建模仿真方法[J]. 系统仿真学报, 2025, 37(5): 1256-1265.

Qu Changhong,Wang Junjie,Wang Kun,et al . A Modeling and Simulation Method for Firepower Intelligent Decision-making of Directed Energy System Based on Joint DQN[J]. Journal of System Simulation, 2025, 37(5): 1256-1265.

图/表 11

图1

图2

图3

图4

图5

表1

DQN相关超参数

名称	参数	取值
学习率	$l r$	0.001 $w$
最大探索率	$ε m a x$	0.99
最小探索率	$ε m i n$	0.01
退火系数	$τ$	0.001
折扣因子	$γ$	0.99
每轮学习次数	$C$	150
批处理大小	$b a t c h_s i z e$	256
经验池	$r e p l y_b u f f e r$	30 000

表1

图6

图7

表2

表3

表4

参考文献 16

1	Cai Yawei, Guo Haifeng, Zhou Kai, et al. Unmanned Aerial Vehicle Cluster Operations Under the Background of Intelligentization[C]//2021 3rd International Conference on Artificial Intelligence and Advanced Manufacture (AIAM). Piscataway: IEEE, 2021: 525-529.
2	Jurn Yaseen N, Mahmood Sawsen A, Aldhaibani Jaafar A. Anti-drone System Based Different Technologies: Architecture, Threats and Challenges[C]//2021 11th IEEE International Conference on Control System, Computing and Engineering (ICCSCE). Piscataway: IEEE, 2021: 114-119.
3	朱孟真, 陈霞, 刘旭, 等. 战术激光武器反无人机发展现状和关键技术分析[J]. 红外与激光工程, 2021, 50(7): 181-193.
	Zhu Mengzhen, Chen Xia, Liu Xu, et al. Situation and Key Technology of Tactical Laser anti-UAV[J]. Infrared and Laser Engineering, 2021, 50(7): 181-193.
4	马海涛, 赵伟东. 基于遗传算法的弹炮混编防空群火力分配[J]. 火力与指挥控制, 2006(4), 36-38.
	Ma Haitao, Zhao Weidong. Firepower allocation of missile artillery mixed air defense group based on genetic algorithm [J]. Firepower and Command Control, 2006 (4), 36-38.
5	Jiang Guangsheng, Shi Xianming, Chen Jing, et al. A Survey of Intelligent Optimization Algorithms for Weapon Target Assignment (WTA) Problem[C]//2020 Management Science Informatization and Economic Innovation Development Conference (MSIEID). Piscataway: IEEE, 2020: 50-54.
6	Zhai Haoran, Wang Weihong, Li Qingze, et al. Weapon-target Assignment Based on Improved PSO Algorithm[C]//2021 33rd Chinese Control and Decision Conference (CCDC). Piscataway: IEEE, 2021: 6320-6325.
7	杨荣军, 李长军. 粒子群算法在激光武器反无人机火力分配中的应用[J]. 指挥信息系统与技术, 2021, 12(5): 70-75, 81.
	Yang Rongjun, Li Changjun. Application of Particle Swarm Optimization in Anti-UAV Fire Allocation of Laser Weapon[J]. Command Information System and Technology, 2021, 12(5): 70-75, 81.
8	Wang Caihong, Gao Junqiang, Naibing Lü, et al. Multi-objective Optimization of Weapon Target Assignment Based on Genetic Algorithm[C]//2021 International Conference on Computer, Internet of Things and Control Engineering (CITCE). Piscataway: IEEE, 2021: 29-34.
9	雷鸣, 谢斌. 基于云遗传算法的防空火力分配[J]. 系统仿真学报, 2018, 30(9): 3533-3537, 3551.
	Lei Ming, Xie Bin. Air Defense Fire Distribution Based on Cloud-genetic Algorithm[J]. Journal of System Simulation, 2018, 30(9): 3533-3537, 3551.
10	钟伟杰, 李小兵, 常昊天, 等. 基于嵌套PSO算法的反无人机集群防空部署模型[J]. 电光与控制, 2021, 28(12): 6-10, 16.
	Zhong Weijie, Li Xiaobing, Chang Haotian, et al. A Model for Air Defense Deployment Against UAV Swarm Based on Nested PSO Algorithm[J]. Electronics Optics & Control, 2021, 28(12): 6-10, 16.
11	石鼎, 燕雪峰, 宫丽娜, 等. 强化学习驱动的海战场多智能体协同作战仿真算法[J]. 系统仿真学报, 2023, 35(4): 786-796.
	Shi Ding, Yan Xuefeng, Gong Lina, et al. Multi-agent Cooperative Combat Simulation in Naval Battlefield with Reinforcement Learning[J]. Journal of System Simulation, 2023, 35(4): 786-796.
12	Nasir Y S, Guo Dongning. Deep Reinforcement Learning for Joint Spectrum and Power Allocation in Cellular Networks[C]//2021 IEEE Globecom Workshops (GC Wkshps). Piscataway: IEEE, 2021: 1-6.
13	刘家义, 王刚, 付强, 等. 基于分配策略优化算法的智能防空任务分配[J]. 系统仿真学报, 2023, 35(8): 1705-1716.
	Liu Jiayi, Wang Gang, Fu Qiang, et al. Intelligent Air Defense Task Assignment Based on Assignment Strategy Optimization Algorithm[J]. Journal of System Simulation, 2023, 35(8): 1705-1716.
14	Mnih V, Kavukcuoglu K, Silver D, et al. Human-level Control Through Deep Reinforcement Learning[J]. Nature, 2015, 518(7540): 529-533.
15	董涛, 白娟, 邢清华, 等. 战术定向能武器地面防空作战研究[J]. 飞航导弹, 2021(5): 73-75, 79.
16	Maini A K. Directed Energy Weapons[M]//Maini A K. Handbook of Defence Electronics and Optronics: Fundamentals, Technologies and Systems. Hoboken: John Wiley & Sons, Inc, 2018: 1013-1105.

高度/m	胜率/%
300	99
500	98
800	93
1000	90

规模	胜率/%
雷达：1 激光：3 微波：1 干扰：2 无人机：40	98
雷达：1 激光：2 微波：1 干扰：2 无人机：20	98
雷达：1 激光：6 微波：3 干扰：2 无人机：120	99

[1]	薛乃阳, 丁丹, 贾玉童, 王志强, 刘渊. 基于DQN的异构测控资源联合调度方法[J]. 系统仿真学报, 2023, 35(2): 423-434.
[2]	倪凌佳, 黄晓霞, 李红旮, 张子博. 基于协作式深度强化学习的火灾应急疏散仿真研究[J]. 系统仿真学报, 2022, 34(6): 1353-1366.
[3]	黄晓冬, 苑海涛, 毕敬, 刘涛. 基于DQN的海战场舰船路径规划及仿真[J]. 系统仿真学报, 2021, 33(10): 2440-2448.

基于联合DQN的定向能系统火力智能决策建模仿真方法

A Modeling and Simulation Method for Firepower Intelligent Decision-making of Directed Energy System Based on Joint DQN

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 16

相关文章 3

编辑推荐

Metrics

本文评价

速度/(m/s)	胜率/%
30	99
45	99
50	96