系统仿真学报 ›› 2023, Vol. 35 ›› Issue (11): 2410-2418.doi: 10.16182/j.issn1004731x.joss.22-0632

• 论文 • 上一篇    下一篇

基于强化学习的最优控制指令模仿生成方法

贾政轩1(), 林廷宇1, 肖莹莹1, 施国强1, 王豪2, 曾贲2, 欧一鸣1, 赵芃芃1   

  1. 1.北京仿真中心,北京 100854
    2.北京电子工程总体研究所,北京 100854
  • 收稿日期:2022-06-09 修回日期:2022-08-03 出版日期:2023-11-25 发布日期:2023-11-23
  • 第一作者简介:贾政轩(1990-),男,工程师,硕士,研究方向为深度强化学习、建模仿真、智能决策等。E-mail:danny2006_2007@126.com

Imitative Generation of Optimal Guidance Law Based on Reinforcement Learning

Jia Zhengxuan1(), Lin Tingyu1, Xiao Yingying1, Shi Guoqiang1, Wang Hao2, Zeng Bi2, Ou Yiming1, Zhao Pengpeng1   

  1. 1.Beijing Simulation Center, Beijing 100854, China
    2.Beijing Institute of Electronic System Engineering, Beijing 100854, China
  • Received:2022-06-09 Revised:2022-08-03 Online:2023-11-25 Published:2023-11-23

摘要:

以高速机动目标拦截为问题背景,基于深度强化学习提出了一种不依赖目标加速度估计的逆轨拦截最优控制指令生成 方法 ,并通过仿真实验进行了有效性验证。从仿真实验结果看,提出的方法实现了三维空间高速机动目标逆轨拦截并大幅削减了对带有强不确定性目标估计的要求,相比最优控制方法具有更强的适用性。

关键词: 强化学习, 最优制导, 模仿学习, 逆轨拦截, 制导控制

Abstract:

Under the background of high-speed maneuvering target interception,an optimal guidance law generation method for head-on interception independent of target acceleration estimation is proposed based on deep reinforcement learning. In addition, its effectiveness is verified through simulation experiments. As the simulation results suggest, the proposed method successfully achieves head-on interception of high-speed maneuvering targets in 3D space and largely reduces the requirement for target estimation with strong uncertainty, and it is more applicable than the optimal control method.

Key words: reinforcement learning, optimal guidance, imitation learning, head-on interception, guidance and control

中图分类号: