Journal of System Simulation ›› 2021, Vol. 33 ›› Issue (8): 1766-1774.doi: 10.16182/j.issn1004731x.joss.21-0432

Previous Articles     Next Articles

Self-learning-based Multiple Spacecraft Evasion Decision Making Simulation Under Sparse Reward Condition

Zhao Yu, Guo Jifeng, Yan Peng, Bai Chengchao   

  1. School of Astronautics, Harbin Institute of Technology, Harbin 150001, China
  • Received:2021-05-14 Revised:2021-06-05 Online:2021-08-18 Published:2021-08-19

Abstract: In order to improve the ability of spacecraft formation to evade multiple interceptors, aiming at the low success rate of traditional procedural maneuver evasion, a multi-agent cooperative autonomous decision-making algorithm, which is based on deep reinforcement learning method, is proposed. Based on the actor-critic architecture, a multi-agent reinforcement learning algorithm is designed, in which a weighted linear fitting method is proposed to solve the reliability allocation problem of the self-learning system. To solve the sparse reward problem in task scenario, a sparse reward reinforcement learning method based on inverse value method is proposed. According to the task scenario, the space multi-agent countermeasure simulation system is established, and the correctness and effectiveness of the proposed algorithm are verified.

Key words: multi-agent, reinforcement learning, sparse reward, evasion maneuver, autonomous decision making

CLC Number: