[1] 于大腾. 空间飞行器安全防护规避机动方法研究[D]. 长沙: 国防科技大学, 2017. Yu Dateng.Approaches for the Spacecraft Security Defense and Evasion Maneuver Method[D]. Changsha: National University of Defense Technology, 2017. [2] 司玉洁, 熊华, 李喆. 拦截机动目标的三维自适应神经网络制导律[J]. 系统仿真学报, 2021, 33(2): 453-460. Si Yujie, Xiong Hua, Li Zhe.Three-dimensional Adaptive Neural Network Guidance Law against Maneuvering Targets[J]. Journal of System Simulation, 2021, 33(2): 453-460. [3] Shinar J, Steinberg D.Analysis of Optimal Evasive Maneuvers Based on a Linearized Two Dimensional Kinematic Model[J]. Journal of Aircraft (S0021-8669), 1977, 14(8): 546-554. [4] 汪民乐. 战略导弹突防仿真模型[J]. 系统工程与电子技术, 1996, 18(10): 53-58. Wang Minle.Simulating Model of Strategic Missile Penetration[J]. Systems Engineering and Electronics, 1996, 18(10): 53-58. [5] 张润德, 蔡伟伟, 杨乐平. 基于微分平坦的航天器避障轨迹快速规划[J]. 飞行力学, 2020, 38(4): 65-70. Zhang Runde, Cai Weiwei, Yang Leping.Differential Flatness Based Rapid Trajectory Planning for Spacecraft Obstacle Avoidance[J]. Flight Dynamics, 2020, 38(4): 65-70. [6] 李翠兰, 欧阳琦, 陈明, 等. 大型低轨航天器与星座卫星的碰撞风险研究[J]. 宇航学报, 2020, 41(9): 1158-1165. Li Cuilan, Ouyang Qi, Chen Ming, et al.Analysis of Collision Risk Between Constellation Satellites and Large Low-Orbit Spacecraft[J]. Journal of Astronautics, 2020, 41(9): 1158-1165. [7] Gupta J K, Egorov M, Kochenderfer M.Cooperative Multi-agent Control Using Deep Reinforcement Learning[C]// AAMAS 2017. Lecture Notes in Computer Science. São Paulo: Springer, Cham, 2017. [8] Yu C, Velu A, Yinitsky E, et al. The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games[J/OL]. ArXiv preprint, (2021-03-02) [2021-03-30]. https://arxiv.org/abs/2103.01955. [9] 周建频, 张姝柳. 基于深度强化学习的动态库存路径优化[J]. 系统仿真学报, 2019, 31(10): 2155-2163. Zhou Jianpin, Zhang Shuliu.Dynamic Inventory Routing Optimization Based on Deep Reinforcement Learning[J]. Journal of System Simulation, 2019, 31(10): 2155-2163. [10] 杨惟轶, 白辰甲, 蔡超, 等. 深度强化学习中稀疏奖励问题研究综述[J]. 计算机科学, 2020, 47(3): 182-191. Yang Weiyi, Bai Chenjia, Cai Chao, et al.Survey on Sparse Reward in Deep Reinforcement Learning[J]. Computer Science, 2020, 47(3): 182-191. [11] 杨瑞, 严江鹏, 李秀. 强化学习稀疏奖励算法研究——理论与实验[J]. 智能系统学报, 2020, 15(5): 888-899. Yang Rui, Yan Jiangpeng, Li Xiu.Survey of Sparse Reward Algorithms in Reinforcement Learning — Theory and Experiment[J]. CAAI Transactions on Intelligent Systems, 2020, 15(5): 888-899. [12] Feinberg V, Wan A, Stoica I, et al. Model-based Value Estimation for Efficient Model-free Reinforcement Learning[J/OL]. ArXiv preprint, (2018-02-28) [2021-04-02]. https://arxiv.org/abs/1803.00101. [13] Leal M A, Baker T L, Pflibsen K P. Multiple Kill Vehicle Interceptor with Autonomous Kill Vehicles: US, US7494090B2[P].2009-02-24. |