[1] 钟华. 贴近实战的外军军事训练[J]. 国防科技, 2014, 35(4): 104-106. Zhong Hua.Close to Actual Combat Military Training of Foreign Troops[J]. National Defense Science & Technology, 2014, 35(4): 104-106. [2] 寇英信, 李战武, 李俊兵, 等. 现代战斗机作战任务管理与决策[M]. 北京: 国防工业出版社, 2017. Kou Yingwin, Li Zhanwu, Li Junbing, et al.Modern Fighter Combat Mission Management and Decision-making[M]. Beijing: National Defense Industry Press, 2017. [3] Poli R, Kennedy J, Blackwell T.Particle Swarm Optimization: An Overview[J]. Swarm Intelligence (S1935-3820), 2007(1): 33-57. [4] Mnih V, Kavukcuoglu K, Silver D, et al.Human-level Control Through Deep Reinforcement Learning[J]. Nature (S1476-4687), 2015, 518(7540): 529-533. [5] Silver D, Huang A, Maddison C J, et al.Mastering the Game of Go with Deep Neural Networks and Tree Search[J]. Nature (S1476-4687), 2016, 529(7587): 484-489. [6] Vinyals O, Babuschkin I, Czarnecki W M, et al.Grandmaster Level in StarCraft II Using Multi-agent Reinforcement Learning[J]. Nature (S1476-4687), 2019, 575(7782): 350-354. [7] Silver D, Lever G, Heess N, et al.Deterministic Policy Gradient Algorithms[C]// International Conference on Machine Learning. PMLR, 2014: 387-395. [8] Kingma D P, Ba J. Adam: A Method for Stochastic Optimization[J]. arXiv preprint arXiv:1412.6980, 2014. [9] Schulman J, Levine S, Abbeel P, et al.Trust Region Policy Optimization[C]// International Conference on Machine Learning. PMLR, 2015: 1889-1897. [10] Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous Control with Deep Reinforcement Learning[J]. arXiv preprint arXiv:1509.02971, 2015. [11] Tesauro G.Temporal Difference Learning and TD-Gammon[J]. Communications of the ACM (S0001-0782), 1995, 38(3): 58-68. [12] Bellemare M G, Dabney W, Munos R.A Distributional Perspective on Reinforcement Learning[C]// International Conference on Machine Learning. PMLR, 2017: 449-458. [13] Barth-Maron G, Hoffman M W, Budden D, et al. Distributed Distributional Deterministic Policy Gradients[J]. arXiv preprint arXiv:1804.08617, 2018. [14] Sergeev A, Del Balso M. Horovod: Fast and Easy Distributed Deep Learning in TensorFlow[J]. arXiv preprint arXiv:1802.05799, 2018. |