1 |
Lowe R, Wu Y I, Tamar A, et al. Multi-agent Actor-critic for Mixed Cooperative-Competitive Environments[C]//Advances in Neural Information Processing Systems. San Francisco: Margan Kaufmann, 2017.
|
2 |
Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous Control with Deep Reinforcement Learning[C/OL]. International Conference on Learning Representations. 2016. [2022-06-11]. .
|
3 |
Rashid T, Samvelyan M, Schroeder C, et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-agent Reinforcement Learning[C]//International Conference on Machine Learning. New York: PMLR, 2018: 4295-4304.
|
4 |
Watkins C J C H. Learning from Delayed Rewards[D]. London: King's College, 1989.
|
5 |
Rummery G A, Niranjan M. On-line Q-learning Using Connectionist Systems[M]. Cambridge, England: University of Cambridge, Department of Engineering, 1994.
|
6 |
Sutton R S, McAllester D A, Singh S P, et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation[C]//Advances in Neural Information Processing Systems. San Francisco: Margan Kaufmann, 2000: 1057-1063.
|
7 |
Mnih V, Kavukcuoglu K, Silver D, et al. Playing Atari with Deep Reinforcement Learning[J/OL]. [2022-06-11]. .
|
8 |
Barto A G, Sutton R S, Anderson C W. Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems[J]. IEEE Transactions on Systems, Man, and Cybernetics(S0018-9472), 1983, 27(5): 834-846.
|
9 |
Hernandez-Leal P, Kartal B, Taylor M E. Is multiagent Deep Reinforcement Learning the Answer or the Question? A Brief Survey[J/OL]. [2022-06-11]. .
|
10 |
Tampuu A, Matiisen T, Kodelja D, et al. Multiagent Cooperation and Competition with Deep Reinforcement Learning[J]. Plos One(S1932-6203), 2017, 12(4): e0172395.
|
11 |
Gupta J K, Egorov M, Kochenderfer M. Cooperative Multi-agent Control Using Deep Reinforcement Learning[C]//International Conference on Autonomous Agents and Multiagent Systems. Cham: Springer, 2017: 66-83.
|
12 |
Foerster J N, Assael Y M, De Freitas N, et al. Learning to Communicate with Deep Multi-agent Reinforcement Learning[J]. [2022-06-11]. .
|
13 |
Sukhbaatar S, Fergus R. Learning Multi-agent Communication with Backpropagation[J]. Advances in Neural Information Processing Systems(S1049-5258), 2016, 29: 2244-2252.
|
14 |
Sunehag P, Lever G, Gruslys A, et al. Value-decomposition Networks for Cooperative Multi-agent Learning[J]. [2022-06-11] .
|
15 |
Foerster J, Nardelli N, Farquhar G, et al. Stabilising Experience Replay for Deep Multi-agent Reinforcement Learning[C]//International Conference on Machine learning. New York: PMLR, 2017: 1146-1155.
|
16 |
符小卫, 王辉, 徐哲. 基于DE-MADDPG的多无人机协同追捕策略研究[J]. 航空学报, 2022, 43(5): 325311.
|
|
Fu Xiaowei, Wang Hui, Xu Zhe. Cooperative Pursuit Strategy for Multi-UAVs Based on DE-MADDPG Algorithm[J]. Acta Aeronauticaet Astronautica Sinica, 2022, 43(5): 325311.
|
17 |
Schaul T, Quan J, Antonoglou I, et al. Prioritized Experience Replay[J]. [2022-06-11]. .
|
18 |
Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]//Advances in Neural Information Processing Systems. San Francisco: Margan Kaufmann, 2017: 5998-6008.
|
19 |
Hu J, Shen L, Sun G. Squeeze-and-Excitation Networks[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141.
|
20 |
Iqbal S, Sha F. Actor-Attention-Critic for Multi-agent Reinforcement Learning[C]//International Conference on Machine Learning. New York: PMLR, 2019: 2961-2970.
|
21 |
Oh J, Chockalingam V, Lee H. Control of Memory, Active Perception, and Action in Minecraft[C]//International Conference on Machine Learning. New York: PMLR, 2016: 2790-2799.
|