系统仿真学报 ›› 2023, Vol. 35 ›› Issue (4): 671-694.doi: 10.16182/j.issn1004731x.joss.22-0555

• 综述 •    

基于深度强化学习的对手建模方法研究综述

徐浩添(), 秦龙, 曾俊杰, 胡越, 张琪()   

  1. 国防科技大学 系统工程学院,湖南 长沙 410073
  • 收稿日期:2022-05-25 修回日期:2022-06-26 出版日期:2023-04-29 发布日期:2023-04-12
  • 通讯作者: 张琪 E-mail:xuhaotian@nudt.edu.cn;zhangqiy123@nudt.edu.cn
  • 作者简介:徐浩添(1998-),男,硕士生,研究方向为系统仿真、多智能体系统等。E-mail:xuhaotian@nudt.edu.cn
  • 基金资助:
    国家自然科学基金(61273300);国家社科基金军事学(2020-SKJJ-C-102);湖南省自然科学基金(2021JJ40697)

Research Progress of Opponent Modeling Based on Deep Reinforcement Learning

Haotian Xu(), Long Qin, Junjie Zeng, Yue Hu, Qi Zhang()   

  1. College of Systems Engineering, National University of Denfense Technology, Changsha 410073, China
  • Received:2022-05-25 Revised:2022-06-26 Online:2023-04-29 Published:2023-04-12
  • Contact: Qi Zhang E-mail:xuhaotian@nudt.edu.cn;zhangqiy123@nudt.edu.cn

摘要:

深度强化学习是一种兼具深度学习特征提取能力和强化学习序列决策能力的智能体建模方法,能够弥补传统对手建模方法存在的非平稳性适应差、特征选取复杂、状态空间表示能力不足等问题。将基于深度强化学习的对手建模方法分为显式建模和隐式建模两类,按照类别梳理相应的理论、模型、算法,以及适用场景;介绍基于深度强化学习的对手建模技术在不同领域的应用情况;总结亟需解决的关键问题以及发展方向,为基于深度强化学习的对手建模方法提供较全面的研究综述。

关键词: 深度强化学习, 对手建模, 博弈论, 心智理论, 表征学习, 元学习

Abstract:

Deep reinforcement learning is an agent modeling method with both deep learning feature extraction ability and reinforcement learning sequence decision-making ability, which can make up for the depleted non-stationary adaptation, complex feature selection and insufficient state-space representation ability of traditional opponent modeling. The deep reinforcement learning-based opponent modeling methods are divided into two categories, explicit modeling and implicit modeling, and the corresponding theories, models, algorithms and applicable scenarios are sorted out according to the categories. The applications of deep reinforcement learning-based opponent modeling techniques on different fields are introduced. The key problems and future development are summarized to provide a comprehensive research review for the deep reinforcement learning-based opponent modeling methods.

Key words: deep reinforcement learning, opponent modeling, game theory, theory of mind, representation learning, meta learning

中图分类号: