系统仿真学报 ›› 2021, Vol. 33 ›› Issue (12): 2838-2845.doi: 10.16182/j.issn1004731x.joss.20-FZ0532

• 仿真建模理论与方法 • 上一篇    下一篇

一种数据驱动的对抗博弈智能体建模方法

曾贲1,2, 房霄1, 孔德帅3, 宋祥祥1, 贾政轩1,2, 林廷宇1,2   

  1. 1.北京仿真中心,北京 100854;
    2.北京电子工程总体研究所,北京 100854;
    3.中国航天科工集团有限公司,北京 100048
  • 收稿日期:2020-04-01 修回日期:2021-06-08 出版日期:2021-12-18 发布日期:2022-01-13
  • 作者简介:曾贲(1994-),男,硕士,工程师,研究方向为深度强化学习技术。E-mail:duanting18@nudt.edu.cn
  • 基金资助:
    国防基础科研(JCKY2018204C004)

A Data-Driven Modeling Method for Game Adversity Agent

Zeng Bi1,2, Fang Xiao1, Kong Deshuai3, Song Xiangxiang1, Jia Zhengxuan1,2, Lin Tingyu1,2   

  1. 1. Beijing Simulation Center, Beijing 100854, China;
    2. Beijing Institute of Electronic System, Beijing 100854, China;
    3. China Aerospace Science and Industry Corporation Limited, Beijing 100048, China
  • Received:2020-04-01 Revised:2021-06-08 Online:2021-12-18 Published:2022-01-13

摘要: 针对复杂对抗场景下编队行为协同建模及决策智能生成等问题,提出一种数据驱动的对抗博弈智能体建模方法,依托基于严肃游戏的复杂海上装备对空对抗模拟场景,通过基于并行对抗场景的分布式训练技术与基于灵巧目标的临机决策建模技术,结合空中目标、复杂海上装备等能力模型,实现人机混合增强的智能体建模,为后续深入开展复杂对抗场景下多目标协同建模研究提供了支撑。实验结果表明:深度强化学习算法能够为智能体灵巧策略的建模提供基础。

关键词: 深度强化学习, 数据驱动, 分布式训练, 临机决策

Abstract: Aiming at the problems of collaborative modeling of formation behavior and intelligent generation of decision-making in complex confrontation scenarios, based on the serious game to simulate the confrontation scenarios of complex maritime equipment against the air, this paper proposes a data-driven modeling method for game agent and uses a distributed modeling technology of parallel adversarial scenarios and opportunistic decision making technology of smart targets to achieve agent modeling. It provides support for the further exploration of multi-objective collaborative modeling in complex confrontation scenarios. The simulation results show that deep reinforcement learning algorithms can provide a basis for the modeling of agents dexterous strategies.

Key words: deep reinforcement learning, data-driven, distributed training, opportunistic decision making

中图分类号: