基于次优示例引导的兵棋推演多智能体强化学习方法
周子聪, 曾俊杰, 胡越, 朱正秋, 尹全军
Multi-agent Reinforcement Learning Method for Wargame Simulation Based on Suboptimal Demonstration Guidance
Zhou Zicong, Zeng Junjie, Hu Yue, Zhu Zhengqiu, Yin Quanjun
系统仿真学报 . 2026, (5): 1277 -1289 .  DOI: 10.16182/j.issn1004731x.joss.25-0743