基于卷积神经网络的陆战兵棋战术机动策略学习

doi:10.16182/j.issn1004731x.joss.21-0429

系统仿真学报 ›› 2022, Vol. 34 ›› Issue (10): 2181-2193.doi: 10.16182/j.issn1004731x.joss.21-0429

基于卷积神经网络的陆战兵棋战术机动策略学习

徐佳乐¹^,²(), 张海东²^,³, 赵东海⁴, 倪晚成²^,³()

^1.中国科学院大学人工智能学院, 北京 100049
^2.中国科学院自动化研究所, 北京 100190
^3.中国科学院人工智能创新研究院, 北京 100190
^4.国防大学联合作战学院, 河北石家庄 050000

收稿日期:2021-05-13 修回日期:2021-07-20 出版日期:2022-10-30 发布日期:2022-10-18
通讯作者: 倪晚成 E-mail:xujiale2020@ia.ac.cn;wancheng.ni@ia.ac.cn
第一作者简介:徐佳乐(1999-)，女，硕士生，研究方向为人工智能理论与方法。E-mail：xujiale2020@ia.ac.cn
基金资助:
国家自然科学基金(61906197);中国科学院战略性先导科技专项资助(XDA27000000)

Tactical Maneuver Strategy Learning from Land Wargame Replay Based on Convolutional Neural Network

Jiale Xu¹^,²(), Haidong Zhang²^,³, Donghai Zhao⁴, Wancheng Ni²^,³()

^1.School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
^2.Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
^3.Innovation Academy for Artificial Intelligence, Chinese Acdemy of Sciences, Beijing 100190, China
^4.Joint Operations College, National Defense University, Shijiazhuang 050000, China

Received:2021-05-13 Revised:2021-07-20 Online:2022-10-30 Published:2022-10-18
Contact: Wancheng Ni E-mail:xujiale2020@ia.ac.cn;wancheng.ni@ia.ac.cn

摘要/Abstract

摘要：

针对从“人在回路”兵棋推演的复盘数据中提取推演者战术经验高价值知识的问题，提出一种基于深度神经网络从复盘数据中学习战术机动策略模型的方法。将战术机动策略建模为在当前态势特征影响下对目标候选位置进行优选的分类问题：梳理总结影响推演者决策的关键认知因素，定义了由机动范围和观察范围等7个属性构成的基础态势特征，建立了带有正负样本标注的态势特征数据集；设计了基于卷积神经网络的分类器，以分类概率实现了单个棋子战术机动终点位置的预测。实验结果表明：该模型的预测准确率可达到78.96%，相比其他模型提高至少4.59%。

关键词: 兵棋推演, 复盘数据, 战术机动策略, 态势特征, 卷积神经网络

Abstract:

Aiming at collecting the high valuable knowledge of action decisions in "man-in-the-loop" wargame's replay data, a method of using convolutional neural network to learn the tactical maneuver strategy model from the replay data of wargame is proposed. In this method, the tactical maneuver strategy is modeled as a classification problem of making a good choice from the target candidate locations under the influence of current situation. The key factors affecting commander's decision-making are summarized, and the basic situation features are defined, which are composed of seven attributes such as "maneuverability range and observation range". The feature dataset with positive and negative labels is established. The classifier based on convolutional neural network is designed, which can predict the maneuver terminal position of a single piece by the classification probability. Experimental results show that the prediction accuracy of the tactical maneuver strategy model based on the convolutional neural network is up to 78.96%, which is improved by at least 4.59% compared with other models.

Key words: wargame, replay data, tactical maneuver strategy, situation feature, convolutional neural network

中图分类号:

TP391.9

徐佳乐,张海东,赵东海等 . 基于卷积神经网络的陆战兵棋战术机动策略学习[J]. 系统仿真学报, 2022, 34(10): 2181-2193.

Jiale Xu,Haidong Zhang,Donghai Zhao,et al . Tactical Maneuver Strategy Learning from Land Wargame Replay Based on Convolutional Neural Network[J]. Journal of System Simulation, 2022, 34(10): 2181-2193.

图/表 20

图1

图2

图3

图4

图5

图6

图7

图8

图9

图10

图11

表1

表2

表3

表4

图12

表5

表6

表7

图13

参考文献 16

1	胡晓峰, 荣明. 智能化作战研究值得关注的几个问题[J]. 指挥与控制学报, 2018, 4(3): 195-200.
	Hu Xiaofeng, Rong Ming. Several Important Questions of Intelligent Warfare Research[J]. Journal of Command and Control, 2018, 4(3): 195-200.
2	邢思远, 倪晚成, 张海东, 等. 基于兵棋复盘数据的武器效用挖掘[J]. 指挥与控制学报, 2020, 6(2): 132-140.
	Xing Siyuan, Ni Wancheng, Zhang Haidong, et al. Mining of Weapon Utility Based on the Replay Data of War-Game[J]. Journal of Command and Control, 2020, 6(2): 132-140.
3	张俊恒. 计算机兵棋中兵力机动路径规划研究[D]. 长沙: 国防科学技术大学, 2010.
	Zhang Junheng. Research on Force Maneuvering Path Planning in Computer Warfare[D]. Changsha: National University of Defense Technology, 2010.
4	胡伟. 计算机兵棋中兵力机动路径优化研究[D]. 长沙:国防科学技术大学, 2010.
	Hu Wei. The Research on Route Optimization of Military Power Maneuvering in Computer Wargame[D]. Changsha: National University of Defense Technology, 2010.
5	周小镜. 基于改进A*算法的游戏地图寻径的研究[D].重庆: 西南大学, 2011.
	Zhou Xiaojing. Research of Routing in the Game Map Based on Improved A* Algorithm[D]. Chongqing: Southwest University, 2011.
6	刘满, 张宏军, 郝文宁, 等. 战术级兵棋实体作战行动智能决策方法[J]. 控制与决策, 2020, 35(12): 2977-2985.
	Liu Man, Zhang Hongjun, Hao Wenning, et al. Research on Intelligent Decision-making Method of Tactical-Level Wargames[J]. Control and Decision, 2020, 35(12): 2977-2985.
7	朱丰, 胡晓峰, 吴琳, 等. 从态势认知走向态势智能认知[J]. 系统仿真学报, 2018, 30(3): 761-771.
	Zhu Feng, Hu Xiaofeng, Wu Lin, et al. From Situation Cognition Stepped into Situation Intelligent Cognition[J]. Journal of System Simulation, 2018, 30(3): 761-771.
8	闫科, 蔡亚. 陆军合同战术兵棋推演[M]. 北京: 军事科学出版社, 2013: 6-7.
	Yan Ke, Cai Ya. Army Contract Tactical War Game [M]. Beijing: Military Science Press, 2013: 6-7.
9	潘毅. 人机对抗中位置估计及其应用[D]. 北京: 中国科学院大学, 2018.
	Pan Yi. Research on the Location Estimation for Human-Computer System and Its Application[D]. Beijing: University of Chinese Academy of Sciences, 2018.
10	王桂起, 刘辉, 朱宁. 兵棋技术综述[J]. 兵工自动化, 2012, 31(8): 38-41, 45.
	Wang Guiqi, Liu Hui, Zhu Ning. A Survey of War Games Technology[J]. Ordnance Industry Automation, 2012, 31(8): 38-41, 45.
11	刘海洋, 唐宇波, 胡晓峰, 等. 基于兵棋推演的联合作战方案评估框架研究[J]. 系统仿真学报, 2018, 30(11): 4115-4122, 4131.
	Liu Haiyang, Tang Yubo, Hu Xiaofeng, et al. Research on Evaluation Framework of COA Based on Wargaming[J]. Journal of System Simulation, 2018, 30(11): 4115-4122, 4131.
12	徐佳乐. 陆战兵棋态势特征数据集[EB/OL][2021-07-18]. , 2021.
	Xu Jiale. Situation Feature Dataset for Land Wargame[EB/OL][2021-07-18]. , 2021.
13	Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-scale Image Recognition[EB/OL]. [2021-07-18]. .
14	He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
15	He K, Zhang X, Ren S, et al. Identity Mappings in Deep Residual Networks[C] //European Conference on Computer Vision. Amsterdam: Springer, Cham, 2016: 630-645.
16	Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift[C]//International conference on Machine Learning. Lille: PMLR, 2015: 448-456.

棋子所属方	RED	RED
回合起始位置	90 016	140 042
回合终点位置	100 026	150 043
样本标签	1	1
地图编号	83	86
主要夺控点位置	80 048	150 049
次要夺控点位置	100 049	130 046

地图编号	样本数量
83	10 024
86	6 670
82	3 920
19	3 414
84	52

模型	准确率/%
LR	67.48
SVM	69.08
Resnet	72.98
PCA+SVM	73.15
KNN	74.29
VGG	74.37
CNN	78.96

卷积层个数	测试准确率/%
1	78.21
2	78.61
3	78.75
4	78.53
5	77.72

FC1神经元个数	FC2神经元个数	测试准确率/%
1 024	256	78.24
2 048	256	78.57
4 096	256	78.90
8 192	256	78.57
4 096	512	78.88
4 096	1 024	78.76
4 096	2 048	78.75

基于卷积神经网络的陆战兵棋战术机动策略学习

Tactical Maneuver Strategy Learning from Land Wargame Replay Based on Convolutional Neural Network

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 20

参考文献 16

相关文章 15

编辑推荐

Metrics

本文评价

共享方案	特征融合
共享方案	加法	减法	取最大值
方案Ⅰ	78.96	78.90	78.88
方案Ⅱ	78.59	78.03	77.68

训练集	测试集	跨地图训练样本数量	跨地图模型准确率/%	全域训练样本数量
83、86、82、19	84	24 028	71.15	24 064	81.25
83、86、82	19、84	20 614	71.61	23 386	77.66
83、86、19	82、84	20 108	69.51	23 285	73.46
83、82、19、84	86	17 410	66.69	22 746	75.56

[1]	刘银钢, 马明, 张荣华. 基于大语言模型的兵棋推演动态任务规划[J]. 系统仿真学报, 2026, 38(5): 1187-1204.
[2]	周子聪, 曾俊杰, 胡越, 朱正秋, 尹全军. 基于次优示例引导的兵棋推演多智能体强化学习方法[J]. 系统仿真学报, 2026, 38(5): 1277-1289.
[3]	黄德启, 涂亚婷, 张振华, 郭鑫. 基于MLP与改进GCN-TD3的交通信号控制建模与仿真[J]. 系统仿真学报, 2025, 37(10): 2568-2577.
[4]	顾皓, 王佳宇, 熊伟丽. 双流框架下的改进Transformer软测量建模[J]. 系统仿真学报, 2025, 37(10): 2594-2604.
[5]	孙怡峰, 李智, 吴疆, 王玉宾. 作战方案驱动的可学习兵棋推演智能体研究[J]. 系统仿真学报, 2024, 36(7): 1525-1535.
[6]	路阳, 刘鹏飞, 许思源, 刘启旺, 顾福谦, 王鹏. 改进注意力机制嵌入PR-Net模型的水稻病害识别仿真[J]. 系统仿真学报, 2024, 36(6): 1322-1333.
[7]	张大永, 杨镜宇, 马骏, 宋晨烨. 面向兵棋推演复盘分析的机器学习数据集构建[J]. 系统仿真学报, 2024, 36(3): 608-624.
[8]	陈静, 张昭冲, 王琳凯, 安脉, 王伟. 基于卷积长短时记忆网络的短时公交客流量预测[J]. 系统仿真学报, 2024, 36(2): 476-486.
[9]	徐艺博, 于清华, 王炎娟, 郭策, 冯世如, 卢惠民. 基于多源信息融合的巡飞弹对地目标识别与毁伤评估[J]. 系统仿真学报, 2024, 36(2): 511-521.
[10]	罗俊仁, 张万鹏, 项凤涛, 蒋超远, 陈璟. 智能推演综述：博弈论视角下的战术战役兵棋与战略博弈[J]. 系统仿真学报, 2023, 35(9): 1871-1894.
[11]	张凯, 卢海鹏, 韩莹, 张龄允, 丁昱杰. 融合变分模态分解的时空卷积短时车速预测[J]. 系统仿真学报, 2023, 35(8): 1651-1660.
[12]	蔡兴泉, 李治均, 奚梦瑶, 孙海燕. 基于神经网络的手绘服饰图纹上色及风格迁移[J]. 系统仿真学报, 2023, 35(3): 604-615.
[13]	杨丰玮, 陈鹏, 郗凯, 蒲华林, 刘雪垠. 融合空间信息的运动想象脑电在线分类方法[J]. 系统仿真学报, 2023, 35(2): 254-267.
[14]	张大永, 杨镜宇, 吴曦. 兵棋推演空中任务智能预测方法研究[J]. 系统仿真学报, 2023, 35(1): 212-220.
[15]	张立峰, 王会忍. 基于卷积神经网络及有限元仿真的电容层析成像图像重建[J]. 系统仿真学报, 2022, 34(4): 712-718.