Journal of System Simulation ›› 2026, Vol. 38 ›› Issue (2): 433-446.doi: 10.16182/j.issn1004731x.joss.25-0621

• Wargaming and Simulation-Based Evaluation • Previous Articles    

Intelligent Decision-making Method in Imbalanced Air Combat Based on Asymmetric Self-play

Zheng Wei1, Tang Jiahao1, Xiong Xiaoping2, Fan Xin1   

  1. 1.School of Software Engineering, Nanchang Hangkong University, Nanchang 330063, China
    2.Civil Aviation Administration of China Jiangxi Aircraft Airworthiness Certification Center, Nanchang 330038, China
  • Received:2025-06-30 Revised:2025-09-19 Online:2026-02-18 Published:2026-02-11
  • Contact: Tang Jiahao

Abstract:

To solve the problem of strategy convergence caused by role homogenization in traditional self-play for imbalanced air combat, an intelligent decision-making method based on asymmetric self-play was proposed. This method decoupled tactics from control by employing a hierarchical reinforcement learning framework and designed differentiated reward functions for advantaged and disadvantaged sides. Bidirectional independent policy pools were constructed to promote the co-evolution of strategies. The proximal policy optimization algorithm was utilized to train the model. Experiments in 1v1 weapon-imbalanced and 2v1 numerically-imbalanced scenarios demonstrate that compared to symmetric self-play, the proposed method increases the kill rate of the advantaged side by up to 12% and the survival rate of the disadvantaged side by up to 40%. The overall effectiveness in multi-agent combat is also significantly enhanced. The study verifies the effectiveness of the asymmetric design in enhancing the specialized combat capabilities and tactical diversity of intelligent agents for imbalanced air combat.

Key words: imbalanced air combat, asymmetric self-play, bidirectional policy pool, hierarchical reinforcement learning, proximal policy optimization

CLC Number: