Journal of System Simulation ›› 2026, Vol. 38 ›› Issue (2): 372-386.doi: 10.16182/j.issn1004731x.joss.25-0486

• Machine Learning Algorithms • Previous Articles    

Strike Strategy Planning Method of Unmanned Ground Vehicles Based on Improved PPO Algorithm

Wang Bingkun1, Wang Yue1, Yang Mei2, Zhang Pengnian1, Fan Bohao1, Tang Jie1   

  1. 1.Northwest Institute of Mechanical & Electrical Engineering, Xianyang 712099, China
    2.College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China
  • Received:2025-05-28 Revised:2025-07-29 Online:2026-02-18 Published:2026-02-11
  • Contact: Wang Yue

Abstract:

An improved PPO algorithm based on the hybrid action space and gated recurrent unit (GRU) is proposed to address the limitations of predefined strike rules in maximizing the hitting accuracy of unmanned ground vehicles and the difficult coupling and optimization of continuous motion planning and discrete strike decision-making. The environmental model and target model are built for the process of unmanned ground vehicles' strike missions, coupled with a three-layer model for unmanned ground vehicles that fuses kinematic constraints, situational awareness, and dynamic decision-making. Two distinct policy networks are employed, including the continuous motion planning network for path planning, and the discrete strike decision-making network for solving the strike decision-making problems in the process of strike location and target sequence selection. A GRU module is introduced to address the partially observable nature of the environment by inferring current states from historical observations. The simulation results show that this method can couple and optimize the path planning and strike decision-making of unmanned ground vehicles, improving the ability of unmanned ground vehicles to autonomously perform strike missions.

Key words: DRL, unmanned ground vehicle, path planning, strike decision-making, PPO

CLC Number: