Journal of System Simulation ›› 2025, Vol. 37 ›› Issue (9): 2335-2351.doi: 10.16182/j.issn1004731x.joss.24-0333

• Papers • Previous Articles    

Control Strategy for UAV Cluster Formation Rendezvous Based on LDE-MADDPG Algorithm

Xiao Wei1, Gao Jiabo1,2, Ke Xueliang1   

  1. 1.Joint Logistic Support Force Engineering University of PLA, Chongqing 401311, China
    2.PLA 95019 Troops
  • Received:2024-04-02 Revised:2024-05-27 Online:2025-09-18 Published:2025-09-22

Abstract:

To solve the problem of difficulty in UAV cluster formation rendezvous based on MADDPG algorithm, an autonomous collaborative control strategy based on LDE-MADDPG algorithm is proposed. To address the issues of weak generalization, poor scalability, and slow cluster training process of MADDPG algorithm, LDE-MADDPG algorithm was proposed by designing a state feature learning network and a decoupled Critical network. By integrating LDE-MADDPG algorithm with strategy generation elements such as the decoupled reward function, cluster state space, and UAV action space, a control strategy for UAV cluster formation endezvous that can adapt to diverse formations and varying quantities has been developed. Simulation experiments show that compared to MADDPG algorithm, LDE-MADDPG algorithm improves the training process by 19.6%; The generated control strategy can complete the assembly of six different formations, such as a diamond, within 60 seconds, and achieve the formation and assembly of 6-21 drone clusters within 80 seconds with good generalization and scalability.

Key words: LDE-MADDPG algorithm, state feature learning network, decoupled Critical network model, formation rendezvous

CLC Number: