Journal of System Simulation ›› 2025, Vol. 37 ›› Issue (11): 2888-2903.doi: 10.16182/j.issn1004731x.joss.24-0622

• Papers • Previous Articles    

A USV Path Planning Algorithm under Special Environment Based on TD3-RRT

Chen Jitong1, Zhou Jiajia1, Wu Di2, Jiang Hailong3   

  1. 1.College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150000, China
    2.Qingdao Innovation and Development Center, Harbin Engineering University, Qingdao 266000, China
    3.The 29th Institute of China Electronics Technology Group Corporation, Chengdu 610000, China
  • Received:2024-06-11 Revised:2024-08-05 Online:2025-11-18 Published:2025-11-27
  • Contact: Zhou Jiajia

Abstract:

In view of USV path planning in special environments such as multiple obstacles, large-size obstacles, and narrow passages, the rapidly-exploring random tree (RRT) algorithm suffers from drawbacks such as a large sampling base, low success rate, and zigzagging planned path. To address these problems, a global path planning algorithm (TD3-RRT) was proposed based on the twin delayed deep deterministic policy gradient (TD3). The USV path search model was established by combining the RRT algorithm with deep reinforcement learning. Forward looking detection was used to sense the environment to adaptively adjust the step size. The path search direction was exported through the policy network to solve the problem of blind expansion in the RRT algorithm. An improved hindsight experience replay strategy was proposed, which enhanced the path search capability in complex environments by re-selecting the virtual targets and sampling in double experience replay pools. A reward function was designed to improve the quality of the planned path and accelerate the path searching speed. Experimental results show that under different environments, compared with current mainstream algorithms, TD3-RRT can effectively improve the path planning success rate and optimize the redundant steering angle, path length, and path planning time. which proves that the improved algorithms can effectively speed up the path search speed and improve the quality of paths. Furthermore, it has a good adaptability to different environments.

Key words: TD3 algorithm, path planning, special environment, RRT algorithm, USV, hindsight experience replay

CLC Number: