系统仿真学报 ›› 2025, Vol. 37 ›› Issue (10): 2578-2593.doi: 10.16182/j.issn1004731x.joss.24-0494

• 论文 • 上一篇    

基于深度强化学习的改进RRT算法路径规划

梁秀满, 刘子良, 刘振东   

  1. 华北理工大学 电气工程学院,河北 唐山 063210
  • 收稿日期:2024-05-08 修回日期:2024-09-12 出版日期:2025-10-20 发布日期:2025-10-21
  • 通讯作者: 刘子良
  • 第一作者简介:梁秀满(1973-),女,副教授,硕导,硕士,研究方向为机器学习、强化学习等。
  • 基金资助:
    河北省自然科学基金(F2018209289)

Path Planning of Improved RRT Algorithm Based on Deep Reinforcement Learning

Liang Xiuman, Liu Ziliang, Liu Zhendong   

  1. College of Electrical Engineering, North China University of Science and Technology, Tangshan 063210, China
  • Received:2024-05-08 Revised:2024-09-12 Online:2025-10-20 Published:2025-10-21
  • Contact: Liu Ziliang

摘要:

针对RRT算法在三维复杂场景中规划全局路径时存在规划效率低、安全性和实用性较差而无法满足无人机对飞行路径的安全需求,提出SAC深度强化学习算法与RRT算法融合的SAC-RRT算法。设计基于SAC算法决策网络的目标点偏置策略和动态步长策略,降低RRT盲目性;设计随机点修正过程,根据决策网络输出动作优化随机点位置,改善路径安全性;设计精简步骤和平滑步骤,进一步提高路径安全性。设计了不同复杂程度的三维场景,规划结果表明:SAC-RRT算法有效缩短了路径长度和规划时间,改善了路径的平滑性和安全性。

关键词: 深度强化学习, SAC算法, RRT算法, 无人机, 三次B样条

Abstract:

To address the low planning efficiency, poor safety, and limited practicability of the RRT algorithm in global path planning within complex three-dimensional environments, which fail to meet the requirements of planning the safe flight path of UAVs, an improved SAC-RRT algorithm was proposed, which fused SAC deep reinforcement learning algorithm and RRT algorithm. A target point bias strategy and a dynamic step size based on the SAC decision-making network were designed to reduce the blindness of RRT. A random point correction process was designed to optimize the position of random points based on actions from the decision network and improve the path safety. In addition, simplified and smooth steps were designed to further improve path safety. Several 3D scenarios of varying complexity were designed, and the planning results show that the SAC-RRT algorithm reduces path length and planning time while improving path smoothness and safety.

Key words: deep reinforcement learning, SAC algorithm, RRT algorithm, UAV, cubic B-spline

中图分类号: