UAV Path Planning Based on Improved Deep Deterministic Policy Gradients

doi:10.16182/j.issn1004731x.joss.23-1524

Abstract

Abstract:

Aiming at the problems of poor convergence and invalid exploration when UAVs perform path planning in complex environments, an improved deep deterministic policy gradient(DDPG) algorithm is proposed. Using a dual experience pooling mechanism to store success and failure experiences separately, the algorithm is able to use the success experience to strengthen the strategy optimization and learn from the failure experience to avoid the wrong path;an APF method is introduced to add a bootstrap term to the planning, which is combined with the exploration of noisy actions in a randomized sampling process to dynamically integrate the selected actions;multi-objective optimization of path planning is achieved by designing combinatorial reward functions using direction, distance, obstacle avoidance and time reward functions and solving the reward sparsity problem. Experiments show that the proposed algorithm can significantly improve the reward and success rate and reach convergence in a shorter time.

Key words: UAV, DRL, path planning, deep deterministic policy gradient(DDPG), APF

CLC Number:

TP273

Zhang Sen, Dai Qiangqiang. UAV Path Planning Based on Improved Deep Deterministic Policy Gradients[J]. Journal of System Simulation, 2025, 37(4): 875-881.

Figures/Tables 7

Fig. 1

Table 1

Training parameter settings

参数	数值	参数	数值
学习率	0.001	$k 1$	0.3
折扣因子	0.9	$k 2$	10
$E m a x$	1 000	$k 3$	0.3
最大步长	500	$k 4$	15
$P s$	100 000	$k 5$	5
$P d$	100 000	$k 6$	50
$η$	0.02	$k 7$	1

Table 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

References 17

1	Zhang Ming, Li Wei, Wang Mengmeng, et al. Helicopter-UAVs Search and Rescue Task Allocation Considering UAVs Operating Environment and Performance[J]. Computers & Industrial Engineering, 2022, 167: 107994.
2	Xing Linjie, Fan Xiaoyan, Dong Yaxin, et al. Multi-UAV Cooperative System for Search and Rescue Based on YOLOv5[J]. International Journal of Disaster Risk Reduction, 2022, 76: 102972.
3	Raptis Emmanuel K, Krestenitis Marios, Egglezos Konstantinos, et al. End-to-end Precision Agriculture UAV-based Functionalities Tailored to Field Characteristics[J]. Journal of Intelligent & Robotic Systems, 2023, 107(2): 23.
4	Zhou Maowu, Chen Hongbin, Shu Lei, et al. UAV-assisted Sleep Scheduling Algorithm for Energy-efficient Data Collection in Agricultural Internet of Things[J]. IEEE Internet of Things Journal, 2022, 9(13): 11043-11056.
5	Asadzadeh Saeid, Wilson José de Oliveira, Carlos Roberto de Souza Filho. UAV-based Remote Sensing for the Petroleum Industry and Environmental Monitoring: State-of-the-art and Perspectives[J]. Journal of Petroleum Science and Engineering, 2022, 208, Part D: 109633.
6	Lee H W. Research on Multi-functional Logistics Intelligent Unmanned Aerial Vehicle[J]. Engineering Applications of Artificial Intelligence, 2022, 116: 105341.
7	Lu Ziyi, Yu Na, Wang Xuehe. Incentive Mechanism and Path Planning for Unmanned Aerial Vehicle (UAV) Hitching over Traffic Networks[J]. Future Generation Computer Systems, 2023, 145: 521-535.
8	Lippiello Vincenzo, Cacace Jonathan. Robust Visual Localization of a UAV over a Pipe-rack Based on the Lie Group SE(3)[J]. IEEE Robotics and Automation Letters, 2022, 7(1): 295-302.
9	Liu Kangcheng, Chen B M. Industrial UAV-based Unsupervised Domain Adaptive Crack Recognitions: From Database Towards Real-site Infrastructural Inspections[J]. IEEE Transactions on Industrial Electronics, 2023, 70(9): 9410-9420.
10	Chai Xuzhao, Zheng Zhishuai, Xiao Junming, et al. Multi-strategy Fusion Differential Evolution Algorithm for UAV Path Planning in Complex Environment[J]. Aerospace Science and Technology, 2022, 121: 107287.
11	Yan Yuehao, Zhiying Lü, Yuan Jinbiao, et al. Obstacle Avoidance for Multi-UAV System with Optimized Artificial Potential Field Algorithm[J]. International Journal of Robotics and Automation, 2021, 36: 1-7.
12	Zu Linan, Wang Zhipeng, Liu Cong, et al. Research on UAV Path Planning Method Based on Improved HPO Algorithm in Multitask Environment[J]. IEEE Sensors Journal, 2023, 23(17): 19881-19893.
13	Xue Yuntao, Chen Weisheng. A UAV Navigation Approach Based on Deep Reinforcement Learning in Large Cluttered 3D Environments[J]. IEEE Transactions on Vehicular Technology, 2023, 72(3): 3001-3014.
14	Zhao Jinduo, Gan Zhigao, Liang Jiakai, et al. Path Planning Research of a UAV Base Station Searching for Disaster Victims' Location Information Based on Deep Reinforcement Learning[J]. Entropy, 2022, 24(12): 1767.
15	Zhang Sitong, Li Yibing, Ye Fang, et al. A Hybrid Human-in-the-loop Deep Reinforcement Learning Method for UAV Motion Planning for Long Trajectories with Unpredictable Obstacles[J]. Drones, 2023, 7(5): 311.
16	张云燕, 魏瑶, 刘昊, 等. 基于深度强化学习的端到端无人机避障决策[J]. 西北工业大学学报, 2022, 40(5): 1055-1064.
	Zhang Yunyan, Wei Yao, Liu Hao, et al. End-to-end UAV Obstacle Avoidance Decision Based on Deep Reinforcement Learning[J]. Journal of Northwestern Polytechnical University, 2022, 40(5): 1055-1064.
17	文超, 董文瀚, 解武杰, 等. 基于解耦型MADDPG的无人机集群自主跟踪与避障[J]. 飞行力学, 2022, 40(6): 24-31.
	Wen Chao, Dong Wenhan, Xie Wujie, et al. Autonomous Tracking and Obstacle Avoidance of UAV Swarms Based on Decomposed MADDPG[J]. Flight Dynamics, 2022, 40(6): 24-31.

[1]	Liang Longxiao, Mao Jianlin, Wang Niya, Fang Chengyuan, Zhou Wenna. Multi-agent CBS Path Planning Algorithm Based on Minimum Planning Margin First [J]. Journal of System Simulation, 2026, 38(5): 1159-1173.
[2]	Meng Wenlong, Pu Yanbo, Gong Ya. AUV Path Planning Integrating Local-global Strategies in Unknown Environments [J]. Journal of System Simulation, 2026, 38(4): 889-902.
[3]	Li Dequan, Xiong Wan. Robot Path Planning by Reinforcement Learning Based on SAC3Q-HDM [J]. Journal of System Simulation, 2026, 38(3): 714-724.
[4]	Xie Jun, Zhang Qi, Peng Yanyun, Shi Haonan, Li Dongyang, Liu Xi. Research on UAV Path Planning Method Based on Collision Free Trajectory [J]. Journal of System Simulation, 2026, 38(3): 808-817.
[5]	Wu Shuxia, Zhang Junjie, Chen Delong, Chen Zheyi. Resource-efficient Continuous Learning Framework for Edge Real-time Video Analytics [J]. Journal of System Simulation, 2026, 38(2): 294-306.
[6]	Zhu Ling, Li Jing, Zhang Zhaohui. An Adaptive Robot Path Planning Based on Improved REA* Algorithm [J]. Journal of System Simulation, 2026, 38(2): 332-345.
[7]	Yang Can, Chen Kai, Zhu Feng. Reinforcement Learning Based Method for UAV Team Orienteering Optimization under Multi-constraint Condition [J]. Journal of System Simulation, 2026, 38(2): 360-371.
[8]	Wang Bingkun, Wang Yue, Yang Mei, Zhang Pengnian, Fan Bohao, Tang Jie. Strike Strategy Planning Method of Unmanned Ground Vehicles Based on Improved PPO Algorithm [J]. Journal of System Simulation, 2026, 38(2): 372-386.
[9]	Liu Quan, Wang Yu, Liu Linyue, Chen Hao, Huang Jian. Knowledge Closed-loop Driving-based Intelligent Game Confrontation Simulation [J]. Journal of System Simulation, 2026, 38(2): 416-432.
[10]	Yu Yiran, Lai Huicheng, Gao Guxue, Zhang Guo, Peng Wangyinan, Yang Longfei, Huang Junhao. Optimization Method for Multi Agricultural Machinery Collaborative Operation Based on Genetic Algorithm and A * Algorithm [J]. Journal of System Simulation, 2025, 37(9): 2397-2408.
[11]	Ni Peilong, Mao Pengjun, Wang Ning, Yang Mengjie. Robot Path Planning Based on Improved A-DDQN Algorithm [J]. Journal of System Simulation, 2025, 37(9): 2420-2430.
[12]	Zhang Kaixiang, Mao Jianlin, Wang Niya, Xu Zhihao. Multi-robot Hierarchical Collaborative k-robust Path Planning for Path Interference [J]. Journal of System Simulation, 2025, 37(8): 2074-2088.
[13]	Wan Yuhang, Zhu Zilu, Zhong Chunfu, Liu Yongkui, Lin Tingyu, Zhang Lin. Dynamic Path Planning for Robotic Arms Based on an Improved PPO Algorithm [J]. Journal of System Simulation, 2025, 37(6): 1462-1473.
[14]	Ye Chen, Shao Peng, Zhang Shaoping, Li Wenting, Zhou Tengming. Enhanced Artificial Gorilla Algorithm for Mobile Robot Path Planning [J]. Journal of System Simulation, 2025, 37(6): 1474-1485.
[15]	Zhang Yan, Li Binghua, Huo Tao, Liu Rong. Research on Robot Dynamic Obstacle Avoidance Method Based on Improved A* and Dynamic Window Algorithm [J]. Journal of System Simulation, 2025, 37(6): 1555-1564.