Robot Path Planning Based on Improved A-DDQN Algorithm

doi:10.16182/j.issn1004731x.joss.24-0369

Abstract

Abstract:

An improved A-DDQN algorithm is proposed to address the challenges of reward sparsity and the inability to distinguish sample importance in traditional DQN algorithms during robot path planning. Building on the original DQN, an enhancement is made by incorporating the Double-DQN approach, which updates the predictive Q-value network based on actions selected by the Q network, rather than directly using the predicted Q-values for action selection, thereby mitigating overestimation issues. Secondly, the concept of artificial potential field (APF) is introduced to design specific rewards for each step of the robot's movement, guiding the robot and addressing the problem of sparse rewards. Lastly, the prioritized experience replay (PER) mechanism is integrated, adjusting the sampling probability of experiences through priority ranking to accelerate the learning process and enhance performance. Comparative analysis of path planning before and after the algorithm improvements in a two-dimensional grid map shows that in small-scale maps, the improved A-DDQN algorithm reduces path length, iteration times, and the number of turning points by 11.5%, 23.1%, and 61.5% respectively; in large-scale maps with sparse obstacles, these reductions are 19.4%, 50.0%, and 52.9% respectively; and in large-scale maps with dense obstacles, reductions are 29.7%, 48.1%, and 64.3%. These simulation results prove that the improved algorithm achieves faster convergence and superior path planning performance.

Key words: robots, path planning, deep reinforcement learning, artificial potential field(APF), priority experience replay(PER)

CLC Number:

TP242

Ni Peilong, Mao Pengjun, Wang Ning, Yang Mengjie. Robot Path Planning Based on Improved A-DDQN Algorithm[J]. Journal of System Simulation, 2025, 37(9): 2420-2430.

Figures/Tables 22

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Table 1

Hyperparameter settings

超参数	数值	超参数	数值
折扣率 $γ$	0.9	动作空间大小 $a$	8
学习率 $α$	0.1	迭代次数	200
初始探索率	1	引力奖励常数 $α 1$	0.000 01
最小探索率	0.1	排斥奖励常数 $α 2$	0.05
探索衰减值	0.998	方向奖励常数 $α 3$	0.05

Table 1

Fig. 7

Fig. 8

Fig. 9

Fig. 10

Table 2

Fig. 11

Fig. 12

Fig. 13

Fig. 14

Table 3

Fig. 15

Fig. 16

Fig. 17

Fig. 18

Table 4

References 20

[1]	林桂娟, 李子涵, 王宇. 基于全局关键点提取的改进A^*算法全局路径规划研究[J]. 系统仿真学报, 2025, 37(3): 667-678.
	Lin Guijuan, Li Zihan, Wang Yu. Research on Improved A^* Algorithm Path Planning Based on Global Key Point Extraction[J]. Journal of System Simulation, 2025, 37(3): 667-678.
[2]	张瑞, 周丽, 刘正洋. 融合RRT^*与DWA算法的移动机器人动态路径规划[J]. 系统仿真学报, 2024, 36(4): 957-968.
	Zhang Rui, Zhou Li, Liu Zhengyang. Dynamic Path Planning for Mobile Robot Based on RRT^* and Dynamic Window Approach[J]. Journal of System Simulation, 2024, 36(4): 957-968.
[3]	Lin Zenan, Yue Ming, Chen Guangyi, et al. Path Planning of Mobile Robot with PSO-based APF and Fuzzy-based DWA Subject to Moving Obstacles[J]. Transactions of the Institute of Measurement and Control, 2022, 44(1): 121-132.
[4]	Chen Yanli, Bai Guiqiang, Zhan Yin, et al. Path Planning and Obstacle Avoiding of the USV Based on Improved ACO-APF Hybrid Algorithm with Adaptive Early-warning[J]. IEEE Access, 2021, 9: 40728-40742.
[5]	Cheng Jin, Wang Bin. Flocking Control of Mobile Robots with Obstacle Avoidance Based on Simulated Annealing Algorithm[J]. Mathematical Problems in Engineering, 2020, 2020(1): 7357464.
[6]	邓向阳, 张立民, 方伟, 等. 基于双向汇聚引导蚁群算法的机器人路径规划[J]. 系统仿真学报, 2022, 34(5): 1101-1108.
	Deng Xiangyang, Zhang Limin, Fang Wei, et al. Robot Path Planning Based on Bidirectional Aggregation Ant Colony Optimization[J]. Journal of System Simulation, 2022, 34(5): 1101-1108.
[7]	Tian Shasha, Li Yuanxiang, Kang Yilin, et al. Multi-robot Path Planning in Wireless Sensor Networks Based on Jump Mechanism PSO and Safety Gap Obstacle Avoidance[J]. Future Generation Computer Systems, 2021, 118: 37-47.
[8]	Chen Deping. Fuzzy Obstacle Avoidance Optimization of Soccer Robot Based on an Improved Genetic Algorithm[J]. Journal of Ambient Intelligence and Humanized Computing, 2020, 11(12): 6187-6198.
[9]	Kulathunga Geesara. A Reinforcement Learning Based Path Planning Approach in 3D Environment[J]. Procedia Computer Science, 2022, 212: 152-160.
[10]	Li Xin, Wang Lei, An Yi, et al. Dynamic Path Planning of Mobile Robots Using Adaptive Dynamic Programming[J]. Expert Systems with Applications, 2024, 235: 121112.
[11]	李子怡, 胡祥涛, 张勇乐, 等. 基于虚拟目标制导的自适应Q学习路径规划算法[J]. 计算机集成制造系统, 2024, 30(2): 553-568.
	Li Ziyi, Hu Xiangtao, Zhang Yongle, et al. Adaptive Q-learning Path Planning Algorithm Based on Virtual Target Guidance[J]. Computer Integrated Manufacturing Systems, 2024, 30(2): 553-568.
[12]	Daranda Andrius, Dzemyda Gintautas. Reinforcement Learning Strategies for Vessel Navigation[J]. Integrated Computer-Aided Engineering, 2023, 30(1): 53-66.
[13]	Mnih V, Kavukcuoglu K, Silver D, et al. Playing Atari with Deep Reinforcement Learning[EB/OL]. [2024-03-02].
[14]	Liangheng Lü, Zhang Sunjie, Ding Derui, et al. Path Planning via an Improved DQN-based Learning Policy[J]. IEEE Access, 2019, 7: 67319-67330.
[15]	张晨, 蒋文英, 陈思源, 等. 基于双层DQN的多智能体路径规划[J]. 中国图象图形学报, 2023, 28(7): 2167-2181.
	Zhang Chen, Jiang Wenying, Chen Siyuan, et al. Multi-agent Path Planning Based on Improved Double DQN[J]. Journal of Image and Graphics, 2023, 28(7): 2167-2181.
[16]	韩玲, 张晖, 方若愚, 等. 基于改进深度强化学习的全局路径规划策略[J]. 汽车安全与节能学报, 2023, 14(2): 202-211.
	Han Ling, Zhang Hui, Fang Ruoyu, et al. Global Path Planning Strategy Based on an Improved Deep Reinforcement Learning[J]. Journal of Automotive Safety and Energy, 2023, 14(2): 202-211.
[17]	李明, 叶汪忠, 燕洁华. 基于深度强化学习的沙漠机器人路径规划[J]. 系统仿真学报, 2024, 36(12): 2917-2925.
	Li Ming, Ye Wangzhong, Yan Jiehua. Path Planning of Desert Robot Based on Deep Reinforcement Learning[J]. Journal of System Simulation, 2024, 36(12): 2917-2925.
[18]	Li Jianxin, Chen Yiting, Zhao Xiuniao, et al. An Improved DQN Path Planning Algorithm[J]. The Journal of Supercomputing, 2022, 78(1): 616-639.
[19]	Kober Jens, Bagnell J A, Peters Jan. Reinforcement Learning in Robotics: A Survey[J]. International Journal of Robotics Research, 2013, 32(11): 1238-1274.
[20]	段建民, 陈强龙. 利用先验知识的Q-Learning路径规划算法研究[J]. 电光与控制, 2019, 26(9): 29-33.
	Duan Jianmin, Chen Qianglong. Prior Knowledge Based Q-learning Path Planning Algorithm[J]. Electronics Optics & Control, 2019, 26(9): 29-33.