Path Planning of Unmanned Delivery Vehicle Based on Improved Q-learning Algorithm

doi:10.16182/j.issn1004731x.joss.23-0051

Abstract

Abstract:

To solve the traditional Q-learning algorithm for unmanned vehicle path planning suffers from the problems of low planning efficiency and slow convergence speed, for this reason, a path planning algorithm for unmanned delivery vehicles based on the improved Q-learning algorithm is proposed. Learning from the energy iteration principle of the simulated annealing algorithm, adjusts the greedy factor ε to make it change dynamically during the training process, so as to balance the relationship between exploration and utilization, and thus improve the planning efficiency. The reward value in the reward mechanism is changed from a discrete value to a continuous value, and it increases as the European distance between the unmanned delivery vehicle and the target point decreases,so that the target point can pull the unmanned delivery vehicle to move and accelerate the convergence speed of the algorithm. The improved Q-learning algorithm is simulated in two different environments, the simulation results show that the improved Q-learning algorithm can efficiently plan a path from the starting point to the target point with 34 steps, which is better path quality than comparison algorithms. The adaptability of the improved Q-learning algorithm to different environments is verified by changing the road environment, and the planning efficiency and convergence speed are still better than the traditional Q-learning algorithm.

Key words: Q-learning, path planning, convergence speed, planning efficiency, path quality

CLC Number:

TP391.9

Wang Xiaokang, Ji Jie, Liu Yang, He Qing. Path Planning of Unmanned Delivery Vehicle Based on Improved Q-learning Algorithm[J]. Journal of System Simulation, 2024, 36(5): 1211-1221.

Figures/Tables 14

Fig. 1

Fig. 2

Fig. 3

Table 1

Parameter setting

参数	值	参数	值
折扣率 $γ$	0.9	比例因子 $μ 2$	0.000 1
学习率 $α$	0.1	训练轮数 $N$	5 000
探索率 $ε 2$	0.4	比例因子 $μ 3$	42.192 5
初始探索率 $ε 1$	0.4	比例因子 $μ 4$	$-$ 0.004 2
最终探索率 $ε f$	0.001	目标点即时奖励 $r 1$	100
比例因子 $μ 1$	$-$ 1

Table 1

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Fig. 9

Fig. 10

Fig. 11

Fig. 12

Fig. 13

References 21

1	张辉, 张瑞亮, 许小庆, 等. 基于关键节点的改进A^*无人车路径规划算法[J]. 汽车技术, 2023(3): 10-18.
	Zhang Hui, Zhang Ruiliang, Xu Xiaoqing, et al. Key Nodes-based Improved A^* Algorithm for Path Planning of Unmanned Vehicle[J]. Automobile Technology, 2023(3): 10-18.
2	Li Xiaowei, Li Qing, Yin Chengqiang, et al. Autonomous Navigation Technology for Low-speed Small Unmanned Vehicle: An Overview[J]. World Electric Vehicle Journal, 2022, 13(9): 165.
3	罗洁, 王中训, 潘康路, 等. 基于改进人工势场法的无人车路径规划算法[J]. 电子设计工程, 2022, 30(17): 90-94, 99.
	Luo Jie, Wang Zhongxun, Pan Kanglu, et al. Unmanned Vehicle Path Planning Algorithm Based on Improved Artificial Potential Field Method[J]. Electronic Design Engineering, 2022, 30(17): 90-94, 99.
4	黄凯文, 赵煜, 黄玲, 等. 基于机械视觉的Arduino智能物流配送车[J]. 河南科技, 2021, 40(22): 19-23.
	Huang Kaiwen, Zhao Yu, Huang Ling, et al. Arduino Intelligent Logistics Delivery Vehicle Based on Mechanical Vision[J]. Henan Science and Technology, 2021, 40(22): 19-23.
5	Wu Yuzhan, Ding Yuanhao, Ding Susheng, et al. Autonomous Last-mile Delivery Based on the Cooperation of Multiple Heterogeneous Unmanned Ground Vehicles[J]. Mathematical Problems in Engineering, 2021, 2021: 5546581.
6	刘珂, 董洪昭, 张丽梅, 等. 基于改进人工势场法的物流无人配送车路径规划[J]. 计算机应用研究, 2022, 39(11): 3287-3291.
	Liu Ke, Dong Hongzhao, Zhang Limei, et al. Path Planning for Logistics Unmanned Delivery Vehicles Based on Improved Artificial Potential Field Method[J]. Application Research of Computers, 2022, 39(11): 3287-3291.
7	Li Jianqiang, Sun Tao, Huang Xiaopeng, et al. A Memetic Path Planning Algorithm for Unmanned Air/Ground Vehicle Cooperative Detection Systems[J]. IEEE Transactions on Automation Science and Engineering, 2022, 19(4): 2724-2737.
8	翟丽, 张雪莹, 张闲, 等. 基于势场法的无人车局部动态避障路径规划算法[J]. 北京理工大学学报, 2022, 42(7): 696-705.
	Zhai Li, Zhang Xueying, Zhang Xian, et al. Local Dynamic Obstacle Avoidance Path Planning Algorithm for Unmanned Vehicles Based on Potential Field Method[J]. Transactions of Beijing Institute of Technology, 2022, 42(7): 696-705.
9	Jang Beakcheol, Kim Myeonghwi, Harerimana Gaspard, et al. Q-learning Algorithms: A Comprehensive Classification and Applications[J]. IEEE Access, 2019, 7: 133653-133667.
10	李远哲, 胡纪滨. 强化学习在无人车领域的应用与展望[J]. 信息与控制, 2022, 51(2): 129-141.
	Li Yuanzhe, Hu Jibin. Applications and Prospect of Reinforcement Learning in Unmanned Ground Vehicles[J]. Information and Control, 2022, 51(2): 129-141.
11	杨秀霞, 高恒杰, 刘伟, 等. 基于阶段Q学习算法的机器人路径规划[J]. 兵器装备工程学报, 2022, 43(5): 197-203.
	Yang Xiuxia, Gao Hengjie, Liu Wei, et al. Robot Path Planning Based on Stage Q Learning Algorithm[J]. Journal of Ordnance Equipment Engineering, 2022, 43(5): 197-203.
12	张祥来, 江尚容, 罗芹. 基于改进Q学习算法的"货到人"系统AGV路径规划[J]. 现代计算机, 2022, 28(2): 62-66, 72.
	Zhang Xianglai, Jiang Shangrong, Luo Qin. Research on AGV Path Planning of "Goods-to-person" System Based on Q-learning[J]. Modern Computer, 2022, 28(2): 62-66, 72.
13	赵也践, 王艳红, 张俊, 等. 改进Q学习算法在作业车间调度问题中的应用[J]. 系统仿真学报, 2022, 34(6): 1247-1258.
	Zhao Yejian, Wang Yanhong, Zhang Jun, et al. Application of Improved Q Learning Algorithm in Job Shop Scheduling Problem[J]. Journal of System Simulation, 2022, 34(6): 1247-1258.
14	Ee Soong Low, Ong Pauline, Cheng Yee Low, et al. Modified Q-learning with Distance Metric and Virtual Target on Path Planning of Mobile Robot[J]. Expert Systems with Applications, 2022, 199: 117191.
15	Konar Amit, Indrani Goswami Chakraborty, Sapam Jitu Singh, et al. A Deterministic Improved Q-learning for Path Planning of a Mobile Robot[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2013, 43(5): 1141-1153.
16	Zhao Meng, Lu Hui, Yang Siyi, et al. The Experience-memory Q-learning Algorithm for Robot Path Planning in Unknown Environment[J]. IEEE Access, 2020, 8: 47824-47844.
17	Zhang Lieping, Tang Liu, Zhang Shenglan, et al. A Self-adaptive Reinforcement-exploration Q-learning Algorithm[J]. Symmetry, 2021, 13(6): 1057.
18	Hu Yanming, Li Decai, He Yuqing, et al. Incremental Learning Framework for Autonomous Robots Based on Q-learning and the Adaptive Kernel Linear Model[J]. IEEE Transactions on Cognitive and Developmental Systems, 2022, 14(1): 64-74.
19	Ma Xin, Xu Ya, Sun Guoqiang, et al. State-chain Sequential Feedback Reinforcement Learning for Path Planning of Autonomous Mobile Robots[J]. Journal of Zhejiang University Science C, 2013, 14(3): 167-178.
20	Shang Erke, Dai Bin, Nie Yiming, et al. An Improved A-star Based Path Planning Algorithm for Autonomous Land Vehicles[J]. International Journal of Advanced Robotic Systems, 2020, 17(5): 1729881420962263.
21	Tang Gang, Tang Congqiang, Claramunt Christophe, et al. Geometric A-star Algorithm: An Improved A-star Algorithm for AGV Path Planning in a Port Environment[J]. IEEE Access, 2021, 9: 59196-59210.

[1]	Lei Xu, Chen Jingyi, Chen Xiaoyang. Research on Path Planning of Warehouse Robot with Improved Harris Hawks Algorithm [J]. Journal of System Simulation, 2024, 36(5): 1081-1092.
[2]	Xiao Peng, Xie Feng, Ni Haihong, Zhang Min, Tang Zhili, Li Ni. Research on Collaborative Optimization Method of Multi-UAV Task Allocation and Path Planning [J]. Journal of System Simulation, 2024, 36(5): 1141-1151.
[3]	Liu Zesen, Bi Sheng, Guo Chuanhong, Wang Yankui, Dong Min. Deep Learning Based Local Path Planning Method for Moving Robots [J]. Journal of System Simulation, 2024, 36(5): 1199-1210.
[4]	Jiang Zhaozhen, Wang Wenlong, Sun Wenqi. Path Planning Rapid Algorithm Based on Modified RRT* for Unmanned Surface Vessel [J]. Journal of System Simulation, 2024, 36(4): 888-900.
[5]	Zhang Rui, Zhou Li, Liu Zhengyang. Dynamic Path Planning for Mobile Robot Based on RRT* and Dynamic Window Approach [J]. Journal of System Simulation, 2024, 36(4): 957-968.
[6]	Li Gaoyang, Li Xiangfeng, Zhao Kang, Jin Yuchao, Yi Zhidong, Zuo Dunwen. Three-Dimensional Path Planning of UAV Based on All Particles Driving Wild Horse Optimizer Algorithm [J]. Journal of System Simulation, 2024, 36(3): 595-607.
[7]	Yu Xiang, Jiang Chen, Duan Sirui, Deng Qianrui. Path Planning for Improvement of A* Algorithm and Artificial Potential Field Method [J]. Journal of System Simulation, 2024, 36(3): 782-794.
[8]	Tang Yunchao, Qi Shaojun, Zhu Lixue, Zhuo Xianrong, Zhang Yunqi, Meng Fan. Obstacle Avoidance Motion in Mobile Robotics [J]. Journal of System Simulation, 2024, 36(1): 1-26.
[9]	Chen Dejun, Fang Zihao, Zeng Yunxiu, Xu Kai. Deceptive Path Planning in Fog of War [J]. Journal of System Simulation, 2023, 35(9): 1895-1908.
[10]	Li Wenjing, Luo Yanlin, Wang Yuhui, Zhu Li. Virtual Navigation Path Planning Based on Octree Potential Field for Endonasal Endoscope [J]. Journal of System Simulation, 2023, 35(9): 2054-2063.
[11]	Junqiang Lin, Hongjun Wang, Xiangjun Zou, Po Zhang, Chengen Li, Yipeng Zhou, Shujie Yao. Obstacle Avoidance Path Planning and Simulation of Mobile Picking Robot Based on DPPO [J]. Journal of System Simulation, 2023, 35(8): 1692-1704.
[12]	Laiyi Yang, Jing Bi, Haitao Yuan. Intelligent Path Planning for Mobile Robots Based on SAC Algorithm [J]. Journal of System Simulation, 2023, 35(8): 1726-1736.
[13]	Menglong Cao, Wenbin Zhao, Zhiqiang Chen. Robot Path Planning by Fusing Particle Swarm Algorithm and Improved Grey Wolf Algorithm [J]. Journal of System Simulation, 2023, 35(8): 1768-1775.
[14]	Hailan Yang, Yongqiang Qi, Baolei Wu, Dan Rong, Miaoying Hong, Jun Wang. Path Planning of Mobile Robots Based on Memristor Reinforcement Learning in Dynamic Environment [J]. Journal of System Simulation, 2023, 35(7): 1619-1633.
[15]	Dalei Song, Wenhao Gan, Yingzhi Xu, Xiuqing Qu, Jiangli Cao. Simulation of Real-Time Path Planning and Formation Control for Unmanned Surface Vessel [J]. Journal of System Simulation, 2023, 35(5): 957-970.