Journal of System Simulation ›› 2024, Vol. 36 ›› Issue (5): 1211-1221.doi: 10.16182/j.issn1004731x.joss.23-0051

Previous Articles     Next Articles

Path Planning of Unmanned Delivery Vehicle Based on Improved Q-learning Algorithm

Wang Xiaokang(), Ji Jie(), Liu Yang, He Qing   

  1. College of Engineering and Technology, Southwest University, Chongqing 400715, China
  • Received:2023-01-14 Revised:2023-04-03 Online:2024-05-15 Published:2024-05-21
  • Contact: Ji Jie;


To solve the traditional Q-learning algorithm for unmanned vehicle path planning suffers from the problems of low planning efficiency and slow convergence speed, for this reason, a path planning algorithm for unmanned delivery vehicles based on the improved Q-learning algorithm is proposed. Learning from the energy iteration principle of the simulated annealing algorithm, adjusts the greedy factor ε to make it change dynamically during the training process, so as to balance the relationship between exploration and utilization, and thus improve the planning efficiency. The reward value in the reward mechanism is changed from a discrete value to a continuous value, and it increases as the European distance between the unmanned delivery vehicle and the target point decreases,so that the target point can pull the unmanned delivery vehicle to move and accelerate the convergence speed of the algorithm. The improved Q-learning algorithm is simulated in two different environments, the simulation results show that the improved Q-learning algorithm can efficiently plan a path from the starting point to the target point with 34 steps, which is better path quality than comparison algorithms. The adaptability of the improved Q-learning algorithm to different environments is verified by changing the road environment, and the planning efficiency and convergence speed are still better than the traditional Q-learning algorithm.

Key words: Q-learning, path planning, convergence speed, planning efficiency, path quality

CLC Number: