系统仿真学报 ›› 2024, Vol. 36 ›› Issue (12): 2917-2925.doi: 10.16182/j.issn1004731x.joss.23-1422

• 论文 • 上一篇    

基于深度强化学习的沙漠机器人路径规划

李明, 叶汪忠, 燕洁华   

  1. 内蒙古农业大学 能源与交通工程学院,内蒙古 呼和浩特 010018
  • 收稿日期:2023-11-22 修回日期:2023-12-18 出版日期:2024-12-20 发布日期:2024-12-20
  • 第一作者简介:李明(1983-),男,副教授,博士,研究方向为林业智能信息化、荒漠化防治。
  • 基金资助:
    国家重点研发计划(2018YFC0507102);内蒙古自治区高等学校科学研究(NJZY22520)

Path Planning of Desert Robot Based on Deep Reinforcement Learning

Li Ming, Ye Wangzhong, Yan Jiehua   

  1. Energy and Transportation Engineering College, Inner Mongolia Agricultural University, Hohhot 010018, China
  • Received:2023-11-22 Revised:2023-12-18 Online:2024-12-20 Published:2024-12-20

摘要:

由于沙漠环境复杂多变,移动机器人如何进行避障和路径规划是其高效作业的关键所在。针对深度强化学习算法在复杂环境下搜索效率差且收敛速度慢等问题,提出一种改进的深度强化学习路径规划算法。改进探索因子,根据算法的收敛程度动态调整,使探索因子随着智能体对环境了解程度的增加而动态下降,从而加快算法收敛速度。为了提高搜索效率,设置一种动态的奖励函数,将二次函数应用到其设置中,通过选择不同的动作,得到不一样的奖励值。仿真实验表明:改进的算法与原算法相比,所得到的路径长度、迭代次数和规划时间分别减少了11.9%、32.6%和17.4%,且该算法更适应复杂环境。

关键词: 路径规划, 机器人, 深度强化学习, 探索因子, 奖励函数

Abstract:

Due to the complexity and variability of the desert environment, the key to the high-efficient of mobile robot is how to avoid obstacles and plan its path. To solve the problems of poor search efficiency and slow convergence of deep reinforcement learning algorithm in complex environment, an improved deep reinforcement learning path planning algorithm is proposed. The exploration factor is improved and dynamically adjusted according to the convergence degree of the algorithm, so that the exploration factor dynamically decreases with the increase of the understanding degree of the agent to the environment, thus speeding up the convergence speed of the algorithm. To improve the search efficiency, a dynamic reward function is set up, the quadratic function is applied to its settings to obtain different reward values by selecting various actions. Simulation results show that compared with the original algorithm, the improved algorithm reduces the path length, iteration times, and planning time by 11.9%, 32.6%, and 17.4% respectively, more adapting to complex environment.

Key words: path planning, robot, deep reinforcement learning, exploration factor, reward function

中图分类号: