系统仿真学报 ›› 2023, Vol. 35 ›› Issue (7): 1619-1633.doi: 10.16182/j.issn1004731x.joss.22-0334

• 论文 • 上一篇    

动态环境下基于忆阻强化学习的移动机器人路径规划

杨海兰1(), 祁永强1(), 吴保磊2, 荣丹1, 洪妙英1, 王军3   

  1. 1.中国矿业大学 数学学院, 江苏 徐州 221116
    2.中国矿业大学, 计算机科学与技术学院, 江苏 徐州 221116
    3.中国矿业大学 信息与控制工程学院, 江苏 徐州 221116
  • 收稿日期:2022-04-11 修回日期:2022-07-07 出版日期:2023-07-29 发布日期:2023-07-19
  • 通讯作者: 祁永强 E-mail:yhailan163@163.com;qiyongqiang@163.com
  • 作者简介:杨海兰(1999-),女,硕士生,研究方向为智能机器人控制。E-mail:yhailan163@163.com
  • 基金资助:
    国家自然科学基金(61304088);中央高校基本科研专项基金(2013QNA37);中国博士后科学基金(2015M581886);非结构化环境混合感知(2020ZDPY0217);中国矿业大学实验室开放基金(2020SYKF42);中国矿业大学未来杰出人才助力计划(2022WLJCRCZL134)

Path Planning of Mobile Robots Based on Memristor Reinforcement Learning in Dynamic Environment

Hailan Yang1(), Yongqiang Qi1(), Baolei Wu2, Dan Rong1, Miaoying Hong1, Jun Wang3   

  1. 1.School of Mathematics, China University of Mining and Technology, Xuzhou 221116, China
    2.School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
    3.School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
  • Received:2022-04-11 Revised:2022-07-07 Online:2023-07-29 Published:2023-07-19
  • Contact: Yongqiang Qi E-mail:yhailan163@163.com;qiyongqiang@163.com

摘要:

为解决动态环境下的移动机器人路径规划问题,提出基于改进蚁群算法和基于忆阻器阵列的DQN(deep q-network)算法的双层路径规划算法。通过改进了概率转移函数和信息素更新原则的蚁群算法完成静态全局路径规划;利用忆阻器“存算一体”的特性,将其作为神经网络的突触结构,改进了传统DQN算法结构,完成移动机器人的局部动态避障。根据移动机器人感知范围内是否有动态障碍物来切换路径规划机制,完成动态环境下的路径规划任务。仿真结果表明该算法有效可行,能在动态环境中为移动机器人实时规划出可行路径。

关键词: 动态环境, DQN(deep q-network), 忆阻器, 存算一体, 路径规划

Abstract:

In order to solve the path planning problem of mobile robots in dynamic environment, two-layer path planning algorithm based on improved ant colony algorithm and MA-DQN algorithm is proposed. Static global path planning is accomplished by ant colony algorithm that improved the probabilistic transfer function and the pheromone updating principle; the traditional DQN algorithm structure is improved by using the memristor as the synaptic structure of neural network, and then completed the local dynamic obstacle avoidance of the mobile robot. The path planning mechanism is switched according to whether there are dynamic obstacles within the sensing range of the mobile robot, so as to completed the path planning task in the dynamic environment. The simulation results show that the algorithm can effectively plan a feasible path for mobile robots in a dynamic environment in real time.

Key words: dynamic environment, (deep q-network)DQN, memristor, in-memory computing, path planning

中图分类号: