系统仿真学报 ›› 2022, Vol. 34 ›› Issue (10): 2264-2271.doi: 10.16182/j.issn1004731x.joss.21-0632

• 仿真支撑平台/系统技术 • 上一篇    下一篇

基于强化学习的连续型机械臂自适应跟踪控制

江达1(), 蔡志勤1(), 刘忠振1, 彭海军1,2, 吴志刚2   

  1. 1.大连理工大学,辽宁 大连 116024
    2.工业装备结构分析国家重点实验室,辽宁 大连 116024
  • 收稿日期:2021-07-07 修回日期:2021-09-12 出版日期:2022-10-30 发布日期:2022-10-18
  • 通讯作者: 蔡志勤 E-mail:ziangdar@sina.com;zhqcai@dlut.edu.cn
  • 作者简介:江达(1992-),男,博士生,研究方向为空间机器人动力学与控制。E-mail:ziangdar@sina.com
  • 基金资助:
    国家自然科学基金重大研究计划重点项目(91748203);国家自然科学基金优秀青年项目(11922203)

Reinforcement-learning-based Adaptive Tracking Control for a Space Continuum Robot Based on Reinforcement Learning

Da Jiang1(), Zhiqin Cai1(), Zhongzhen Liu1, Haijun Peng1,2, Zhigang Wu2   

  1. 1.Dalian University of Technology, Dalian 116024, China
    2.State Key Laboratory of Structural Analysis for Industrial Equipment, Dalian 116024, China
  • Received:2021-07-07 Revised:2021-09-12 Online:2022-10-30 Published:2022-10-18
  • Contact: Zhiqin Cai E-mail:ziangdar@sina.com;zhqcai@dlut.edu.cn

摘要:

针对空间主动碎片清除操作中连续型三臂节机器人系统跟踪问题,提出一种基于强化学习的自适应滑模控制算法基于数据驱动的建模方法,采用BP神经网络对三臂节连续型机械臂进行建模,并作为预测模型指导强化学习实时调节所提出滑模控制器的控制参数,从而实现连续型机器人运动的实时跟踪控制。仿真结果表明:提出的数据驱动的预测模型对随机轨迹预测的相对误差保持在±1%以内,能够高精度地反映系统动态特性。对比固定参数的滑模控制器,提出的自适应控制器在保证系统达到控制目标的同时具有更低的超调量和更短的调节时间,表现出更好的控制效果。

关键词: 空间连续型机器人, 强化学习, 预测控制, 滑模控制, 轨迹跟踪

Abstract:

Aiming at the tracking control for three-arm space continuum robot in space active debris removal manipulation, an adaptive sliding mode control algorithm based on deep reinforcement learning is proposed. Through BP network, a data-driven dynamic model is developed as the predictive model to guide the reinforcement learning to adjust the sliding mode controller's parameters online, and finally realize a real-time tracking control. Simulation results show that the proposed data-driven predictive model can accurately predict the robot's dynamic characteristics with the relative error within ±1% to random trajectories. Compared with the fixed-parameter sliding mode controller, the proposed adaptive controller has a lower overshoot and shorter settling time and can achieve a better tracking performance.

Key words: space continuum robot, reinforcement learning, predictive control, sliding mode control, trajectory tracking

中图分类号: