Journal of System Simulation ›› 2024, Vol. 36 ›› Issue (11): 2631-2643.doi: 10.16182/j.issn1004731x.joss.23-0939

Previous Articles    

End-to-end Motion Planning of Unmanned Vehicles Based on Multimodal Deep Reinforcement Learning

Ding Kaiyuan1,2, Hamdulla Askar1,2, Zhu Bin3, Firkat Eksan1, Ma Zhengtang1   

  1. 1.School of Computer Science and Technology, Xinjiang University, Urumqi 830017, China
    2.Xinjiang Key Laboratory of Signal Detection and Processing, Urumqi 830017, China
    3.Department of Automation, Tsinghua University, Beijing 100084, China
  • Received:2023-07-25 Revised:2023-10-11 Online:2024-11-13 Published:2024-11-19

Abstract:

Since the agent cannot sense the surrounding environment and cannot successfully avoid obstacles, reinforcement learning fails to be generalized to robot motion planning in difficult terrain. Therefore, a solution based on multimodal deep reinforcement learning, which learns to blend proprioceptive states with high-dimensional depth sensor inputs, is proposed for the motion planning of unmanned vehicles. To be specific, proprioceptive states offer contact measurement for immediate reaction, and the unmanned vehicle can learn and forecast environmental changes with its attached visual sensors, proactively navigating around obstacles and uneven terrains numerous time steps ahead. TransProAct (transformer-based proactive action), a unique end-to-end multimodal Transformer fusion model, is proposed. Proprioceptive states and visual data are fused through its self-attention mechanism, and then the deep reinforcement algorithm PPO is used to train the self-learning of motion planning by the unmanned vehicle. In addition, multimodal delay randomization is introduced to resolve the differences between simulation and reality. After being tested in difficult simulation environments with a variety of barriers and uneven ground, the proposed approach shows notable gains over the baseline and a remarkable improvement in generalization ability.

Key words: multimodal perception, reinforcement learning, unmanned vehicle, motion planning, neural network

CLC Number: