Journal of System Simulation ›› 2019, Vol. 31 ›› Issue (1): 16-26.doi: 10.16182/j.issn1004731x.joss.16PQS-003

Previous Articles     Next Articles

DP-Q(λ): Real-time Path Planning for Multi-agent in Large-scale Web3D Scene

Yan Fengting, Jia Jinyuan   

  1. School of Software Engineering, Shanghai 201804, China
  • Received:2016-05-31 Revised:2016-08-04 Online:2019-01-08 Published:2019-04-16

Abstract: The path planning of multi-agent in an unknown large-scale scene needs an efficient and stable algorithm, and needs to solve multi-agent collision avoidance problem, and then completes a real-time path planning in Web3D. To solve above problems, the DP-Q(λ) algorithm is proposed; and the direction constraints, high reward or punishment weight training methods are used to adjust the values of reward or punishment by using a probability p (0-1 random number). The value from reward or punishment determines its next step path planning strategy. If the next position is free, the agent could walk to it. The above strategy is extended to multi-agent path planning, and is used in Web3D. The experiment shows that the DP-Q(λ) algorithm is efficient and stable in the Web3D real-time multi-agent path planning.

Key words: Web3D, large-scale unknown environment, multi-agent, reinforcement learning, dynamic rewards p, path planning

CLC Number: