Journal of System Simulation ›› 2024, Vol. 36 ›› Issue (1): 39-49.doi: 10.16182/j.issn1004731x.joss.22-0886

• Papers • Previous Articles     Next Articles

Strategy Optimization Method of Multi-dimension Projection Based on Deep Reinforcement Learning

An Jing1,2,3(), Si Guangya3(), Zhang Lei1,2,3   

  1. 1.Joint Logistics College, PLA National Defense University, Beijing 100858, China
    2.Graduate School, PLA National Defense University, Beijing 100091, China
    3.Joint Operations College, PLA National Defense University, Beijing 100091, China
  • Received:2022-08-02 Revised:2022-09-27 Online:2024-01-20 Published:2024-01-19
  • Contact: Si Guangya E-mail:anj21_2000@sina.com;sgy863@sina.com

Abstract:

Based on the perfect performance of deep reinforcement learning (DRL) in strategy optimization, this paper proposes a strategy optimization method of action taking the multi-dimension projection action as the main research object. The method combines the simulation experiment method with the DRL method. After analyzing the current situation of strategy optimization research, the deep learning framework is selected according to the research problems, and a DRL multi-dimension projection strategy model based on the asynchronous advantage actor-critic (A3C) algorithm is constructed. Through simulation experiments, the interactive learning between the DRL model and the simulation of "out of the loop" is realized, and the optimized multi-dimension projection strategy is obtained. Finally, the effectiveness of the cooperative optimization strategy between the DRL framework and the simulation experiment is verified.

Key words: deep reinforcement learning (DRL), simulation, strategy optimization, multi-dimension projection, asynchronous advantage actor-critic (A3C) algorithm

CLC Number: