Journal of System Simulation ›› 2022, Vol. 34 ›› Issue (6): 1247-1258.doi: 10.16182/j.issn1004731x.joss.21-0099

• Modeling Theory and Methodology • Previous Articles     Next Articles

Application of Improved Q Learning Algorithm in Job Shop Scheduling Problem

Yejian Zhao(), Yanhong Wang(), Jun Zhang, Hongxia Yu, Zhongda Tian   

  1. School of Artificial Intelligence, Shenyang University of Technology, Shenyang 110027, China
  • Received:2021-02-02 Revised:2021-03-14 Online:2022-06-30 Published:2022-06-16
  • Contact: Yanhong Wang E-mail:zhao_yejian@163.com;wangyh_sut@163.com

Abstract:

Aiming at the job shop scheduling in a dynamic environment, a dynamic scheduling algorithm based on an improved Q learning algorithm and dispatching rules is proposed. The state space of the dynamic scheduling algorithm is described with the concept of "the urgency of remaining tasks" and a reward function with the purpose of "the higher the slack, the higher the penalty" is disigned. In view of the problem that the greedy strategy will select the sub-optimal actions in the later stage of learning, the traditional Q learning algorithm is improved by introducing an action selection strategy based on the "softmax" function, which makes the improved Q learning algorithm more equal in the probability of selecting different actions in the early stage. The simulation results obtained from 6 different test instances show that the performance indicator of the scheduling algorithm is improved by an average of about 6.5% compared to the before and by about 38.3% and 38.9% respectively compared with the IPSO algorithm and PSO algorithm. The indicator is significantly better than conventional methods such as using a single dispatching rule and traditional optimization algorithms.

Key words: reinforcement learning, Q learning, dispatching rules, dynamic scheduling, job shop scheduling

CLC Number: