系统仿真学报 ›› 2025, Vol. 37 ›› Issue (2): 474-486.doi: 10.16182/j.issn1004731x.joss.23-1164

• 研究论文 • 上一篇    

基于改进D3QN算法的随机工时下柔性综合调度问题研究

李想, 任晓羽, 周永兵, 张剑   

  1. 西南交通大学 机械工程学院,四川 成都 610031
  • 收稿日期:2023-09-19 修回日期:2023-11-08 出版日期:2025-02-14 发布日期:2025-02-10
  • 通讯作者: 张剑
  • 第一作者简介:李想(1998-),男,硕士生,研究方向为智能调度与决策。
  • 基金资助:
    四川省重大科技专项基金(2020ZDZX001503)

Research on Flexible Integrated Scheduling Under Stochastic Processing Times Based on Improved D3QN Algorithm

Li Xiang, Ren Xiaoyu, Zhou Yongbing, Zhang Jian   

  1. School of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, China
  • Received:2023-09-19 Revised:2023-11-08 Online:2025-02-14 Published:2025-02-10
  • Contact: Zhang Jian

摘要:

针对离散制造车间的工时不确定性问题,在考虑设备和工序约束的基础上,以最小化最大完工时间为优化目标构建综合调度数学模型,并提出一种改进双竞争深度Q网络算法(ID3QN)求解随机工时下的柔性综合调度问题。从工序、机器及整体层面分别设计了三组状态特征;将与工时、加工顺序相关的工序规则以及与优化目标相关的机器规则组成的8组复合调度规则作为动作集,并根据平均机器利用率差值进行即时奖励引入自注意力机制与混合采样策略,以进一步提升算法稳定性和泛化性。仿真结果表明:所提算法在求解随机工时柔性综合调度问题时,平均偏差比现有深度强化学习算法平均提高了54.63%,验证了算法的有效性。

关键词: 柔性综合调度, ID3QN, 自注意力机制, 随机工时

Abstract:

Aiming at the problem of time uncertainty in discrete manufacturing workshops, we construct an integrated scheduling mathematical model with the optimization objective of minimizing the maximum completion time based on the consideration of equipment and process constraints, and propose an improved dual-competitive deep Q-network algorithm (ID3QN) to solve the flexible integrated scheduling problem under stochastic working hours. The levels of process, machine, and overall scheduling are designed as features. Eight composite scheduling rules are formed as the action space by combining process rules based on processing times, processing sequences, and process structure tree, along with machine rules relevant to optimization objectives. The difference in average machine utilization set as rewards instantly. Simultaneously, the self-attention mechanism and the mixed sampling strategy are introduced to improve the stability and generalization of the algorithm. The empirical results demonstrate that the average variation of the ID3QN is increased by 54.63% compared to existing deep reinforcement learning algorithms under stochastic processing times, thus confirming the effectiveness of the proposed algorithm.

Key words: flexible integrated scheduling, ID3QN, self-attention mechanism, stochastic processing times

中图分类号: