基于深度强化学习的云作业调度及仿真研究

doi:10.16182/j.issn1004731x.joss.21-0337

系统仿真学报 ›› 2022, Vol. 34 ›› Issue (2): 258-268.doi: 10.16182/j.issn1004731x.joss.21-0337

基于深度强化学习的云作业调度及仿真研究

李启锐¹(), 彭心怡²()

^1.广东石油化工学院计算机学院，广东茂名 525000
^2.华南师范大学数学科学学院，广东广州 510631

收稿日期:2021-04-20 修回日期:2021-07-01 出版日期:2022-02-18 发布日期:2022-02-23
通讯作者: 彭心怡 E-mail:liqirui@gdupt.edu.cn;1742043887@qq.com
作者简介:李启锐(1982-)，男，硕士，副教授，研究方向云计算资源调度。E-mail：liqirui@gdupt.edu.cn
基金资助:
国家自然科学基金资助项目(61772145);广东省自然科学基金资助项目(2020A1515010727);广东省科技专项资金资助项目(mmkj2020008)

Job Scheduling and Simulation in Cloud Based on Deep Reinforcement Learning

Qirui Li¹(), Xinyi Peng²()

^1.College of Computer Science, Guangdong University of Petrochemical Technology, Maoming 525000, China
^2.School of Mathematical Sciences, South China Normal University, Guangzhou 510631, China

Received:2021-04-20 Revised:2021-07-01 Online:2022-02-18 Published:2022-02-23
Contact: Xinyi Peng E-mail:liqirui@gdupt.edu.cn;1742043887@qq.com

摘要/Abstract

摘要：

针对复杂瞬变的多用户多队列多数据中心云计算环境中作业调度困难的问题，提出一种基于深度强化学习的作业调度方法。建立了云作业调度系统模型及其数学模型，并建立了由传输时间、等待时间和执行时间三部分构成的优化目标。基于深度强化学习设计了作业调度算法，给出了算法的状态空间、动作空间和奖赏函数。设计与开发了云作业仿真调度器，完成作业的仿真调度。仿真结果表明，相比随机调度、轮转调度、首次适应、最佳适应等基准算法，提出的算法能够有效降低作业的整体完工时间。

关键词: 云计算, 作业调度, 深度强化学习, 完工时间, 多用户, 多队列, 多数据中心

Abstract:

To solve the difficulty in job scheduling in the complex and transient multi-user, multi-queue, and multi-data-center cloud computing environment, this paper proposed a job scheduling method based on deep reinforcement learning. A system model of cloud job scheduling and its mathematical model were built, and an optimization goal consisting of transmission time, waiting time, and execution time was obtained. A job scheduling algorithm based on deep reinforcement learning was designed, and its state space, action space, and reward function were given. A simulated cloud job scheduler was designed and developed, and simulated scheduling experiments were conducted on it. The results show that compared with benchmark algorithms such as random scheduling, round-robin scheduling, firstfit, and optimal fit, the proposed algorithm could effectively reduce the overall makespan of the jobs.

Key words: cloud computing, job scheduling, deep reinforcement learning, makespan, multi-user, multi-queue, multi-data-center

中图分类号:

TP391.9

李启锐, 彭心怡. 基于深度强化学习的云作业调度及仿真研究[J]. 系统仿真学报, 2022, 34(2): 258-268.

Qirui Li, Xinyi Peng. Job Scheduling and Simulation in Cloud Based on Deep Reinforcement Learning[J]. Journal of System Simulation, 2022, 34(2): 258-268.

图/表 10

图1

图2

表1

图3

图4

表2

DQN网络超参数表

参数	值	参数	值
训练回合数量	5 000	目标网络更新频率	300
学习率 $α$	0.01	初始 $ε$ 值	0.2
折扣因子β	0.9	最大 $ε$ 值	0.9
样本池规模	500	$ε$ 每回合增幅	0.002
批样本数	64	等待时间系数 $ζ$	0.5
隐藏层数	2	传输时间系数 $ξ$	1

表2

图5

图6

图7

图8

参考文献 20

1	Xu Z, Liang W, Xia Q. Efficient Embedding of Virtual Networks to Distributed Clouds via Exploring Periodic Resource Demands[J]. IEEE Transactions on Cloud Computing(S2168-7161), 2018, 6(3): 694-707.
2	李成辉, 李仁旺, 杨强光, 等. 基于改进萤火虫算法的云计算任务调度算法[J]. 浙江理工大学学报(自然科学版), 2019, 41(3): 354-359.
	Li Chenghui, Li Renwang, Yang Qiangguang, et al. Cloud Computing Task Scheduling Algorithm Based on Improved Firefly Algorithm[J]. Journal of Zhejiang Sci-Tech University(Natural Sciences Edition), 2019, 41(3):354-359.
3	王康瑾, 贾统, 李影. 在离线混部作业调度与资源管理技术研究综述[J].软件学报, 2020, 31(10): 3100-3119.
	Wang Kangjin, Jia Tong, Li Ying. State-of-the-art Survey of Scheduling and Resource Management Technology for Colocation Jobs[J]. Journal of Software, 2020,31(10):3100-3119.
4	Verma A, Kaushal S. A hybrid Multi-objective Particle Swarm Optimization for Scientific Workflow Scheduling[J]. Parallel Computing(S0167-8191), 2017, 62: 1-19.
5	Duan H, Chen C, Min G, et al. Energy-aware Scheduling of Virtual Machines in Heterogeneous Cloud Computing Systems[J]. Future Generation Computer Systems(S0167-739X), 2017, 74: 142-150.
6	Srichandan S, Turuk A K S. Task Scheduling for Cloud Computing Using Multi-objective Hybrid Bacteria Foraging Algorithm[J]. Future Computing and Informatics Journal(S2314-7288), 2018, 3(2): 210-23.
7	李强, 刘晓峰. 基于模拟植物生长算法的云作业调度模型[J]. 系统仿真学报, 2018, 30(12): 4649-4658.
	Li Qiang, Liu Xiaofeng. Cloud Job Scheduling Model Based on Improved Plant Growth Algorithm[J]. Journal of System Simulation, 2018, 30(12): 4649-4658.
8	殷昌盛, 杨若鹏, 朱巍, 等. 多智能体分层强化学习综述[J]. 智能系统学报, 2020, 15(4): 646-655.
	Yin Changsheng, Yang Ruopeng, Zhu Wei, et al. A Survey on Multi-agent Hierarchical Reinforcement Learning[J]. CAAI Transactions on Intelligent Systems, 2020, 15(4): 646-655.
9	Peng Z, Cui D, Zuo J, et al. Random Task Scheduling Scheme Based on Reinforcement Learning in Cloud Computing[J]. Cluster Computing(S1386-7857), 2015, 18: 1595-1607.
10	Cui D, Peng Z, Xiong J, et al. A Reinforcement Learning-Based Mixed Job Scheduler Scheme for Grid or IaaS Cloud[J]. IEEE Transactions on Cloud Computing(S2168-7161), 2020, 4: 1030-1039.
11	袁景凌, 陈旻骋, 江涛, 等. 异构云环境下AHP定权的多目标强化学习作业调度方法[J/OL].(2021-01-05) 控制与决策, 2021:1-8. .
	Yuan Jingling, Chen Minchi, Jiang Tao, et al. Multi-Objective Reinforcement Learning Job Scheduling Method using AHP Fixed Weight in Heterogeneous Cloud Environment[J/OL].(2021-01-05) Control and Decision, 2021:1-8. .
12	Lin J, Cui D, Peng Z, et al. A Two-Stage Framework for the Multi-User Multi-Data Center Job Scheduling and Resource Allocation[J]. IEEE Access(S2169-3536), 2020, 8: 197863-197874.
13	郭玉栋, 左金平. 基于霍普菲尔德网络的云作业调度算法[J]. 系统仿真学报, 2019, 31(12): 2859-2867.
	Guo Yudong, Zuo Jinping. The Scheduling Algorithm of Cloud Job Based on Hopfield Neural Network[J]. Journal of System Simulation, 2019, 31(12): 2859-2867.
14	Rangra A, Sehgal V K, Shukla S. A Novel Approach of Cloud Based Scheduling Using Deep-Learning Approach in E-Commerce Domain[J]. International Journal of Information System Modeling and Design(S1947-8186), 2019, 10(3): 59-75.
15	李凯文, 张涛, 王锐, 等. 基于深度强化学习的组合优化研究进展[J]. 自动化学报, 2021, 47(11): 2521-2537.
	Li Kaiwen, Zhang Tao, Wang Rui, et al. Research Reviews of Combinatorial Optimization Methods Based on Deep Reinforcement Learning[J]. Acta Automatica Sinica, 2021, 47(11): 2521-2537.
16	朱斐, 吴文, 伏玉琛, 等. 基于双深度网络的安全深度强化学习方法[J].计算机学报, 2019, 42(8): 1812-1826.
	Zhu Fei, Wu Wen, Fu Yushen, et al. A Dual Deep Network Based Secure Deep Reinforcement Learning Method[J]. Chinese Journal of Computers, 2019, 42(8): 1812-1826.
17	Guo W, Tian W, Ye Y, et al. Cloud Resource Scheduling With Deep Reinforcement Learning and Imitation Learning[J]. IEEE Internet of Things Journal(S2327-4662), 2021, 8(5): 3576-3586.
18	Peng Z, Lin J, Cui D, et al. A Multi-objective Trade-off Framework for Cloud Resource Scheduling Based on the Deep Q-network Algorithm[J]. Cluster Computing(S1386-7857), 2020, 23(4): 2753-2767.
19	Lin J, Peng Z, Cui D. Deep Reinforcement Learning for Multi-resource Cloud Job Scheduling[C]// 2018 25th International Conference on Neural Information Processing. Berlin: Springer, 2018: 289-302.
20	Miettinen A, Nurminen J. Energy Efficiency of Mobile Clients in Cloud Computing[C]// Boston: USENIX Association, 2010: 1-7.

序号	计算能力(cycles)	核心数(个)	带宽/Mbps
1	650	4	200
2	1 850	8	300
3	2 500	12	500
4	700	6	250
5	2 050	10	400
6	1 500	8	200

[1]	张森, 张孟炎, 邵敬平, 普杰信. 基于随机策略搜索的多机三维路径规划方法[J]. 系统仿真学报, 2022, 34(6): 1286-1295.
[2]	倪凌佳, 黄晓霞, 李红旮, 张子博. 基于协作式深度强化学习的火灾应急疏散仿真研究[J]. 系统仿真学报, 2022, 34(6): 1353-1366.
[3]	王红微, 杨鹏. 基于深度强化学习的机场货运业务优化研究[J]. 系统仿真学报, 2022, 34(3): 651-660.
[4]	高昂, 董志明, 张国辉, 梁涛, 郭齐胜. LVC训练系统中计算机生成兵力生成技术研究[J]. 系统仿真学报, 2021, 33(3): 745-752.
[5]	曾贲, 房霄, 孔德帅, 宋祥祥, 贾政轩, 林廷宇. 一种数据驱动的对抗博弈智能体建模方法[J]. 系统仿真学报, 2021, 33(12): 2838-2845.
[6]	孟杨凯, 王正, 范加利. 基于禁忌算法对不确定性舰载机保障的调度优化研究[J]. 系统仿真学报, 2021, 33(10): 2363-2371.
[7]	刘长平, 简祯富, 傅文翰. 求解零等待流水线调度问题的离散磷虾群算法[J]. 系统仿真学报, 2020, 32(6): 1051-1059.
[8]	李俊萱, 王艳, 纪志成. 基于混合QPSO的模糊柔性作业车间调度问题研究[J]. 系统仿真学报, 2020, 32(10): 2010-2021.
[9]	鲍劲松, 李志强, 周亚勤. 基于遗传算法的舰载装备多目标作业调度优化研究[J]. 系统仿真学报, 2019, 31(5): 901-908.
[10]	轩华, 秦莹莹, 王薛苑, 张百林. 带恶化工件的不相关并行机调度优化[J]. 系统仿真学报, 2019, 31(5): 919-924.
[11]	张会丽, 李志河. 一种基于HNN的云服务组合优化[J]. 系统仿真学报, 2019, 31(11): 2335-2343.
[12]	张天瑞, 曲传声, 吴宝库, 徐佳楠. 基于云计算的复杂装备健康管理系统建模与仿真[J]. 系统仿真学报, 2019, 31(11): 2356-2365.
[13]	李强, 刘晓峰. 基于模拟植物生长算法的云作业调度模型[J]. 系统仿真学报, 2018, 30(12): 4649-4658.
[14]	张守刚, 吴龙成, 王艳, 纪志成. 基于分布估计算法的硫化车间调度[J]. 系统仿真学报, 2017, 29(9): 2182-2189.
[15]	郭松辉, 李清宝, 孙磊, 龚雪容, 杨天池. 基于ISSM的密码服务系统虚拟化性能建模[J]. 系统仿真学报, 2017, 29(8): 1692-1701.

基于深度强化学习的云作业调度及仿真研究

Job Scheduling and Simulation in Cloud Based on Deep Reinforcement Learning

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 20

相关文章 15

编辑推荐

Metrics

本文评价