系统仿真学报 ›› 2024, Vol. 36 ›› Issue (9): 2113-2126.doi: 10.16182/j.issn1004731x.joss.23-0576

• 研究论文 • 上一篇    

基于DRL的边缘监控任务卸载与资源分配算法

李超1, 李贾宝1, 丁才昌2, 叶志伟1, 左方威1   

  1. 1.湖北工业大学 计算机学院,湖北 武汉 430000
    2.湖北工程学院 计算机与信息科学学院,湖北 孝感 432000
  • 收稿日期:2023-05-16 修回日期:2023-07-16 出版日期:2024-09-15 发布日期:2024-09-30
  • 通讯作者: 丁才昌
  • 第一作者简介:李超(1982-),男,讲师,博士,研究方向为深度学习与边缘计算。
  • 基金资助:
    国家自然科学基金(61902116);湖北省教育厅科学技术研究计划中青年人才(Q20202705);湖北省大学生创新训练计划(S202210500096)

Edge Surveillance Task Offloading and Resource Allocation Algorithm Based on DRL

Li Chao1, Li Jiabao1, Ding Caichang2, Ye Zhiwei1, Zuo Fangwei1   

  1. 1.School of Computer Science, Hubei University of Technology, Wuhan 430000, China
    2.School of Computer and Information Science, Hubei Engineering University, Xiaogan 432000, China
  • Received:2023-05-16 Revised:2023-07-16 Online:2024-09-15 Published:2024-09-30
  • Contact: Ding Caichang

摘要:

为解决边缘计算环境下密集型监控任务资源受限的问题,提出一种基于DRL的监控任务卸载与资源分配算法。以监控任务时延和识别精度为优化目标,将监控系统中的任务卸载、无线信道分配和图像压缩率的联合决策目标优化求解建模为马尔可夫决策过程;针对无线信道动态性和监控任务随机性引起的训练样本波动性较大,导致算法收敛速度慢和不稳定,采用Transformer注意力机制对多时隙序列的信道状态和监控任务信息进行联合编码。编码后的状态信息能够捕捉多时隙状态序列之间的依赖关系,提升网络状态的表征能力,并以此提高算法鲁棒性实验结果表明:与传统强化学习算法和启发式算法相比,该算法在降低任务计算时延的同时能够有效提高识别精度。

关键词: 监控任务, 移动边缘计算, 深度强化学习, 任务卸载, 资源分配, 注意力机制

Abstract:

For the resource limitation of intensive surveillance tasks in edge computing, a surveillance task offloading and resource allocation algorithm based on DRL is proposed. With the optimization objectives of surveillance task delay and recognition accuracy, the joint decision objective optimization solution of task offloading, wireless channel allocation, and image compression rate was modeled as a Markov decision process. To address the problem of slow and unstable algorithm convergence due to the high volatility of training samples caused by the dynamic nature of wireless channels and the randomness of surveillance tasks, an attention mechanism is used to jointly encode channel states and surveillance task information from multi-slot state sequences. By capturing the dependency relationships between multi-slot state sequences, the representation ability of network state and the robustness of the algorithm are improved. Experimental results show that the proposed algorithm outperforms traditional reinforcement learning algorithm and heuristic algorithm in improving recognition accuracy and reducing task computation delay.

Key words: surveillance task, mobile edge computing, DRL, task offloading, resource allocation, attention mechanism

中图分类号: