基于DRL的边缘监控任务卸载与资源分配算法

doi:10.16182/j.issn1004731x.joss.23-0576

摘要/Abstract

摘要：

为解决边缘计算环境下密集型监控任务资源受限的问题，提出一种基于DRL的监控任务卸载与资源分配算法。以监控任务时延和识别精度为优化目标，将监控系统中的任务卸载、无线信道分配和图像压缩率的联合决策目标优化求解建模为马尔可夫决策过程；针对无线信道动态性和监控任务随机性引起的训练样本波动性较大，导致算法收敛速度慢和不稳定，采用Transformer注意力机制对多时隙序列的信道状态和监控任务信息进行联合编码。编码后的状态信息能够捕捉多时隙状态序列之间的依赖关系，提升网络状态的表征能力，并以此提高算法鲁棒性。实验结果表明：与传统强化学习算法和启发式算法相比，该算法在降低任务计算时延的同时能够有效提高识别精度。

关键词: 监控任务, 移动边缘计算, 深度强化学习, 任务卸载, 资源分配, 注意力机制

Abstract:

For the resource limitation of intensive surveillance tasks in edge computing, a surveillance task offloading and resource allocation algorithm based on DRL is proposed. With the optimization objectives of surveillance task delay and recognition accuracy, the joint decision objective optimization solution of task offloading, wireless channel allocation, and image compression rate was modeled as a Markov decision process. To address the problem of slow and unstable algorithm convergence due to the high volatility of training samples caused by the dynamic nature of wireless channels and the randomness of surveillance tasks, an attention mechanism is used to jointly encode channel states and surveillance task information from multi-slot state sequences. By capturing the dependency relationships between multi-slot state sequences, the representation ability of network state and the robustness of the algorithm are improved. Experimental results show that the proposed algorithm outperforms traditional reinforcement learning algorithm and heuristic algorithm in improving recognition accuracy and reducing task computation delay.

Key words: surveillance task, mobile edge computing, DRL, task offloading, resource allocation, attention mechanism

中图分类号:

TP391.9

李超,李贾宝,丁才昌等 . 基于DRL的边缘监控任务卸载与资源分配算法[J]. 系统仿真学报, 2024, 36(9): 2113-2126.

Li Chao,Li Jiabao,Ding Caichang,et al . Edge Surveillance Task Offloading and Resource Allocation Algorithm Based on DRL[J]. Journal of System Simulation, 2024, 36(9): 2113-2126.

图/表 18

图1

图2

图3

图4

图5

图6

表1

仿真参数设定

参数	预设值
继电器的数量 $K$	10
连接到继电器的摄像头数量 $C$	3
传输功率 $P$ /dBm	15
噪声功率 $p N$ /dBW	-97
总带宽 $B$ /MHz	7
学习率 $α$	$1 × 10 - 5$
折扣率 $γ$	0.9
子信道带宽 $B s u b$ /KHz	700
有界化参数 $ξ$	1.5
有界化参数 $G$	0.5
奖励函数参数 $ρ$	16
历史状态个数 $v$	2

表1

图7

图8

图9

图10

图11

图12

表2

表3

表4

不同图像大小集合下动作概率

集合	$A f r o n t$	$A w a i t$	$A b a c k$
size1	0.013 32	0.047 05	0.939 61
size2	0.024 68	0.047 76	0.927 54
size3	0.025 12	0.048 23	0.926 63
size4	0.033 69	0.048 49	0.917 80
size5	0.035 84	0.051 09	0.913 05

表4

表5

不同图像大小集合下选择压缩率的概率

集合	$m 1$	$m 2$	$m 3$	$m 4$	$m 5$
size1	0.915 80	0.032 97	0.017 86	0.015 94	0.017 41
size2	0.831 59	0.057 53	0.036 80	0.034 34	0.039 72
size3	0.815 36	0.067 58	0.041 89	0.037 31	0.037 84
size4	0.789 79	0.077 30	0.046 19	0.041 90	0.044 79
size5	0.778 31	0.075 43	0.052 24	0.043 26	0.050 75

表5

表6

不同图像置信度集合的动作概率

置信度	$A f r o n t$	$A w a i t$	$A b a c k$
$β 1$	0.002 07	0.048 37	0.949 55
$β 2$	0.003 53	0.053 77	0.942 68
$β 3$	0.004 82	0.063 72	0.931 44
$β 4$	0.015 13	0.055 18	0.929 67
$β 5$	0.032 58	0.066 93	0.900 47

表6

参考文献 28

1	Jiang Xiantao, Yu F R, Song Tian, et al. Intelligent Resource Allocation for Video Analytics in Blockchain-enabled Internet of Autonomous Vehicles with Edge Computing[J]. IEEE Internet of Things Journal, 2022, 9(16): 14260-14272.
2	Sethuraman Sibi C, Kompally Pranav, Reddy Srikar. VISU: A 3-D Printed Functional Robot for Crowd Surveillance[J]. IEEE Consumer Electronics Magazine, 2021, 10(1): 17-23.
3	Chen Xinqiang, Ling Jun, Wang Shengzheng, et al. Ship Detection from Coastal Surveillance Videos Via an Ensemble Canny-gaussian-morphology Framework[J]. The Journal of Navigation, 2021, 74(6): 1252-1266.
4	Wan Shaohua, Xu Xiaolong, Wang Tian, et al. An Intelligent Video Analysis Method for Abnormal Event Detection in Intelligent Transportation Systems[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(7): 4487-4495.
5	Song Chunhe, Xu Wenxiang, Wu Tingting, et al. QoE-driven Edge Caching in Vehicle Networks Based on Deep Reinforcement Learning[J]. IEEE Transactions on Vehicular Technology, 2021, 70(6): 5286-5295.
6	Chen Y Y, Lin Y H, Hu Yuchen, et al. Distributed Real-time Object Detection Based on Edge-cloud Collaboration for Smart Video Surveillance Applications[J]. IEEE Access, 2022, 10: 93745-93759.
7	Xu Zhi, Li Jingzhao, Zhang Mei. A Surveillance Video Real-time Analysis System Based on Edge-cloud and FL-YOLO Cooperation in Coal Mine[J]. IEEE Access, 2021, 9: 68482-68497.
8	Girolami Michele, Vitello Piergiorgio, Capponi Andrea, et al. A Mobility-based Deployment Strategy for Edge Data Centers[J]. Journal of Parallel and Distributed Computing, 2022, 164: 133-141.
9	张依琳, 梁玉珠, 尹沐君, 等. 移动边缘计算中计算卸载方案研究综述[J]. 计算机学报, 2021, 44(12): 2406-2430.
	Zhang Yilin, Liang Yuzhu, Yin Mujun, et al. Survey on the Methods of Computation Offloading in Mobile Edge Computing[J]. Chinese Journal of Computers, 2021, 44(12): 2406-2430.
10	Sun Jin, Yin Lu, Zou Minhui, et al. Makespan-minimization Workflow Scheduling for Complex Networks with Social Groups in Edge Computing[J]. Journal of Systems Architecture, 2020, 108: 101799.
11	Liao Zhuofan, Peng Jingsheng, Xiong Bing, et al. Adaptive Offloading in Mobile-edge Computing for Ultra-dense Cellular Networks Based on Genetic Algorithm[J]. Journal of Cloud Computing, 2021, 10(1): 1-16.
12	Gao Tieliang, Tang Qigui, Li Jiao, et al. A Particle Swarm Optimization with Lévy Flight for Service Caching and Task Offloading in Edge-cloud Computing[J]. IEEE Access, 2022, 10: 76636-76647.
13	Wang Junhua, Zhu Kun, Chen Bing, et al. Distributed Clustering-based Cooperative Vehicular Edge Computing for Real-time Offloading Requests[J]. IEEE Transactions on Vehicular Technology, 2022, 71(1): 653-669.
14	Guo Min, Huang Xing, Wang Wei, et al. HAGP: A Heuristic Algorithm Based on Greedy Policy for Task Offloading with Reliability of MDs in MEC of the Industrial Internet[J]. Sensors, 2021, 21(10): 3513.
15	Xu Fei, Qin Zengshi, Ning Linpeng, et al. Research on Computing Offloading Strategy Based on Genetic Ant Colony Fusion Algorithm[J]. Simulation Modelling Practice and Theory, 2022, 118: 102523.
16	杨来义, 毕敬, 苑海涛. 基于SAC算法的移动机器人智能路径规划[J]. 系统仿真学报, 2023, 35(8): 1726-1736.
	Yang Laiyi, Bi Jing, Yuan Haitao. Intelligent Path Planning for Mobile Robots Based on SAC Algorithm[J]. Journal of System Simulation, 2023, 35(8): 1726-1736.
17	Yan Kunpeng, Shan Hangguan, Sun Tengxu, et al. Reinforcement Learning-based Mobile Edge Computing and Transmission Scheduling for Video Surveillance[J]. IEEE Transactions on Emerging Topics in Computing, 2022, 10(2): 1142-1156.
18	Zhou Huan, Jiang Kai, Liu Xunun, et al. Deep Reinforcement Learning for Energy-efficient Computation Offloading in Mobile-edge Computing[J]. IEEE Internet of Things Journal, 2022, 9(2): 1517-1530.
19	Chen Ying, Liu Zhiyong, Zhang Yongchao, et al. Deep Reinforcement Learning-based Dynamic Resource Management for Mobile Edge Computing in Industrial Internet of Things[J]. IEEE Transactions on Industrial Informatics, 2021, 17(7): 4925-4934.
20	Tang Ming, Wong V W S. Deep Reinforcement Learning for Task Offloading in Mobile Edge Computing Systems[J]. IEEE Transactions on Mobile Computing, 2022, 21(6): 1985-1997.
21	Hu Haoji, Shan Hangguan, Wang Chuankun, et al. Video Surveillance on Mobile Edge Networks-A Reinforcement-learning-based Approach[J]. IEEE Internet of Things Journal, 2020, 7(6): 4746-4760.
22	Wang Shuoyao, Bi Suzhi, Zhang Yingjun. Deep Reinforcement Learning with Communication Transformer for Adaptive Live Streaming in Wireless Edge Networks[J]. IEEE Journal on Selected Areas in Communications, 2022, 40(1): 308-322.
23	Blad Christian, Bøgh Simon, Carsten Skovmose Kallesøe. Data-driven Offline Reinforcement Learning for HVAC-systems[J]. Energy, 2022, 261, Part B: 125290.
24	Wang Junpeng, Zhang Wei, Yang Hao, et al. Visual Analytics for RNN-based Deep Reinforcement Learning[J]. IEEE Transactions on Visualization and Computer Graphics, 2022, 28(12): 4141-4155.
25	Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY, USA: Curran Associates Inc., 2017: 6000-6010.
26	Haarnoja T, Zhou A, Abbeel P, et al. Soft Actor-critic: Off-policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor[C]//Proceedings of the 35th International Conference on Machine Learning. Chia Laguna Resort, Sardinia, Italy: PMLR, 2018: 1861-1870.
27	Wu Jingda, Wei Zhongbao, Li Weihan, et al. Battery Thermal- and Health-constrained Energy Management for Hybrid Electric Bus Based on Soft Actor-critic DRL Algorithm[J]. IEEE Transactions on Industrial Informatics, 2021, 17(6): 3751-3761.
28	Chen Chunyu, Cui Mingjian, Li Fangxing, et al. Model-free Emergency Frequency Control Based on Reinforcement Learning[J]. IEEE Transactions on Industrial Informatics, 2021, 17(4): 2336-2346.

算法	精度	前端延迟/ms	后端延迟/ms
SAC-MSE	0.974	50.06	99.60
SAC	0.958	50.24	99.84
Random	0.798	55.83	104.76
DQN	0.942	53.56	99.75
Greedy	0.895	54.62	103.31

head数	精度	前端延迟/ms	后端延迟/ms
1	0.945 4	50.275 6	99.702 5
2	0.947 6	50.192 7	99.695 1
7	0.949 8	50.164 1	99.781 2

[1]	江明, 何韬. 基于深度强化学习的带容量约束车辆路径问题求解[J]. 系统仿真学报, 2025, 37(9): 2177-2187.
[2]	姜彦吉, 张颖阳, 董浩, 张晓光, 王美惠. 基于实例关联的暗光下车道线检测[J]. 系统仿真学报, 2025, 37(9): 2188-2199.
[3]	马仑, 杨跃, 王迨贺, 廖桂生, 李幸. 联合自注意力机制与权值共享的人体行为识别模型[J]. 系统仿真学报, 2025, 37(9): 2409-2419.
[4]	倪培龙, 毛鹏军, 王宁, 杨孟杰. 基于改进A-DDQN算法的机器人路径规划[J]. 系统仿真学报, 2025, 37(9): 2420-2430.
[5]	鲁斌, 杨烜, 杨振宇, 高啸天. 自适应采样与重影多尺度特征融合的轻量化焊缝缺陷检测[J]. 系统仿真学报, 2025, 37(8): 1978-1990.
[6]	史宣莉, 陈伟能, 宋安, 赵甜芳. 多粒度协同演化的病毒传播控制资源分配方法[J]. 系统仿真学报, 2025, 37(8): 2043-2060.
[7]	刘子龙, 张磊. 自然环境下改进YOLOv5对小目标苹果的检测[J]. 系统仿真学报, 2025, 37(8): 2124-2138.
[8]	陈真, 吴卓屹, 张霖. 深度强化学习中策略表征研究简述[J]. 系统仿真学报, 2025, 37(7): 1753-1769.
[9]	王子怡, 张凯, 钱殿伟, 刘玉贞. 一种基于DRL的分布式装备体系优选方法[J]. 系统仿真学报, 2025, 37(6): 1565-1573.
[10]	伍国华, 曾家恒, 王得志, 郑龙, 邹伟. 基于深度强化学习的四旋翼航迹跟踪控制方法[J]. 系统仿真学报, 2025, 37(5): 1169-1187.
[11]	王祥, 谭国真. 基于知识与大语言模型的高速环境自动驾驶决策研究[J]. 系统仿真学报, 2025, 37(5): 1246-1255.
[12]	李杰, 刘扬, 李良, 苏本淦, 魏佳隆, 周广达, 石艳敏, 赵振. 基于跨阶段双分支特征聚合的遥感小目标检测[J]. 系统仿真学报, 2025, 37(4): 1025-1040.
[13]	张森, 代强强. 改进型深度确定性策略梯度的无人机路径规划[J]. 系统仿真学报, 2025, 37(4): 875-881.
[14]	李敏, 张森, 曾祥光, 王刚, 张童伟, 谢地杰, 任文哲, 张滔. 基于深度强化学习的四足机器人单腿越障轨迹规划[J]. 系统仿真学报, 2025, 37(4): 895-909.
[15]	郑岚月, 张玉洁. 基于改进YOLOv7的交通信号灯检测[J]. 系统仿真学报, 2025, 37(4): 993-1007.