基于Stackelberg博弈与深度强化学习的计算卸载策略

doi:10.16182/j.issn1004731x.joss.21-1118

系统仿真学报 ›› 2023, Vol. 35 ›› Issue (2): 372-385.doi: 10.16182/j.issn1004731x.joss.21-1118

基于Stackelberg博弈与深度强化学习的计算卸载策略

周娴玮(), 龚启旭, 余松森()

华南师范大学软件学院，广东佛山 528225

收稿日期:2021-11-02 修回日期:2022-01-07 出版日期:2023-02-28 发布日期:2023-02-16
通讯作者: 余松森 E-mail:20871147@qq.com;yss8109@163.com
第一作者简介:周娴玮(1982-)，男，讲师，博士，研究方向为机器人技术、多传感信息融合。E-mail：20871147@qq.com
基金资助:
广东省应用型科技研发重大专项(2016B020244003);广东省基础与应用基础研究基金(2020B1515120089);广东省企业科技特派员项目(GDKTP2020014000)

Computation Offloading Strategy Based on Stackelberg Game and DRL

Xianwei Zhou(), Qixu Gong, Songsen Yu()

School of Software, South China Normal University, Foshan 528225, China

Received:2021-11-02 Revised:2022-01-07 Online:2023-02-28 Published:2023-02-16
Contact: Songsen Yu E-mail:20871147@qq.com;yss8109@163.com

摘要/Abstract

摘要：

为使5G混合专网结构的2种用户能获得最优计算卸载策略，将2种用户竞争移动边缘计算(mobile edge computing，MEC)服务器资源的问题建模成Stackelberg博弈，并分别讨论了完全信息博弈和不完全信息博弈下的策略。完全信息博弈下，存在唯一纳什均衡解；不完全信息博弈下，将环境建模为部分可观测的马尔可夫决策过程(partially observable Markov decision process，POMDP)，并提出一种基于二阶段深度强化学习(two-stage deep reinforcement learning，TSDRL)的最优卸载策略。仿真实验表明：该算法相较于D-DRL算法能减少20.81%的时延及3.38%的能耗，有效提高用户QoE(quality of experience)。

关键词: 5G混合专网, 计算卸载, Stackelberg博弈, Nash均衡, 马尔可夫决策过程

Abstract:

To achieve the optimal computation offloading strategy for two kinds of MEC users in 5G hybrid private network, Stackelberg game is used to build the model of the competition for MEC server resources of two kinds of users, andthe strategies of complete information game and partially incomplete information game are researched respectively. It is proved that there is only one Nash equilibrium solution in the complete information scenario. In the incomplete information scenario, the environment is modeled as POMDP, and a two-stage deep reinforcement learning(TSDRL) is proposed to obtain the optimal computation offloading strategy. Simulation results show the proposed algorithm having a total reduction of 20.81% time delay and 3.38 % energy consumption compared with the D-DRL algorithm and can effectively improve the user QoE(quality of experience).

Key words: 5G hybrid private network, computation offloading, Stackelberg game theory, Nash equilibrium, partially observable Markov decision process(POMDP)

中图分类号:

TP393.01

周娴玮,龚启旭,余松森 . 基于Stackelberg博弈与深度强化学习的计算卸载策略[J]. 系统仿真学报, 2023, 35(2): 372-385.

Xianwei Zhou,Qixu Gong,Songsen Yu . Computation Offloading Strategy Based on Stackelberg Game and DRL[J]. Journal of System Simulation, 2023, 35(2): 372-385.

图/表 8

图1

图2

图3

图4

表1

实验设置参数

参数	值	参数	值
计算总资源 $R$ / $M b p s$	$50$	专网用户2传输速率 $r l, 2 k$ ( $M b / s$ )	$1 / 6$
专网用户任务总量 $C l, i$ / $M b$	$10$	专网用户3传输速率 $r l, 3 k$ ( $M b / s$ )	$1 / 5.6$
公网用户任务总量 $C f, j$ / $M b$	$1$	公网用户1传输速率 $r f, 1 k$ ( $M b / s$ )	$1 / 8.4$
单位传输时间成本 $ρ$ /( $J / s$ )	$1$	公网用户2传输速率 $r f, 2 k$ ( $M b / s$ )	$1 / 8.3$
单位数据消耗功率成本 $ν$ /( $J / M b$ )	$0.1$	公网用户3传输速率 $r f, 3 k$ ( $M b / s$ )	$1 / 8.2$
专网用户本地计算资源 $F l, i k$ ( $M b / s$ )	$0.1$	公网用户4传输速率 $r f, 4 k$ ( $M b / s$ )	$1 / 8.1$
公网用户本地计算资源 $F f, j k$ ( $M b / s$ )	$0.01$	公网用户5传输速率 $r f, 5 k$ ( $M b / s$ )	$1 / 8$
专网用户1传输速率 $r l, 1 k$ ( $M b / s$ )	$1 / 6.4$

表1

图5

图6

图7

参考文献 22

1	Strinati E C, Haustein T, Maman M, et al. Beyond 5G Private Networks: the 5G CONNI Perspective[C]//2020 IEEE Globecom Workshops(GC Wkshps. Taiwan, China: IEEE, 2020: 1-6.
2	Li X, Garcia-Saavedra A, Costa-Perez X, et al. 5Growth: an End-to-End Service Platform for Automated Deployment and Management of Vertical Services over 5G Networks[J]. IEEE Communications Magazine(S0163-6804), 2021, 59(3): 84-90.
3	Xu J, Chen L, Zhou P. Joint Service Caching and Task Offloading for Mobile Edge Computing in Dense Networks[C]//IEEE INFOCOM 2018-IEEE Conference on Computer Communications. Honolulu, HI, USA: IEEE, 2018: 207-215.
4	董思岐, 李海龙, 屈毓锛, 等. 移动边缘计算中的计算卸载策略研究综述[J]. 计算机科学, 2019, 46(11): 32-40.
	Dong Siqi, Li Hailong, Qu Yuben, et al. Survey of Research on Computation Unloading Strategy in Mobile Edge Computing[J]. Computer Science, 2019, 46(11): 32-40.
5	Walia J S, Hämmäinen H, Flinck H. Future Scenarios and Value Network Configurations for Industrial 5G[C]//2017 8th International Conference on the Network of the Future(NOF). London, United Kingdom: IEEE, 2017: 79-84.
6	Guo H Z, Zhang J, Liu J J, et al. Energy-Efficient Task Offloading and Transmit Power Allocation for Ultra-Dense Edge Computing[C]//2018 IEEE Global Communications Conference(GLOBECOM). United Arab Emirates: IEEE, 2018: 1-6.
7	Liu J, Mao Y, Zhang J, et al. Delay-Optimal Computation Task Scheduling for Mobile-Edge Computing Systems[C]//2016 IEEE International Symposium on Information Theory(ISIT). Barcelona, Spain: IEEE, 2016: 1451-1455.
8	Kamoun M, Labidi W, Sarkiss M. Joint Resource Allocation and Offloading Strategies in Cloud Enabled Cellular Networks[C]//2015 IEEE International Conference on Communications(ICC). United Kingdom: IEEE, 2015: 5529-5534.
9	You C, Huang K, Chae H, et al. Energy-Efficient Resource Allocation for Mobile-Edge Computation Offloading[J]. IEEE Transactions on Wireless Communications(S1536-1276), 2016, 16(3): 1397-1411.
10	Chen M H, Liang B, Dong M. Joint Offloading Decision and Resource Allocation for Multi-User Multi-Task Mobile Cloud[C]//2016 IEEE International Conference on Communications(ICC). Kuala Lumpur, Malaysia: IEEE, 2016: 1-6.
11	李长云, 黎建波, 徐曦, 等. 具有能量收集设备的移动边缘计算资源分配研究[J]. 系统仿真学报, 2022, 34(11): 2313-2322.
	Li Changyun, Li Jianbo, Xu Xi, et al. Research on Mobile Edge Computing Resource Allocation with Energy Harvesting Device[J]. Journal of System Simulation, 2022, 34(11): 2313-2322.
12	Tran T X, Pompili D. Joint Task Offloading and Resource Allocation for Multi-Server Mobile-Edge Computing Networks[J]. IEEE Transactions on Vehicular Technology(S0018-9545), 2018, 68(1): 856-868.
13	Cao H, Cai J. Distributed Multiuser Computation Offloading for Cloudlet-Based Mobile Cloud Computing: A Game-Theoretic Machine Learning Approach[J]. IEEE Transactions on Vehicular Technology(S0018-9545), 2017, 67(1): 752-764.
14	Zhan Y, Guo S, Li P, et al. A deep Reinforcement Learning Based Offloading Game in Edge Computing[J]. IEEE Transactions on Computers(S0018-9340), 2020, 69(6): 883-893.
15	Asheralieva A, Niyato D. Bayesian Reinforcement Learning and Bayesian Deep Learning for Blockchains with Mobile Edge Computing[J]. IEEE Transactions on Cognitive Communications and Networking(S2332-7731), 2020, 7(1): 319-335.
16	Li F, Yao H, Du J, et al. Stackelberg Game-Based Computation Offloading in Social and Cognitive Industrial Internet of Things[J]. IEEE Transactions on Industrial Informatics(S1551-3203), 2019, 16(8): 5444-5455.
17	Kotulski Z, Niewolski W, Nowak T W, et al. New Security Architecture of Access Control in 5G MEC[C]//International Symposium on Security in Computing and Communication. Singapore: Springer, 2020: 77-91.
18	吴学文, 廖婧贤. 云边协同系统中基于博弈论的资源分配与任务卸载方案[J]. 系统仿真学报, 2022, 34(7): 1468-1481.
	Wu Xuewen, Liao Jingxian. Game-based Resource Allocation and Task Offloading Scheme in Collaborative Cloud-Edge Computing System[J]. Journal of System Simulation, 2022, 34(7): 1468-1481.
19	Wan X, Sheng G, Li Y, et al. Reinforcement Learning Based Mobile Offloading for Cloud-Based Malware Detection[C]//GLOBECOM 2017-2017 IEEE Global Communications Conference. Singapore: IEEE, 2017: 1-6.
20	Bai Y, Chen L, Song L, et al. Risk-Aware Edge Computation Offloading Using Bayesian Stackelberg Game[J]. IEEE Transactions on Network and Service Management(S1932-4537), 2020, 17(2): 1000-1012.
21	Pang J S, Fukushima M. Quasi-Variational Inequalities, Generalized Nash Equilibria, and Multi-Leader-Follower Games[J]. Computational Management Science (S1619-697X), 2005, 2(1): 21-56.
22	He K, Sun J. Convolutional Neural Networks at Constrained Time Cost[C]//IEEE Conference on Computer Vision and Pattern Recognition. United States: IEEE, 2015: 5353-5360.

[1]	郭荣玉, 李孝斌, 江沛, 李传江, 刘善慧, 马军. 工业软件平台多模式运营利益分配优化模型与仿真[J]. 系统仿真学报, 2025, 37(9): 2242-2257.
[2]	郑家瑜, 麦著学, 陈哲毅. 数字孪生云边网络下服务缓存与计算卸载优化[J]. 系统仿真学报, 2025, 37(11): 2741-2753.
[3]	黄智钦, 卢恬英, 陈哲毅. 面向大规模IoT系统的多无人机部署与协作卸载[J]. 系统仿真学报, 2025, 37(1): 25-39.
[4]	李健, 李洹坤, 何鹏博, 王化北, 徐莉萍, 何奎. 协同智能体强化学习算法的柔性作业车间调度方法研究[J]. 系统仿真学报, 2024, 36(11): 2699-2711.
[5]	刘家义, 王刚, 付强, 郭相科, 王思远. 基于分配策略优化算法的智能防空任务分配[J]. 系统仿真学报, 2023, 35(8): 1705-1716.
[6]	丁飞, 沙宇晨, 洪莹, 蒯晓, 张登银. 智能网联汽车计算卸载与边缘缓存联合优化策略[J]. 系统仿真学报, 2023, 35(6): 1203-1214.
[7]	徐颖, 刘勤明, 周林森. 基于博弈论的闭环双渠道回收供应链决策研究[J]. 系统仿真学报, 2022, 34(2): 396-408.
[8]	李长云, 黎建波, 徐曦, 李亭立. 具有能量收集设备的移动边缘计算资源分配研究[J]. 系统仿真学报, 2022, 34(11): 2313-2322.
[9]	高雪莹, 唐昊, 苗刚中, 平兆武. 储能系统能量调度与需求响应联合优化控制[J]. 系统仿真学报, 2016, 28(5): 1165-1172.
[10]	李江波, 王波, 高岩, 张惠珍. 马尔可夫决策过程下的智能电网实时电价模型[J]. 系统仿真学报, 2016, 28(11): 2756-2763.

基于Stackelberg博弈与深度强化学习的计算卸载策略

Computation Offloading Strategy Based on Stackelberg Game and DRL

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 22

相关文章 10

编辑推荐

Metrics

本文评价