基于PER-MADDPG算法的发电商容量市场交易策略

doi:10.16182/j.issn1004731x.joss.25-0456

摘要/Abstract

摘要：

针对不同容量市场环境下发电商如何权衡报量报价策略以实现收益最大化的问题，构建了容量市场竞价均衡模型，并针对传统求解方法存在的依赖完全信息假设和历史交易策略信息利用率低的问题，提出了一种基于优先经验回放下多智能体深度确定性策略梯度(prioritized experience replay multi-agent deep deterministic policy gradient, PER-MADDPG)的容量市场交易仿真方法。以报量报价策略构造动作空间，以历史交易策略和中标信息构造状态空间，各发电商基于有限的状态信息，利用优先经验回放机制，依据样本的时序差分误差分配采样概率，使误差较大的样本在训练过程中被更频繁地回放，有效解决了多智能体非平稳交互所导致的梯度噪声放大问题，提升样本利用效率与模型收敛速度。市场仿真结果表明，相比于MADDPG、MAPPO、MASAC、MATD3和QMIX算法，所提方法获得的发电商平均奖励分别提高了2 853.08、3 628.74、2 167.11、4 260.19和5 459.64元，平均算法耗时则分别缩短了15.35%、8.18%、3.87%、5.33%和31.03%。所提方法既可以帮助发电商在不同市场环境下制定最优容量市场交易策略，增加容量收益，也可以为我国容量市场建设者选择容量市场出清价格机制提供参考，降低电网容量采购成本。

关键词: 容量市场, 发电商, 竞价均衡模型, PER-MADDPG算法, 交易策略

Abstract:

Considering the issue of how power generators trade off their quantity and price bidding strategies to maximize profits in different capacity market environments, a capacity market bidding equilibrium model is constructed. Recognizing the limitations of traditional solution methods, which rely on the assumption of complete information and have low utilization of historical trading strategy information, a capacity market trading simulation method based on prioritized experience replay multi-agent deep deterministic policy gradient (PER-MADDPG) is proposed. The action space is constructed using quantity bidding strategy and price bidding strategy, and the state space is constructed using historical transaction strategies and winning bid information. Based on limited state information, each generator utilizes prioritized experience replay mechanism to allocate sampling probabilities according to the temporal difference error of the sample, ensuring that samples with larger errors are replayed more frequently during training. This effectively addresses the issue of amplified gradient noise caused by non-stationary interactions among multiple agents, thereby improving sample utilization efficiency and model convergence speed. Market simulation results indicate that the proposed method can help generators formulate optimal capacity market trading strategies under different market conditions to increase capacity revenue, and also can provide reference for capacity market builders in China to select capacity market clearing price mechanisms, thereby reducing grid capacity procurement costs. Compared to MADDPG, MAPPO, MASAC, MATD3, and QMIX algorithms, the average rewards obtained by the proposed method for power generators increased by 2 853.08, 3 628.74, 2 167.11, 4 260.19, and 5 459.64 yuan, while the average algorithm runtime was reduced by 15.35%, 8.18%, 3.87%, 5.33%, and 31.03%, respectively.

Key words: capacity market, generator, bidding equilibrium model, PER-MADDPG algorithm, trading strategy

中图分类号:

TM73

李彦斌,潘肇伦,马新月等 . 基于PER-MADDPG算法的发电商容量市场交易策略[J]. 系统仿真学报, 2026, 38(5): 1408-1425.

Li Yanbin,Pan Zhaolun,Ma Xinyue,et al . Capacity Market Trading Strategies of Generators Based on PER-MADDPG Algorithm[J]. Journal of System Simulation, 2026, 38(5): 1408-1425.

图/表 29

图1

图2

图3

图4

图5

表1

表2

表3

图6

图7

图8

图9

图10

图11

图12

图13

图14

图15

图16

图17

图18

图19

图20

图21

图22

图23

表4

图24

图25

参考文献 25

[1]	国家发展改革委员会. 电力市场运行基本规则[EB/OL]. (2024-04-25) [2025-04-30]. .
[2]	国家能源局山东监管办公室. 山东电力市场规则(试行): 鲁监能市场规〔2024〕24号[EB/OL]. (2024-04-19) [2025-04-30]. .
[3]	山西省发展和改革委员会. 关于我省煤电容量电价机制有关事项的通知: 晋发改商品发〔2023〕463号[EB/OL]. (2023-12-29) [2025-04-30]. .
[4]	Lan Liuhan, Zhang Xingping, Tsan Sheng Ng, et al. Bidding Behaviors of Coal-fired Units in China's Electricity Market Considering Previous Capacity Revenues Based on Prospect Theory[J]. Energy, 2025, 322: 135721.
[5]	刘硕, 于松泰, 孙田, 等. 面向高比例可再生能源电力系统的容量补偿机制研究[J]. 电网技术, 2022, 46(5): 1780-1789.
	Liu Shuo, Yu Songtai, Sun Tian, et al. Capacity Compensation Mechanism for Highly-proportional Renewable Energy Power Systems[J]. Power System Technology, 2022, 46(5): 1780-1789.
[6]	Yin Shaobo, Sun Weiqing, Wang Haibing. Virtual Power Plant Capacity Tariff Pricing Method Based on Master-slave Game[J]. International Journal of Electrical Power & Energy Systems, 2025, 169: 110774.
[7]	卢治霖, 尚楠, 张妍, 等. 发电企业参与容量市场的纳什‒主从博弈模型[J]. 电力系统自动化, 2023, 47(16): 94-102.
	Lu Zhilin, Shang Nan, Zhang Yan, et al. Nash-stackelberg Game Model for Power Generation Enterprises Participating in Capacity Market[J]. Automation of Electric Power Systems, 2023, 47(16): 94-102.
[8]	张妍, 陈启鑫, 郭鸿业, 等. 引入投资决策的电力容量市场均衡分析[J]. 电力系统自动化, 2020, 44(20): 11-18.
	Zhang Yan, Chen Qixin, Guo Hongye, et al. Equilibrium Analysis of Power Capacity Market Incorporating Investment Decision[J]. Automation of Electric Power Systems, 2020, 44(20): 11-18.
[9]	瞿颖, 肖云鹏, 张臣, 等. 考虑灵活调节需求的容量市场出清模型与定价方法[J]. 电力系统自动化, 2024, 48(11): 64-76.
	Qu Ying, Xiao Yunpeng, Zhang Chen, et al. Clearing Model and Pricing Method for Capacity Market Considering Flexible Regulation Requirement[J]. Automation of Electric Power Systems, 2024, 48(11): 64-76.
[10]	李咸善, 张远航, 胡长宇, 等. 考虑容量市场机制的风光水储联盟混合式抽蓄容量优化配置策略[J]. 电网技术, 2025, 49(11): 4625-4637.
	Li Xianshan, Zhang Yuanhang, Hu Changyu, et al. Optimal Configuration Strategy of Hybrid Pumped Storage Capacity in Wind-solar-cascade-energy Storage Alliance Considering Capacity Market Mechanism[J]. Power System Technology, 2025, 49(11): 4625-4637.
[11]	Wang Benke, Li Chunhua, Ban Yongshuang, et al. A Multi-market Equilibrium Model Considering the Carbon-green Certificate Mutual Recognition Trading Mechanism Under the Electricity Market[J]. Energy, 2025, 330: 136902.
[12]	刘雨梦, 陈皓勇, 黄龙, 等. 基于多群体协同进化的电力市场均衡模型[J]. 电力系统保护与控制, 2020, 48(10): 38-45.
	Liu Yumeng, Chen Haoyong, Huang Long, et al. Equilibrium Model of Electricity Market Based on Multi-swarm Co-evolution[J]. Power System Protection and Control, 2020, 48(10): 38-45.
[13]	谢畅, 王蓓蓓, 赵盛楠, 等. 基于双层粒子群算法求解电力市场均衡[J]. 电网技术, 2018, 42(4): 1170-1176.
	Xie Chang, Wang Beibei, Zhao Shengnan, et al. Equilibrium Solution for Electricity Market Based on Bi-level Particle Swarm Optimization Algorithm[J]. Power System Technology, 2018, 42(4): 1170-1176.
[14]	孙庆凯, 王小君, 王怡, 等. 基于多智能体Nash-Q强化学习的综合能源市场交易优化决策[J]. 电力系统自动化, 2021, 45(16): 124-133.
	Sun Qingkai, Wang Xiaojun, Wang Yi, et al. Optimal Trading Decision-making for Integrated Energy Market Based on Multi-agent Nash-Q Reinforcement Learning[J]. Automation of Electric Power Systems, 2021, 45(16): 124-133.
[15]	成城, 陈智杰, 郭子铭, 等. 多智能体协同决策仿真平台研究与开发[J]. 系统仿真学报, 2023, 35(12): 2669-2679.
	Cheng Cheng, Chen Zhijie, Guo Ziming, et al. Research and Development of Simulation Training Platform for Multi-agent Collaborative Decision-making[J]. Journal of System Simulation, 2023, 35(12): 2669-2679.
[16]	张继行, 张一, 王旭, 等. 基于多代理强化学习的多新型市场主体虚拟电厂博弈竞价及效益分配策略[J]. 电网技术, 2024, 48(5): 1980-1991.
	Zhang Jixing, Zhang Yi, Wang Xu, et al. Game Bidding and Benefit Allocation Strategies for Virtual Power Plants with Multiple New Market Entities Based on Multi-agent Reinforcement Learning[J]. Power System Technology, 2024, 48(5): 1980-1991.
[17]	Zhang Haoyang, Qiu Dawei, Kok Koen, et al. Reliability Assessment of Multi-agent Reinforcement Learning Algorithms for Hybrid Local Electricity Market Simulation[J]. Applied Energy, 2025, 389: 125789.
[18]	王蓓蓓, 刘飞宇, 杨朋朋, 等. 考虑轻微利他效用的发售电一体化集团成员合谋行为的多智能体深度双Q网络推演研究[J]. 中国电机工程学报, 2023, 43(7): 2640-2651, 13.
	Wang Beibei, Liu Feiyu, Yang Pengpeng, et al. Study on Collusion Behavior Between the Members in Electricity Producers-retailers Integration Group Considering Slightly Altruistic Utility Based on Multi-agent Deep Double Q Network[J]. Proceedings of the CSEE, 2023, 43(7): 2640-2651, 13.
[19]	彭春华, 易泰洵, 孙惠娟, 等. 基于多智能体深度确定性策略梯度的电碳耦合市场发电商均衡竞价策略[J]. 电网技术, 2023, 47(10): 4229-4236.
	Peng Chunhua, Yi Taixun, Sun Huijuan, et al. Power Generator Balanced Bidding Based on Multi-agent Deep Deterministic Strategy Gradient in Electricity-carbon Coupling Market[J]. Power System Technology, 2023, 47(10): 4229-4236.
[20]	李琪瑞, 杨知方, 李文沅. 面向差异化电源成本结构的容量市场机制设计[J]. 电工技术学报, 2024, 39(23): 7498-7511.
	Li Qirui, Yang Zhifang, Li Wenyuan. Capacity Market Mechanism Design for Power Sources with Differentiated Cost Structures[J]. Transactions of China Electrotechnical Society, 2024, 39(23): 7498-7511.
[21]	王蓓蓓, 亢丽君, 苗曦云, 等. 考虑可信度的新能源及需求响应参与英美容量市场分析及思考[J]. 电网技术, 2022, 46(4): 1233-1247.
	Wang Beibei, Kang Lijun, Miao Xiyun, et al. Analysis and Enlightenment of Renewable Energy and Demand Response Participating in UK and US Capacity Markets Considering Capacity Credibility[J]. Power System Technology, 2022, 46(4): 1233-1247.
[22]	Energy Timera. 5 Factors That Will Drive Spanish Capacity Pricing[EB/OL]. (2021-07-05) [2025-04-30]. .
[23]	韩莹竹, 张兴平, 刘力, 等. 基于煤电容量价值发现的月度容量市场激励机制[J]. 电网技术, 2025, 49(2): 622-630.
	Han Yingzhu, Zhang Xingping, Liu Li, et al. Incentive Mechanism of Monthly Capacity Market Based on Capacity Value of Coal Power[J]. Power System Technology, 2025, 49(2): 622-630.
[24]	Wu Haochi, Qiu Dawei, Zhang Liyu, et al. Adaptive Multi-agent Reinforcement Learning for Flexible Resource Management in a Virtual Power Plant with Dynamic Participating Multi-energy Buildings[J]. Applied Energy, 2024, 374: 123998.
[25]	Divényi Dániel, Polgári Beáta, Sleisz Ádám, et al. Algorithm Design for European Electricity Market Clearing with Joint Allocation of Energy and Control Reserves[J]. International Journal of Electrical Power & Energy Systems, 2019, 111: 269-285.

名称	装机容量/MW	单位容量成本/(万元/MW)
G1	1 000	2.8
G2	1 000	3.4
G3	600	3.3
G4	600	3.5
G5	600	3.6
G6	600	4.1
G7	300	3.2
G8	300	3.3
G9	300	3.4
G10	300	3.7
G11	300	4.2
G12	100	4.0

参数	数值
智能体数量	12
Actor网络层数	4
Critic网络层数	4
经验池大小	10 000
迭代次数	5 000
学习率	0.000 1
奖励折扣因子	0.90
每次采样经验样本数	256
ε-贪婪策略探索率	0.15
目标网络软更新参数	0.01

场景编号	算法总耗时	每轮训练平均耗时
Ⅰ	2 828.09	0.56
Ⅱ	2 552.61	0.51
Ⅲ	2 916.69	0.58
Ⅳ	3 449.42	0.68
Ⅴ	3 814.01	0.76
Ⅵ	3 625.13	0.72

场景	场景Ⅰ		场景Ⅱ		场景Ⅲ		场景Ⅳ		场景Ⅴ		场景Ⅵ
发电商	中标容量/ MW	出清价格/ (万元/MW)	中标容量/ MW	出清价格/ (万元/MW)	中标容量/ MW	出清价格/ (万元/MW)	中标容量/ MW	出清价格/ (万元/MW)	中标容量/ MW	出清价格/ (万元/MW)	中标容量/ MW	出清价格/ (万元/MW)
G1	993	3.34	972	3.85	1 000	4.93	993	3.63	997	3.62	984	3.62
G2	986	3.34	953	3.85	1 000	4.93	986	3.86	1 000	3.86	982	3.86
G3	600	3.34	592	3.85	600	4.93	583	3.83	596	3.83	600	3.85
G4	600	3.34	600	3.85	600	4.93	390	3.9	598	3.9	596	4.6
G5	600	3.34	591	3.85	581	4.93	583.63	3.87	540	3.95	600	3.94
G6	89	3.34	600	3.85	599	4.93	475.55	3.59	600	4.17	600	4.14
G7	299	3.34	300	3.85	300	4.93	300	3.78	293.97	4.17	300	3.78
G8	297	3.34	292	3.85	300	4.93	300	3.83	294.64	4.05	297	4.47
G9	298	3.34	226	3.85	298	4.93	232	3.88	295	3.86	298	4.54
G10	298	3.34	292	3.85	297	4.93	294.64	3.67	293	3.99	297	4.72
G11	0	3.34	295	3.85	295	4.93	0	0	294.64	4.11	297	4.18
G12	100	3.34	100	3.85	97	4.93	0	0	0	0	99	4.84