系统仿真学报 ›› 2024, Vol. 36 ›› Issue (12): 2884-2893.doi: 10.16182/j.issn1004731x.joss.24-FZ0761

• 论文 • 上一篇    

建筑空调系统迁移模仿学习仿真调度策略研究

王翘楚1, 丁研1, 梁传志2, 张颢正1, 黄宸1   

  1. 1.天津大学 环境科学与工程学院,天津 300354
    2.住房和城乡建设部科技与产业化发展中心,北京 100835
  • 收稿日期:2024-07-16 修回日期:2024-10-20 出版日期:2024-12-20 发布日期:2024-12-20
  • 通讯作者: 丁研
  • 第一作者简介:王翘楚(1994-),女,博士生,研究方向为能源系统运行优化。

Research on Scheduling Strategies Simulation for Building Air-conditioning Systems Based on Transfer Imitation Learning

Wang Qiaochu1, Ding Yan1, Liang Chuanzhi2, Zhang Haozheng1, Huang Chen1   

  1. 1.School of Environmental Science and Engineering, Tianjin University, Tianjin 300354, China
    2.Technology and Industrialization Development Center of the Ministry of Housing and Urban Rural Development, Beijing 100835, China
  • Received:2024-07-16 Revised:2024-10-20 Online:2024-12-20 Published:2024-12-20
  • Contact: Ding Yan

摘要:

为解决空调调度在线部署初期,低质量数据工况存在性能不稳定与训练过程效率低下的问题提出一种基于迁移模仿学习的空调仿真调度策略制定方法。通过强化学习方法获得建筑运行策略,建立标准建筑仿真模型作为源域部署迁移学习,模仿学习损失函数被应用于智能体损失函数中以增强算法性能。结果表明:相比未采用迁移学习的方法,运行效益提升了16.2%,有效解决了强化学习训练初期的运行不稳定问题;相比未采用模仿学习的方法,运行效益提升了11.5%,有效提高了强化学习的训练效率。

关键词: 迁移学习, 强化学习, 模仿学习, 空调调控方法, 室温控制

Abstract:

To solve the problem of unstable performance and inefficient training process of low-quality data conditions at the initial stage of online deployment of air conditioner scheduling, we propose a migration-imitation learning-based air conditioning scheduling strategy simulation method. Reinforcement learning methods are used to generate building operation strategies. A standard building simulation model serves as the source domain, upon which migration learning is applied. An imitation learning loss function is incorporated into the intelligent loss function to enhance algorithm performance. The results indicate that, compared with the non-use of migration learning, the proposed method can improve the operational efficiency by 16.2%, effectively resolving the operational instability issues at the initial stage of reinforcement learning training. Compared to methods without imitation learning, operational efficiency is enhanced by 11.5%, significantly improving the training efficiency of reinforcement learning.

Key words: transfer learning, reinforcement learning, imitation learning, air conditioning control, room temperature control

中图分类号: