系统仿真学报 ›› 2022, Vol. 34 ›› Issue (6): 1353-1366.doi: 10.16182/j.issn1004731x.joss.21-0108

• 国民经济仿真 • 上一篇    下一篇

基于协作式深度强化学习的火灾应急疏散仿真研究

倪凌佳1,2(), 黄晓霞1,3(), 李红旮1, 张子博1,2   

  1. 1.中国科学院 空天信息创新研究院, 北京 100094
    2.中国科学院大学, 北京 100049
    3.自然资源部城市国土资源监测与仿真重点实验室, 深圳 518034
  • 收稿日期:2021-02-04 修回日期:2021-05-18 出版日期:2022-06-30 发布日期:2022-06-16
  • 通讯作者: 黄晓霞 E-mail:13476143753@163.com;huangxx@aircas.ac.cn
  • 作者简介:倪凌佳(1996-),男,硕士生,研究方向为火灾应急疏散。E-mail:13476143753@163.com
  • 基金资助:
    国家自然科学基金(41971363);自然资源部城市国土资源监测与仿真重点实验室开放基金资助课题(KF-2018-03-032);国家重点研发计划(2017YFB0503905)

Research on Fire Emergency Evacuation Simulation Based on Cooperative Deep Reinforcement Learning

Lingjia Ni1,2(), Xiaoxia Huang1,3(), Hongga Li1, Zibo Zhang1,2   

  1. 1.Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
    2.University of Chinese Academy of Sciences, Beijing 100049, China
    3.Key Laboratory of Urban Land Resources Monitoring and Simulation, MNR, Shenzhen 518034, China
  • Received:2021-02-04 Revised:2021-05-18 Online:2022-06-30 Published:2022-06-16
  • Contact: Xiaoxia Huang E-mail:13476143753@163.com;huangxx@aircas.ac.cn

摘要:

火灾是威胁公共安全的主要灾害之一,火灾产生的高温和有毒有害烟气严重影响了疏散路径的选择。将深度强化学习引入到应急疏散仿真研究,针对多智能体环境提出了协作式双深度Q网络算法。建立随时间动态变化的火灾场景模型,为人员疏散提供实时的危险区域分布信息;对各自独立的智能体神经网络进行整合,建立多智能体统一的深度神经网络,实现所有智能体之间的神经网络和经验共享,提高整体协作疏散效率。结果表明:所提方法具有良好的稳定性和适应性,训练和学习效率得到提升,具有良好的应用价值。

关键词: 协作式双深度Q网络算法, 深度强化学习, 多智能体系统, 应急疏散仿真, 火灾场景仿真

Abstract:

The fire accident is a major threat to the public safety, in which the high temperature, toxic and harmful gases seriously interfer the selection of the evacuation routes. Deep reinforcement learning is introduced into the research of emergency evacuation simulation, and a cooperative double deep Q network algorithm is proposed for the multi-agent environment. A fire scene model that changes dynamically over time is established to provide the real-time information on the distribution of the dangerous areas for the evacuation. The independent agent neural networks are integrated and the multi-agent unified deep neural network is established to realize the sharing of the neural network and experience among all agents, and improve the overall cooperative evacuation efficiency. The experimental comparison results show that the proposed method has the good stability and adaptability, improved training and learning efficiency, and good application value.

Key words: cooperative double deep Q network algorithm, deep reinforcement learning, multi-agent system, emergency evacuation simulation, fire scenario simulation

中图分类号: