系统仿真学报 ›› 2022, Vol. 34 ›› Issue (6): 1353-1366.doi: 10.16182/j.issn1004731x.joss.21-0108
倪凌佳1,2(
), 黄晓霞1,3(
), 李红旮1, 张子博1,2
收稿日期:2021-02-04
修回日期:2021-05-18
出版日期:2022-06-30
发布日期:2022-06-16
通讯作者:
黄晓霞
E-mail:13476143753@163.com;huangxx@aircas.ac.cn
第一作者简介:倪凌佳(1996-),男,硕士生,研究方向为火灾应急疏散。E-mail:基金资助:
Lingjia Ni1,2(
), Xiaoxia Huang1,3(
), Hongga Li1, Zibo Zhang1,2
Received:2021-02-04
Revised:2021-05-18
Online:2022-06-30
Published:2022-06-16
Contact:
Xiaoxia Huang
E-mail:13476143753@163.com;huangxx@aircas.ac.cn
摘要:
火灾是威胁公共安全的主要灾害之一,火灾产生的高温和有毒有害烟气严重影响了疏散路径的选择。将深度强化学习引入到应急疏散仿真研究,针对多智能体环境提出了协作式双深度Q网络算法。建立随时间动态变化的火灾场景模型,为人员疏散提供实时的危险区域分布信息;对各自独立的智能体神经网络进行整合,建立多智能体统一的深度神经网络,实现所有智能体之间的神经网络和经验共享,提高整体协作疏散效率。结果表明:所提方法具有良好的稳定性和适应性,训练和学习效率得到提升,具有良好的应用价值。
中图分类号:
倪凌佳,黄晓霞,李红旮等 . 基于协作式深度强化学习的火灾应急疏散仿真研究[J]. 系统仿真学报, 2022, 34(6): 1353-1366.
Lingjia Ni,Xiaoxia Huang,Hongga Li,et al . Research on Fire Emergency Evacuation Simulation Based on Cooperative Deep Reinforcement Learning[J]. Journal of System Simulation, 2022, 34(6): 1353-1366.
表6
留仙洞大厦一层有无火灾场景中IDQN算法与协作式DDQN算法疏散结果对比 (s)
| 算法疏散结果 | 无火灾场景 | 火灾场景 | |||
|---|---|---|---|---|---|
| IDQN算法 | 协作式DDQN算法 | IDQN算法 | 协作式DDQN算法 | ||
平均每人疏散 花费时间/s | 第1次实验 | 14.93 | 14.93 | 15.11 | 14.97 |
| 第2次实验 | 14.93 | 14.92 | 15.05 | 14.96 | |
| 第3次实验 | 14.92 | 14.91 | 15.13 | 14.98 | |
| 3次实验平均 | 14.93 | 14.92 | 15.10 | 14.97 | |
人员全部疏散 花费时间 | 第1次实验 | 26.00 | 26.00 | 26.00 | 26.00 |
| 第2次实验 | 26.00 | 26.00 | 27.00 | 26.00 | |
| 第3次实验 | 26.00 | 26.00 | 26.33 | 26.00 | |
| 3次实验平均 | 26.00 | 26.00 | 26.44 | 26.00 | |
| 1 | 王玲. 公共场所用品的卫生检测分析[J]. 中外医学研究, 2010(10): 98. |
| Wang Ling. Hygienic Detection and Analysis of Articles in Public Places[J]. Chinese and Foreign Medical Research, 2010(10): 98. | |
| 2 | 储浩然. 5G新技术全面赋能智慧消防行业[J]. 中国安防, 2020(5): 47-49. |
| Chu Haoran. 5G New Technology Fully Empowers the Smart Fire Protection Industry[J]. China Security, 2020(5): 47-49. | |
| 3 | Bryan J L. A Review of the Examination and Analysis of the Dynamics of Human Behavior in the Fire at the MGM Grand Hotel, Clark County, Nevada as Determined from a Selected Questionnaire Population [J]. Fire Safety Journal(S0379-7112), 1983, 5(3/4): 233-240. |
| 4 | Cremer M, Ludwig J. A Fast Simulation Model for Traffic Flow on the Basis of Boolean Operations[J]. Mathematics and Computers in Simulation, 1986, 28(4): 297-303. |
| 5 | 宋英华, 张宇, 霍非舟, 等. 考虑避让行为的人员疏散元胞自动机模型研究[J]. 系统仿真学报, 2020, 32(6): 975-981. |
| Song Yinghua, Zhang Yu, Huo Feizhou, et al. Study on Evacuation Cellular Automaton Model Considering Avoidance Behavior[J]. Journal of System Simulation, 2020, 32(6): 975-981. | |
| 6 | Helbing D, Molnar P. Social Force Model for Pedestrian Dynamics[J]. Physical Review E (S1539-3755), 1995, 51(5): 4282-4286. |
| 7 | 邓媛媛, 郑利平, 蔡瑞文. 社会行为驱动的疏散仿真方法研究[J]. 系统仿真学报, 2020, 32(1): 130-141. |
| Deng Yuanyuan, Zheng Liping, Cai Ruiwen. Research on Evacuation Simulation Method Considering Social Behavior[J]. Journal of System Simulation, 2020, 32(1): 130-141. | |
| 8 | 李昌华, 毕成功, 李智杰. 基于改进PSO算法的人群疏散模型[J]. 系统仿真学报, 2020, 32(6): 1000-1008. |
| Li Changhua, Bi Chenggong, Li Zhijie. Crowd Evacuation Model Based on Improved PSO Algorithm[J]. Journal of System Simulation, 2020, 32(6): 1000-1008. | |
| 9 | Huang X, Li H, Li X, et al. Fire Numerical Simulation Analysis for Large-scale Public Building in 3D GIS[C]// 2019 IEEE International Geoscience and Remote Sensing Symposium. Yokohama: IEEE, 2019: 7522-7525. |
| 10 | Berg J V D, Lin M, Manocha D. Reciprocal Velocity Obstacles for Real-Time Multi-Agent Navigation[C]//IEEE International Conference on Robotics and Automatics. Pasadena: IEEE, 2008: 1928-1935. |
| 11 | 林海波, 郑利平, 王建伟, 等. 一种通用的层次化人群疏散仿真框架[J]. 系统仿真学报, 2020, 32(8): 1524-1530. |
| Lin Haibo, Zheng Liping, Wang Jianwei, et al. A Common and Hierarchical Crowd Evacuation Simulation Framework[J]. Journal of System Simulation, 2020, 32(8): 1524-1530. | |
| 12 | Mnih V, Kavukcuoglu K, Silver D, et al. Playing Atari with Deep Reinforcement Learning[J/OL]. Computer Ence, [2021-03-18]. . |
| 13 | Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous Control with Deep Reinforcement Learning[J/OL]. [2021-03-18]. . |
| 14 | Mnih V, Badia A P, Mirza M, et al. Asynchronous Methods for Deep Reinforcement Learning[C]//Interna-tional Conference on Machine Learning. PMLR: 2016: 1928-1937. |
| 15 | Schulman J, Wolski F, Dhariwal P, et al. Proximal Policy Optimization Algorithms[J/OL]. [2021-03-18]. . |
| 16 | Mnih V, Kavukcuoglu K, Silver D, et al. Human-Level Control Through Deep Reinforcement Learning[J]. Nature (S0028-0836), 2015, 518(7540): 529-533. |
| 17 | Levine S, Pastor P, Krizhevsky A, et al. Learning Hand-Eye Coordination for Robotic Grasping with Large-Scale Data Collection[C]//International Symposium on Experimental Robotics. 2017. |
| 18 | Satija H, Pineau J. Simultaneous Machine Translation Using Deep Reinforcement Learning[C]//Workshops of International Conference on Machine Learning. New York: 2016: 110-119. |
| 19 | Sallab A, Abdou M, Perot E, et al. Deep Reinforcement Learning Framework for Autonomous Driving[J]. Electronic Imaging(S1017-9909), 2017(19): 70-76. |
| 20 | Tai L, Paolo G, Liu M. Virtual-to-Real Deep Reinforcement Learning: Continuous Control of Mobile Robots for Mapless Navigation[C]//2017 IEEE/RSJ International Conference on Intelligent Robots and Systems. Vancouver: IEEE. 2017: 31-36. |
| 21 | Yu H, Zhang H, Xu W. A Deep Compositional Framework for Human-Like Language Acquisition in Virtual Environment[J/OL]. [2021-03-18]. . |
| 22 | Chen Y F, Liu M, Everett M F, et al. Decentralized Non-Communicating Multiagent Collision Avoidance with Deep Reinforcement Learning[C]//2017 IEEE Interna-tional Conference on Robotics and Automation. Singa-pore: IEEE, 2017: 285-292. |
| 23 | Van H H, Guez A, Silver D. Deep Reinforcement Learning with Double Q-Learning[C]//Thirtieth AAAI Conference on Aitificial Intelligence. Palo Alto, CA: AAAI, 2016: 2094-2100. |
| 24 | 孙彧, 曹雷, 陈希亮, 等. 多智能体深度强化学习研究综述[J]. 计算机工程与应用, 2020, 56(5): 13-24. |
| Sun Yu, Cao Lei, Chen Xiliang, et al. A Review of Multi-Agent Deep Reinforcement Learning[J]. Computer Engineering and Applications, 2020, 56(5): 13-24. | |
| 25 | 陈延敏, 李锦华. 国内外建筑信息模型 (BIM) 理论与实践研究综述[J]. 城市, 2013(10): 72-76. |
| Chen Yanmin, Li Jinhua. A Summary of the Research on the Theory and Practice of Building Information Modeling (BIM) at Home and Abroad[J]. City, 2013(10): 72-76. | |
| 26 | 刘阳. 基于FDS的建筑火灾数值模拟及安全疏散研究[D]. 阜新: 辽宁工程技术大学, 2013. |
| Liu Yang. Research on Building Fire Numerical Simulation and Safe Evacuation Based on FDS[D]. Fuxin: Liaoning Technical University, 2013. | |
| 27 | 毕伟民. 基于火灾动力学和疏散理论耦合的人员疏散研究[D]. 西安: 建筑科技大学, 2008. |
| Bi Weimin. Evacuation Research Based on the Coupling of Fire Dynamics and Evacuation Theory[D]. Xi'an: Xi'an University of Architecture and Technology, 2008. | |
| 28 | 沙云飞. 人群疏散的微观仿真模型研究[D]. 北京: 清华大学, 2008. |
| Sha Yunfei. Research on Microscopic Simulation Model of Crowd Evacuation[D]. Beijing: Tsinghua University, 2008. | |
| 29 | 张江涛. 基于FDS的办公大楼火灾数值模拟研究[D]. 邯郸: 河北工程大学, 2015. |
| Zhang Jiangtao. Research on Numerical Simulation of Office Building Fire Based on FDS[D]. Handan: Hebei University of Engineering, 2015. | |
| 30 | 李引擎, 史毅. 建筑防火工程[M]. 北京: 化学工业出版社, 2004. |
| Li Yinqing, Shi Yi. Building Fire Protection Engineering[M]. Beijing: Chemical Industry Press, 2004. | |
| 31 | Li B, Liang H. Multi-Robot Path Planning Method Based on Prior Knowledge and Q-learning Algorithms[J]. Journal of Physics: Conference Series, 2020, 1624(4):042008. |
| 32 | Lhan E, Gow J, Perezliebana D. Teaching on a Budget in Multi-Agent Deep Reinforcement Learning[C]//2019 IEEE Conference on Games. London: IEEE, 2019: 1-8. |
| 33 | Koh S, Zhou B, Fang H, et al. Real-Time Deep Reinforcement Learning Based Vehicle Routing and Navigation[J]. Applied Soft Computing(S1568-4946), 2020, 96: 106694. |
| [1] | 江明, 何韬. 基于深度强化学习的带容量约束车辆路径问题求解[J]. 系统仿真学报, 2025, 37(9): 2177-2187. |
| [2] | 倪培龙, 毛鹏军, 王宁, 杨孟杰. 基于改进A-DDQN算法的机器人路径规划[J]. 系统仿真学报, 2025, 37(9): 2420-2430. |
| [3] | 陈真, 吴卓屹, 张霖. 深度强化学习中策略表征研究简述[J]. 系统仿真学报, 2025, 37(7): 1753-1769. |
| [4] | 伍国华, 曾家恒, 王得志, 郑龙, 邹伟. 基于深度强化学习的四旋翼航迹跟踪控制方法[J]. 系统仿真学报, 2025, 37(5): 1169-1187. |
| [5] | 张森, 代强强. 改进型深度确定性策略梯度的无人机路径规划[J]. 系统仿真学报, 2025, 37(4): 875-881. |
| [6] | 李敏, 张森, 曾祥光, 王刚, 张童伟, 谢地杰, 任文哲, 张滔. 基于深度强化学习的四足机器人单腿越障轨迹规划[J]. 系统仿真学报, 2025, 37(4): 895-909. |
| [7] | 王贺, 许佳宁, 闫广宇. 基于深度强化学习的AGV行人避让策略研究[J]. 系统仿真学报, 2025, 37(3): 595-606. |
| [8] | 张斌, 雷永林, 李群, 高远, 陈永, 朱佳俊, 鲍琛龙. 基于强化学习的导弹突防决策建模研究[J]. 系统仿真学报, 2025, 37(3): 763-774. |
| [9] | 黄思进, 文佳, 陈哲毅. 面向边缘车联网系统的智能服务迁移方法[J]. 系统仿真学报, 2025, 37(2): 379-391. |
| [10] | 费帅迪, 蔡长龙, 刘飞, 陈明晖, 刘晓明. 舰船防空反导的目标分配方法研究[J]. 系统仿真学报, 2025, 37(2): 508-516. |
| [11] | 白臻祖, 侯一帜, 何章鸣, 魏居辉, 周海银, 王炯琦. 考虑随机扰动的动态武器目标分配优化[J]. 系统仿真学报, 2025, 37(12): 2967-2980. |
| [12] | 郑家瑜, 麦著学, 陈哲毅. 数字孪生云边网络下服务缓存与计算卸载优化[J]. 系统仿真学报, 2025, 37(11): 2741-2753. |
| [13] | 邸剑, 万雪, 姜丽梅. 基于精英指导和随机搜索的进化强化学习[J]. 系统仿真学报, 2025, 37(11): 2877-2887. |
| [14] | 徐忠锴, 储晨阳, 解凯, 赵睿卓, 柯文俊. 基于SC-PPO的高比例新能源电力系统优化调度方法[J]. 系统仿真学报, 2025, 37(10): 2511-2521. |
| [15] | 梁秀满, 刘子良, 刘振东. 基于深度强化学习的改进RRT算法路径规划[J]. 系统仿真学报, 2025, 37(10): 2578-2593. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||