系统仿真学报 ›› 2021, Vol. 33 ›› Issue (1): 24-36.doi: 10.16182/j.issn1004731x.joss.20-0690

• 仿真建模理论与方法 • 上一篇    下一篇

基于BERT-LDA模型的新冠肺炎疫情网络舆情演化仿真

庄穆妮1,3, 李勇1,2, 谭旭1,3, 毛太田1, 蓝凯城3, 邢立宁4   

  1. 1.湘潭大学 公共管理学院,湖南 湘潭 411105;
    2.长沙学院 经济与管理学院,湖南 长沙 410022;
    3.深圳信息职业技术学院 软件学院,广东 深圳 518172;
    4.国防科技大学 系统工程学院,湖南 长沙 410022
  • 收稿日期:2020-08-31 修回日期:2020-11-04 发布日期:2021-01-18
  • 作者简介:庄穆妮(1996-),女,硕士生,研究方向为网络舆情分析。E-mail:997737694@qq.com
  • 基金资助:
    国家自然科学基金(72074033),教育部人文社科基金(17YJCZH157),广东省视频图像大数据公共安全应用创新团队项目,深圳市科技计划基础研究重点项目(JCYJ20200109141218676)

Evolutionary Simulation of Online Public Opinion Based on the BERT-LDA Model under COVID-19

Zhuang Muni1,3, Li Yong1,2, Tan Xu1,3, Mao Taitian1, Lan Kaicheng3, Xing Lining4   

  1. 1. School of Public Management, Xiangtan University, Xiangtan 411105, China;
    2. School of Economics and Management, Changsha University, Changsha 410022, China;
    3. School of Software Engineering, Shenzhen Institute of Information Technology, Shenzhen 518172, China;
    4. College of Systems Engineering, National University of Defense Technology, Changsha 410022, China
  • Received:2020-08-31 Revised:2020-11-04 Published:2021-01-18

摘要: 构建大规模网络舆情演化仿真模型,对新冠疫情武汉重灾区与全国其他地区采取差异化的应急管理和舆情疏导具有指导价值。为实现主题细粒度的舆情情感演化仿真,将LDA(Latent Dirichlet Allocation)主题模型与BERT(Bidirectional Encoder Representations from Transformers)词向量深度融合,优化主题向量助力文本主题聚类;同时,在改进BERT预训练任务的基础上,叠加深度预训练任务,以提高模型在情感分类中的精确度。结果表明:在主题向量训练过程中,改进的BERT-LDA模型较原始LDA模型NPMI(Normalized Pointwise Mutual Information)值提升0.357;在疫情事件情感分类任务上,AUC(Area Under the Curve)值超过了99.6%,证明其能够有效运用于大规模网络舆情演化仿真。

关键词: 新冠肺炎疫情, BERT-LDA模型, 舆情演化仿真, 差异性比较

Abstract: The construction of a large-scale online public opinion evolution simulation model has guidance value for differentiated emergency management and public opinion guidance in the worst-hit areas in Wuhan and the other areas in China during the outbreak of the COVID-19. In order to realize the fine-grained simulation of the public sentiment evolution of the topic, the LDA topic model is deeply integrated with BERT word vector to optimize the topic vector and power the text topic clustering. At the same time, on the basis of improving BERT pre-training task, the deep pre-training task is superimposed to improve the accuracy of the model in emotion classification. The results show that the NPMI value of the improved BERT-LDA model is 0.357 higher than that of the original LDA model during the topic vector training. In terms of the emotional classification task of epidemic events, the AUC value exceeds 99.6%, which proves that the improved BERT-LDA model can be effectively applied to large-scale internet public opinion evolution simulation.

Key words: corona virus disease 2019 (COVID-19), BERT-LDA model, evolution simulation of public opinion, difference comparison

中图分类号: