系统仿真学报 ›› 2026, Vol. 38 ›› Issue (1): 158-173.doi: 10.16182/j.issn1004731x.joss.25-0896

• 论文 • 上一篇    下一篇

基于频域增强的跨域人群计数模型

张德1,2, 梁子珊1,2, 刘宁宁3   

  1. 1.北京建筑大学 智能科学与技术学院,北京 102616
    2.北京建筑大学 城市建筑超级智能技术北京市重点实验室,北京 102616
    3.对外经济贸易大学 信息学院,北京 100029
  • 收稿日期:2025-09-15 修回日期:2025-11-11 出版日期:2026-01-18 发布日期:2026-01-28
  • 通讯作者: 刘宁宁
  • 第一作者简介:张德(1979-),男,副教授,博士,研究方向为计算机视觉、虚拟现实。
  • 基金资助:
    国家自然科学基金(62271035);城市建筑超级智能技术北京市重点实验室开放课题(BKL-SITUA-202502);对外经济贸易大学信息学院人工智能交叉专项

Cross-domain Crowd Counting Model Based on Frequency Domain Enhancement

Zhang De1,2, Liang Zishan1,2, Liu Ningning3   

  1. 1.School of Intelligence Science and Technology, Beijing University of Civil Engineering and Architecture, Beijing 102616, China
    2.Beijing Key Laboratory of Super Intelligent Technology for Urban Architecture, Beijing University of Civil Engineering and Architecture, Beijing 102616, China
    3.School of Information Technology and Management, University of International Business and Economics, Beijing, 100029, China
  • Received:2025-09-15 Revised:2025-11-11 Online:2026-01-18 Published:2026-01-28
  • Contact: Liu Ningning

摘要:

人群计数以视频监控数据为输入,可应用于城市数字孪生平台建设、虚拟城市建模和智慧城市管理等领域。当应用场景与训练场景存在数据域差异时,计数性能往往显著下降,因此,提出一种基于频域增强的跨域人群计数模型为缓解域间分布差异,构建频域特征增强模块与域不变频域适配器模块:前者利用离散余弦变换提取关键统计特征以增强空间表示能力,后者基于快速傅里叶变换分解幅度与相位谱,并通过注意力机制抑制域特异幅度信息以提取域不变特征。为应对复杂背景干扰,设计并联双注意力模块以聚焦前景区域。为解决尺度剧烈变化带来的挑战,提出多尺度特征聚合模块,实现不同尺度解码器特征的融合,从而提升模型的鲁棒性。在4个公开人群计数数据集上进行了跨域仿真实验,结果表明所提模型在大多数具有挑战性的场景中均获得最低计数误差,优于当前主流模型,可有效支撑鲁棒、高精度的人群动态仿真。

关键词: 人群计数, 频域建模, 域泛化, 仿真输入建模, 智慧城市

Abstract:

Crowd counting takes video surveillance data as input and can be applied to the construction of city digital twin platforms, virtual city modeling and smart city management, etc. However, when there are data domain differences between the application scenario and training scenario, counting performance often significantly decreases. A cross-domain crowd counting model based on frequency domain enhancement is proposed. To alleviate the distribution differences between domains, a frequency domain feature enhancement module and a domain invariant frequency domain adapter module are constructed: the former uses discrete cosine transform to extract key statistical features to enhance spatial representation ability, while the latter decomposes amplitude and phase spectra based on fast Fourier transform and suppresses domain specific amplitude information through attention mechanism to extract domain invariant features. To cope with complex background interference, a parallel dual attention module is designed to focus on the foreground region. To address the challenges posed by drastic scale changes, a multi-scale feature aggregation module is proposed to achievethe fusion of decoder features at different scales, thereby enhancing the model robustness. Cross-domain simulation experiments are conducted on four public crowd counting datasets, and the results showed that the proposed method achieved the lowest counting error in most challenging scenarios, outperforming current mainstream methods and providing effective supports for robust and high-precision crowd dynamic simulation.

Key words: crowd counting, frequency-domain modeling, domain generalization, simulation input modeling, smart city

中图分类号: