系统仿真学报 ›› 2025, Vol. 37 ›› Issue (9): 2409-2419.doi: 10.16182/j.issn1004731x.joss.24-0362

• 论文 • 上一篇    

联合自注意力机制与权值共享的人体行为识别模型

马仑1, 杨跃1, 王迨贺1, 廖桂生2, 李幸1   

  1. 1.长安大学 信息工程学院,陕西 西安,710064
    2.西安电子科技大学 雷达信号处理国家重点实验室,陕西 西安,710071
  • 收稿日期:2024-04-08 修回日期:2024-08-04 出版日期:2025-09-18 发布日期:2025-09-22
  • 通讯作者: 杨跃
  • 第一作者简介:马仑(1981-),男,副教授,博士,研究方向为信号处理、人工智能。
  • 基金资助:
    中国博士后科学基金(2015M582586);长安大学中央高校基本科研业务费专项资金(学生创新实践能力提升子计划)

A Model Combining Self-attention and Weight Sharing for Human Activity Recognition

Ma Lun1, Yang Yue1, Wang Daihe1, Liao Guisheng2, Li Xing1   

  1. 1.School of Information Engineering, Chang'an University, Xi'an 710064, China
    2.National Lab of Radar Signal Processing, Xidian University, Xi'an 710071, China
  • Received:2024-04-08 Revised:2024-08-04 Online:2025-09-18 Published:2025-09-22
  • Contact: Yang Yue

摘要:

随着可穿戴设备的普及,基于可穿戴传感器的人体行为识别已被广泛关注。如何从原始传感器数据中提取较为有效的行为信息并组成相应特征向量,是该领域的核心问题。目前,卷积和循环神经网络已广泛用于多传感器数据的特征提取,然而这些网络难以站在全局角度关注到人体行为沿时间维度具有的重要特征。为此,在考虑到布设于人体不同部位传感器存在的逻辑相关性后,提出了一个基于自注意力机制与权值共享的多分支人体行为识别模型(Multi-CNN-BiLSTM-self attention,Multi-CBSA)。该模型使用架构统一且权值一致的子网络提取人体不同部位行为数据的特征,从而简化模型结构并减少模型训练参数。同时,每个子网络利用一维卷积将原始行为数据转换为由高级特征组成的短序列,通过双向长短期记忆网络获取短序列的前后向时序特征,利用自注意力机制将提取到的行为特征进行动态权值分配,获取具有代表性的关键特征,每个子网络输出将在融合层进行特征融合。消融实验表明:在引入自注意力机制后,Multi-CBSA在收敛速度、验证集损失、以及单类行为识别准确率上都有提升。对比实验表明,Multi-CBSA可以在减少训练参数量的基础上将MHEALTH和PAMAP2数据集的识别准确率提高到99.3%和96.4%,相较于近年表现较好的模型,识别准确率最大可以提高4.2%和4.4%。

关键词: 人体行为识别, 可穿戴传感器, 特征提取, 自注意力机制, 权值共享

Abstract:

With the prevalence of wearable devices, human activity recognition based on wearable sensor data has garnered significant attention. The central issue in this field is how to extract effective behavioral information from raw sensor data to form corresponding feature vectors. Currently, convolutional neural networks and recurrent neural networks have been widely utilized for feature extraction from multi-sensor data. However, these networks struggle to globally capture the crucial temporal features inherent of human activity over time. To address this, a multi-CNN-BiLSTM-self attention (Multi-CBSA) model based on self-attention and weight sharing has been proposed, taking into consideration the logical correlations among sensors placed on different parts of the body. This model employs uniformly structured and weight-shared sub-networks to extract features from activity data captured by different body parts, simplifying the model architecture and reducing training parameters. In this model, 1-dimensional convolutional neural network is used to convert the original behavioral data into short sequences consisting of advanced features; second, the forward and backward temporal features of the short sequences are obtained by bi-directional long and short-term memory network for each sub-network; and third, representative key features are obtained utilizing the self-attention by assigning dynamic weights to human features; The outputs from each sub-network are fused in a fusion layer. Ablation experiments demonstrate that Multi-CBSA has significant improvements in convergence speed, validation set loss, and single-class activity recognition accuracy after the introduction of self-attention. Comparative experiments show that Multi-CBSA can achieve recognition accuracies of 99.3% and 96.4% on the MHEALTH and PAMAP2 datasets, respectively, with fewer training parameters. Compared to recent state-of-the-art models, the recognition accuracy can be increased by up to 4.2% and 4.4%.

Key words: human activity recognition, wearable sensor, feature extraction, self-attention, weight sharing

中图分类号: