系统仿真学报 ›› 2024, Vol. 36 ›› Issue (3): 555-563.doi: 10.16182/j.issn1004731x.joss.22-1234

• 论文 • 上一篇    下一篇

投影子空间下基于骨骼边信息的人体动作识别

苏本跃1,2(), 张鹏1,2, 朱邦国1,2, 郭梦娟1,2, 盛敏3   

  1. 1.安庆师范大学 计算机与信息学院, 安徽 安庆 246133
    2.铜陵学院 数学与计算机学院, 安徽 铜陵 244061
    3.安庆师范大学 数理学院, 安徽 安庆 246133
  • 收稿日期:2022-10-17 修回日期:2023-01-31 出版日期:2024-03-15 发布日期:2024-03-14
  • 第一作者简介:苏本跃(1971-),男,教授,博士,研究方向为机器学习与模式识别、图形图像处理。E-mail: subenyue@sohu.com
  • 基金资助:
    安徽省领军人才团队项目(皖教秘人[2019]16号);安庆师范大学与铜陵学院联合培养研究生科研创新基金(tlaqsflhy2)

Human Action Recognition Based on Skeleton Edge Information Under Projection Subspace

Su Benyue1,2(), Zhang Peng1,2, Zhu Bangguo1,2, Guo Mengjuan1,2, Sheng Min3   

  1. 1.School of Computer and Information, Anqing Normal University, Anqing 246133, China
    2.School of Mathematics and Computer, Tongling University, Tongling 244061, China
    3.School of Mathematics and Physics, Anqing Normal University, Anqing 246133, China
  • Received:2022-10-17 Revised:2023-01-31 Online:2024-03-15 Published:2024-03-14

摘要:

近年来,基于骨骼数据的人体动作识别在计算机视觉、人机交互等领域受到了广泛的关注。现有的方法大多关注于在原始的3D坐标空间下对骨骼点进行建模。然而,骨骼点忽略了人体自身的物理链状结构,很难刻画人体运动的局部相关性;此外,由于相机视角的多样性,在原始的基于点的3D空间下难以探索动作在不同视角下的综合表征。鉴于此,提出了一种投影子空间下基于骨骼边信息的动作识别方法。定义了结合人体自身连接的骨骼边信息,用于捕获动作的空间特性;在骨骼边信息的基础上引入了骨骼边运动的方向与大小信息,用于获取动作的时间特性;采用2D投影子空间的方式在不同的子空间视角下进行动作表征;探索了合适的特征融合策略,通过改进的CNN框架对上述特征进行综合提取。在2个具有挑战性的大型数据集NTU-RGB+D 60(评价指标为cross-subject与cross-view)和NTU-RGB+D 120(评价指标为cross-subject与cross-set)上的实验结果表明,相比基准方法,所提方法在4个指标下精度分别提升了3.2%、2.4%、3.1%和5.8%。

关键词: 骨骼数据, 骨骼边, 边方向, 边大小, 投影子空间

Abstract:

In recent years, human action recognition based on skeleton data has received a lot of attention in the fields of computer vision and human-computer interaction. Most of the existing methods focus on modeling the skeleton points in the original 3D coordinate space. However, skeleton points ignore the physical chain structure of the human body itself, which makes it difficult to portray the local correlation of human motion. In addition, due to the diversity of camera views, it is difficult to explore the comprehensive representation of actions in different views under the original point-based 3D space. In view of this, this paper proposed an action recognition method based on skeleton edge information in the projection subspace. The method defined skeleton edge information combined with the body's own connection for capturing the spatial characteristics of the action. The direction and size information of skeleton edge motion was introduced on the basis of the skeleton edge information for capturing the temporal characteristics of the action. The 2D projection subspace was used for action characterization under different subspace perspectives. A suitable feature fusion strategy was explored, and the above features were extracted comprehensively through the improved CNN framework. Experimental results on two challenging large datasets NTU-RGB+D 60 (evaluation metrics are cross-subject and cross-view) and NTU-RGB+D 120 (evaluation metrics are cross-subject and cross-set) show that compared with the benchmark method, the proposed method improves the accuracy under the four metrics by 3.2%, 2.4%, 3.1%, and 5.8%, respectively.

Key words: skeleton data, skeleton edges, edge direction, edge size, projection subspace

中图分类号: