基于深度神经网络的多视角人体动作识别

doi:10.16182/j.issn1004731x.joss.19-0448

系统仿真学报 ›› 2021, Vol. 33 ›› Issue (5): 1019-1030.doi: 10.16182/j.issn1004731x.joss.19-0448

基于深度神经网络的多视角人体动作识别

赵瑛^1,2, 陆耀¹, 张健³, 梁启弟³, 龙炜¹

1.北京理工大学智能信息技术北京市重点实验室,北京 100081;
2.北京联合大学师范学院,北京 100011;
3.中南大学计算机学院,长沙 410083

收稿日期:2019-08-26 修回日期:2019-10-08 出版日期:2021-05-18 发布日期:2021-06-09
通讯作者: 陆耀(1958-),男,博士,教授,研究方向为神经网络、图像和信号处理、模式识别。E-mail：vis_yl@bit.edu.cn
第一作者简介:赵瑛(1977-),女,博士,副教授,研究方向为人体行为分析、机器学习、智慧教育。E-mail：sftzhaoying@buu.edu.cn
基金资助:
国家自然科学基金(61273273); 国家重点研发计划(2017YFC0112001)

Multi-view Human Action Recognition Based on Deep Neural Network

Zhao Ying^1,2, Lu Yao¹, Zhang Jian³, Liang Qidi³, Long Wei¹

1. Beijing Laboratory of Intelligent Information Technology, Beijing Institute of Technology, Beijing 100081, China;
2. Teachers College, Beijing Union University, Beijing 100011, China;
3. School of Computer Science and Engineering, Central South University, Changsha 410083, China

Received:2019-08-26 Revised:2019-10-08 Online:2021-05-18 Published:2021-06-09

摘要/Abstract

摘要： 为提高多视角人体动作识别的精度,提出了一种新的深度神经网络模型——CNN+CA(Convolutional Neural Network plus Context Attention)模型和一种基于序列匹配的识别方法。利用卷积神经网络自动学习出多视角融合特征;引入上下文注意力模块自动突出特征中有利于识别的区域,进一步提高特征的判别力;通过基于序列匹配的方法实现人体动作识别。在IXMAS 数据集和i3DPost 数据集上的实验结果表明,所提方法在识别精度上超过了其他同类方法。

关键词: 多视角, 人体动作识别, 卷积神经网络, 上下文注意力, 序列匹配

Abstract: A novel deep neural network named CNN+CA(Convolutional Neural Network plus Context Attention) model is constructed and a new recognition algorithm based on sequence matching is presented to improve the recognition accuracy of MVHAR (Multi-view Human Action Recognition). A CNN(Convolutional Neural Network) is designed to automatically learn multi-view fusion features; the CA (Context Attention) module is introduced to selectively focus on the parts of the features that are relevant for the recognition task; the proposed recognition algorithm based on sequence matching is used to realize MVHAR. The experimental results on the IXMAS dataset and the i3DPost dataset demonstrate that the recognition accuracy of the proposed method is higher than those of the state-of-the-art MVHAR methods.

Key words: multi-view, human action recognition, convolutional neural network, context attention, sequence matching

中图分类号:

TP391

赵瑛,陆耀,张健等 . 基于深度神经网络的多视角人体动作识别[J]. 系统仿真学报, 2021, 33(5): 1019-1030.

Zhao Ying,Lu Yao,Zhang Jian,et al . Multi-view Human Action Recognition Based on Deep Neural Network[J]. Journal of System Simulation, 2021, 33(5): 1019-1030.

参考文献

[1] Sharma S, Kiros R, Salakhutdinov R.Action Recognition Using Visual Attention[C]// International Conference on Learning Representations Workshop. San Juan, Puerto Rico, USA: Springer, 2016: 1-11.
[2] Jaderberg M, Simonyan K, Zisserman A, et al.Spatial Transformer Networks[C]// Advances in Neural Information Processing Systems. Montréal CA: Springer, 2015: 2017-2025.
[3] Aryanfar A, Yaakob R, Halin A A, et al.Multi-view Human Action Recognition Using Wavelet Data Reduction and Multi-Class Classification[J]. Procedia Computer Science (S1877-0509), 2015, 100(62): 585-592.
[4] Liu C, Li Z, Shi X, et al.Learning a Mid-level Representation for Multi-View Action Recognition[J]. Advances in Multimedia (S1687-5680), 2018: 1-10.
[5] Zhen X, Shao L, Maybank S, et al.Handcrafted vs. Learned Representations for Human Action Recognition[J]. Image and Vision Computing (S0262-8856), 2016, 55(2): 39-41.
[6] Putra P U, Shima K, Shimatani K.Markerless Human Activity Recognition Method Based on Deep Neural Network Model Using Multiple Cameras[C]// IEEE International Conference on Control, Decision and Information Technologies. Thessaloniki, Greece: IEEE, 2018: 13-18.
[7] Chuanxu W, Guofeng H, Yun L.Multi-views Action Recognition on Deep Learning and K-SVD[J]. Journal of Physics Conference Series (S1742-6588), 2019, 1176(6): 062015.
[8] Kavi R, Kulathumani V, Rohit F, et al.Multiview Fusion for Activity Recognition Using Deep Neural Networks[J]. Journal of Electronic Imaging (S1017-9909), 2016, 25(4): 043010.
[9] Weinland D, Ronfard R, Boyer E.Free Viewpoint Action Recognition Using Motion History Volumes[J]. Computer Vision and Image Understanding (S1077-3142), 2006, 104(2/3): 249-257.
[10] Gkalelis N, Kim H, Hilton A, et al.The i3DPost Multi-View and 3D Human Action/Interaction Database[C]// IEEE Conference for Visual Media Production. London, UK: IEEE, 2009: 159-168.
[11] Sargano A, Angelov P, Habib Z.Human Action Recognition From Multiple Views Based on View-invariant Feature Descriptor Using Support Vector Machines[J]. Applied Sciences (S2076-3417), 2016, 6(10): 309-322.
[12] Zhang J, Zhang L, Shum H P H, et al. Arbitrary View Action Recognition Via Transfer Dictionary Learning on Synthetic Training Data[C]// IEEE International Conference on Robotics and Automation. Stockholm, Sweden: IEEE, 2016: 1678-1684.
[13] Liu C W, Pei M T, Wu X X, et al.Learning a Discriminative Mid-Level Feature for Action Recognition[J]. Science China-Information Sciences (S1862-2836), 2014, 57(5): 1-13.
[14] Zhang J, Shum H P, Han J, et al.Action Recognition From Arbitrary Views Using Transferable Dictionary Learning[J]. IEEE Transactions on Image Processing (S1057-7149), 2018, 27(10): 4709-4723.
[15] Chalearnnetkul P, Suvonvorn N.Multiview Layer Fusion Model for Action Recognition Using RGBD Images[J]. Computational Intelligence and Neuroscience (S1687-5265), 2018: 1-22.
[16] Iosifidis A, Tefas A, Pitas I.Regularized Extreme Learning Machine for Multi-view Semi-supervised Action Recognition[J]. Neurocomputing (S0925-2312), 2014, 145(5): 250-262.
[17] Sadek S, Al-Hamadi A, Krell G, et al.Affine-Invariant Feature Extraction for Activity Recognition[J]. ISRN Machine Vision (S2090-7796), 2013(1): 1-7.
[18] Iosifidis A, Tefas A, Nikolaidis N, et al.Multi-view Human Movement Recognition Based on fuzzy Distances and Linear Discriminant Analysis[J]. Computer Vision and Image Understanding (S1077-3142), 2012, 116(3): 347-360.
[19] Holte M B, Moeslund T B, Nikolaidis N, et al.3D Human Action Recognition for Multi-view Camera Systems[C]// International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission. Hangzhou, China: IEEE, 2011: 342-349.

基于深度神经网络的多视角人体动作识别

Multi-view Human Action Recognition Based on Deep Neural Network

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	冯雪健, 丁晗, 童逸琦, 霍超颖, 张燕津. 一种目标典型航迹形状仿真及多视角识别算法[J]. 系统仿真学报, 2026, 38(3): 725-735.
[2]	黄德启, 涂亚婷, 张振华, 郭鑫. 基于MLP与改进GCN-TD3的交通信号控制建模与仿真[J]. 系统仿真学报, 2025, 37(10): 2568-2577.
[3]	顾皓, 王佳宇, 熊伟丽. 双流框架下的改进Transformer软测量建模[J]. 系统仿真学报, 2025, 37(10): 2594-2604.
[4]	路阳, 刘鹏飞, 许思源, 刘启旺, 顾福谦, 王鹏. 改进注意力机制嵌入PR-Net模型的水稻病害识别仿真[J]. 系统仿真学报, 2024, 36(6): 1322-1333.
[5]	苏本跃, 朱邦国, 郭梦娟, 盛敏. 融合球空间下旋转角度编码的人体动作识别[J]. 系统仿真学报, 2024, 36(6): 1433-1441.
[6]	陈静, 张昭冲, 王琳凯, 安脉, 王伟. 基于卷积长短时记忆网络的短时公交客流量预测[J]. 系统仿真学报, 2024, 36(2): 476-486.
[7]	徐艺博, 于清华, 王炎娟, 郭策, 冯世如, 卢惠民. 基于多源信息融合的巡飞弹对地目标识别与毁伤评估[J]. 系统仿真学报, 2024, 36(2): 511-521.
[8]	张凯, 卢海鹏, 韩莹, 张龄允, 丁昱杰. 融合变分模态分解的时空卷积短时车速预测[J]. 系统仿真学报, 2023, 35(8): 1651-1660.
[9]	蔡兴泉, 李治均, 奚梦瑶, 孙海燕. 基于神经网络的手绘服饰图纹上色及风格迁移[J]. 系统仿真学报, 2023, 35(3): 604-615.
[10]	杨丰玮, 陈鹏, 郗凯, 蒲华林, 刘雪垠. 融合空间信息的运动想象脑电在线分类方法[J]. 系统仿真学报, 2023, 35(2): 254-267.
[11]	刘逊韵, 徐新海, 朱成璋, 李豪, 曾蕾. 面向多视角渲染任务的超实时可视仿真系统[J]. 系统仿真学报, 2022, 34(7): 1619-1628.
[12]	张立峰, 王会忍. 基于卷积神经网络及有限元仿真的电容层析成像图像重建[J]. 系统仿真学报, 2022, 34(4): 712-718.
[13]	敖邦乾, 杨莎, 令狐金卿, 叶振环. 基于级联神经网络疲劳驾驶检测系统设计[J]. 系统仿真学报, 2022, 34(2): 323-333.
[14]	王亚茹, 杨凯, 翟永杰, 郭聪彬, 赵文清, 苏杰. 基于人工图像数据扩充的输电线路绝缘子识别[J]. 系统仿真学报, 2022, 34(11): 2337-2347.
[15]	徐佳乐, 张海东, 赵东海, 倪晚成. 基于卷积神经网络的陆战兵棋战术机动策略学习[J]. 系统仿真学报, 2022, 34(10): 2181-2193.