OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Journal of Image and Signal Processing 2023

基于时空注意力深度增强差分图卷积的骨架行为识别
A Skeleton-Based Action Recognition with Spatiotemporal Attention Depth Enhance Differential Graph Convolution

DOI: 10.12677/JISP.2023.122019, PP. 188-199

姬仲轩, 周美丽, 白宗文

Keywords: 行为识别，深度卷积，时空特征，时空注意力机
Action Recognition, Depth-Wise Convolution, Spatiotemporal Feature, Spatiotemporal Attention

Full-Text Cite this paper Add to My Lib

Abstract:

时空卷积神经网络是行为识别的主流方法之一，但传统时空图卷积神经网络在空间特征聚合存在数据冗余与时间特征提取不充分的问题，针对该问题该文提出了一种时空注意力深度增强差分图卷积网络(ST-DEdGCN)模型。首先，在空间上通过深度增强差分图卷积(DEdGC)动态地学习不同通道中节点拓扑与节点梯度信息，有效地聚合不同通道中的关节特征。其次，通过时空卷积模块在时间维度上对全局时间信息进行建模，得到高效的序列特征信息。最后在NTU RGB + D 60和NTU RGB + D 120两个数据集进行了实验，实验结果表明时空注意力深度差分图卷积网络模型在空间特征的有效聚合和时空信息的有效提取方面优于当前主流方法，为行为识别及其相关研究提供了新的技术途径。
Spatiotemporal convolution neural network is one of the mainstream methods of action recognition, but the traditional spatiotemporal graph convolution neural network while having the problems of data redundancy and insufficient temporal feature extraction. To tackle the problem, a novel Spatio Temporal attention Depth Enhance difference Graph Convolution Network (ST- DEdGCN) model is proposed in this paper. Firstly, the Depth Enhance difference Graph Convolution (DEdGC) in space is proposed to dynamically learn joint topology and joint gradient information in different channels, and the joint features in different channels are effectively aggregated. Secondly, the Spatiotemporal Attention Temporal Convolution Network is proposed to model the global temporal joint information in time, and obtain efficient temporal feature information. Finally, the proposed algorithm is verified on the public skeleton action data sets NTU RGB + D 60 and NTU RGB + D 120. The results further verify the superiority to aggregate spatial features and to extract spatial-temporal information of this model, and provide a new technical approach for action recognition.

References

[1]	Hussein, M.E., Torki, M., Gowayyed, M.A., et al. (2013) Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3d Joint Locations. Proceedings of the 23rd International Joint Conference on Artificial Intelligence, Beijing, 3-9 August 2013, 2466-2472.
[2]	Vemulapalli, R., Arrate, F. and Chellappa, R. (2014) Human Action Recognition by Representing 3d Skeletons as Points in a Lie Group. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, 23-28 June 2014, 588-595. https://doi.org/10.1109/CVPR.2014.82
[3]	Weng, J.W., Weng, C.Q. and Yuan, J.S. (2017) Spatio-Temporal Naive-Bayes Nearest-Neighbor (st-nbnn) for Skeleton-Based Action Recognition. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 4171-4180. https://doi.org/10.1109/CVPR.2017.55
[4]	Du, Y., Wang, W. and Wang, L. (2015) Hierarchical Recurrent Neural Network for Skeleton Based Action Recognition. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 7-12 June 2015, 1110-1118. https://doi.org/10.1109/CVPR.2015.7298714
[5]	Liu, J., Shahroudy, A., Xu, D., et al. (2016) Spatio-Temporal LSTM with Trust Gates for 3d Human Action Recognition. 2016 14th European Conference on Computer Vision (ECCV), Amsterdam, 11-14 October 2016, 816-833. https://doi.org/10.1007/978-3-319-46487-9_50
[6]	Li, C.K., Wang, P.C., Wang, S., et al. (2017) Skeleton-Based Action Recognition Using LSTM and CNN. 2017 International Conference on Multimedia & Expo Workshops (ICMEW), Hong Kong, 10-14 July 2017, 585-590. https://doi.org/10.1109/ICMEW.2017.8026287
[7]	Li, C., Zhong, Q.Y., Xie, D., et al. (2017) Skeleton-Based Action Recognition with Convolutional Neural Networks. 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), Hong Kong, 10-14 July 2017, 597-600. https://doi.org/10.1109/ICMEW.2017.8026285
[8]	Yan, S.J., Xiong, Y.J. and Lin, D.H. (2018) Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition. 2018 30th Innovative Applications of Artificial Intelligence (IAAI-18), New Orleans, 2-7 February 2018, 2-7.
[9]	Shi, L., Zhang, Y.F., Cheng, J., et al. (2019) Two-Stream Adaptive Graph Convolutional Networks for Skeleton Based Action Recognition. 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 12026-12035. https://doi.org/10.1109/CVPR.2019.01230
[10]	Li, M.S., Chen, S.H., Chen, X., et al. (2019) Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition. 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 16-20. https://doi.org/10.1109/CVPR.2019.00371
[11]	Cheng, K., Zhang, Y.F., Cao, C.Q., et al. (2020) Decoupling GCN with Drop Graph Module for Skeleton-Based Action Recognition. 2020 European Conference on Computer Vision (ECCV), Glasgow, 23-28 August 2020, 536-553. https://doi.org/10.1007/978-3-030-58586-0_32
[12]	Miao, S.Y., Hou, Y.H., Gao, Z.M., et al. (2021) A Central Difference Graph Convolutional Operator for Skeleton-Based Action Recognition. IEEE Transactions on Circuits and Systems for Video Technology, 32, 4893-4899. https://arxiv.org/abs/2111.06995
[13]	Chen, Y.X., Zhang, Z.Q., Yuan, C.F., et al. (2021) Channel-Wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition. 2021 International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 10-17. https://doi.org/10.1109/ICCV48922.2021.01311
[14]	Bai, R.W., Li, M., Meng, B., et al. (2021) Hierarchical Graph Convolutional Skeleton Transformer for Action Recognition. 2022 IEEE International Conference on Multimedia and Expo (ICME), Taipei, 18-22 July 2022, 1-6. https://doi.org/10.1109/ICME52920.2022.9859781
[15]	Liu, Z.Y., Zhang, H.W., Chen, Z., et al. (2020) Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 143-152. https://doi.org/10.1109/CVPR42600.2020.00022
[16]	Gao, X., Hu, W., Tang, J.Y., et al. (2019) Optimized Skeleton-Based Action Recognition via Sparsified Graph Regression. 27th ACM International Conference on Multimedia, Nice, 21-25 October 2019, 601-610. https://doi.org/10.1145/3343031.3351170
[17]	Shahroudy, A., Liu, J., Ng, T.T., et al. (2016) NTU RGB+D: A Large Scale Dataset for 3d Human Activity Analysis. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 27-30. https://doi.org/10.1109/CVPR.2016.115
[18]	Liu, J., Shahroudy, A., Perez, M.L., et al. (2019) NTU RGB+D 120: A Large-Scale Benchmark for 3d Human Activity Understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 2684-2701. https://doi.org/10.1109/TPAMI.2019.2916873
[19]	Zhang, P.F., Lan, C.L., Zeng, W.J., et al. (2020) Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 1112-1121. https://doi.org/10.1109/CVPR42600.2020.00119
[20]	Li, L., Zheng, W., Zhang, Z.X., et al. (2018) Skeleton-Based Relational Modeling for Action Recognition. 2019 IEEE International Conference on Multimedia and Expo (ICME), Shanghai, 8-12 July 2019. http://arxiv.org/abs/1805.02556
[21]	Li, S., Li, W.Q., et al. (2018) Independently Recurrent Neural Network (IndRNN): Building a Longer and Deeper RNN. 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, 18-23 June 2018, 18-22. https://doi.org/10.1109/CVPR.2018.00572
[22]	Li, C., Zhong, Q.Y., Xie, D., et al. (2018) Cooccurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation. 2018 27th International Joint Conference on Artificial Intelligence (IJCAI), Stockholm, 13-19 July 2018, 786-792. https://doi.org/10.24963/ijcai.2018/109
[23]	Si, C.Y., Chen, W.T., Wang, W., et al. (2019) An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition. 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 1227-1236. https://doi.org/10.1109/CVPR.2019.00132
[24]	Cheng, K., Zhang, Y.F., He, X., et al. (2020) Skeleton-Based Action Recognition with Shift Graph Convolutional Network. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 183-192. https://doi.org/10.1109/CVPR42600.2020.00026
[25]	Liu, J., Wang, G., Duan, L.Y., et al. (2017) Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks. IEEE Transactions on Image Processing, 27, 1586-1599. https://doi.org/10.1109/TIP.2017.2785279
[26]	Ke, Q.H., Bennamoun, M., An, S.J., et al. (2018) Learning Clip Representations for Skeleton-Based 3d Action Recognition. IEEE Transactions on Image Processing, 27, 2842-2855. https://doi.org/10.1109/TIP.2018.2812099
[27]	Xu, K.L., Ye, F.F., Zhong, Q.Y., et al. (2022) Topology-Aware Convolutional Neural Network for Efficient Skeleton-Based Action Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 2866-2874. https://doi.org/10.1609/aaai.v36i3.20191

Full-Text

Contact Us

[email protected]

QQ:3279437679

WhatsApp +8615387084133

基于时空注意力深度增强差分图卷积的骨架行为识别A Skeleton-Based Action Recognition with Spatiotemporal Attention Depth Enhance Differential Graph Convolution

基于时空注意力深度增强差分图卷积的骨架行为识别
A Skeleton-Based Action Recognition with Spatiotemporal Attention Depth Enhance Differential Graph Convolution