{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T18:45:21Z","timestamp":1772131521304,"version":"3.50.1"},"reference-count":26,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2020,11,18]],"date-time":"2020-11-18T00:00:00Z","timestamp":1605657600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61573213"],"award-info":[{"award-number":["61573213"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61803227"],"award-info":[{"award-number":["61803227"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>In view of difficulty in application of optical flow based human action recognition due to large amount of calculation, a human action recognition algorithm I3D-shufflenet model is proposed combining the advantages of I3D neural network and lightweight model shufflenet. The 5 \u00d7 5 convolution kernel of I3D is replaced by a double 3 \u00d7 3 convolution kernels, which reduces the amount of calculations. The shuffle layer is adopted to achieve feature exchange. The recognition and classification of human action is performed based on trained I3D-shufflenet model. The experimental results show that the shuffle layer improves the composition of features in each channel which can promote the utilization of useful information. The Histogram of Oriented Gradients (HOG) spatial-temporal features of the object are extracted for training, which can significantly improve the ability of human action expression and reduce the calculation of feature extraction. The I3D-shufflenet is testified on the UCF101 dataset, and compared with other models. The final result shows that the I3D-shufflenet has higher accuracy than the original I3D with an accuracy of 96.4%.<\/jats:p>","DOI":"10.3390\/a13110301","type":"journal-article","created":{"date-parts":[[2020,11,18]],"date-time":"2020-11-18T07:41:00Z","timestamp":1605685260000},"page":"301","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":24,"title":["I3D-Shufflenet Based Human Action Recognition"],"prefix":"10.3390","volume":"13","author":[{"given":"Guocheng","family":"Liu","sequence":"first","affiliation":[{"name":"School of Mechanical, Electrical &amp; Information Engineering, Shandong University, Weihai 264209, China"}]},{"given":"Caixia","family":"Zhang","sequence":"additional","affiliation":[{"name":"Mechanical &amp; Electrical Engineering Department, Weihai Vocational College, Weihai 264210, China"}]},{"given":"Qingyang","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Mechanical, Electrical &amp; Information Engineering, Shandong University, Weihai 264209, China"}]},{"given":"Ruoshi","family":"Cheng","sequence":"additional","affiliation":[{"name":"School of Mechanical, Electrical &amp; Information Engineering, Shandong University, Weihai 264209, China"}]},{"given":"Yong","family":"Song","sequence":"additional","affiliation":[{"name":"School of Mechanical, Electrical &amp; Information Engineering, Shandong University, Weihai 264209, China"}]},{"given":"Xianfeng","family":"Yuan","sequence":"additional","affiliation":[{"name":"School of Mechanical, Electrical &amp; Information Engineering, Shandong University, Weihai 264209, China"}]},{"given":"Jie","family":"Sun","sequence":"additional","affiliation":[{"name":"School of Mechanical, Electrical &amp; Information Engineering, Shandong University, Weihai 264209, China"}]}],"member":"1968","published-online":{"date-parts":[[2020,11,18]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1038\/scientificamerican0675-76","article-title":"Visual motion perception","volume":"232","author":"Johansson","year":"1975","journal-title":"Sci. Am."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"953","DOI":"10.1016\/j.procs.2018.04.095","article-title":"Recognition of basketball referee signals from videos using Histogram of Oriented Gradients (HOG) and Support Vector Machine (SVM)","volume":"130","author":"Raudonis","year":"2018","journal-title":"Procedia Comput. Sci."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1109\/TCSVT.2014.2358029","article-title":"Crowded Scene Analysis: A Survey","volume":"25","author":"Li","year":"2015","journal-title":"IEEE Trans. Circ. Syst. Vid."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Wang, H., Klaser, A., Schmid, C., and Liu, C. (2011, January 20\u201325). Action Recognition by Dense Trajectories. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2011.5995407"},{"key":"ref_5","unstructured":"Kipf, T.N., and Welling, M. (2016). Semi-Supervised Classification with Graph Convolutional Networks. arXiv."},{"key":"ref_6","unstructured":"Gori, M., Monfardini, G., and Scarselli, F. (August, January 31). A New Model for Learning in Graph Domains. Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, Montreal, QC, Canada."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1109\/TNN.2008.2005605","article-title":"The Graph Neural Network Model","volume":"20","author":"Scarselli","year":"2009","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1109\/TPAMI.2012.59","article-title":"3D Convolutional Neural Networks for Human Action Recognition","volume":"35","author":"Ji","year":"2013","journal-title":"IEEE Trans. Pattern Anal."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_10","unstructured":"Simonyan, K., and Zisserman, A. (2014, January 8\u201313). Two-Stream Convolutional Networks for Action Recognition in Videos. Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Feichtenhofer, C., Pinz, A., and Wildes, R.P.B.I. (2017, January 21\u201326). Spatiotemporal Multiplier Networks for Video Action Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.787"},{"key":"ref_12","unstructured":"Xie, S., Sun, C., Huang, J., Tu, Z., and Murphy, K. (November, January 29). Rethinking spatiotemporal feature learning: Speed-accuracy trade-offs in video classification. Proceedings of the 15th European Conference. Proceedings: Lecture Notes in Computer Science (LNCS 11219), Tokyo, Japan."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Chen, Y., Kalantidis, Y., Li, J., Yan, S., and Feng, J. (2018). Multi-Fiber Networks for Video Recognition, Springer. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-030-01246-5_22"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"2799","DOI":"10.1109\/TPAMI.2017.2769085","article-title":"Action Recognition with Dynamic Image Networks","volume":"40","author":"Bilen","year":"2018","journal-title":"IEEE T Pattern Anal."},{"key":"ref_15","unstructured":"Zhu, L., Tran, D., Sevilla-Lara, L., Yang, Y., Feiszli, M., and Wang, H. (2020, January 7\u201312). FASTER Recurrent Networks for Efficient Video Classification. Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), New York, NY, USA."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Tran, D., Bourdev, L., Fergus, R., Torresani, L., and Paluri, M. (2015, January 7\u201313). Learning Spatiotemporal Features with 3D Convolutional Networks. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.510"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhou, X., Lin, M., and Sun, J. (2018, January 18\u201322). ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00716"},{"key":"ref_18","unstructured":"Soomro, K., Zamir, A.R., and Shah, M. (2012). UCF101: A Dataset of 101 Human Actions Classes from Videos in the Wild. arXiv."},{"key":"ref_19","first-page":"1113913","article-title":"Support Vector Machine and Convolutional Neural Network Based Approaches for Defect Detection in Fused Filament Fabrication","volume":"11139","author":"Narayanan","year":"2019","journal-title":"Int. Soc. Opt. Photonic"},{"key":"ref_20","first-page":"111390W","article-title":"Performance Analysis of Machine Learning and Deep Learning Architectures for Malaria Detection on Cell Images","volume":"11139","author":"Narayanan","year":"2019","journal-title":"Int. Soc. Opt. Photonic"},{"key":"ref_21","unstructured":"Narayanan, B.N., De Silva, M.S., Hardie, R.C., Kueterman, N.K., and Ali, R. (2019). Understanding Deep Neural Network Predictions for Medical Imaging Applications. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Feichtenhofer, C., Pinz, A., and Zisserman, A. (2016, January 27\u201330). Convolutional Two-Stream Network Fusion for Video Action Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.213"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Wang, L., Li, W., and Van Gool, L. (2018, January 18\u201322). Appearance-and-Relation Networks for Video Classification. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00155"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"8599","DOI":"10.1007\/s11042-018-6396-4","article-title":"I3D: A new dataset for testing denoising and demosaicing algorithms","volume":"79","author":"Bonanomi","year":"2018","journal-title":"Multimed. Tools Appl."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., and van Gool, L. (2016, January 8\u201316). Temporal segment networks: Towards good practices for deep action recognition. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46484-8_2"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Kalfaoglu, M.E., Alkan, S., and Alatan, A.A. (2020). Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition. arXiv.","DOI":"10.1007\/978-3-030-68238-5_48"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/13\/11\/301\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:33:36Z","timestamp":1760178816000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/13\/11\/301"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,18]]},"references-count":26,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2020,11]]}},"alternative-id":["a13110301"],"URL":"https:\/\/doi.org\/10.3390\/a13110301","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,11,18]]}}}