{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,14]],"date-time":"2025-05-14T12:00:49Z","timestamp":1747224049300,"version":"3.40.5"},"reference-count":27,"publisher":"IGI Global","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,4,1]]},"abstract":"<p>In this work, the authors propose several techniques for accelerating a modern action recognition pipeline. This article reviewed several recent and popular action recognition works and selected two of them as part of the tools used for improving the aforementioned acceleration. Specifically, temporal segment networks (TSN), a convolutional neural network (CNN) framework that makes use of a small number of video frames for obtaining robust predictions which have allowed to win the first place in the 2016 ActivityNet challenge, and MotionNet, a convolutional-transposed CNN that is capable of inferring optical flow RGB frames. Together with the last proposal, this article integrated a new software for decoding videos that takes advantage of NVIDIA GPUs. This article shows a proof of concept for this approach by training the RGB stream of the TSN network in videos loaded with NVIDIA Video Loader (NVVL) of a subset of daily actions from the University of Central Florida 101 dataset.<\/p>","DOI":"10.4018\/ijcvip.2019040102","type":"journal-article","created":{"date-parts":[[2019,3,27]],"date-time":"2019-03-27T18:33:04Z","timestamp":1553711584000},"page":"16-31","source":"Crossref","is-referenced-by-count":0,"title":["Accelerating Deep Action Recognition Networks for Real-Time Applications"],"prefix":"10.4018","volume":"9","author":[{"given":"David","family":"Ivorra-Piqueres","sequence":"first","affiliation":[{"name":"University of Alicante, Alicante, Spain"}]},{"given":"John Alejandro Castro","family":"Vargas","sequence":"additional","affiliation":[{"name":"University of Alicante, Alicante, Spain"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6037-9815","authenticated-orcid":true,"given":"Pablo","family":"Martinez-Gonzalez","sequence":"additional","affiliation":[{"name":"University of Alicante, Alicante, Spain"}]}],"member":"2432","reference":[{"key":"IJCVIP.2019040102-0","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2868668"},{"key":"IJCVIP.2019040102-1","doi-asserted-by":"publisher","DOI":"10.1109\/TAC.1959.1104847"},{"key":"IJCVIP.2019040102-2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.675"},{"key":"IJCVIP.2019040102-3","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.1986.4767851"},{"key":"IJCVIP.2019040102-4","doi-asserted-by":"crossref","first-page":"4724","DOI":"10.1109\/CVPR.2017.502","article-title":"Quo vadis, action recognition? a new model and the kinetics dataset.","author":"J.Carreira","year":"2017","journal-title":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)"},{"key":"IJCVIP.2019040102-5","unstructured":"Casper, J., Barker, J., & Catanzaro, B. (2018). NVVL: NVIDIA Video Loader."},{"key":"IJCVIP.2019040102-6","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.368"},{"key":"IJCVIP.2019040102-7","article-title":"Predictive corrective networks for action detection.","author":"A.Dave","year":"2017","journal-title":"Proceedings of the Computer Vision and Pattern Recognition"},{"key":"IJCVIP.2019040102-8","unstructured":"Duke, B. (2018). Lintel: Python video decoding."},{"key":"IJCVIP.2019040102-9","doi-asserted-by":"crossref","unstructured":"Efros, A. A., Berg, A. C., Mori, G., & Malik, J. (2003, October). Recognizing action at a distance. In null (p. 726). IEEE.","DOI":"10.1109\/ICCV.2003.1238420"},{"key":"IJCVIP.2019040102-10","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.787"},{"key":"IJCVIP.2019040102-11","doi-asserted-by":"publisher","DOI":"10.1006\/cviu.1998.0716"},{"key":"IJCVIP.2019040102-12","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(81)90024-2"},{"key":"IJCVIP.2019040102-13","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2012.59"},{"key":"IJCVIP.2019040102-14","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.390"},{"key":"IJCVIP.2019040102-15","doi-asserted-by":"crossref","first-page":"1003","DOI":"10.1109\/CVPR.2017.113","article-title":"Temporal convolutional networks for action segmentation and detection.","author":"C.Lea","year":"2017","journal-title":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)"},{"key":"IJCVIP.2019040102-16","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.616"},{"issue":"1","key":"IJCVIP.2019040102-17","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1109\/MASSP.1986.1165342","article-title":"An introduction to hidden Markov models.","volume":"3","author":"L. R.Rabiner","year":"1986","journal-title":"IEEE ASSP Magazine"},{"key":"IJCVIP.2019040102-18","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.399"},{"key":"IJCVIP.2019040102-19","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1109\/ICPR.2004.1334462","article-title":"Recognizing human actions: a local SVM approach.","volume":"Vol. 3","author":"C.Schuldt","year":"2004","journal-title":"Proceedings of the 17th International Conference on Pattern Recognition ICPR 2004"},{"key":"IJCVIP.2019040102-20","first-page":"8","volume":"Vol. 6","author":"G. A.Sigurdsson","year":"2017","journal-title":"Asynchronous Temporal Fields for Action Recognition"},{"key":"IJCVIP.2019040102-21","unstructured":"Simonyan, K., & Zisserman, A. (2014). Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems (pp. 568-576)."},{"key":"IJCVIP.2019040102-22","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.393"},{"key":"IJCVIP.2019040102-23","unstructured":"Soomro, K., Zamir, A. R., & Shah, M. (2012). UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv:1212.0402"},{"key":"IJCVIP.2019040102-24","article-title":"Temporal segment networks for action recognition in videos.","author":"L.Wang","year":"2018","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"IJCVIP.2019040102-25","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.326"},{"key":"IJCVIP.2019040102-26","unstructured":"Zhu, Y., Lan, Z., Newsam, S., & Hauptmann, A. G. (2017). Hidden two-stream convolutional networks for action recognition. arXiv:1704.00389"}],"container-title":["International Journal of Computer Vision and Image Processing"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=226242","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,6]],"date-time":"2022-05-06T21:02:15Z","timestamp":1651870935000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/IJCVIP.2019040102"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2019,4,1]]},"references-count":27,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2019,4]]}},"URL":"https:\/\/doi.org\/10.4018\/ijcvip.2019040102","relation":{},"ISSN":["2155-6997","2155-6989"],"issn-type":[{"type":"print","value":"2155-6997"},{"type":"electronic","value":"2155-6989"}],"subject":[],"published":{"date-parts":[[2019,4,1]]}}}