{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T08:15:32Z","timestamp":1773908132439,"version":"3.50.1"},"reference-count":19,"publisher":"World Scientific Pub Co Pte Ltd","issue":"08","funder":[{"DOI":"10.13039\/501100006378","name":"Universitas Indonesia","doi-asserted-by":"publisher","award":["NKB-0209\/UN2.R3.1\/HKP.05.00\/2019"],"award-info":[{"award-number":["NKB-0209\/UN2.R3.1\/HKP.05.00\/2019"]}],"id":[{"id":"10.13039\/501100006378","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Patt. Recogn. Artif. Intell."],"published-print":{"date-parts":[[2022,6,30]]},"abstract":"<jats:p> In this paper, we propose an end-to-end multi-resolution three-dimensional (3D) capsule network for detecting actions of multiple actors in a video scene. Unlike previous capsule, network-based action recognition does not specifically concern with the individual action of multiple actors in a single scene, our 3D capsule network takes advantage of multi-resolution technique to detect different actions of multiple actors that have different sizes, scales, and aspect ratios. Our 3D capsule network is built on top of 3D convolutional neural network (3DCNN) that extracts spatio-temporal features from video frames inside regions of interest generated by Faster RCNN object detection. We first apply our method to the problem of detecting illegal cheating activities in a classroom examination scene with multiple subjects involved. Second, we test our system on the publicly available and extensively studied UCF-101 dataset. We compare our method with several state-of-the-art 3DCNN-based methods, first the multi-resolution 3DCNN, the single-resolution 3D capsule network, and a combination of both these models. We show that models containing 3D capsule networks have a slight advantage over the conventional 3DCNN and multi-resolution 3DCNN. Our 3D capsule networks not only perform a classification of said actions but also generate videos of single actions. Our experimental results show that the use of multi-resolution pathways in the 3D capsule networks make the result even better. Such findings also hold even when we use pre-trained C3D (convolutional 3D) features to train these networks. We believe that the multiple resolutions capture lower-level features at different scales. At the same time, the 3D capsule layers combine these features in more complex ways than conventional convolutional models. <\/jats:p>","DOI":"10.1142\/s0218001422550151","type":"journal-article","created":{"date-parts":[[2022,5,15]],"date-time":"2022-05-15T03:49:20Z","timestamp":1652586560000},"source":"Crossref","is-referenced-by-count":2,"title":["End-to-End Multi-Resolution 3D Capsule Network for People Action Detection"],"prefix":"10.1142","volume":"36","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6050-664X","authenticated-orcid":false,"given":"Mohamad Ivan","family":"Fanany","sequence":"first","affiliation":[{"name":"Machine Learning and Computer Vision Laboratory, Faculty of Computer Science, Universitas Indonesia Jl. Margonda Raya, Pondok Cina, Kecamatan Beji, Kota Depok, Jawa Barat 16424, Indonesia"}]},{"given":"Ahmad","family":"Arinaldi","sequence":"additional","affiliation":[{"name":"Machine Learning and Computer Vision Laboratory, Faculty of Computer Science, Universitas Indonesia Jl. Margonda Raya, Pondok Cina, Kecamatan Beji, Kota Depok, Jawa Barat 16424, Indonesia"}]}],"member":"219","published-online":{"date-parts":[[2022,5,13]]},"reference":[{"key":"S0218001422550151BIB003","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2003.11.001"},{"key":"S0218001422550151BIB005","volume-title":"Int. Joint Conf. Artificial Intelligence","volume":"2","author":"Hinton G. E.","year":"1981"},{"key":"S0218001422550151BIB006","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-21735-7_6"},{"key":"S0218001422550151BIB007","volume-title":"Int. Conf. Learning Representations","author":"Hinton G. E.","year":"2018"},{"key":"S0218001422550151BIB008","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2012.59"},{"key":"S0218001422550151BIB009","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2016.10.004"},{"key":"S0218001422550151BIB010","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.223"},{"key":"S0218001422550151BIB011","first-page":"1097","volume-title":"Advances in Neural Information Processing Systems","author":"Krizhevsky A.","year":"2012"},{"key":"S0218001422550151BIB012","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"S0218001422550151BIB014","volume":"34","author":"Patrick K.","year":"2022","journal-title":"J. King Saud Univ. - Comput. Inf. Sci."},{"key":"S0218001422550151BIB015","volume-title":"Types of Eye Movements and Their Functions - Neuroscience","author":"Purves D.","year":"2004","edition":"3"},{"key":"S0218001422550151BIB017","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2577031"},{"key":"S0218001422550151BIB018","first-page":"3859","volume-title":"Advances in Neural Information Processing Systems","author":"Sabour S.","year":"2017"},{"key":"S0218001422550151BIB019","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2012.12.013"},{"key":"S0218001422550151BIB020","first-page":"568","volume-title":"Advances in Neural Information Processing Systems","author":"Simonyan K.","year":"2014"},{"key":"S0218001422550151BIB022","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.510"},{"key":"S0218001422550151BIB023","first-page":"613","volume-title":"Advances in Neural Information Processing Systems","author":"Vondrick C.","year":"2016"},{"key":"S0218001422550151BIB024","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2017.04.004"},{"key":"S0218001422550151BIB025","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46484-8_2"}],"container-title":["International Journal of Pattern Recognition and Artificial Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0218001422550151","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,7,8]],"date-time":"2022-07-08T03:54:05Z","timestamp":1657252445000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S0218001422550151"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,13]]},"references-count":19,"journal-issue":{"issue":"08","published-print":{"date-parts":[[2022,6,30]]}},"alternative-id":["10.1142\/S0218001422550151"],"URL":"https:\/\/doi.org\/10.1142\/s0218001422550151","relation":{},"ISSN":["0218-0014","1793-6381"],"issn-type":[{"value":"0218-0014","type":"print"},{"value":"1793-6381","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,5,13]]},"article-number":"2255015"}}