{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,31]],"date-time":"2025-12-31T14:06:31Z","timestamp":1767189991534,"version":"3.48.0"},"reference-count":57,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2025,12,29]],"date-time":"2025-12-29T00:00:00Z","timestamp":1766966400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>Real-time fine-grained human activity recognition (HAR) remains a challenging problem due to rapid spatial\u2013temporal variations, subtle motion differences, and dynamic environmental conditions. Addressing this difficulty, we propose NovAc-DL, a unified deep learning framework designed to accurately classify short human-like actions, specifically, \u201cpour\u201d and \u201cstir\u201d from sequential video data. The framework integrates adaptive time-distributed convolutional encoding with temporal reasoning modules to enable robust recognition under realistic robotic-interaction conditions. A balanced dataset of 2000 videos was curated and processed through a consistent spatiotemporal pipeline. Three architectures, LRCN, CNN-TD, and ConvLSTM, were systematically evaluated. CNN-TD achieved the best performance, reaching 98.68% accuracy with the lowest test loss (0.0236), outperforming the other models in convergence speed, generalization, and computational efficiency. Grad-CAM visualizations further confirm that NovAc-DL reliably attends to motion-salient regions relevant to pouring and stirring gestures. These results establish NovAc-DL as a high-precision real-time-capable solution for deployment in healthcare monitoring, industrial automation, and collaborative robotics.<\/jats:p>","DOI":"10.3390\/bdcc10010011","type":"journal-article","created":{"date-parts":[[2025,12,31]],"date-time":"2025-12-31T13:30:52Z","timestamp":1767187852000},"page":"11","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["NovAc-DL: Novel Activity Recognition Based on Deep Learning in the Real-Time Environment"],"prefix":"10.3390","volume":"10","author":[{"given":"Saksham","family":"Singla","sequence":"first","affiliation":[{"name":"Computer Science Engineering Department, Thapar Institute of Engineering Technology, Patiala 147004, Punjab, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sheral","family":"Singla","sequence":"additional","affiliation":[{"name":"Computer Science Engineering Department, Thapar Institute of Engineering Technology, Patiala 147004, Punjab, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Karan","family":"Singla","sequence":"additional","affiliation":[{"name":"Chemical Engineering Department, Visvesvaraya National Institute of Technology (VNIT), Nagpur 440010, Maharashtra, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Priya","family":"Kansal","sequence":"additional","affiliation":[{"name":"Computer Science Engineering Department, Thapar Institute of Engineering Technology, Patiala 147004, Punjab, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0748-4039","authenticated-orcid":false,"given":"Sachin","family":"Kansal","sequence":"additional","affiliation":[{"name":"Computer Science Engineering Department, Thapar Institute of Engineering Technology, Patiala 147004, Punjab, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6298-4963","authenticated-orcid":false,"given":"Alka","family":"Bishnoi","sequence":"additional","affiliation":[{"name":"Department of Physical Therapy, College of Health Professions and Human Services, Kean University, Union, NJ 07083, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2499-6039","authenticated-orcid":false,"given":"Jyotindra","family":"Narayan","sequence":"additional","affiliation":[{"name":"Department of Mechanical Engineering, Indian Institute of Technology, Patna 801106, Bihar, India"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,12,29]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1527","DOI":"10.1162\/neco.2006.18.7.1527","article-title":"A fast learning algorithm for deep belief nets","volume":"18","author":"Hinton","year":"2006","journal-title":"Neural Comput."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., and Fei-Fei, L. (2014, January 23\u201328). Large-scale video classification with convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.223"},{"key":"ref_4","unstructured":"Simonyan, K., and Zisserman, A. (2014, January 8\u201313). Two-stream convolutional networks for action recognition in videos. Proceedings of the Advances in Neural Information Processing Systems (NIPS 2014), Montreal, QC, Canada."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"101869","DOI":"10.1016\/j.inffus.2023.101869","article-title":"A review of deep learning techniques for speech processing","volume":"99","author":"Mehrish","year":"2023","journal-title":"Inf. Fusion"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Do, T.T.T., Huynh, Q.T., Kim, K., and Nguyen, V.Q. (2025). A Survey on Video Big Data Analytics: Architecture, Technologies, and Open Research Challenges. Appl. Sci., 15.","DOI":"10.3390\/app15148089"},{"key":"ref_7","unstructured":"Liu, X., Xiang, X., Li, Z., Wang, Y., Li, Z., Liu, Z., Zhang, W., Ye, W., and Zhang, J. (2024). A survey of ai-generated video evaluation. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"12029","DOI":"10.1007\/s00521-023-08337-y","article-title":"DL-DARE: Deep learning-based different activity recognition for the human\u2013robot interaction environment","volume":"35","author":"Kansal","year":"2023","journal-title":"Neural Comput. Appl."},{"key":"ref_9","first-page":"8273546","article-title":"Human Activity Recognition Based on a Modified Capsule Network","volume":"2023","author":"Zhu","year":"2023","journal-title":"Mob. Inf. Syst."},{"key":"ref_10","first-page":"259","article-title":"Deep learning-based human activity recognition using CNN, ConvLSTM, and LRCN","volume":"5","author":"Uddin","year":"2024","journal-title":"Int. J. Cogn. Comput. Eng."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Donahue, J., Hendricks, L.A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., and Darrell, T. (2015, January 7\u201312). Long-term Recurrent Convolutional Networks for Visual Recognition and Description. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.21236\/ADA623249"},{"key":"ref_12","unstructured":"Ljajic, A. (2024). Deep Learning-Based Body Action Classification and Ergonomic Assessment. [Ph.D. Thesis, Technische Universit\u00e4t Wien]."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Lee, J.W., and Kang, H.S. (2024). Three-stage deep learning framework for video surveillance. Appl. Sci., 14.","DOI":"10.3390\/app14010408"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Yoshidomi, T., Kume, S., Aizawa, H., and Furui, A. (2024). Classification of Carotid Plaque with Jellyfish Sign Through Convolutional and Recurrent Neural Networks Utilizing Plaque Surface Edges. arXiv.","DOI":"10.1109\/EMBC53108.2024.10782813"},{"key":"ref_15","unstructured":"Arulalan, V. (2023). Deep Learning Based Methods for Improving Object Detection, Classification and Tracking in Video Surveillance. [Ph.D. Thesis, Anna University]."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Tabassum, I. (2024). A Hybrid Deep-Learning Approach for Multi-Class Cyberbullying Classification of Cyberbullying Using Social Medias\u2019 Multi-Modal Data. [Master\u2019s Thesis, University of South-Eastern Norway].","DOI":"10.20944\/preprints202411.0392.v1"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Das, A., Mistry, D., Kamal, R., Ganguly, S., and Chakraborty, S. (2025). Facemask and Hand Gloves Detection Using Hybrid Deep Learning Model. Smart Medical Imaging for Diagnosis and Treatment Planning, Chapman and Hall\/CRC.","DOI":"10.1201\/9781003464884-12"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Lei, J., Sun, W., Fang, Y., Ye, N., Yang, S., and Wu, J. (2024). A Model for Detecting Abnormal Elevator Passenger Behavior Based on Video Classification. Electronics, 13.","DOI":"10.3390\/electronics13132472"},{"key":"ref_19","first-page":"100461","article-title":"Human activity classification using deep learning based on 3D motion feature","volume":"12","author":"Rahayu","year":"2023","journal-title":"Mach. Learn. Appl."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"340","DOI":"10.1016\/j.future.2023.01.006","article-title":"Human activity recognition using marine predators algorithm with deep learning","volume":"142","author":"Helmi","year":"2023","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1016\/j.inffus.2023.01.015","article-title":"Multi-level feature fusion for multimodal human activity recognition in Internet of Healthcare Things","volume":"94","author":"Islam","year":"2023","journal-title":"Inf. Fusion"},{"key":"ref_22","first-page":"23","article-title":"Human Activity Recognition From Sensorised Patient\u2019s Data in Healthcare: A Streaming Deep Learning-Based Approach","volume":"8","author":"Hurtado","year":"2023","journal-title":"Int. J. Interact. Multimed. Artif. Intell."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"139489","DOI":"10.1109\/ACCESS.2021.3118541","article-title":"Video processing using deep learning techniques: A systematic literature review","volume":"9","author":"Sharma","year":"2021","journal-title":"IEEE Access"},{"key":"ref_24","first-page":"24513","article-title":"A novel keyframe extraction method for video classification using deep neural networks","volume":"35","author":"Gan","year":"2021","journal-title":"Neural Comput. Appl."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Naik, K.J., and Soni, A. (2021). Video classification using 3D convolutional neural network. Advancements in Security and Privacy Initiatives for Multimedia Images, IGI Global.","DOI":"10.4018\/978-1-7998-2795-5.ch001"},{"key":"ref_26","first-page":"7252896","article-title":"A sports training video classification model based on deep learning","volume":"2021","author":"Xu","year":"2021","journal-title":"Sci. Program."},{"key":"ref_27","unstructured":"Pentyala, S., Dowsley, R., and De Cock, M. (2021, January 18\u201324). Privacy-preserving video classification with convolutional neural networks. Proceedings of the International Conference on Machine Learning, PMLR, Virtual."},{"key":"ref_28","first-page":"20","article-title":"Deep Learning for Video Classification: A Review","volume":"1","author":"Rehman","year":"2021","journal-title":"TechRxiv"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"4","DOI":"10.21037\/jmai.2019.10.03","article-title":"Improving ultrasound video classification: An evaluation of novel deep learning methods in echocardiography","volume":"3","author":"Howard","year":"2020","journal-title":"J. Med. Artif. Intell."},{"key":"ref_30","first-page":"20","article-title":"Temporal Segment Networks: Towards Good Practices for Deep Action Recognition","volume":"Volume 9912","author":"Wang","year":"2016","journal-title":"Proceedings of the European Conference on Computer Vision"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1140611","DOI":"10.1155\/2021\/1140611","article-title":"Big data and deep learning-based video classification model for sports","volume":"2021","author":"Wang","year":"2021","journal-title":"Wirel. Commun. Mob. Comput."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"29267","DOI":"10.1007\/s11042-021-10889-x","article-title":"Prediction of diseased rice plant using video processing and LSTM-simple recurrent neural network with comparative study","volume":"80","author":"Verma","year":"2021","journal-title":"Multimed. Tools Appl."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Kwong, S., Xu, L., and Zhao, T. (2022). Advances in Deep-Learning-Based Sensing, Imaging, and Video Processing. Sensors, 22.","DOI":"10.3390\/s22166192"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Uchiyama, T., Sogi, N., Niinuma, K., and Fukui, K. (2023, January 2\u20137). Visually explaining 3D-CNN predictions for video classification with an adaptive occlusion sensitivity analysis. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA.","DOI":"10.1109\/WACV56688.2023.00156"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Shea, D.E., Kulhare, S., Millin, R., Laverriere, Z., Mehanian, C., Delahunt, C.B., Banik, D., Zheng, X., Zhu, M., and Ji, Y. (2023, January 17\u201324). Deep Learning Video Classification of Lung Ultrasound Features Associated with Pneumonia. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPRW59228.2023.00312"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"3218431","DOI":"10.1155\/2022\/3218431","article-title":"Sports Video Classification Framework Using Enhanced Threshold-Based Keyframe Selection Algorithm and Customized CNN on UCF101 and Sports1-M Dataset","volume":"2022","author":"Ramesh","year":"2022","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3597434","article-title":"Deep unsupervised key frame extraction for efficient video classification","volume":"19","author":"Tang","year":"2023","journal-title":"ACM Trans. Multimed. Comput. Commun. Appl."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"107137","DOI":"10.1016\/j.cie.2021.107137","article-title":"A classification proposal of digital twin applications in the safety domain","volume":"154","author":"Agnusdei","year":"2021","journal-title":"Comput. Ind. Eng."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"325","DOI":"10.1016\/j.psep.2019.10.021","article-title":"Improving process safety: What roles for Digitalization and Industry 4.0?","volume":"132","author":"Lee","year":"2019","journal-title":"Process Saf. Environ. Prot."},{"key":"ref_40","unstructured":"Bakshi, R. (2021). Hand hygiene video classification based on deep learning. arXiv."},{"key":"ref_41","first-page":"19","article-title":"Adaptive and effective spatio-temporal modelling for offensive video classification using deep neural network","volume":"11","author":"Chelliah","year":"2023","journal-title":"Int. J. Intell. Eng. Inform."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1016\/j.biosystemseng.2020.08.016","article-title":"Video and image classification using atomisation spray image patterns and deep learning","volume":"200","author":"Li","year":"2020","journal-title":"Biosyst. Eng."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Sabzi, S., Pourdarbani, R., Kalantari, D., and Panagopoulos, T. (2020). Designing a fruit identification algorithm in orchard conditions to develop robots using video processing and majority voting based on hybrid artificial neural network. Appl. Sci., 10.","DOI":"10.3390\/app10010383"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Li, G.Y., Chen, L., Zahiri, M., Balaraju, N., Patil, S., Mehanian, C., Gregory, C., Gregory, K., Raju, B., and Kruecker, J. (2023, January 1\u20136). Weakly Semi-Supervised Detector-Based Video Classification with Temporal Context for Lung Ultrasound. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCVW60793.2023.00262"},{"key":"ref_45","unstructured":"Qian, R., Li, Y., Xu, Z., Yang, M.H., Belongie, S., and Cui, Y. (2022). Multimodal open-vocabulary video classification via pre-trained vision and language models. arXiv."},{"key":"ref_46","unstructured":"Kansal, S., and Kansal, P. (2025, September 09). Robotic Hand Pour & Stir Video Dataset. Available online: https:\/\/www.kaggle.com\/datasets\/8baa9574ce5ae310af601d342765670b61246e37140a6d190270f4601424a058."},{"key":"ref_47","unstructured":"Soomro, K., Zamir, A.R., and Shah, M. (2012). UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., and Serre, T. (2011, January 6\u201313). HMDB: A large video database for human motion recognition. Proceedings of the 2011 International Conference on Computer Vision (ICCV 2011), Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126543"},{"key":"ref_49","unstructured":"Simonyan, K., and Zisserman, A. (2015, January 7\u20139). Very Deep Convolutional Networks for Large-Scale Image Recognition. Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA."},{"key":"ref_50","unstructured":"Nair, V., and Hinton, G.E. (2010, January 21\u201324). Rectified Linear Units Improve Restricted Boltzmann Machines. Proceedings of the 27th International Conference on Machine Learning (ICML), Haifa, Israel."},{"key":"ref_51","unstructured":"Kingma, D.P., and Ba, J. (2015, January 7\u20139). Adam: A Method for Stochastic Optimization. Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA."},{"key":"ref_52","unstructured":"Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., and Woo, W.C. (201, January 7\u201312). Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Proceedings of the Advances in Neural Information Processing Systems (NIPS 2015), Montreal, QC, Canada."},{"key":"ref_53","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"111819","DOI":"10.1016\/j.patcog.2025.111819","article-title":"Dynamic bound adaptive gradient methods with belief in observed gradients","volume":"168","author":"Xiang","year":"2025","journal-title":"Pattern Recognit."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"121182","DOI":"10.1016\/j.eswa.2023.121182","article-title":"Quadruplet depth-wise separable fusion convolution neural network for ballistic target recognition with limited samples","volume":"235","author":"Xiang","year":"2024","journal-title":"Expert Syst. Appl."},{"key":"ref_56","unstructured":"Bradski, G. (2000). The OpenCV Library. Dr. Dobb\u2019S J. Softw. Tools, Available online: https:\/\/jacobfilipp.com\/DrDobbs\/articles\/DDJ\/2000\/0011\/0011k\/0011k.htm?utm_source=chatgpt.com."},{"key":"ref_57","unstructured":"Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (, January 22\u201329). Grad-CAM: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy."}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/10\/1\/11\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,31]],"date-time":"2025-12-31T14:04:41Z","timestamp":1767189881000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/10\/1\/11"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,29]]},"references-count":57,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2026,1]]}},"alternative-id":["bdcc10010011"],"URL":"https:\/\/doi.org\/10.3390\/bdcc10010011","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,29]]}}}