{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T07:30:29Z","timestamp":1774942229576,"version":"3.50.1"},"reference-count":38,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2023,5,19]],"date-time":"2023-05-19T00:00:00Z","timestamp":1684454400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>The presence of occlusion in human activity recognition (HAR) tasks hinders the performance of recognition algorithms, as it is responsible for the loss of crucial motion data. Although it is intuitive that it may occur in almost any real-life environment, it is often underestimated in most research works, which tend to rely on datasets that have been collected under ideal conditions, i.e., without any occlusion. In this work, we present an approach that aimed to deal with occlusion in an HAR task. We relied on previous work on HAR and artificially created occluded data samples, assuming that occlusion may prevent the recognition of one or two body parts. The HAR approach we used is based on a Convolutional Neural Network (CNN) that has been trained using 2D representations of 3D skeletal motion. We considered cases in which the network was trained with and without occluded samples and evaluated our approach in single-view, cross-view, and cross-subject cases and using two large scale human motion datasets. Our experimental results indicate that the proposed training strategy is able to provide a significant boost of performance in the presence of occlusion.<\/jats:p>","DOI":"10.3390\/s23104899","type":"journal-article","created":{"date-parts":[[2023,5,19]],"date-time":"2023-05-19T10:08:55Z","timestamp":1684490935000},"page":"4899","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Human Activity Recognition in the Presence of Occlusion"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1554-0539","authenticated-orcid":false,"given":"Ioannis","family":"Vernikos","sequence":"first","affiliation":[{"name":"Department of Informatics and Telecommunications, University of Thessaly, 35131 Lamia, Greece"}]},{"given":"Theodoros","family":"Spyropoulos","sequence":"additional","affiliation":[{"name":"Department of Digital Systems, University of Piraeus, 18534 Piraeus, Greece"}]},{"given":"Evaggelos","family":"Spyrou","sequence":"additional","affiliation":[{"name":"Department of Informatics and Telecommunications, University of Thessaly, 35131 Lamia, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6916-3129","authenticated-orcid":false,"given":"Phivos","family":"Mylonas","sequence":"additional","affiliation":[{"name":"Department of Informatics and Computer Engineering, University of West Attica, Egaleo Park, 12243 Athens, Greece"}]}],"member":"1968","published-online":{"date-parts":[[2023,5,19]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"28","DOI":"10.3389\/frobt.2015.00028","article-title":"A review of human activity recognition methods","volume":"2","author":"Vrigkas","year":"2015","journal-title":"Front. Robot. AI"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1016\/j.cviu.2018.04.007","article-title":"RGB-D-based human motion recognition with deep learning: A survey","volume":"171","author":"Wang","year":"2018","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"2259","DOI":"10.1007\/s10462-020-09904-8","article-title":"A survey on video-based human action recognition: Recent updates, datasets, challenges, and applications","volume":"54","author":"Pareek","year":"2021","journal-title":"Artif. Intell. Rev."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1109\/MSP.2015.2503881","article-title":"Monitoring activities of daily living in smart homes: Understanding human behavior","volume":"33","author":"Debes","year":"2016","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"e15704","DOI":"10.2196\/15704","article-title":"Comparing the usability and acceptability of wearable sensors among older irish adults in a real-world context: Observational study","volume":"8","author":"Keogh","year":"2020","journal-title":"JMIR mHealth uHealth"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Majumder, S., Mondal, T., and Deen, M.J. (2017). Wearable sensors for remote health monitoring. Sensors, 17.","DOI":"10.3390\/s17010130"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Papadakis, A., Mathe, E., Spyrou, E., and Mylonas, P. (2019, January 23\u201325). A geometric approach for cross-view human action recognition using deep learning. Proceedings of the 11th International Symposium on Image and Signal Processing and Analysis (ISPA), Dubrovnik, Croatia.","DOI":"10.1109\/ISPA.2019.8868717"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Liu, C., Hu, Y., Li, Y., Song, S., and Liu, J. (2017). Pku-mmd: A large scale benchmark for continuous multi-modal human action understanding. arXiv.","DOI":"10.1145\/3132734.3132739"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Shahroudy, A., Liu, J., Ng, T.T., and Wang, G. (2016, January 27\u201330). Ntu rgb+ d: A large scale dataset for 3d human activity analysis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.115"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Gu, R., Wang, G., and Hwang, J.N. (2021, January 10\u201315). Exploring severe occlusion: Multi-person 3d pose estimation with gated convolution. Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy.","DOI":"10.1109\/ICPR48806.2021.9412107"},{"key":"ref_11","unstructured":"Giannakos, I., Mathe, E., Spyrou, E., and Mylonas, P. (July, January 29). A study on the Effect of Occlusion in Human Activity Recognition. Proceedings of the 14th PErvasive Technologies Related to Assistive Environments Conference, Corfu, Greece."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1093\/geront\/9.3_Part_1.179","article-title":"Assessment of older people: Self-maintaining and instrumental activities of daily living","volume":"9","author":"Lawton","year":"1969","journal-title":"Gerontologist"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Du, Y., Fu, Y., and Wang, L. (2015, January 3\u20136). Skeleton based action recognition with convolutional neural network. Proceedings of the 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur, Malaysia.","DOI":"10.1109\/ACPR.2015.7486569"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"807","DOI":"10.1109\/TCSVT.2016.2628339","article-title":"Skeleton optical spectra-based action recognition using convolutional neural networks","volume":"28","author":"Hou","year":"2016","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"731","DOI":"10.1109\/LSP.2017.2690339","article-title":"Skeletonnet: Mining deep part features for 3-d action recognition","volume":"24","author":"Ke","year":"2017","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"624","DOI":"10.1109\/LSP.2017.2678539","article-title":"Joint distance maps based action recognition with convolutional neural networks","volume":"24","author":"Li","year":"2017","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.patcog.2017.02.030","article-title":"Enhanced skeleton visualization for view invariant human action recognition","volume":"68","author":"Liu","year":"2017","journal-title":"Pattern Recognit."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1016\/j.knosys.2018.05.029","article-title":"Action recognition based on joint trajectory maps with convolutional neural networks","volume":"158","author":"Wang","year":"2018","journal-title":"Knowl.-Based Syst."},{"key":"ref_19","unstructured":"Iosifidis, A., Tefas, A., and Pitas, I. (2012, January 27\u201331). Multi-view human action recognition under occlusion based on fuzzy distances and neural networks. Proceedings of the 20th European Signal Processing Conference (EUSIPCO), Bucharest, Romania."},{"key":"ref_20","unstructured":"Papadakis, A., Mathe, E., Vernikos, I., Maniatis, A., Spyrou, E., and Mylonas, P. (2019). Engineering Applications of Neural Networks, Proceedings of the 20th International Conference, EANN 2019, Xersonisos, Crete, Greece, 24\u201326 May 2019, Springer."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1433","DOI":"10.1109\/TMM.2019.2944745","article-title":"2D pose-based real-time human action recognition with occlusion-handling","volume":"22","author":"Angelini","year":"2019","journal-title":"IEEE Trans. Multimed."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Cao, Z., Simon, T., Wei, S.E., and Sheikh, Y. (2017, January 21\u201326). Realtime multi-person 2d pose estimation using part affinity fields. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.143"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Gkalelis, N., Kim, H., Hilton, A., Nikolaidis, N., and Pitas, I. (2009, January 12\u201313). The i3dpost multi-view and 3d human action\/interaction database. Proceedings of the 2009 Conference for Visual Media Production, London, UK.","DOI":"10.1109\/CVMP.2009.19"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"538","DOI":"10.1109\/JSTSP.2012.2196975","article-title":"Human pose estimation and activity recognition from multi-view videos: Comparative explorations of recent developments","volume":"6","author":"Holte","year":"2012","journal-title":"IEEE J. Sel. Top. Signal Process."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1109\/ICPR.2004.1334462","article-title":"Recognizing human actions: A local SVM approach","volume":"Volume 3","author":"Schuldt","year":"2004","journal-title":"Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004"},{"key":"ref_26","first-page":"254","article-title":"The MOBISERV-AIIA eating and drinking multi-view database for vision-based assisted living","volume":"6","author":"Iosifidis","year":"2015","journal-title":"J. Inf. Hiding Multimed. Signal Process."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"103352","DOI":"10.1016\/j.jobe.2021.103352","article-title":"Action recognition of construction workers under occlusion","volume":"45","author":"Li","year":"2022","journal-title":"J. Build. Eng."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Li, W., Zhang, Z., and Liu, Z. (2010, January 13\u201318). Action recognition based on a bag of 3d points. Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, San Francisco, CA, USA.","DOI":"10.1109\/CVPRW.2010.5543273"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1325","DOI":"10.1109\/TPAMI.2013.248","article-title":"Human3. 6m: Large scale datasets and predictive methods for 3d human sensing in natural environments","volume":"36","author":"Ionescu","year":"2013","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Yang, D., Wang, Y., Dantcheva, A., Garattoni, L., Francesca, G., and Br\u00e9mond, F. (2021, January 15\u201318). Self-Supervised Video Pose Representation Learning for Occlusion-Robust Action Recognition. Proceedings of the 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), Jodhpur, India.","DOI":"10.1109\/FG52635.2021.9667032"},{"key":"ref_31","unstructured":"Das, S., Dai, R., Koperski, M., Minciullo, L., Garattoni, L., Bremond, F., and Francesca, G. (November, January 27). Toyota smarthome: Real-world activities of daily living. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Wang, J., Nie, X., Xia, Y., Wu, Y., and Zhu, S.C. (2014, January 23\u201328). Cross-view action modeling, learning and recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.339"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhang, W., Zhu, M., and Derpanis, K.G. (2013, January 1\u20138). From actemes to action: A strongly-supervised representation for detailed action understanding. Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia.","DOI":"10.1109\/ICCV.2013.280"},{"key":"ref_34","unstructured":"Vernikos, I., Mathe, E., Papadakis, A., Spyrou, E., and Mylonas, P. (2009, January 5\u20137). An image representation of skeletal data for action recognition using convolutional neural networks. Proceedings of the 12th ACM International Conference on PErvasive Technologies Related to Assistive Environments, Rhodes, Greece."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1007\/s11263-021-01529-w","article-title":"View-invariant, occlusion-robust probabilistic embedding for human pose","volume":"130","author":"Liu","year":"2022","journal-title":"Int. J. Comput. Vision"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"2350002","DOI":"10.1142\/S0129065723500028","article-title":"A Multimodal Fusion Approach for Human Activity Recognition","volume":"33","author":"Koutrintzes","year":"2022","journal-title":"Int. J. Neural Syst."},{"key":"ref_37","unstructured":"Chollet, F. (2023, March 20). Keras. Available online: https:\/\/github.com\/fchollet\/keras."},{"key":"ref_38","unstructured":"Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., and Isard, M. (2016, January 2\u20134). TensorFlow: A System for Large-Scale Machine Learning. Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, USA."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/10\/4899\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:38:26Z","timestamp":1760125106000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/10\/4899"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,19]]},"references-count":38,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2023,5]]}},"alternative-id":["s23104899"],"URL":"https:\/\/doi.org\/10.3390\/s23104899","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,19]]}}}