{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T15:56:45Z","timestamp":1774540605160,"version":"3.50.1"},"reference-count":75,"publisher":"Springer Science and Business Media LLC","issue":"22","license":[{"start":{"date-parts":[[2024,8,27]],"date-time":"2024-08-27T00:00:00Z","timestamp":1724716800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,8,27]],"date-time":"2024-08-27T00:00:00Z","timestamp":1724716800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100023561","name":"Ministerio de Universidades","doi-asserted-by":"publisher","award":["FPU21\/00414"],"award-info":[{"award-number":["FPU21\/00414"]}],"id":[{"id":"10.13039\/501100023561","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100016386","name":"Conselleria de Innovaci\u00f3n, Universidades, Ciencia y Sociedad Digital, Generalitat Valenciana","doi-asserted-by":"publisher","award":["CIACIF\/2021\/430"],"award-info":[{"award-number":["CIACIF\/2021\/430"]}],"id":[{"id":"10.13039\/501100016386","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004837","name":"Ministerio de Ciencia e Innovaci\u00f3n","doi-asserted-by":"publisher","award":["MCIN\/AEI\/10.13039\/501100011033"],"award-info":[{"award-number":["MCIN\/AEI\/10.13039\/501100011033"]}],"id":[{"id":"10.13039\/501100004837","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100018707","name":"HORIZON EUROPE Reforming and enhancing the European Research and Innovation system","doi-asserted-by":"publisher","award":["101086387"],"award-info":[{"award-number":["101086387"]}],"id":[{"id":"10.13039\/100018707","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100018707","name":"HORIZON EUROPE Reforming and enhancing the European Research and Innovation system","doi-asserted-by":"publisher","award":["CIAICO\/2022\/132"],"award-info":[{"award-number":["CIAICO\/2022\/132"]}],"id":[{"id":"10.13039\/100018707","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Multimed Tools Appl"],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>The evolution of virtual and augmented reality devices in recent years has encouraged researchers to develop new systems for different fields. This paper introduces Holo4Care, a context-aware mixed reality framework designed for assisting in activities of daily living (ADL) using the HoloLens 2. By leveraging egocentric cameras embedded in these devices, which offer a close-to-wearer perspective, our framework establishes a congruent relationship, facilitating a deeper understanding of user actions and enabling effective assistance. In our approach, we extend a previously established action estimation architecture after conducting a thorough review of state-of-the-art methods. The proposed architecture utilizes YOLO for hand and object detection, enabling action estimation based on these identified elements. We have trained new models on well-known datasets for object detection, incorporating action recognition annotations. The achieved mean Average Precision (mAP) is 33.2% in the EpicKitchens dataset and 26.4% on the ADL dataset. Leveraging the capabilities of the HoloLens 2, including spatial mapping and 3D hologram display, our system seamlessly presents the output of the action recognition architecture to the user. Unlike previous systems that focus primarily on user evaluation, Holo4Care emphasizes assistance by providing a set of global actions based on the user\u2019s field of view and hand positions that reflect their intentions. Experimental results demonstrate Holo4Care\u2019s ability to assist users in activities of daily living and other domains.<\/jats:p>","DOI":"10.1007\/s11042-024-20107-z","type":"journal-article","created":{"date-parts":[[2024,8,26]],"date-time":"2024-08-26T22:02:01Z","timestamp":1724709721000},"page":"24983-25007","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Holo4Care: a MR framework for assisting in activities of daily living by context-aware action recognition"],"prefix":"10.1007","volume":"84","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8809-8476","authenticated-orcid":false,"given":"Manuel","family":"Benavent-Lledo","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Mulero-P\u00e9rez","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jose","family":"Garcia-Rodriguez","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ester","family":"Martinez-Martin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Flores","family":"Vizcaya-Moreno","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,8,27]]},"reference":[{"key":"20107_CR1","doi-asserted-by":"publisher","unstructured":"Palumbo A (2022) Microsoft hololens 2 in medical and healthcare context: State of the art and future prospects. Sensors 22(20) https:\/\/doi.org\/10.3390\/s22207709","DOI":"10.3390\/s22207709"},{"issue":"8","key":"20107_CR2","doi-asserted-by":"publisher","first-page":"838","DOI":"10.1111\/iju.14907","volume":"29","author":"L Wang","year":"2022","unstructured":"Wang L, Zhao Z, Wang G, Zhou J, Zhu H, Guo H, Huang H, Yu M, Zhu G, Li N, Na Y (2022) Application of a three-dimensional visualization model in intraoperative guidance of percutaneous nephrolithotomy. Int J Urol 29(8):838\u2013844. https:\/\/doi.org\/10.1111\/iju.14907","journal-title":"Int J Urol"},{"issue":"4","key":"20107_CR3","doi-asserted-by":"publisher","first-page":"1006","DOI":"10.1016\/j.surg.2021.10.004","volume":"171","author":"M Kitagawa","year":"2022","unstructured":"Kitagawa M, Sugimoto M, Haruta H, Umezawa A, Kurokawa Y (2022) Intraoperative holography navigation using a mixed-reality wearable computer during laparoscopic cholecystectomy. Surgery 171(4):1006\u20131013. https:\/\/doi.org\/10.1016\/j.surg.2021.10.004","journal-title":"Surgery"},{"issue":"23","key":"20107_CR4","doi-asserted-by":"publisher","first-page":"7824","DOI":"10.3390\/s21237824","volume":"21","author":"M Garc\u00eda-Sevilla","year":"2021","unstructured":"Garc\u00eda-Sevilla M, Moreta-Martinez R, Garc\u00eda-Mato D, Pose-Diez-de-la-Lastra A, P\u00e9rez-Ma\u00f1anes R, Calvo-Haro JA, Pascau J (2021) Augmented reality as a tool to guide PSI placement in pelvic tumor resections. Sensors. 21(23):7824. https:\/\/doi.org\/10.3390\/s21237824","journal-title":"Sensors."},{"key":"20107_CR5","doi-asserted-by":"publisher","unstructured":"Wolf J, Lohmeyer Q, Holz C, Meboldt M (2021) Gaze comes in handy: Predicting and preventing erroneous hand actions in ar-supported manual tasks. In: 2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 166\u2013175. https:\/\/doi.org\/10.1109\/ISMAR52148.2021.00031","DOI":"10.1109\/ISMAR52148.2021.00031"},{"key":"20107_CR6","doi-asserted-by":"publisher","unstructured":"Wolf E, Fiedler ML, D\u00f6llinger N, Wienrich C, Latoschik ME (2022) Exploring presence, avatar embodiment, and body perception with a holographic augmented reality mirror. In: 2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), pp. 350\u2013359. https:\/\/doi.org\/10.1109\/VR51125.2022.00054","DOI":"10.1109\/VR51125.2022.00054"},{"key":"20107_CR7","doi-asserted-by":"crossref","unstructured":"Mulero-P\u00e9rez D, Benavent-Lledo M, Garcia-Rodriguez J, Azorin-Lopez J, Vizcaya-Moreno F (2023) Holodemtect: A mixed reality framework for cognitive stimulation through interaction with objects. In: 18th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2023, Garc\u00eda Bringas P, P\u00e9rez Garc\u00eda H, Pis\u00f3n FJ, Mart\u00ednez \u00c1lvarez F, Troncoso Lora A, Herrero \u00c1, Calvo Rolle JL, Quinti\u00e1n H, Corchado E (eds) Springer, Cham, pp 226\u2013235","DOI":"10.1007\/978-3-031-42536-3_22"},{"key":"20107_CR8","doi-asserted-by":"publisher","DOI":"10.1007\/s11548-021-02408-y","author":"J Wolf","year":"2021","unstructured":"Wolf J, Wolfer V, Halbe M, Maisano F, Lohmeyer Q, Meboldt M (2021) Comparing the effectiveness of augmented reality-based and conventional instructions during single ECMO cannulation training. Int J Comput Assist Radiol Surg. https:\/\/doi.org\/10.1007\/s11548-021-02408-y","journal-title":"Int J Comput Assist Radiol Surg"},{"issue":"1","key":"20107_CR9","doi-asserted-by":"publisher","first-page":"127","DOI":"10.7861\/fhj.2020-0146","volume":"8","author":"JB Levy","year":"2021","unstructured":"Levy JB, Kong E, Johnson N, Khetarpal A, Tomlinson J, Martin GF, Tanna A (2021) The mixed reality medical ward round with the MS HoloLens 2: Innovation in reducing COVID-19 transmission and PPE usage. Future Healthcare Journal 8(1):127\u2013130. https:\/\/doi.org\/10.7861\/fhj.2020-0146","journal-title":"Future Healthcare Journal"},{"key":"20107_CR10","doi-asserted-by":"publisher","unstructured":"Dolega-Dolegowski D, Proniewska K, Dolega-Dolegowska M, Pregowska A, Hajto-Bryk J, Trojak M, Chmiel J, Walecki P, Fudalej PS (2022) Application of holography and augmented reality based technology to visualize the internal structure of the dental root \u2013 a proof of concept. Head & Face Medicine 18(1) https:\/\/doi.org\/10.1186\/s13005-022-00307-4","DOI":"10.1186\/s13005-022-00307-4"},{"issue":"7","key":"20107_CR11","doi-asserted-by":"publisher","first-page":"344","DOI":"10.1080\/01691864.2021.2017342","volume":"36","author":"R Kurazume","year":"2022","unstructured":"Kurazume R, Hiramatsu T, Kamei M, Inoue D, Kawamura A, Miyauchi S, An Q (2022) Development of AR training systems for humanitude dementia care. Adv Robot 36(7):344\u2013358. https:\/\/doi.org\/10.1080\/01691864.2021.2017342","journal-title":"Adv Robot"},{"key":"20107_CR12","unstructured":"Ulhaq A, Akhtar N, Pogrebna G, Mian A (2022) Vision Transformers for Action Recognition: A Survey"},{"key":"20107_CR13","doi-asserted-by":"crossref","unstructured":"Girdhar R, Grauman K (2021) Anticipative Video Transformer","DOI":"10.1109\/ICCV48922.2021.01325"},{"key":"20107_CR14","doi-asserted-by":"crossref","unstructured":"Xing Z, Dai Q, Hu H, Chen J, Wu Z, Jiang YG (2023) Svformer: Semi-supervised video transformer for action recognition. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 18816\u201318826","DOI":"10.1109\/CVPR52729.2023.01804"},{"key":"20107_CR15","doi-asserted-by":"publisher","unstructured":"Benavent-Lledo M, Oprea S, Castro-Vargas JA, Mulero-Perez D, Garcia-Rodriguez J (2022) Predicting human-object interactions in egocentric videos. In: 2022 International Joint Conference on Neural Networks (IJCNN), pp. 1\u20137. https:\/\/doi.org\/10.1109\/IJCNN55064.2022.9892910","DOI":"10.1109\/IJCNN55064.2022.9892910"},{"key":"20107_CR16","doi-asserted-by":"crossref","unstructured":"Alfaro-Viquez D, Zamora-Hernandez MA, Benavent-Lledo M, Garcia-Rodriguez J, Azor\u00edn-L\u00f3pez J (2023) Monitoring human performance through deep learning and computer vision in industry 4.0. In: 17th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2022, Garc\u00eda Bringas P, P\u00e9rez Garc\u00eda H, Martinez-de-Pison FJ, Villar Flecha JR, Troncoso Lora A, Cal EA, Herrero \u00c1, Mart\u00ednez \u00c1lvarez F, Psaila G, Quinti\u00e1n H, Corchado Rodriguez ES (eds) Springer, Cham, pp 309\u2013318","DOI":"10.1007\/978-3-031-18050-7_30"},{"key":"20107_CR17","doi-asserted-by":"crossref","unstructured":"Gu C, Sun C, Ross DA, Vondrick C, Pantofaru C, Li Y, Vijayanarasimhan S, Toderici G, Ricco S, Sukthankar R, Schmid C, Malik J (2018) AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions","DOI":"10.1109\/CVPR.2018.00633"},{"key":"20107_CR18","unstructured":"Li A, Thotakuri M, Ross DA, Carreira J, Vostrikov A, Zisserman A (2020) The AVA-Kinetics Localized Human Actions Video Dataset"},{"key":"20107_CR19","doi-asserted-by":"publisher","unstructured":"Gomez-Donoso F, Orts-Escolano S, Garcia-Garcia A, Garcia-Rodriguez J, Castro-Vargas JA, Ovidiu-Oprea S, Cazorla M (2017) A robotic platform for customized and interactive rehabilitation of persons with disabilities. Pattern Recognition Letters. 99, 105\u2013113 https:\/\/doi.org\/10.1016\/j.patrec.2017.05.027 . User Profiling and Behavior Adaptation for Human-Robot Interaction","DOI":"10.1016\/j.patrec.2017.05.027"},{"key":"20107_CR20","doi-asserted-by":"publisher","first-page":"42","DOI":"10.1016\/j.physbeh.2017.01.034","volume":"173","author":"JM Fernandez Montenegro","year":"2017","unstructured":"Fernandez Montenegro JM, Argyriou V (2017) Cognitive evaluation for the diagnosis of alzheimer\u2019s disease based on turing test and virtual environments. Physiology & Behavior 173:42\u201351. https:\/\/doi.org\/10.1016\/j.physbeh.2017.01.034","journal-title":"Physiology & Behavior"},{"issue":"1","key":"20107_CR21","doi-asserted-by":"publisher","first-page":"16","DOI":"10.22543\/7674.91.P1627","volume":"9","author":"EM Merlo","year":"2022","unstructured":"Merlo EM, Myles LAM, Pappalardo SM (2022) The vespa project: Virtual reality interventions for neurocognitive and developmental disorders. Journal of Mind and Medical Sciences 9(1):16\u201327","journal-title":"Journal of Mind and Medical Sciences"},{"key":"20107_CR22","doi-asserted-by":"publisher","first-page":"42","DOI":"10.1016\/j.physbeh.2017.01.034","volume":"173","author":"JM Fernandez Montenegro","year":"2017","unstructured":"Fernandez Montenegro JM, Argyriou V (2017) Cognitive evaluation for the diagnosis of alzheimer\u2019s disease based on turing test and virtual environments. Physiology & Behavior 173:42\u201351. https:\/\/doi.org\/10.1016\/j.physbeh.2017.01.034","journal-title":"Physiology & Behavior"},{"key":"20107_CR23","doi-asserted-by":"publisher","unstructured":"Fern\u00e1ndez\u00a0Montenegro JM, Villarini B, Angelopoulou A, Kapetanios E, Garcia-Rodriguez J, Argyriou V (2020) A survey of alzheimer\u2019s disease early diagnosis methods for cognitive assessment. Sensors 20(24) https:\/\/doi.org\/10.3390\/s20247292","DOI":"10.3390\/s20247292"},{"issue":"1","key":"20107_CR24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12984-019-0530-z","volume":"16","author":"N Rohrbach","year":"2019","unstructured":"Rohrbach N, Gulde P, Armstrong AR, Hartig L, Abdelrazeq A, Schr\u00f6der S, Neuse J, Grimmer T, Diehl-Schmid J, Hermsd\u00f6rfer J (2019) An augmented reality approach for adl support in alzheimer\u2019s disease: a crossover trial. J Neuroeng Rehabil 16(1):1\u201311","journal-title":"J Neuroeng Rehabil"},{"key":"20107_CR25","doi-asserted-by":"publisher","unstructured":"De\u00a0Cecco M, Luchetti A, Butaslac I, Pilla F, Guandalini GMA, Bonavita J, Mazzucato M, Hirokazu K (2023) Sharing augmented reality between a patient and a clinician for assessment and rehabilitation in daily living activities. Information 14(4) https:\/\/doi.org\/10.3390\/info14040204","DOI":"10.3390\/info14040204"},{"issue":"1","key":"20107_CR26","doi-asserted-by":"publisher","first-page":"234","DOI":"10.1038\/s41746-023-00978-6","volume":"6","author":"M Muurling","year":"2023","unstructured":"Muurling M, Boer C, Vairavan S, Harms RL, Chadha AS, Tarnanas I, Luis EV, Religa D, Gjestsen MT, Galluzzi S et al (2023) Augmented reality versus standard tests to assess cognition and function in early alzheimer\u2019s disease. NPJ digital medicine 6(1):234","journal-title":"NPJ digital medicine"},{"key":"20107_CR27","doi-asserted-by":"publisher","first-page":"325","DOI":"10.1016\/j.jmsy.2020.04.018","volume":"55","author":"C Chen","year":"2020","unstructured":"Chen C, Wang T, Li D, Hong J (2020) Repetitive assembly action recognition based on object detection and pose estimation. J Manuf Syst 55:325\u2013333. https:\/\/doi.org\/10.1016\/j.jmsy.2020.04.018","journal-title":"J Manuf Syst"},{"key":"20107_CR28","doi-asserted-by":"publisher","first-page":"395","DOI":"10.1007\/978-3-642-42054-2_49","volume-title":"Neural Information Processing","author":"S Kim","year":"2013","unstructured":"Kim S, Jung J, Kavuri S, Lee M (2013) Intention estimation and recommendation system based on attention sharing. In: Lee M, Hirose A, Hou Z-G, Kil RM (eds) Neural Information Processing. Springer, Berlin, Heidelberg, pp 395\u2013402"},{"key":"20107_CR29","doi-asserted-by":"publisher","unstructured":"Reza S, Zhang Y, Camps O, Moghaddam M (2023) Towards seamless egocentric hand action recognition in mixed reality. In: 2023 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), pp. 411\u2013416. https:\/\/doi.org\/10.1109\/ISMAR-Adjunct60411.2023.00088","DOI":"10.1109\/ISMAR-Adjunct60411.2023.00088"},{"issue":"3","key":"20107_CR30","first-page":"3200","volume":"45","author":"Z Sun","year":"2022","unstructured":"Sun Z, Ke Q, Rahmani H, Bennamoun M, Wang G, Liu J (2022) Human action recognition from various data modalities: A review. IEEE Trans Pattern Anal Mach Intell 45(3):3200\u20133225","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"20107_CR31","doi-asserted-by":"crossref","unstructured":"Kazakos E, Nagrani A, Zisserman A, Damen D (2019) Epic-fusion: Audio-visual temporal binding for egocentric action recognition. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision, pp. 5492\u20135501","DOI":"10.1109\/ICCV.2019.00559"},{"key":"20107_CR32","doi-asserted-by":"crossref","unstructured":"Zhang C, Cui Z, Zhang Y, Zeng B, Pollefeys M, Liu S (2021) Holistic 3D Scene Understanding from a Single Image with Implicit Representation","DOI":"10.1109\/CVPR46437.2021.00872"},{"key":"20107_CR33","doi-asserted-by":"publisher","unstructured":"Vaca-Castano G, Das S, Sousa JP, Lobo ND, Shah M (2017) Improved scene identification and object detection on egocentric vision of daily activities. Computer Vision and Image Understanding 156, 92\u2013103 https:\/\/doi.org\/10.1016\/j.cviu.2016.10.016 . Image and Video Understanding in Big Data","DOI":"10.1016\/j.cviu.2016.10.016"},{"key":"20107_CR34","unstructured":"Grauman K, Westbury A, et al. (2022) Ego4D: Around the World in 3,000 Hours of Egocentric Video"},{"key":"20107_CR35","doi-asserted-by":"crossref","unstructured":"Wang H, Singh MK, Torresani L (2023) Ego-only: Egocentric action detection without exocentric transferring. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp. 5250\u20135261","DOI":"10.1109\/ICCV51070.2023.00484"},{"key":"20107_CR36","doi-asserted-by":"crossref","unstructured":"Gong X, Mohan S, Dhingra N, Bazin JC, Li Y, Wang Z, Ranjan R (2023) Mmg-ego4d: Multimodal generalization in egocentric action recognition. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6481\u20136491","DOI":"10.1109\/CVPR52729.2023.00627"},{"key":"20107_CR37","doi-asserted-by":"crossref","unstructured":"Li Y, Nagarajan T, Xiong B, Grauman K (2021) Ego-exo: Transferring visual representations from third-person to first-person videos. In: CVPR","DOI":"10.1109\/CVPR46437.2021.00687"},{"issue":"4","key":"20107_CR38","doi-asserted-by":"publisher","first-page":"2333","DOI":"10.1109\/LRA.2023.3251843","volume":"8","author":"G Goletto","year":"2023","unstructured":"Goletto G, Planamente M, Caputo B, Averta G (2023) Bringing online egocentric action recognition into the wild. IEEE Robotics and Automation Letters 8(4):2333\u20132340. https:\/\/doi.org\/10.1109\/LRA.2023.3251843","journal-title":"IEEE Robotics and Automation Letters"},{"key":"20107_CR39","doi-asserted-by":"crossref","unstructured":"Kapidis G, Poppe R, Dam E, Noldus LPJJ, Veltkamp RC (2019) Egocentric Hand Track and Object-based Human Action Recognition","DOI":"10.1109\/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00185"},{"key":"20107_CR40","doi-asserted-by":"publisher","unstructured":"Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 580\u2013587. https:\/\/doi.org\/10.1109\/CVPR.2014.81","DOI":"10.1109\/CVPR.2014.81"},{"key":"20107_CR41","doi-asserted-by":"publisher","unstructured":"Girshick R (2015) Fast R-CNN. arXiv. https:\/\/doi.org\/10.48550\/ARXIV.1504.08083","DOI":"10.48550\/ARXIV.1504.08083"},{"issue":"6","key":"20107_CR42","doi-asserted-by":"publisher","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","volume":"39","author":"S Ren","year":"2017","unstructured":"Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137\u20131149. https:\/\/doi.org\/10.1109\/TPAMI.2016.2577031","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"20107_CR43","doi-asserted-by":"publisher","unstructured":"Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779\u2013788. https:\/\/doi.org\/10.1109\/CVPR.2016.91","DOI":"10.1109\/CVPR.2016.91"},{"key":"20107_CR44","doi-asserted-by":"publisher","unstructured":"Redmon J, Farhadi A (2017) Yolo9000: Better, faster, stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517\u20136525. https:\/\/doi.org\/10.1109\/CVPR.2017.690","DOI":"10.1109\/CVPR.2017.690"},{"key":"20107_CR45","doi-asserted-by":"crossref","unstructured":"Lin TY, Goyal P, Girshick R, He K, Doll\u00e1r P (2018) Focal Loss for Dense Object Detection","DOI":"10.1109\/ICCV.2017.324"},{"key":"20107_CR46","unstructured":"Wu Y, Kirillov A, Massa F, Lo WY, Girshick R (2019) Detectron2. https:\/\/github.com\/facebookresearch\/detectron2"},{"key":"20107_CR47","doi-asserted-by":"publisher","unstructured":"Redmon J, Farhadi A (2018) YOLOv3: An Incremental Improvement. https:\/\/doi.org\/10.48550\/arXiv.1804.02767","DOI":"10.48550\/arXiv.1804.02767"},{"key":"20107_CR48","unstructured":"Bochkovskiy A, Wang CY, Liao HYM (2020) YOLOv4: Optimal Speed and Accuracy of Object Detection"},{"key":"20107_CR49","doi-asserted-by":"publisher","unstructured":"Wang CY, Bochkovskiy A, Liao HYM (2021) Scaled-yolov4: Scaling cross stage partial network. In: 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13024\u201313033. https:\/\/doi.org\/10.1109\/CVPR46437.2021.01283","DOI":"10.1109\/CVPR46437.2021.01283"},{"key":"20107_CR50","unstructured":"Wang CY, Yeh IH, Liao HYM (2021) You Only Learn One Representation: Unified Network for Multiple Tasks"},{"key":"20107_CR51","unstructured":"Ge Z, Liu S, Wang F, Li Z, Sun J (2021) YOLOX: Exceeding YOLO Series in 2021"},{"key":"20107_CR52","doi-asserted-by":"crossref","unstructured":"Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-End Object Detection with Transformers","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"20107_CR53","unstructured":"Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2021) Deformable DETR: Deformable Transformers for End-to-End Object Detection"},{"key":"20107_CR54","unstructured":"Song H, Sun D, Chun S, Jampani V, Han D, Heo B, Kim W, Yang MH (2022) Vidt: An efficient and effective fully transformer-based object detector. In: International Conference on Learning Representation"},{"key":"20107_CR55","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale"},{"key":"20107_CR56","doi-asserted-by":"crossref","unstructured":"Fern\u00e1ndez ISM, Oprea S, Castro-Vargas JA, Martinez-Gonzalez P, Garcia-Rodriguez J (2022) Estimating context aware human-object interaction using deep learning-based object recognition architectures. In: 16th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2021, Sanjurjo Gonz\u00e1lez H, Pastor L\u00f3pez I, Garc\u00eda Bringas P, Quinti\u00e1n H, Corchado E (eds) Springer, Cham, pp 429\u2013438","DOI":"10.1007\/978-3-030-87869-6_41"},{"key":"20107_CR57","doi-asserted-by":"crossref","unstructured":"Benavent-Lled\u00f3 M, Oprea S, Castro-Vargas JA, Martinez-Gonzalez P, Garcia-Rodriguez J (2022) Interaction estimation in egocentric videos via simultaneous hand-object recognition. In: 16th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2021, Sanjurjo Gonz\u00e1lez H, Pastor L\u00f3pez I, Garc\u00eda Bringas P, Quinti\u00e1n H, Corchado E (eds) Springer, Cham, pp 439\u2013448","DOI":"10.1007\/978-3-030-87869-6_42"},{"key":"20107_CR58","doi-asserted-by":"publisher","unstructured":"Dewi C, Chen APS, Christanto HJ (2023) Deep learning for highly accurate hand recognition based on yolov7 model. Big Data and Cognitive Computing 7(1) https:\/\/doi.org\/10.3390\/bdcc7010053","DOI":"10.3390\/bdcc7010053"},{"key":"20107_CR59","doi-asserted-by":"crossref","unstructured":"\u0141ysakowski M, \u017bywanowski K, Banaszczyk A, Nowicki MR, Skrzypczy\u0144ski P, Tadeja SK (2023) Real-Time Onboard Object Detection for Augmented Reality: Enhancing Head-Mounted Display with YOLOv8","DOI":"10.1109\/EDGE60047.2023.00059"},{"key":"20107_CR60","doi-asserted-by":"publisher","unstructured":"Mahurkar S (2018) Integrating yolo object detection with augmented reality for ios apps. In: 2018 9th IEEE Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON), pp. 585\u2013589. https:\/\/doi.org\/10.1109\/UEMCON.2018.8796579","DOI":"10.1109\/UEMCON.2018.8796579"},{"key":"20107_CR61","doi-asserted-by":"publisher","DOI":"10.1117\/12.2660940","author":"Y Qin","year":"2023","unstructured":"Qin Y, Wang S, Zhang Q, Cheng Y, Huang J, He W (2023). Assembly training system on hololens using embedded algorithm. https:\/\/doi.org\/10.1117\/12.2660940","journal-title":"Assembly training system on hololens using embedded algorithm."},{"key":"20107_CR62","doi-asserted-by":"publisher","unstructured":"Gupta S, Malik J (2015) Visual Semantic Role Labeling. https:\/\/doi.org\/10.48550\/arXiv.1505.04474","DOI":"10.48550\/arXiv.1505.04474"},{"key":"20107_CR63","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1016\/j.imavis.2019.06.002","volume":"89","author":"S Cruz","year":"2019","unstructured":"Cruz S, Chan A (2019) Is that my hand? an egocentric dataset for hand disambiguation. Image Vis Comput 89:131\u2013143. https:\/\/doi.org\/10.1016\/j.imavis.2019.06.002","journal-title":"Image Vis Comput"},{"key":"20107_CR64","doi-asserted-by":"crossref","unstructured":"Damen D, Doughty H, Farinella GM, Fidler S, Furnari A, Kazakos E, Moltisanti D, Munro J, Perrett T, Price W, Wray M (2018) Scaling egocentric vision: The epic-kitchens dataset. In: Proceedings of the European Conference on Computer Vision (ECCV)","DOI":"10.1007\/978-3-030-01225-0_44"},{"key":"20107_CR65","doi-asserted-by":"publisher","unstructured":"Pirsiavash H, Ramanan D (2012) Detecting activities of daily living in first-person camera views. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2847\u20132854. https:\/\/doi.org\/10.1109\/CVPR.2012.6248010","DOI":"10.1109\/CVPR.2012.6248010"},{"issue":"1","key":"20107_CR66","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1007\/s11263-014-0733-5","volume":"111","author":"M Everingham","year":"2014","unstructured":"Everingham M, Eslami SMA, Van Gool L, Williams CKI, Winn J, Zisserman A (2014) The pascal visual object classes challenge: A retrospective. Int J Comput Vision 111(1):98\u2013136. https:\/\/doi.org\/10.1007\/s11263-014-0733-5","journal-title":"Int J Comput Vision"},{"issue":"6","key":"20107_CR67","doi-asserted-by":"publisher","first-page":"1452","DOI":"10.1109\/TPAMI.2017.2723009","volume":"40","author":"B Zhou","year":"2018","unstructured":"Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A (2018) Places: A 10 million image database for scene recognition. IEEE Trans Pattern Anal Mach Intell 40(6):1452\u20131464. https:\/\/doi.org\/10.1109\/TPAMI.2017.2723009","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"20107_CR68","unstructured":"Simonyan K, Zisserman A (2015) Very Deep Convolutional Networks for Large-Scale Image Recognition"},{"key":"20107_CR69","unstructured":"Kalliatakis G (2017) Keras-VGG16-Places365. GitHub. https:\/\/github.com\/GKalliatakis\/Keras-VGG16-places365"},{"key":"20107_CR70","doi-asserted-by":"publisher","unstructured":"Quattoni A, Torralba A (2009) Recognizing indoor scenes. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 413\u2013420. https:\/\/doi.org\/10.1109\/CVPR.2009.5206537","DOI":"10.1109\/CVPR.2009.5206537"},{"key":"20107_CR71","doi-asserted-by":"publisher","unstructured":"Gonzalez-Franco M, Peck TC (2018) Avatar embodiment. towards a standardized questionnaire. Frontiers in Robotics and AI. 5 https:\/\/doi.org\/10.3389\/frobt.2018.00074","DOI":"10.3389\/frobt.2018.00074"},{"key":"20107_CR72","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1016\/j.cag.2019.07.003","volume":"83","author":"S Oprea","year":"2019","unstructured":"Oprea S, Martinez-Gonzalez P, Garcia-Garcia A, Castro-Vargas JA, Orts-Escolano S, Garcia-Rodriguez J (2019) A visually realistic grasping system for object manipulation and interaction in virtual reality environments. Comput Graph 83:77\u201386. https:\/\/doi.org\/10.1016\/j.cag.2019.07.003","journal-title":"Comput Graph"},{"key":"20107_CR73","doi-asserted-by":"publisher","unstructured":"Salagean A, Crellin E, Parsons M, Cosker D, Stanton\u00a0Fraser D (2023) Meeting your virtual twin: Effects of photorealism and personalization on embodiment, self-identification and perception of self-avatars in virtual reality. In: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA. https:\/\/doi.org\/10.1145\/3544548.3581182","DOI":"10.1145\/3544548.3581182"},{"key":"20107_CR74","doi-asserted-by":"publisher","unstructured":"Zaccardi S, Frantz T, Beckw\u00e9e D, Swinnen E, Jansen B (2023) On-device execution of deep learning models on hololens2 for real-time augmented reality medical applications. Sensors 23(21) https:\/\/doi.org\/10.3390\/s23218698","DOI":"10.3390\/s23218698"},{"key":"20107_CR75","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2023.101945","volume":"100","author":"JM G\u00f3rriz","year":"2023","unstructured":"G\u00f3rriz JM, \u00c1lvarez-Ill\u00e1n I et al (2023) Computational approaches to explainable artificial intelligence: Advances in theory, applications and trends. Information Fusion. 100:101945. https:\/\/doi.org\/10.1016\/j.inffus.2023.101945","journal-title":"Information Fusion."}],"container-title":["Multimedia Tools and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-024-20107-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11042-024-20107-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-024-20107-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,5]],"date-time":"2025-09-05T21:12:52Z","timestamp":1757106772000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11042-024-20107-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,27]]},"references-count":75,"journal-issue":{"issue":"22","published-online":{"date-parts":[[2025,7]]}},"alternative-id":["20107"],"URL":"https:\/\/doi.org\/10.1007\/s11042-024-20107-z","relation":{},"ISSN":["1573-7721"],"issn-type":[{"value":"1573-7721","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,8,27]]},"assertion":[{"value":"21 November 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 July 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 August 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 August 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}