{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,12]],"date-time":"2026-06-12T16:45:59Z","timestamp":1781282759180,"version":"3.54.1"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2023,9,5]],"date-time":"2023-09-05T00:00:00Z","timestamp":1693872000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,9,5]],"date-time":"2023-09-05T00:00:00Z","timestamp":1693872000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004837","name":"Ministerio de Ciencia e Innovaci\u00f3n","doi-asserted-by":"publisher","award":["RTC2019-007350-1"],"award-info":[{"award-number":["RTC2019-007350-1"]}],"id":[{"id":"10.13039\/501100004837","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100007515","name":"Universidad de Valladolid","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100007515","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Virtual Reality"],"published-print":{"date-parts":[[2023,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Real-time hand segmentation is a key process in applications that require human\u2013computer interaction, such as gesture recognition or augmented reality systems. However, the infinite shapes and orientations that hands can adopt, their variability in skin pigmentation and the self-occlusions that continuously appear in images make hand segmentation a truly complex problem, especially with uncontrolled lighting conditions and backgrounds. The development of robust, real-time hand segmentation algorithms is essential to achieve immersive augmented reality and mixed reality experiences by correctly interpreting collisions and occlusions. In this paper, we present a simple but powerful algorithm based on the MediaPipe Hands solution, a highly optimized neural network. The algorithm processes the landmarks provided by MediaPipe using morphological and logical operators to obtain the masks that allow dynamic updating of the skin color model. Different experiments were carried out comparing the influence of the color space on skin segmentation, with the CIELab color space chosen as the best option. An average intersection over union of 0.869 was achieved on the demanding Ego2Hands dataset running at 90 frames per second on a conventional computer without any hardware acceleration. Finally, the proposed segmentation procedure was implemented in an augmented reality application to add hand occlusion for improved user immersion. An open-source implementation of the algorithm is publicly available at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/itap-robotica-medica\/lightweight-hand-segmentation\">https:\/\/github.com\/itap-robotica-medica\/lightweight-hand-segmentation<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/s10055-023-00858-0","type":"journal-article","created":{"date-parts":[[2023,9,5]],"date-time":"2023-09-05T13:02:22Z","timestamp":1693918942000},"page":"3125-3132","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":34,"title":["Lightweight real-time hand segmentation leveraging MediaPipe landmark detection"],"prefix":"10.1007","volume":"27","author":[{"given":"Guillermo","family":"S\u00e1nchez-Brizuela","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1556-7179","authenticated-orcid":false,"given":"Ana","family":"Cisnal","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Eusebio","family":"de la Fuente-L\u00f3pez","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Juan-Carlos","family":"Fraile","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Javier","family":"P\u00e9rez-Turiel","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2023,9,5]]},"reference":[{"issue":"11","key":"858_CR1","doi-asserted-by":"publisher","first-page":"2274","DOI":"10.1109\/TPAMI.2012.120","volume":"34","author":"R Achanta","year":"2012","unstructured":"Achanta R, Shaji A, Smith K, Lucchi A, Fua P, S\u00fcsstrunk S (2012) Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274\u20132282. https:\/\/doi.org\/10.1109\/TPAMI.2012.120","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"858_CR2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2019.112922","volume":"141","author":"M Arsalan","year":"2020","unstructured":"Arsalan M, Kim DS, Owais M, Park KR (2020) Or-skip-net: outer residual skip network for skin segmentation in non-ideal situations. Expert Syst Appl 141:112922. https:\/\/doi.org\/10.1016\/j.eswa.2019.112922","journal-title":"Expert Syst Appl"},{"key":"858_CR3","doi-asserted-by":"publisher","unstructured":"Bambach S, Lee S, Crandall DJ, Yu C (2015) Lending a hand: Detecting hands and recognizing activities in complex egocentric interactions. In: 2015 IEEE International conference on computer vision (ICCV), pp. 1949\u20131957. https:\/\/doi.org\/10.1109\/ICCV.2015.226","DOI":"10.1109\/ICCV.2015.226"},{"issue":"5","key":"858_CR4","doi-asserted-by":"publisher","first-page":"2705","DOI":"10.1109\/JSEN.2015.2411994","volume":"15","author":"L Baraldi","year":"2015","unstructured":"Baraldi L, Paci F, Serra G, Benini L, Cucchiara R (2015) Gesture recognition using wearable vision sensors to enhance visitors\u2019 museum experiences. IEEE Sens J 15(5):2705\u20132714. https:\/\/doi.org\/10.1109\/JSEN.2015.2411994","journal-title":"IEEE Sens J"},{"key":"858_CR5","doi-asserted-by":"publisher","unstructured":"Cai M, Lu F, Sato Y (2020) Generalizing hand segmentation in egocentric videos with uncertainty-guided model adaptation. In: 2020 IEEE\/CVF conference on computer vision and pattern recognition (CVPR), pp. 14380\u201314389. https:\/\/doi.org\/10.1109\/CVPR42600.2020.01440","DOI":"10.1109\/CVPR42600.2020.01440"},{"issue":"1","key":"858_CR6","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1049\/iet-cvi.2017.0052","volume":"12","author":"BK Chakraborty","year":"2018","unstructured":"Chakraborty BK, Sarma D, Bhuyan MK, MacDorman KF (2018) Review of constraints on vision-based gesture recognition for human-computer interaction. IET Comput Vis 12(1):3\u201315. https:\/\/doi.org\/10.1049\/iet-cvi.2017.0052","journal-title":"IET Comput Vis"},{"key":"858_CR7","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1155\/2020\/8953670","volume":"2020","author":"J Cheng","year":"2020","unstructured":"Cheng J, Wei F, Liu Y, Li C, Chen Q, Chen X (2020) Chinese sign language recognition based on dtw-distance-mapping features. Math Probl Eng 2020:1\u201313. https:\/\/doi.org\/10.1155\/2020\/8953670","journal-title":"Math Probl Eng"},{"issue":"1145\/3306346","key":"858_CR8","first-page":"3322957","volume":"10","author":"O Glauser","year":"2019","unstructured":"Glauser O, Wu S, Panozzo D, Hilliges O, Sorkine-Hornung O (2019) Interactive hand pose estimation using a stretch-sensing soft glove. ACM Trans Graph 10(1145\/3306346):3322957","journal-title":"ACM Trans Graph"},{"key":"858_CR9","doi-asserted-by":"publisher","unstructured":"Kang B, Tan K-H, Jiang N, Tai H-S, Tretter D, Nguyen T (2017) Hand segmentation for hand-object interaction from depth map. In: 2017 IEEE global conference on signal and information processing (GlobalSIP), pp. 259\u2013263. https:\/\/doi.org\/10.1109\/GlobalSIP.2017.8308644","DOI":"10.1109\/GlobalSIP.2017.8308644"},{"issue":"4","key":"858_CR10","first-page":"30","volume":"3","author":"A Kaur","year":"2012","unstructured":"Kaur A, Kranthi B (2012) Comparison between ycbcr color space and cielab color space for skin color segmentation. Int J Appl Inf Syst 3(4):30\u201333","journal-title":"Int J Appl Inf Syst"},{"key":"858_CR11","doi-asserted-by":"publisher","unstructured":"Khan AU, Borji A (2018) Analysis of hand segmentation in the wild. In: 2018 IEEE\/CVF conference on computer vision and pattern recognition, pp. 4710\u20134719. https:\/\/doi.org\/10.1109\/CVPR.2018.00495","DOI":"10.1109\/CVPR.2018.00495"},{"key":"858_CR12","doi-asserted-by":"publisher","unstructured":"Li C, Kitani KM (2013) Pixel-level hand detection in ego-centric videos. In: 2013 IEEE conference on computer vision and pattern recognition, pp. 3570\u20133577. https:\/\/doi.org\/10.1109\/CVPR.2013.458","DOI":"10.1109\/CVPR.2013.458"},{"key":"858_CR13","doi-asserted-by":"publisher","unstructured":"Lim G, Jatesiktat P, Ang W (2020) MobileHand: real-time 3D hand shape and pose estimation from color image, pp. 450\u2013459. https:\/\/doi.org\/10.1007\/978-3-030-63820-7_52","DOI":"10.1007\/978-3-030-63820-7_52"},{"key":"858_CR14","unstructured":"Lin F, Martinez TR (2020) Ego2hands: a dataset for egocentric two-hand segmentation and detection. ArXiv arXiv:2011.07252"},{"issue":"5","key":"858_CR15","doi-asserted-by":"publisher","first-page":"1191","DOI":"10.1109\/TPAMI.2019.2892416","volume":"42","author":"X Liu","year":"2020","unstructured":"Liu X, Zhu X, Li M, Wang L, Zhu E, Liu T, Kloft M, Shen D, Yin J, Gao W (2020) Multiple kernel kk-means with incomplete kernels. IEEE Trans Pattern Anal Mach Intell 42(5):1191\u20131204. https:\/\/doi.org\/10.1109\/TPAMI.2019.2892416","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"1","key":"858_CR16","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/j.jid.2019.11.003","volume":"140","author":"BCK Ly","year":"2020","unstructured":"Ly BCK, Dyer EB, Feig JL, Chien AL, Del Bino S (2020) Research techniques made simple: cutaneous colorimetry: a reliable technique for objective skin color measurement. J Investig Dermatol 140(1):3\u2013121. https:\/\/doi.org\/10.1016\/j.jid.2019.11.003","journal-title":"J Investig Dermatol"},{"key":"858_CR17","doi-asserted-by":"publisher","unstructured":"Montenegro J, G\u00f3mez W, S\u00e1nchez-Orellana P (2013) A comparative study of color spaces in skin-based face segmentation. In: 2013 10th International conference on electrical engineering, computing science and automatic control (CCE), pp. 313\u2013317. https:\/\/doi.org\/10.1109\/ICEEE.2013.6676048","DOI":"10.1109\/ICEEE.2013.6676048"},{"key":"858_CR18","doi-asserted-by":"crossref","unstructured":"Seeber M, Oswald MR, Poranne R (2021) Realistichands: a hybrid model for 3d hand reconstruction. In: 2021 International conference on 3D vision (3DV). 22\u201331","DOI":"10.1109\/3DV53792.2021.00013"},{"key":"858_CR19","doi-asserted-by":"publisher","unstructured":"Shilkrot R, Narasimhaswamy S, Vazir S, Nguyen MH (2019) Working hands: a hand-tool assembly dataset for image segmentation and activity mining. In: Proceedings of the British machine vision conference (BMVC), pp. 1\u201312. https:\/\/doi.org\/10.5244\/C.33.171","DOI":"10.5244\/C.33.171"},{"key":"858_CR20","doi-asserted-by":"publisher","DOI":"10.3390\/s21175856","author":"J Shin","year":"2021","unstructured":"Shin J, Matsuoka A, Hasan MAM, Srizon AY (2021) American sign language alphabet recognition by extracting feature from hand pose estimation. Sensors. https:\/\/doi.org\/10.3390\/s21175856","journal-title":"Sensors"},{"issue":"9","key":"858_CR21","doi-asserted-by":"publisher","first-page":"25","DOI":"10.5815\/ijigsp.2019.09.03","volume":"11","author":"PM Thwe","year":"2019","unstructured":"Thwe PM, Yu MT (2019) Analysis on skin colour model using adaptive threshold values for hand segmentation. Int J Image Graph Signal Process 11(9):25\u201333. https:\/\/doi.org\/10.5815\/ijigsp.2019.09.03","journal-title":"Int J Image Graph Signal Process"},{"key":"858_CR22","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.neucom.2022.04.079","volume":"495","author":"T-H Tsai","year":"2022","unstructured":"Tsai T-H, Huang S-A (2022) Refined u-net: a new semantic technique on hand segmentation. Neurocomputing 495:1\u201310. https:\/\/doi.org\/10.1016\/j.neucom.2022.04.079","journal-title":"Neurocomputing"},{"key":"858_CR23","doi-asserted-by":"publisher","unstructured":"Wang W, Yu K, Hugonot J, Fua P, Salzmann M (2019) Recurrent u-net for resource-constrained segmentation. In: 2019 IEEE\/CVF International conference on computer vision (ICCV), pp. 2142\u20132151. IEEE Computer Society, Los Alamitos, CA, USA. https:\/\/doi.org\/10.1109\/ICCV.2019.00223","DOI":"10.1109\/ICCV.2019.00223"},{"key":"858_CR24","doi-asserted-by":"publisher","DOI":"10.1016\/j.bspc.2022.104089","volume":"79","author":"F Xiao","year":"2023","unstructured":"Xiao F, Zhang Z, Liu C, Wang Y (2023) Human motion intention recognition method with visual, audio, and surface electromyography modalities for a mechanical hand in different environments. Biomed Signal Process Control 79:104089. https:\/\/doi.org\/10.1016\/j.bspc.2022.104089","journal-title":"Biomed Signal Process Control"},{"key":"858_CR25","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijpvp.2020.104249","volume":"189","author":"X Yu","year":"2021","unstructured":"Yu X, Lu Y, Gao Q (2021) Pipeline image diagnosis algorithm based on neural immune ensemble learning. Int J Press Vessels Pip 189:104249. https:\/\/doi.org\/10.1016\/j.ijpvp.2020.104249","journal-title":"Int J Press Vessels Pip"},{"key":"858_CR26","doi-asserted-by":"publisher","DOI":"10.1016\/j.dsp.2022.103442","author":"X Yu","year":"2022","unstructured":"Yu X, Ye X, Zhang S (2022) Floating pollutant image target extraction algorithm based on immune extremum region. Digit Signal Process. https:\/\/doi.org\/10.1016\/j.dsp.2022.103442","journal-title":"Digit Signal Process"},{"issue":"3","key":"858_CR27","first-page":"435","volume":"45","author":"Q Zhang","year":"2018","unstructured":"Zhang Q, Yang M, Kpalma K, Zheng Q, Zhang X (2018) Segmentation of hand posture against complex backgrounds based on saliency and skin colour detection. IAENG Int J Comput Sci 45(3):435\u2013444","journal-title":"IAENG Int J Comput Sci"},{"key":"858_CR28","unstructured":"Zhang F, Bazarevsky V, Vakunov A, Tkachenka A, Sung G, Chang C-L, Grundmann M (2020) Mediapipe hands: on-device real-time hand tracking. arXiv arXiv:2006.10214"},{"key":"858_CR29","doi-asserted-by":"publisher","DOI":"10.1186\/s13640-018-0262-1","author":"YL Zhao","year":"2018","unstructured":"Zhao YL, Quan C (2018) Coarse-to-fine online learning for hand segmentation in egocentric video. J Image Video Proc. https:\/\/doi.org\/10.1186\/s13640-018-0262-1","journal-title":"J Image Video Proc"},{"key":"858_CR30","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1186\/s13640-018-0262-1","volume":"2018","author":"Y Zhao","year":"2018","unstructured":"Zhao Y, Luo Z, Quan C (2018) Coarse-to-fine online learning for hand segmentation in egocentric video. EURASIP J Image Video Process 2018:20. https:\/\/doi.org\/10.1186\/s13640-018-0262-1","journal-title":"EURASIP J Image Video Process"},{"key":"858_CR31","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1016\/j.cviu.2015.07.008","volume":"141","author":"X Zhu","year":"2015","unstructured":"Zhu X, Jia X, Wong K-YK (2015) Structured forests for pixel-level hand detection and hand part labelling. Comput Vis Image Underst 141:95\u2013107. https:\/\/doi.org\/10.1016\/j.cviu.2015.07.008","journal-title":"Comput Vis Image Underst"}],"container-title":["Virtual Reality"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10055-023-00858-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10055-023-00858-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10055-023-00858-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,29]],"date-time":"2023-11-29T10:14:20Z","timestamp":1701252860000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10055-023-00858-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,5]]},"references-count":31,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,12]]}},"alternative-id":["858"],"URL":"https:\/\/doi.org\/10.1007\/s10055-023-00858-0","relation":{},"ISSN":["1359-4338","1434-9957"],"issn-type":[{"value":"1359-4338","type":"print"},{"value":"1434-9957","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,5]]},"assertion":[{"value":"19 December 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 August 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 September 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no competing interests to declare that are relevant to the content of this article.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}