{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T15:25:31Z","timestamp":1774365931043,"version":"3.50.1"},"reference-count":40,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2021,4,30]],"date-time":"2021-04-30T00:00:00Z","timestamp":1619740800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Robotics"],"abstract":"<jats:p>In gaze-based Human-Robot Interaction (HRI), it is important to determine human visual intention for interacting with robots. One typical HRI interaction scenario is that a human selects an object by gaze and a robotic manipulator will pick up the object. In this work, we propose an approach, GazeEMD, that can be used to detect whether a human is looking at an object for HRI application. We use Earth Mover\u2019s Distance (EMD) to measure the similarity between the hypothetical gazes at objects and the actual gazes. Then, the similarity score is used to determine if the human visual intention is on the object. We compare our approach with a fixation-based method and HitScan with a run length in the scenario of selecting daily objects by gaze. Our experimental results indicate that the GazeEMD approach has higher accuracy and is more robust to noises than the other approaches. Hence, the users can lessen cognitive load by using our approach in the real-world HRI scenario.<\/jats:p>","DOI":"10.3390\/robotics10020068","type":"journal-article","created":{"date-parts":[[2021,4,30]],"date-time":"2021-04-30T05:10:55Z","timestamp":1619759455000},"page":"68","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":34,"title":["GazeEMD: Detecting Visual Intention in Gaze-Based Human-Robot Interaction"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1628-1559","authenticated-orcid":false,"given":"Lei","family":"Shi","sequence":"first","affiliation":[{"name":"Department of Electromechanics, Faculty of Applied Engineering, Campus Groenenborger, University of Antwerp, 2020 Antwerp, Belgium"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5880-6275","authenticated-orcid":false,"given":"Cosmin","family":"Copot","sequence":"additional","affiliation":[{"name":"Department of Electromechanics, Faculty of Applied Engineering, Campus Groenenborger, University of Antwerp, 2020 Antwerp, Belgium"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7975-1338","authenticated-orcid":false,"given":"Steve","family":"Vanlanduit","sequence":"additional","affiliation":[{"name":"Department of Electromechanics, Faculty of Applied Engineering, Campus Groenenborger, University of Antwerp, 2020 Antwerp, Belgium"}]}],"member":"1968","published-online":{"date-parts":[[2021,4,30]]},"reference":[{"key":"ref_1","unstructured":"Holmqvist, K., Nystr\u00f6m, M., Andersson, R., Dewhurst, R., Jarodzka, H., and Van de Weijer, J. (2011). Eye Tracking: A Comprehensive Guide to Methods and Measures, OUP."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Salvucci, D.D., and Goldberg, J.H. (2000, January 6\u20138). Identifying fixations and saccades in eye-tracking protocols. Proceedings of the 2000 Symposium on Eye Tracking Research & Applications, Palm Beach Gardens, FL, USA.","DOI":"10.1145\/355017.355028"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Santini, T., Fuhl, W., K\u00fcbler, T., and Kasneci, E. (2016, January 14\u201317). Bayesian identification of fixations, saccades, and smooth pursuits. Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications, Charleston, SC, USA.","DOI":"10.1145\/2857491.2857512"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"160","DOI":"10.3758\/s13428-017-0860-3","article-title":"Using machine learning to detect events in eye-tracking data","volume":"50","author":"Zemblys","year":"2018","journal-title":"Behav. Res. Methods"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1343","DOI":"10.1109\/LRA.2019.2895419","article-title":"Human gaze-driven spatial tasking of an autonomous MAV","volume":"4","author":"Yuan","year":"2019","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Chanel, C.P., Roy, R.N., Dehais, F., and Drougard, N. (2020). Towards Mixed-Initiative Human-Robot Interaction: Assessment of Discriminative Physiological and Behavioral Features for Performance Prediction. Sensors, 20.","DOI":"10.3390\/s20010296"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"2824","DOI":"10.1109\/TBME.2017.2677902","article-title":"3-D-gaze-based robotic grasping through mimicking human visuomotor function for people with motion impairments","volume":"64","author":"Li","year":"2017","journal-title":"IEEE Trans. Biomed. Eng."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Wang, M.Y., Kogkas, A.A., Darzi, A., and Mylonas, G.P. (2018, January 1\u20135). Free-View, 3D Gaze-Guided, Assistive Robotic System for Activities of Daily Living. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8594045"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Shafti, A., Orlov, P., and Faisal, A.A. (2019, January 20\u201324). Gaze-based, context-aware robotic system for assisted reaching and grasping. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8793804"},{"key":"ref_10","first-page":"449","article-title":"A system for three-dimensional gaze fixation analysis using eye tracking glasses","volume":"5","author":"Takahashi","year":"2018","journal-title":"J. Comput. Des. Eng."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1143","DOI":"10.1109\/TCDS.2018.2821566","article-title":"Quantifying gaze behavior during real-world interactions using automated object, face, and fixation detection","volume":"10","author":"Chukoskie","year":"2018","journal-title":"IEEE Trans. Cogn. Dev. Syst."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Venuprasad, P., Dobhal, T., Paul, A., Nguyen, T.N., Gilman, A., Cosman, P., and Chukoskie, L. (2019, January 25\u201328). Characterizing joint attention behavior during real world interactions using automated object and gaze detection. Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications, Denver, CO, USA.","DOI":"10.1145\/3314111.3319843"},{"key":"ref_13","unstructured":"Jacob, R.J. (, January April). What you look at is what you get: Eye movement-based interaction techniques. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Seattle, WA, USA."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Blattgerste, J., Renner, P., and Pfeiffer, T. (2018, January 14\u201317). Advantages of eye-gaze over head-gaze-based selection in virtual and augmented reality under varying field of views. Proceedings of the Workshop on Communication by Gaze Interaction, Warsaw, Poland.","DOI":"10.1145\/3206343.3206349"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Tanriverdi, V., and Jacob, R.J. (2000, January 1\u20136). Interacting with eye movements in virtual environments. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, The Hague, The Netherlands.","DOI":"10.1145\/332040.332443"},{"key":"ref_16","unstructured":"Stellmach, S., and Dachselt, R. (May, January 27). Still looking: Investigating seamless gaze-supported selection, positioning, and manipulation of distant targets. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Paris, France."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Meena, Y.K., Cecotti, H., Wong-Lin, K., and Prasad, G. (2017, January 11\u201315). A multimodal interface to resolve the Midas-Touch problem in gaze controlled wheelchair. Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Jeju, Korea.","DOI":"10.1109\/EMBC.2017.8036971"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Chatterjee, I., Xiao, R., and Harrison, C. (2015, January 9\u201313). Gaze+ gesture: Expressive, precise and targeted free-space interactions. Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, Seattle, WA, USA.","DOI":"10.1145\/2818346.2820752"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Pfeuffer, K., Mayer, B., Mardanbegi, D., and Gellersen, H. (2017, January 16\u201317). Gaze+ pinch interaction in virtual reality. Proceedings of the 5th Symposium on Spatial User Interaction, Brighton, UK.","DOI":"10.1145\/3131277.3132180"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Istance, H., Bates, R., Hyrskykari, A., and Vickers, S. (2008, January 26\u201328). Snap clutch, a moded approach to solving the Midas touch problem. Proceedings of the 2008 Symposium on Eye Tracking Research & Applications, Savannah, GA, USA.","DOI":"10.1145\/1344471.1344523"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1","DOI":"10.16910\/jemr.7.4.4","article-title":"Human-robot interaction based on gaze gestures for the drone teleoperation","volume":"7","author":"Yu","year":"2014","journal-title":"J. Eye Mov. Res."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.procs.2014.11.012","article-title":"New Solution to the Midas Touch Problem: Identification of Visual Commands Via Extraction of Focal Fixations","volume":"39","author":"Velichkovsky","year":"2014","journal-title":"Procedia Comput. Sci."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Krishna Sharma, V., Saluja, K., Mollyn, V., and Biswas, P. (2020, January 2\u20135). Eye gaze controlled robotic arm for persons with severe speech and motor impairment. Proceedings of the ACM Symposium on Eye Tracking Research and Applications, Stuttgart, Germany.","DOI":"10.1145\/3379155.3391324"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Araujo, J.M., Zhang, G., Hansen, J.P.P., and Puthusserypady, S. (2020, January 2\u20135). Exploring Eye-Gaze Wheelchair Control. Proceedings of the ACM Symposium on Eye Tracking Research and Applications, Stuttgart, Germany.","DOI":"10.1145\/3379157.3388933"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1131","DOI":"10.1007\/s11548-017-1580-y","article-title":"Gaze-contingent perceptually enabled interactions in the operating theatre","volume":"12","author":"Kogkas","year":"2017","journal-title":"Int. J. Comput. Assist. Radiol. Surg."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"102414","DOI":"10.1016\/j.ijhcs.2020.102414","article-title":"Comparing selection mechanisms for gaze input techniques in head-mounted displays","volume":"139","author":"Esteves","year":"2020","journal-title":"Int. J. Hum. Comput. Stud."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1023\/A:1026543900054","article-title":"The earth mover\u2019s distance as a metric for image retrieval","volume":"40","author":"Rubner","year":"2000","journal-title":"Int. J. Comput. Vis."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1109\/34.192468","article-title":"A unified approach to the change of resolution: Space and gray-level","volume":"11","author":"Peleg","year":"1989","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_29","unstructured":"Bazan, E., Dokl\u00e1dal, P., and Dokladalova, E. (2019, January 9\u201312). Quantitative Analysis of Similarity Measures of Distributions. Proceedings of the British Machine Vision Conferences, Cardiff, UK."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"186","DOI":"10.1109\/TCDS.2016.2558516","article-title":"Evolutionary fuzzy integral-based gaze control with preference of human gaze","volume":"8","author":"Yoo","year":"2016","journal-title":"IEEE Trans. Cogn. Dev. Syst."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2016). YOLO9000: Better, Faster, Stronger. arXiv.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014). Microsoft coco: Common objects in context. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Kassner, M., Patera, W., and Bulling, A. (2014, January 13\u201317). Pupil: An Open Source Platform for Pervasive Eye Tracking and Mobile Gaze-based Interaction. Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Seattle, WA, USA.","DOI":"10.1145\/2638728.2641695"},{"key":"ref_34","unstructured":"Bjelonic, M. (2019, July 06). YOLO ROS: Real-Time Object Detection for ROS. Available online: https:\/\/github.com\/leggedrobotics\/darknet_ros."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1457","DOI":"10.1080\/17470210902816461","article-title":"The 35th Sir Frederick Bartlett Lecture: Eye movements and attention in reading, scene perception, and visual search","volume":"62","author":"Rayner","year":"2009","journal-title":"Q. J. Exp. Psychol."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Ward, J.A., Lukowicz, P., and Tr\u00f6ster, G. (2006, January 10\u201311). Evaluating performance in continuous context recognition using event-driven error characterisation. Proceedings of the International Symposium on Location-and Context-Awareness, Dublin, Ireland.","DOI":"10.1007\/11752967_16"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Ward, J.A., Lukowicz, P., and Gellersen, H.W. (2011). Performance Metrics for Activity Recognition. ACM Trans. Intell. Syst. Technol., 2.","DOI":"10.1145\/1889681.1889687"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Shojaeizadeh, M., Djamasbi, S., and Trapp, A.C. (2016, January 17\u201322). Density of gaze points within a fixation and information processing behavior. Proceedings of the International Conference on Universal Access in Human-Computer Interaction, Toronto, ON, Canada.","DOI":"10.1007\/978-3-319-40250-5_44"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Wang, H., and Shi, B.E. (2019, January 25\u201328). Gaze awareness improves collaboration efficiency in a collaborative assembly task. Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications, Denver, CO, USA.","DOI":"10.1145\/3317959.3321492"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Moon, A., Troniak, D.M., Gleeson, B., Pan, M.K., Zheng, M., Blumer, B.A., MacLean, K., and Croft, E.A. (2014, January 3\u20136). Meet me where i\u2019m gazing: How shared attention gaze affects human-robot handover timing. Proceedings of the 2014 ACM\/IEEE International Conference on Human-Robot Interaction, Bielefeld, Germany.","DOI":"10.1145\/2559636.2559656"}],"container-title":["Robotics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2218-6581\/10\/2\/68\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:55:53Z","timestamp":1760162153000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2218-6581\/10\/2\/68"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,30]]},"references-count":40,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2021,6]]}},"alternative-id":["robotics10020068"],"URL":"https:\/\/doi.org\/10.3390\/robotics10020068","relation":{},"ISSN":["2218-6581"],"issn-type":[{"value":"2218-6581","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,4,30]]}}}