{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T14:30:03Z","timestamp":1766068203589,"version":"build-2065373602"},"reference-count":67,"publisher":"Association for Computing Machinery (ACM)","issue":"1","funder":[{"name":"project Fit for Medical Robotics","award":["PNRR MUR Cod. PNC0000007\u2014CUP: B53C22006960001"],"award-info":[{"award-number":["PNRR MUR Cod. PNC0000007\u2014CUP: B53C22006960001"]}]},{"name":"Future Artificial Intelligence Research","award":["PNRR MUR Cod. PE0000013\u2014CUP: E63C22001940006"],"award-info":[{"award-number":["PNRR MUR Cod. PE0000013\u2014CUP: E63C22001940006"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Hum.-Robot Interact."],"published-print":{"date-parts":[[2026,1,31]]},"abstract":"<jats:p>Gaze is a crucial social cue in any interacting scenario and drives many mechanisms of social cognition (joint and shared attention, predicting human intention and coordinating tasks). Gaze is an indication of social and emotional functions affecting the way the emotions are perceived. Evidence shows that embodied humanoid robots endowed with social abilities can be seen as sophisticated stimuli to study several mechanisms of human social cognition while increasing engagement and ecological validity. In this context, building a robotic perception system to automatically estimate the human gaze only relying on robot\u2019s sensors is still demanding. Main goal of the article is to propose a learning robotic architecture estimating the human gaze direction in table-top scenarios without any external hardware. Table-top tasks are largely used in experimental psychology because they are suitable to implement numerous face-to-face collaborative scenarios. Such an architecture can provide a valuable support in studies where external hardware might represent an obstacle to spontaneous human behaviour, especially in environments less controlled than the laboratory (e.g., in clinical settings). A novel dataset was also collected with the humanoid robot iCub, including images annotated from 24 participants in different gaze conditions.<\/jats:p>","DOI":"10.1145\/3758104","type":"journal-article","created":{"date-parts":[[2025,8,4]],"date-time":"2025-08-04T15:01:06Z","timestamp":1754319666000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Gaze Estimation Learning Architecture as Support to Affective, Social and Cognitive Studies in Natural Human\u2013Robot Interaction"],"prefix":"10.1145","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5792-5889","authenticated-orcid":false,"given":"Maria","family":"Lombardi","sequence":"first","affiliation":[{"name":"Humanoid Sensing and Perception, Istituto Italiano di Tecnologia, Genova, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0127-3014","authenticated-orcid":false,"given":"Elisa","family":"Maiettini","sequence":"additional","affiliation":[{"name":"Humanoid Sensing and Perception, Istituto Italiano di Tecnologia, Genova, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3323-7357","authenticated-orcid":false,"given":"Agnieszka","family":"Wykowska","sequence":"additional","affiliation":[{"name":"Social Cognition in Human-Robot Interaction, Istituto Italiano di Tecnologia, Genova, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8777-5233","authenticated-orcid":false,"given":"Lorenzo","family":"Natale","sequence":"additional","affiliation":[{"name":"Humanoid Sensing and Perception, Istituto Italiano di Tecnologia, Genova, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,10,28]]},"reference":[{"issue":"6","key":"e_1_3_8_2_2","doi-asserted-by":"crossref","first-page":"644","DOI":"10.1046\/j.0956-7976.2003.psci_1479.x","article-title":"Perceived gaze direction and the processing of facial displays of emotion","volume":"14","author":"Reginald B. Adams","year":"2003","unstructured":"Reginald B. Adams, Jr. and Robert E. Kleck. 2003. Perceived gaze direction and the processing of facial displays of emotion. Psychological Science 14, 6 (2003), 644\u2013647.","journal-title":"Psychological Science"},{"issue":"1","key":"e_1_3_8_3_2","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1037\/1528-3542.5.1.3","article-title":"Effects of direct and averted gaze on the perception of facially communicated emotion","volume":"5","author":"Reginald B. Adams","year":"2005","unstructured":"Reginald B. Adams, Jr. and Robert E. Kleck. 2005. Effects of direct and averted gaze on the perception of facially communicated emotion. Emotion 5, 1 (2005), 3.","journal-title":"Emotion"},{"issue":"2","key":"e_1_3_8_4_2","first-page":"20","article-title":"OpenFace: A general-purpose face recognition library with mobile applications","volume":"6","author":"Amos Brandon","year":"2016","unstructured":"Brandon Amos, Bartosz Ludwiczuk, and Mahadev Satyanarayanan. 2016. OpenFace: A general-purpose face recognition library with mobile applications. CMU School of Computer Science 6, 2 (2016), 20.","journal-title":"CMU School of Computer Science"},{"key":"e_1_3_8_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3568294.3580069"},{"key":"e_1_3_8_6_2","unstructured":"Rishi Athavale Lakshmi Sritan Motati and Rohan Kalahasty. 2022. One eye is all you need: Lightweight ensembles for gaze estimation with single encoders. arXiv:2211.11936. Retrieved from https:\/\/arxiv.org\/abs\/2211.11936"},{"key":"e_1_3_8_7_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asej.2022.101731"},{"key":"e_1_3_8_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00417"},{"key":"e_1_3_8_9_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cognition.2006.07.012"},{"key":"e_1_3_8_10_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.actpsy.2019.02.002"},{"key":"e_1_3_8_11_2","doi-asserted-by":"publisher","DOI":"10.3389\/fpsyg.2019.00560"},{"issue":"1","key":"e_1_3_8_12_2","first-page":"172","article-title":"OpenPose: Realtime multi-person 2D pose estimation using part affinity fields","volume":"43","author":"Cao Zhe","year":"2019","unstructured":"Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2019. OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 1 (2019), 172\u2013186.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_8_13_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v36i1.19921"},{"key":"e_1_3_8_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR56361.2022.9956687"},{"key":"e_1_3_8_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2024.3393571"},{"key":"e_1_3_8_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2020.2982828"},{"key":"e_1_3_8_17_2","article-title":"Connecting gaze, scene, and attention: Generalized attention estimation via joint modeling of gaze and scene saliency","author":"Chong Eunji","year":"2018","unstructured":"Eunji Chong, Nataniel Ruiz, Yongxin Wang, Yun Zhang, Agata Rozga, and James M. Rehg. 2018. Connecting gaze, scene, and attention: Generalized attention estimation via joint modeling of gaze and scene saliency. In Proceedings of the European Conference on Computer Vision (ECCV).","journal-title":"European Conference on Computer Vision (ECCV)"},{"key":"e_1_3_8_18_2","doi-asserted-by":"publisher","DOI":"10.3758\/s13414-014-0780-6"},{"key":"e_1_3_8_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.concog.2016.08.016"},{"key":"e_1_3_8_20_2","doi-asserted-by":"publisher","DOI":"10.1098\/rstb.2010.0319"},{"key":"e_1_3_8_21_2","doi-asserted-by":"publisher","DOI":"10.3758\/s13423-020-01730-x"},{"key":"e_1_3_8_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACV45572.2020.9093439"},{"key":"e_1_3_8_23_2","doi-asserted-by":"publisher","DOI":"10.1002\/icd.434"},{"key":"e_1_3_8_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.01413"},{"key":"e_1_3_8_25_2","doi-asserted-by":"publisher","DOI":"10.3758\/BF03208827"},{"key":"e_1_3_8_26_2","doi-asserted-by":"publisher","DOI":"10.1037\/0033-2909.133.4.694"},{"key":"e_1_3_8_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/2578153.2578190"},{"key":"e_1_3_8_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00633"},{"key":"e_1_3_8_29_2","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-psych-010418-103145"},{"key":"e_1_3_8_30_2","doi-asserted-by":"publisher","DOI":"10.3389\/frobt.2024.1346714"},{"key":"e_1_3_8_31_2","doi-asserted-by":"publisher","DOI":"10.3389\/fpsyg.2018.01587"},{"key":"e_1_3_8_32_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuropsychologia.2008.02.029"},{"issue":"5","key":"e_1_3_8_33_2","doi-asserted-by":"crossref","first-page":"e0249554","DOI":"10.1371\/journal.pone.0249554","article-title":"The persuasive power of robot touch. Behavioral and evaluative consequences of non-functional touch from a robot","volume":"16","author":"Hoffmann Laura","year":"2021","unstructured":"Laura Hoffmann and Nicole C. Kr\u00e4mer. 2021. The persuasive power of robot touch. Behavioral and evaluative consequences of non-functional touch from a robot. PLoS One 16, 5 (2021), e0249554.","journal-title":"PLoS One"},{"key":"e_1_3_8_34_2","doi-asserted-by":"publisher","DOI":"10.1111\/1467-9280.00024"},{"key":"e_1_3_8_35_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10919-020-00333-3"},{"key":"e_1_3_8_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.381"},{"key":"e_1_3_8_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2018.8545162"},{"key":"e_1_3_8_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00453"},{"key":"e_1_3_8_39_2","doi-asserted-by":"publisher","DOI":"10.3389\/frobt.2020.599581"},{"key":"e_1_3_8_40_2","first-page":"6","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR)","volume":"5","author":"Kinga D.","year":"2015","unstructured":"D. Kinga and Jimmy Ba Adam. 2015. A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (ICLR), Vol. 5, 6."},{"key":"e_1_3_8_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2011.6130513"},{"key":"e_1_3_8_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.239"},{"key":"e_1_3_8_43_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.optlaseng.2011.12.001"},{"key":"e_1_3_8_44_2","first-page":"35","volume-title":"Proceedings of the Asian Conference on Computer Vision","author":"Lian Dongze","year":"2018","unstructured":"Dongze Lian, Zehao Yu, and Shenghua Gao. 2018. Believe it or not, we know what you are looking at! In Proceedings of the Asian Conference on Computer Vision. Springer, 35\u201350."},{"key":"e_1_3_8_45_2","doi-asserted-by":"publisher","DOI":"10.3389\/fpsyg.2021.684357"},{"key":"e_1_3_8_46_2","doi-asserted-by":"publisher","DOI":"10.3389\/frobt.2022.770165"},{"key":"e_1_3_8_47_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-023-36864-0"},{"key":"e_1_3_8_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/2857491.2857530"},{"key":"e_1_3_8_49_2","first-page":"54","volume-title":"Proceedings of the 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)","author":"Marchesi Serena","year":"2020","unstructured":"Serena Marchesi, Jairo P\u00e9rez-Osorio, Davide De Tommaso, and Agnieszka Wykowska. 2020. Don\u2019t overthink: Fast decision making combined with behavior variability perceived as more human-like. In Proceedings of the 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). IEEE, 54\u201359."},{"key":"e_1_3_8_50_2","doi-asserted-by":"publisher","DOI":"10.1037\/bul0000353"},{"key":"e_1_3_8_51_2","doi-asserted-by":"publisher","DOI":"10.5772\/5761"},{"key":"e_1_3_8_52_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2010.08.010"},{"key":"e_1_3_8_53_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-8624.2007.01042.x"},{"key":"e_1_3_8_54_2","unstructured":"Cristina Palmero Javier Selva Mohammad Ali Bagheri and Sergio Escalera. 2018. Recurrent CNN for 3D gaze estimation using appearance and shape cues. arXiv:1805.03064. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.1805.03064"},{"key":"e_1_3_8_55_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01261-8_44"},{"key":"e_1_3_8_56_2","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python","volume":"12","author":"Pedregosa F.","year":"2011","unstructured":"F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12 (2011), 2825\u20132830.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_8_57_2","doi-asserted-by":"publisher","DOI":"10.1155\/2024\/2797320"},{"key":"e_1_3_8_58_2","article-title":"Where are they looking","author":"Recasens Adria","year":"2015","unstructured":"Adria Recasens, Aditya Khosla, Carl Vondrick, and Antonio Torralba. 2015. Where are they looking? In Proceedings of the Advances in Neural Information Processing Systems, Vol. 28.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_8_59_2","doi-asserted-by":"publisher","DOI":"10.4324\/9781410604194-7"},{"key":"e_1_3_8_60_2","doi-asserted-by":"publisher","DOI":"10.1038\/nn1150"},{"issue":"2","key":"e_1_3_8_61_2","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1109\/TBME.2010.2087330","article-title":"Iris center corneal reflection method for gaze tracking using visible light","volume":"58","author":"Sigut Jose","year":"2010","unstructured":"Jose Sigut and Sid-Ahmed Sidha. 2010. Iris center corneal reflection method for gaze tracking using visible light. IEEE Transactions on Bio-Medical Engineering 58, 2 (2010), 411\u2013419.","journal-title":"IEEE Transactions on Bio-Medical Engineering"},{"key":"e_1_3_8_62_2","unstructured":"Evangelos Ververas Polydefkis Gkagkos Jiankang Deng Michail Christos Doukas Jia Guo and Stefanos Zafeiriou. 2022. 3DGazeNet: Generalizing gaze estimation with weak-supervision from synthetic views. arXiv:2212.02997. Retrieved from https:\/\/arxiv.org\/abs\/2212.02997"},{"key":"e_1_3_8_63_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01877"},{"key":"e_1_3_8_64_2","doi-asserted-by":"publisher","DOI":"10.1007\/s12369-020-00674-5"},{"key":"e_1_3_8_65_2","doi-asserted-by":"publisher","DOI":"10.1177\/0963721420978609"},{"key":"e_1_3_8_66_2","doi-asserted-by":"publisher","DOI":"10.3389\/fpsyg.2022.1036530"},{"key":"e_1_3_8_67_2","first-page":"51","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops","author":"Zhang Xucong","year":"2017","unstructured":"Xucong Zhang, Yusuke Sugano, Mario Fritz, and Andreas Bulling. 2017. It\u2019s written all over your face: Full-face appearance-based gaze estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 51\u201360."},{"key":"e_1_3_8_68_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-019-01263-4"}],"container-title":["ACM Transactions on Human-Robot Interaction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3758104","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,29]],"date-time":"2025-10-29T15:16:46Z","timestamp":1761751006000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3758104"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,28]]},"references-count":67,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,1,31]]}},"alternative-id":["10.1145\/3758104"],"URL":"https:\/\/doi.org\/10.1145\/3758104","relation":{},"ISSN":["2573-9522"],"issn-type":[{"type":"electronic","value":"2573-9522"}],"subject":[],"published":{"date-parts":[[2025,10,28]]},"assertion":[{"value":"2024-08-19","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-07-21","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-10-28","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}