{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T11:16:08Z","timestamp":1773141368598,"version":"3.50.1"},"reference-count":86,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2025,4,18]],"date-time":"2025-04-18T00:00:00Z","timestamp":1744934400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/"}],"funder":[{"name":"National Science Foundation","award":["IIS-1924802, IIS-2143109, and IIS-2106690"],"award-info":[{"award-number":["IIS-1924802, IIS-2143109, and IIS-2106690"]}]},{"DOI":"10.13039\/100006785","name":"Google","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100006785","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Hum.-Robot Interact."],"published-print":{"date-parts":[[2025,6,30]]},"abstract":"<jats:p>\n            Understanding human perceptions of robot performance is crucial for designing socially intelligent robots that can adapt to human expectations. Current approaches often rely on surveys, which can disrupt ongoing human\u2013robot interactions. As an alternative, we explore predicting people\u2019s perceptions of robot performance using non-verbal behavioral cues and machine learning techniques. We contribute the SEAN TOGETHER Dataset consisting of observations of an interaction between a person and a mobile robot in Virtual Reality, together with perceptions of robot performance provided by users on a 5-point scale. We then analyze how well humans and supervised learning techniques can predict perceived robot performance based on different observation types (like facial expression and spatial behavior features). Our results suggest that facial expressions alone provide useful information, but in the navigation scenarios that we considered, reasoning about spatial features in context is critical for the prediction task. Also, supervised learning techniques outperformed humans\u2019 predictions in most cases. Further, when predicting robot performance as a binary classification task on unseen users\u2019 data, the\n            <jats:inline-formula content-type=\"math\/tex\">\n              <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\(F_{1}\\)<\/jats:tex-math>\n            <\/jats:inline-formula>\n            -Score of machine learning models more than doubled that of predictions on a 5-point scale. This suggested good generalization capabilities, particularly in identifying performance directionality over exact ratings. Based on these findings, we conducted a real-world demonstration where a mobile robot uses a machine learning model to predict how a human who follows it perceives it. Finally, we discuss the implications of our results for implementing these supervised learning models in real-world navigation. Our work paves the path to automatically enhancing robot behavior based on observations of users and inferences about their perceptions of a robot.\n          <\/jats:p>","DOI":"10.1145\/3719020","type":"journal-article","created":{"date-parts":[[2025,2,27]],"date-time":"2025-02-27T14:50:06Z","timestamp":1740667806000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Predicting Human Perceptions of Robot Performance during Navigation Tasks"],"prefix":"10.1145","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8535-2771","authenticated-orcid":false,"given":"Qiping","family":"Zhang","sequence":"first","affiliation":[{"name":"Yale University, New Haven, CT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0823-4859","authenticated-orcid":false,"given":"Nathan","family":"Tsoi","sequence":"additional","affiliation":[{"name":"Yale University, New Haven, CT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-5127-1051","authenticated-orcid":false,"given":"Mofeed","family":"Nagib","sequence":"additional","affiliation":[{"name":"Yale University, New Haven, CT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-7609-7062","authenticated-orcid":false,"given":"Booyeon","family":"Choi","sequence":"additional","affiliation":[{"name":"Yale University, New Haven, CT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7947-0333","authenticated-orcid":false,"given":"Jie","family":"Tan","sequence":"additional","affiliation":[{"name":"Google DeepMind, Google Inc, Mountain View, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-3698-642X","authenticated-orcid":false,"given":"Hao-Tien Lewis","family":"Chiang","sequence":"additional","affiliation":[{"name":"Google DeepMind, Google Inc, Mountain View, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0698-5472","authenticated-orcid":false,"given":"Marynel","family":"V\u00e1zquez","sequence":"additional","affiliation":[{"name":"Yale University, New Haven, CT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,4,18]]},"reference":[{"key":"e_1_3_4_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijhcs.2021.102744"},{"key":"e_1_3_4_3_2","first-page":"671","volume-title":"Conference on Robot Learning","author":"Anderson Peter","year":"2021","unstructured":"Peter Anderson, Ayush Shrivastava, Joanne Truong, Arjun Majumdar, Devi Parikh, Dhruv Batra, and Stefan Lee. 2021. Sim-to-real transfer for vision-and-language navigation. In Conference on Robot Learning. PMLR, 671\u2013681."},{"key":"e_1_3_4_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/IROS47612.2022.9981754"},{"key":"e_1_3_4_5_2","volume-title":"Proceedings of RSS '18 Towards a Framework for Joint Action Workshop","author":"Aronson Reuben M.","year":"2018","unstructured":"Reuben M. Aronson and Henny Admoni. 2018. Gaze for error detection during human-robot shared manipulation. In Proceedings of RSS '18 Towards a Framework for Joint Action Workshop."},{"key":"e_1_3_4_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/RO-MAN50785.2021.9515472"},{"key":"e_1_3_4_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/ROMAN.2014.6926389"},{"key":"e_1_3_4_8_2","unstructured":"Yuntao Bai Andy Jones Kamal Ndousse Amanda Askell Anna Chen Nova DasSarma Dawn Drain Stanislav Fort Deep Ganguli Tom Henighan et al. 2022. Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv:2204.05862. Retrieved from https:\/\/arxiv.org\/abs\/2204.05862"},{"key":"e_1_3_4_9_2","doi-asserted-by":"publisher","DOI":"10.1017\/9781108676649"},{"key":"e_1_3_4_10_2","unstructured":"Peter W. Battaglia Jessica B. Hamrick Victor Bapst Alvaro Sanchez-Gonzalez Vinicius Zambaldi Mateusz Malinowski Andrea Tacchetti David Raposo Adam Santoro Ryan Faulkner et al. 2018. Relational inductive biases deep learning and graph networks. arXiv:1806.01261. Retrieved from https:\/\/arxiv.org\/abs\/1806.01261"},{"key":"e_1_3_4_11_2","first-page":"21","volume-title":"Computer Vision and Pattern Recognition Workshops (CVPR)","author":"Bera Aniket","year":"2019","unstructured":"Aniket Bera, Tanmay Randhavane, and Dinesh Manocha. 2019. Improving socially-aware multi-channel human emotion prediction for robot navigation. In Computer Vision and Pattern Recognition Workshops (CVPR), 21\u201327."},{"key":"e_1_3_4_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8794310"},{"key":"e_1_3_4_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/HRI53351.2022.9889650"},{"key":"e_1_3_4_14_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-66435-4_5"},{"key":"e_1_3_4_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/RO-MAN50785.2021.9515407"},{"key":"e_1_3_4_16_2","doi-asserted-by":"publisher","DOI":"10.5898\/JHRI.2.1.Breazeal"},{"key":"e_1_3_4_17_2","doi-asserted-by":"publisher","DOI":"10.5555\/3545946.3598652"},{"key":"e_1_3_4_18_2","doi-asserted-by":"crossref","unstructured":"Kate Candon Nicholas C. Georgiou Helen Zhou Sidney Richardson Qiping Zhang Brian Scassellati and Marynel V\u00e1zquez. 2024. REACT: Two datasets for analyzing both human reactions and evaluative feedback to robots over time. arXiv:2402.00190. Retrieved from https:\/\/arxiv.org\/abs\/2402.00190","DOI":"10.1145\/3610977.3637480"},{"key":"e_1_3_4_19_2","first-page":"254","volume-title":"2017 ACM\/IEEE International Conference on Human-Robot Interaction","author":"Carpinella Colleen M.","year":"2017","unstructured":"Colleen M. Carpinella, Alisa B. Wyman, Michael A. Perez, and Steven J. Stroessner. 2017. The robotic social attributes scale (RoSAS) development and validation. In 2017 ACM\/IEEE International Conference on Human-Robot Interaction, 254\u2013262."},{"key":"e_1_3_4_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2020.2964824"},{"key":"e_1_3_4_21_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-24349-3_9"},{"key":"e_1_3_4_22_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1907856118"},{"key":"e_1_3_4_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3068769"},{"key":"e_1_3_4_24_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2021\/599"},{"key":"e_1_3_4_25_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i18.17998"},{"key":"e_1_3_4_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/2696454.2696473"},{"key":"e_1_3_4_27_2","doi-asserted-by":"publisher","DOI":"10.5555\/2447556.2447672"},{"key":"e_1_3_4_28_2","unstructured":"Anthony Francis Claudia P\u00e9rez-d\u2019Arpino Chengshu Li Fei Xia Alexandre Alahi Rachid Alami Aniket Bera Abhijat Biswas Joydeep Biswas Rohan Chandra et al. 2023. Principles and guidelines for evaluating social robot navigation algorithms. arXiv:2306.16740. Retrieved from https:\/\/arxiv.org\/abs\/2306.16740"},{"key":"e_1_3_4_29_2","doi-asserted-by":"publisher","DOI":"10.3389\/frobt.2021.721317"},{"key":"e_1_3_4_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/1228716.1228720"},{"key":"e_1_3_4_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2006.889486"},{"key":"e_1_3_4_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/IROS45743.2020.9341607"},{"key":"e_1_3_4_33_2","doi-asserted-by":"publisher","DOI":"10.5772\/6180"},{"key":"e_1_3_4_34_2","volume":"609","author":"Hall Edmund T.","year":"1966","unstructured":"Edmund T. Hall and Edward T. Hall. 1966. The Hidden Dimension (Vol. 609). Anchor.","journal-title":"The Hidden Dimension"},{"key":"e_1_3_4_35_2","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1977.10480998"},{"key":"e_1_3_4_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_4_37_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-63596-0_2"},{"key":"e_1_3_4_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3133612"},{"key":"e_1_3_4_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3411764.3445270"},{"key":"e_1_3_4_40_2","doi-asserted-by":"publisher","DOI":"10.1080\/01691864.2021.1928551"},{"key":"e_1_3_4_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3173574.3173939"},{"key":"e_1_3_4_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2022.3184025"},{"key":"e_1_3_4_43_2","volume-title":"Erving Goffman: Exploring the Interaction Order","author":"Kendon Adam","year":"1988","unstructured":"Adam Kendon. 1988. Goffman\u2019s approach to face-to-face interaction. In Erving Goffman: Exploring the Interaction Order. Paul Drew and Anthony J. Wootton (Eds.), Polity Press."},{"key":"e_1_3_4_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/HRI.2013.6483597"},{"key":"e_1_3_4_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/2909824.3020226"},{"key":"e_1_3_4_46_2","doi-asserted-by":"publisher","DOI":"10.1145\/1597735.1597738"},{"key":"e_1_3_4_47_2","unstructured":"Alexander Lew Sydney Thompson Nathan Tsoi and Marynel V\u00e1zquez. 2023. Shutter the robot photographer: Leveraging behavior trees for public in-the-wild human-robot interactions. arXiv:2302.00191. Retrieved from https:\/\/arxiv.org\/abs\/2302.00191"},{"key":"e_1_3_4_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/HRI.2019.8673116"},{"key":"e_1_3_4_49_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.3006254"},{"key":"e_1_3_4_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/IROS40897.2019.8968191"},{"key":"e_1_3_4_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2014.6942636"},{"key":"e_1_3_4_52_2","first-page":"2285","volume-title":"International Conference on Machine Learning. PMLR","author":"MacGlashan James","year":"2017","unstructured":"James MacGlashan, Mark K. Ho, Robert Loftin, Bei Peng, Guan Wang, David L. Roberts, Matthew E. Taylor, and Michael L. Littman. 2017. Interactive learning from policy-dependent human feedback. In International Conference on Machine Learning. PMLR, 2285\u20132294."},{"key":"e_1_3_4_53_2","unstructured":"Roberto Mart\u00edn-Mart\u00edn Hamid Rezatofighi Abhijeet Shenoi Mihir Patel J. Gwak Nathan Dass Alan Federman Patrick Goebel and Silvio Savarese. 2019. JRDB: A dataset and benchmark for visual perception for navigation in human environments. arXiv:1910.11792. Retrieved from https:\/\/arxiv.org\/abs\/1910.11792"},{"key":"e_1_3_4_54_2","doi-asserted-by":"publisher","DOI":"10.1145\/3495244"},{"key":"e_1_3_4_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/3583741"},{"key":"e_1_3_4_56_2","doi-asserted-by":"publisher","DOI":"10.5555\/3523760.3523831"},{"key":"e_1_3_4_57_2","doi-asserted-by":"publisher","DOI":"10.1177\/1071181320641506"},{"key":"e_1_3_4_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2008.926867"},{"key":"e_1_3_4_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2016.7759741"},{"key":"e_1_3_4_60_2","unstructured":"S\u00f6ren Pirk Edward Lee Xuesu Xiao Leila Takayama Anthony Francis and Alexander Toshev. 2022. A protocol for validating social navigation policies. arXiv:2204.05443. Retrieved from https:\/\/arxiv.org\/abs\/2204.05443"},{"key":"e_1_3_4_61_2","first-page":"55","volume-title":"Neural Networks: Tricks of the Trade","author":"Prechelt Lutz","year":"2002","unstructured":"Lutz Prechelt. 2002. Early stopping-but when? In Neural Networks: Tricks of the Trade. Genevieve B. Orr and Klaus-Robert M\u00fcller (Eds.), Springer, 55\u201369."},{"key":"e_1_3_4_62_2","volume-title":"ICRA Workshop on Open Source Software","volume":"3","author":"Quigley Morgan","year":"2009","unstructured":"Morgan Quigley, Ken Conley, Brian Gerkey, Josh Faust, Tully Foote, Jeremy Leibs, Rob Wheeler, and Andrew Ng. 2009. ROS: An open-source Robot Operating System. ICRA Workshop on Open Source Software, 3."},{"key":"e_1_3_4_63_2","unstructured":"Claire Rivoire and Angelica Lim. 2016. The delicate balance of boring and annoying: Learning proactive timing in long-term human robot interaction."},{"key":"e_1_3_4_64_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2022.104047"},{"key":"e_1_3_4_65_2","first-page":"1","article-title":"Planning for autonomous cars that leverage effects on human actions","volume":"2","author":"Sadigh Dorsa","year":"2016","unstructured":"Dorsa Sadigh, Shankar Sastry, Sanjit A. Seshia, and Anca D. Dragan. 2016. Planning for autonomous cars that leverage effects on human actions. In Robotics: Science and Systems (Vol. 2). Ann Arbor, MI, USA, 1\u20139.","journal-title":"Robotics: Science and Systems"},{"key":"e_1_3_4_66_2","doi-asserted-by":"publisher","DOI":"10.1109\/MTS.2018.2795095"},{"key":"e_1_3_4_67_2","doi-asserted-by":"publisher","DOI":"10.1109\/HRI.2010.5453270"},{"key":"e_1_3_4_68_2","doi-asserted-by":"publisher","DOI":"10.1145\/3536221.3557028"},{"key":"e_1_3_4_69_2","doi-asserted-by":"publisher","DOI":"10.1109\/IROS47612.2022.9981726"},{"key":"e_1_3_4_70_2","doi-asserted-by":"publisher","DOI":"10.1145\/3568162.3576990"},{"key":"e_1_3_4_71_2","volume-title":"Generalized Linear Mixed Models: Modern Concepts, Methods and Applications","author":"Stroup Walter W.","year":"2012","unstructured":"Walter W. Stroup. 2012. Generalized Linear Mixed Models: Modern Concepts, Methods and Applications. CRC Press."},{"key":"e_1_3_4_72_2","unstructured":"Aamodh Suresh Angelique Taylor Laurel D. Riek and Sonia Martinez. 2023. Robot navigation in risky crowded environments: Understanding human preferences. arXiv:2303.08284. Retrieved from https:\/\/arxiv.org\/abs\/2303.08284"},{"key":"e_1_3_4_73_2","doi-asserted-by":"publisher","DOI":"10.1109\/HRI.2019.8673304"},{"key":"e_1_3_4_74_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2007.09.009"},{"key":"e_1_3_4_75_2","doi-asserted-by":"publisher","DOI":"10.1145\/3439720"},{"key":"e_1_3_4_76_2","doi-asserted-by":"publisher","DOI":"10.5555\/3109829.3109831"},{"key":"e_1_3_4_77_2","doi-asserted-by":"publisher","DOI":"10.1177\/0278364914557874"},{"key":"e_1_3_4_78_2","doi-asserted-by":"publisher","DOI":"10.1109\/IROS51168.2021.9636319"},{"key":"e_1_3_4_79_2","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2022.3196783"},{"key":"e_1_3_4_80_2","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"key":"e_1_3_4_81_2","doi-asserted-by":"publisher","DOI":"10.1109\/IROS45743.2020.9341209"},{"key":"e_1_3_4_82_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2008.11.007"},{"key":"e_1_3_4_83_2","doi-asserted-by":"publisher","DOI":"10.1109\/RO-MAN53752.2022.9900589"},{"key":"e_1_3_4_84_2","doi-asserted-by":"publisher","DOI":"10.1145\/3313831.3376810"},{"key":"e_1_3_4_85_2","doi-asserted-by":"publisher","DOI":"10.1145\/3568162.3576986"},{"key":"e_1_3_4_86_2","doi-asserted-by":"publisher","DOI":"10.1145\/3568294.3580039"},{"key":"e_1_3_4_87_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00783"}],"container-title":["ACM Transactions on Human-Robot Interaction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3719020","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3719020","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:19:08Z","timestamp":1750295948000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3719020"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,18]]},"references-count":86,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,6,30]]}},"alternative-id":["10.1145\/3719020"],"URL":"https:\/\/doi.org\/10.1145\/3719020","relation":{},"ISSN":["2573-9522"],"issn-type":[{"value":"2573-9522","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,4,18]]},"assertion":[{"value":"2024-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-02-06","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-18","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}