{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,8]],"date-time":"2026-05-08T20:46:39Z","timestamp":1778273199680,"version":"3.51.4"},"publisher-location":"New York, NY, USA","reference-count":43,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,11,9]],"date-time":"2021-11-09T00:00:00Z","timestamp":1636416000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,11,9]]},"DOI":"10.1145\/3472307.3484167","type":"proceedings-article","created":{"date-parts":[[2021,11,9]],"date-time":"2021-11-09T17:50:07Z","timestamp":1636480207000},"page":"31-38","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":25,"title":["Speech-based Gesture Generation for Robots and Embodied Agents: A Scoping Review"],"prefix":"10.1145","author":[{"given":"Yu","family":"Liu","sequence":"first","affiliation":[{"name":"University of New South Wales, Australia"}]},{"given":"Gelareh","family":"Mohammadi","sequence":"additional","affiliation":[{"name":"School of Computer Science &amp; Engineering, UNSW, Australia"}]},{"given":"Yang","family":"Song","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, University of New South Wales, Australia"}]},{"given":"Wafa","family":"Johal","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, University of New South Wales, Australia"}]}],"member":"320","published-online":{"date-parts":[[2021,11,9]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58523-5_15"},{"key":"e_1_3_2_1_2_1","volume-title":"Computer Graphics Forum, Vol.\u00a039","author":"Alexanderson Simon","unstructured":"Simon Alexanderson , Gustav\u00a0Eje Henter , Taras Kucherenko , and Jonas Beskow . 2020. Style-Controllable Speech-Driven Gesture Synthesis Using Normalising Flows . In Computer Graphics Forum, Vol.\u00a039 . Wiley Online Library , 487\u2013496. Simon Alexanderson, Gustav\u00a0Eje Henter, Taras Kucherenko, and Jonas Beskow. 2020. Style-Controllable Speech-Driven Gesture Synthesis Using Normalising Flows. In Computer Graphics Forum, Vol.\u00a039. Wiley Online Library, 487\u2013496."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROMAN.2009.5326136"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/Humanoids.2011.6100810"},{"key":"e_1_3_2_1_5_1","unstructured":"Kyunghyun Cho Bart Van\u00a0Merri\u00ebnboer Caglar Gulcehre Dzmitry Bahdanau Fethi Bougares Holger Schwenk and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078(2014).  Kyunghyun Cho Bart Van\u00a0Merri\u00ebnboer Caglar Gulcehre Dzmitry Bahdanau Fethi Bougares Holger Schwenk and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078(2014)."},{"key":"e_1_3_2_1_6_1","volume-title":"The repertoire of nonverbal behavior: Categories, origins, usage, and coding. Nonverbal communication, interaction, and gesture","author":"Ekman Paul","year":"1969","unstructured":"Paul Ekman and Wallace\u00a0 V Friesen . 1969. The repertoire of nonverbal behavior: Categories, origins, usage, and coding. Nonverbal communication, interaction, and gesture ( 1969 ), 57\u2013106. Paul Ekman and Wallace\u00a0V Friesen. 1969. The repertoire of nonverbal behavior: Categories, origins, usage, and coding. Nonverbal communication, interaction, and gesture (1969), 57\u2013106."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3267851.3267898"},{"key":"e_1_3_2_1_8_1","first-page":"1","article-title":"Multi-objective adversarial gesture generation","author":"Ferstl Ylva","year":"2019","unstructured":"Ylva Ferstl , Michael Neff , and Rachel McDonnell . 2019 . Multi-objective adversarial gesture generation . In Motion, Interaction and Games. 1 \u2013 10 . Ylva Ferstl, Michael Neff, and Rachel McDonnell. 2019. Multi-objective adversarial gesture generation. In Motion, Interaction and Games. 1\u201310.","journal-title":"Motion, Interaction and Games."},{"key":"e_1_3_2_1_9_1","unstructured":"Ian\u00a0J Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative adversarial networks. arXiv preprint arXiv:1406.2661(2014).  Ian\u00a0J Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative adversarial networks. arXiv preprint arXiv:1406.2661(2014)."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3267851.3267878"},{"key":"e_1_3_2_1_11_1","unstructured":"Martin Heusel Hubert Ramsauer Thomas Unterthiner Bernhard Nessler and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. arXiv preprint arXiv:1706.08500(2017).  Martin Heusel Hubert Ramsauer Thomas Unterthiner Bernhard Nessler and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. arXiv preprint arXiv:1706.08500(2017)."},{"key":"e_1_3_2_1_12_1","volume-title":"Long short-term memory. Neural computation 9, 8","author":"Hochreiter Sepp","year":"1997","unstructured":"Sepp Hochreiter and J\u00fcrgen Schmidhuber . 1997. Long short-term memory. Neural computation 9, 8 ( 1997 ), 1735\u20131780. Sepp Hochreiter and J\u00fcrgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735\u20131780."},{"key":"e_1_3_2_1_13_1","volume-title":"Perspectives on Sim2Real Transfer for Robotics: A Summary of the R:SS 2020 Workshop. (Dec","author":"H\u00f6fer Sebastian","year":"2020","unstructured":"Sebastian H\u00f6fer , Kostas Bekris , Ankur Handa , Juan\u00a0Camilo Gamboa , Florian Golemo , Melissa Mozifian , Chris Atkeson , Dieter Fox , Ken Goldberg , John Leonard , C.\u00a0 Karen Liu , Jan Peters , Shuran Song , Peter Welinder , and Martha White . 2020. Perspectives on Sim2Real Transfer for Robotics: A Summary of the R:SS 2020 Workshop. (Dec . 2020 ). https:\/\/www.arxiv-vanity.com\/papers\/2012.03806\/ Sebastian H\u00f6fer, Kostas Bekris, Ankur Handa, Juan\u00a0Camilo Gamboa, Florian Golemo, Melissa Mozifian, Chris Atkeson, Dieter Fox, Ken Goldberg, John Leonard, C.\u00a0Karen Liu, Jan Peters, Shuran Song, Peter Welinder, and Martha White. 2020. Perspectives on Sim2Real Transfer for Robotics: A Summary of the R:SS 2020 Workshop. (Dec. 2020). https:\/\/www.arxiv-vanity.com\/papers\/2012.03806\/"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2018.2856281"},{"key":"e_1_3_2_1_15_1","volume-title":"Non-verbal Signals in HRI: Interference in Human Perception. In International Conference on Social Robotics. Springer, 275\u2013284","author":"Johal Wafa","year":"2015","unstructured":"Wafa Johal , Ga\u00eblle Calvary , and Sylvie Pesty . 2015 . Non-verbal Signals in HRI: Interference in Human Perception. In International Conference on Social Robotics. Springer, 275\u2013284 . Wafa Johal, Ga\u00eblle Calvary, and Sylvie Pesty. 2015. Non-verbal Signals in HRI: Interference in Human Perception. In International Conference on Social Robotics. Springer, 275\u2013284."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/HRI.2016.7451799"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511807572"},{"key":"e_1_3_2_1_18_1","unstructured":"Heon-Hui Kim Yun-Su Ha Zeungnam Bien and Kwang-Hyun Park. 2012. Gesture encoding and reproduction for human-robot interaction in text-to-gesture systems. Industrial Robot: An International Journal(2012).  Heon-Hui Kim Yun-Su Ha Zeungnam Bien and Kwang-Hyun Park. 2012. Gesture encoding and reproduction for human-robot interaction in text-to-gesture systems. Industrial Robot: An International Journal(2012)."},{"key":"e_1_3_2_1_19_1","volume-title":"2012 IEEE\/SICE International Symposium on System Integration (SII). IEEE, 645\u2013647","author":"Kim Jaewoo","year":"2012","unstructured":"Jaewoo Kim , Woo\u00a0Hyun Kim , Won\u00a0Hyong Lee , Ju-Hwan Seo , Myung\u00a0Jin Chung , and Dong-Soo Kwon . 2012 . Automated robot speech gesture generation system based on dialog sentence punctuation mark extraction . In 2012 IEEE\/SICE International Symposium on System Integration (SII). IEEE, 645\u2013647 . Jaewoo Kim, Woo\u00a0Hyun Kim, Won\u00a0Hyong Lee, Ju-Hwan Seo, Myung\u00a0Jin Chung, and Dong-Soo Kwon. 2012. Automated robot speech gesture generation system based on dialog sentence punctuation mark extraction. In 2012 IEEE\/SICE International Symposium on System Integration (SII). IEEE, 645\u2013647."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1080\/01690960802586188"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3308532.3329472"},{"key":"e_1_3_2_1_22_1","volume-title":"Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems. 2072\u20132074","author":"Kucherenko Taras","year":"2019","unstructured":"Taras Kucherenko , Dai Hasegawa , Naoshi Kaneko , Gustav\u00a0Eje Henter , and Hedvig Kjellstr\u00f6m . 2019 . On the importance of representations for speech-driven gesture generation . In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems. 2072\u20132074 . Taras Kucherenko, Dai Hasegawa, Naoshi Kaneko, Gustav\u00a0Eje Henter, and Hedvig Kjellstr\u00f6m. 2019. On the importance of representations for speech-driven gesture generation. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems. 2072\u20132074."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3382507.3418815"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/Humanoids.2011.6100857"},{"key":"e_1_3_2_1_25_1","volume-title":"Hand and mind: What gestures reveal about thought","author":"McNeill David","unstructured":"David McNeill . 1992. Hand and mind: What gestures reveal about thought . University of Chicago press. David McNeill. 1992. Hand and mind: What gestures reveal about thought. University of Chicago press."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2663204.2663275"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.5772\/56870"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3406499.3415076"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CogInfoCom.2017.8268286"},{"key":"e_1_3_2_1_30_1","first-page":"141","article-title":"Guidance for conducting systematic scoping reviews","volume":"13","author":"Peters DJ","year":"2015","unstructured":"Micah\u00a0 DJ Peters , Christina\u00a0 M Godfrey , Hanan Khalil , Patricia McInerney , Deborah Parker , and Cassia\u00a0Baldini Soares . 2015 . Guidance for conducting systematic scoping reviews . JBI Evidence Implementation 13 , 3 (2015), 141 \u2013 146 . Micah\u00a0DJ Peters, Christina\u00a0M Godfrey, Hanan Khalil, Patricia McInerney, Deborah Parker, and Cassia\u00a0Baldini Soares. 2015. Guidance for conducting systematic scoping reviews. JBI Evidence Implementation 13, 3 (2015), 141\u2013146.","journal-title":"JBI Evidence Implementation"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"crossref","unstructured":"Razieh Rastgoo Kourosh Kiani and Sergio Escalera. 2020. Sign language recognition: A deep survey. Expert Systems with Applications(2020) 113794.  Razieh Rastgoo Kourosh Kiani and Sergio Escalera. 2020. Sign language recognition: A deep survey. Expert Systems with Applications(2020) 113794.","DOI":"10.1016\/j.eswa.2020.113794"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-25504-5_4"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/s12369-013-0196-9"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROMAN.2018.8525621"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROMAN.2017.8172338"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1080\/03637759309376314"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3125739.3132594"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-58750-9_28"},{"key":"e_1_3_2_1_39_1","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan\u00a0N Gomez \u0141ukasz Kaiser and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998\u20136008.  Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan\u00a0N Gomez \u0141ukasz Kaiser and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998\u20136008."},{"key":"e_1_3_2_1_40_1","unstructured":"Pieter Wolfert Nicole Robinson and Tony Belpaeme. 2021. A Review of Evaluation Practices of Gesture Generation in Embodied Conversational Agents. arxiv:2101.03769\u00a0[cs.HC]  Pieter Wolfert Nicole Robinson and Tony Belpaeme. 2021. A Review of Evaluation Practices of Gesture Generation in Embodied Conversational Agents. arxiv:2101.03769\u00a0[cs.HC]"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.3390\/electronics10030228"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3414685.3417838"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8793720"}],"event":{"name":"HAI '21: International Conference on Human-Agent Interaction","location":"Virtual Event Japan","acronym":"HAI '21","sponsor":["SIGCHI ACM Special Interest Group on Computer-Human Interaction"]},"container-title":["Proceedings of the 9th International Conference on Human-Agent Interaction"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3472307.3484167","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3472307.3484167","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:17:37Z","timestamp":1750191457000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3472307.3484167"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,9]]},"references-count":43,"alternative-id":["10.1145\/3472307.3484167","10.1145\/3472307"],"URL":"https:\/\/doi.org\/10.1145\/3472307.3484167","relation":{},"subject":[],"published":{"date-parts":[[2021,11,9]]},"assertion":[{"value":"2021-11-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}