{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T18:13:38Z","timestamp":1760552018313,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":28,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,10,9]],"date-time":"2023-10-09T00:00:00Z","timestamp":1696809600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,10,9]]},"DOI":"10.1145\/3577190.3616116","type":"proceedings-article","created":{"date-parts":[[2023,10,7]],"date-time":"2023-10-07T22:30:48Z","timestamp":1696717848000},"page":"802-810","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["The UEA Digital Humans entry to the GENEA Challenge 2023"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4985-2282","authenticated-orcid":false,"given":"Jonathan","family":"Windle","sequence":"first","affiliation":[{"name":"University of East Anglia, England"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-5004-2397","authenticated-orcid":false,"given":"Iain","family":"Matthews","sequence":"additional","affiliation":[{"name":"University of East Anglia, England"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6315-3475","authenticated-orcid":false,"given":"Ben","family":"Milner","sequence":"additional","affiliation":[{"name":"University of East Anglia, England"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1299-0446","authenticated-orcid":false,"given":"Sarah","family":"Taylor","sequence":"additional","affiliation":[{"name":"Independent Researcher, England"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,10,9]]},"reference":[{"volume-title":"Text2gestures: A transformer-based network for generating emotive body gestures for virtual agents. In 2021 IEEE virtual reality and 3D user interfaces (VR)","author":"Bhattacharya Uttaran","key":"e_1_3_2_1_1_1","unstructured":"Uttaran Bhattacharya , Nicholas Rewkowski , Abhishek Banerjee , Pooja Guhan , Aniket Bera , and Dinesh Manocha . 2021. Text2gestures: A transformer-based network for generating emotive body gestures for virtual agents. In 2021 IEEE virtual reality and 3D user interfaces (VR) . IEEE , 1\u201310. Uttaran Bhattacharya, Nicholas Rewkowski, Abhishek Banerjee, Pooja Guhan, Aniket Bera, and Dinesh Manocha. 2021. Text2gestures: A transformer-based network for generating emotive body gestures for virtual agents. In 2021 IEEE virtual reality and 3D user interfaces (VR). IEEE, 1\u201310."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00051"},{"key":"e_1_3_2_1_3_1","volume-title":"Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860","author":"Dai Zihang","year":"2019","unstructured":"Zihang Dai , Zhilin Yang , Yiming Yang , Jaime Carbonell , Quoc\u00a0 V Le , and Ruslan Salakhutdinov . 2019 . Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860 (2019). Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc\u00a0V Le, and Ruslan Salakhutdinov. 2019. Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860 (2019)."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1756-8765.2012.01183.x"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3267851.3267898"},{"key":"e_1_3_2_1_6_1","volume-title":"ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech. arXiv preprint arXiv:2209.07556","author":"Ghorbani Saeed","year":"2022","unstructured":"Saeed Ghorbani , Ylva Ferstl , Daniel Holden , Nikolaus\u00a0 F Troje , and Marc-Andr\u00e9 Carbonneau . 2022. ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech. arXiv preprint arXiv:2209.07556 ( 2022 ). Saeed Ghorbani, Ylva Ferstl, Daniel Holden, Nikolaus\u00a0F Troje, and Marc-Andr\u00e9 Carbonneau. 2022. ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech. arXiv preprint arXiv:2209.07556 (2022)."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3267851.3267878"},{"key":"e_1_3_2_1_8_1","first-page":"6840","article-title":"Denoising diffusion probabilistic models","volume":"33","author":"Ho Jonathan","year":"2020","unstructured":"Jonathan Ho , Ajay Jain , and Pieter Abbeel . 2020 . Denoising diffusion probabilistic models . Advances in Neural Information Processing Systems 33 (2020), 6840 \u2013 6851 . Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems 33 (2020), 6840\u20136851.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_9_1","volume-title":"Do gestures communicate? A review. Research on language and social interaction 27, 3","author":"Kendon Adam","year":"1994","unstructured":"Adam Kendon . 1994. Do gestures communicate? A review. Research on language and social interaction 27, 3 ( 1994 ), 175\u2013200. Adam Kendon. 1994. Do gestures communicate? A review. Research on language and social interaction 27, 3 (1994), 175\u2013200."},{"key":"e_1_3_2_1_10_1","volume-title":"Flame: Free-form language-based motion synthesis & editing. arXiv preprint arXiv:2209.00349","author":"Kim Jihoon","year":"2022","unstructured":"Jihoon Kim , Jiseob Kim , and Sungjoon Choi . 2022 . Flame: Free-form language-based motion synthesis & editing. arXiv preprint arXiv:2209.00349 (2022). Jihoon Kim, Jiseob Kim, and Sungjoon Choi. 2022. Flame: Free-form language-based motion synthesis & editing. arXiv preprint arXiv:2209.00349 (2022)."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3577190.3616120"},{"key":"e_1_3_2_1_12_1","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision. 763\u2013772","author":"Lee Gilwoo","year":"2019","unstructured":"Gilwoo Lee , Zhiwei Deng , Shugao Ma , Takaaki Shiratori , Siddhartha\u00a0 S Srinivasa , and Yaser Sheikh . 2019 . Talking with hands 16.2 m: A large-scale dataset of synchronized body-finger motion and audio for conversational motion analysis and synthesis . In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 763\u2013772 . Gilwoo Lee, Zhiwei Deng, Shugao Ma, Takaaki Shiratori, Siddhartha\u00a0S Srinivasa, and Yaser Sheikh. 2019. Talking with hands 16.2 m: A large-scale dataset of synchronized body-finger motion and audio for conversational motion analysis and synthesis. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 763\u2013772."},{"key":"e_1_3_2_1_13_1","volume-title":"Proceedings, Part VII. Springer, 612\u2013630","author":"Liu Haiyang","year":"2022","unstructured":"Haiyang Liu , Zihao Zhu , Naoya Iwamoto , Yichen Peng , Zhengqing Li , You Zhou , Elif Bozkurt , and Bo Zheng . 2022 . BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis. In Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022 , Proceedings, Part VII. Springer, 612\u2013630 . Haiyang Liu, Zihao Zhu, Naoya Iwamoto, Yichen Peng, Zhengqing Li, You Zhou, Elif Bozkurt, and Bo Zheng. 2022. BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis. In Computer Vision\u2013ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23\u201327, 2022, Proceedings, Part VII. Springer, 612\u2013630."},{"key":"e_1_3_2_1_14_1","volume-title":"Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101","author":"Loshchilov Ilya","year":"2017","unstructured":"Ilya Loshchilov and Frank Hutter . 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 ( 2017 ). Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3536221.3558059"},{"key":"e_1_3_2_1_16_1","volume-title":"So you think gestures are nonverbal?Psychological review 92, 3","author":"McNeill David","year":"1985","unstructured":"David McNeill . 1985. So you think gestures are nonverbal?Psychological review 92, 3 ( 1985 ), 350. David McNeill. 1985. So you think gestures are nonverbal?Psychological review 92, 3 (1985), 350."},{"key":"e_1_3_2_1_17_1","volume-title":"Proceedings of the International Conference on Language Resources and Evaluation (LREC","author":"Mikolov Tomas","year":"2018","unstructured":"Tomas Mikolov , Edouard Grave , Piotr Bojanowski , Christian Puhrsch , and Armand Joulin . 2018 . Advances in Pre-Training Distributed Word Representations . In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018). Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2018. Advances in Pre-Training Distributed Word Representations. In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018)."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-23974-8_43"},{"key":"e_1_3_2_1_19_1","volume-title":"International Conference on Machine Learning. PMLR, 8162\u20138171","author":"Nichol Alexander\u00a0Quinn","year":"2021","unstructured":"Alexander\u00a0Quinn Nichol and Prafulla Dhariwal . 2021 . Improved denoising diffusion probabilistic models . In International Conference on Machine Learning. PMLR, 8162\u20138171 . Alexander\u00a0Quinn Nichol and Prafulla Dhariwal. 2021. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning. PMLR, 8162\u20138171."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1121\/10.0001730"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP40776.2020.9053569"},{"key":"e_1_3_2_1_22_1","first-page":"203","article-title":"Hand and Mind","volume":"37","author":"Studdert-Kennedy Michael","year":"1994","unstructured":"Michael Studdert-Kennedy . 1994 . Hand and Mind : What Gestures Reveal About Thought.Language and Speech 37 , 2 (1994), 203 \u2013 209 . Michael Studdert-Kennedy. 1994. Hand and Mind: What Gestures Reveal About Thought.Language and Speech 37, 2 (1994), 203\u2013209.","journal-title":"What Gestures Reveal About Thought.Language and Speech"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3125739.3132594"},{"key":"e_1_3_2_1_24_1","volume-title":"Human motion diffusion model. arXiv preprint arXiv:2209.14916","author":"Tevet Guy","year":"2022","unstructured":"Guy Tevet , Sigal Raab , Brian Gordon , Yonatan Shafir , Daniel Cohen-Or , and Amit\u00a0 H Bermano . 2022. Human motion diffusion model. arXiv preprint arXiv:2209.14916 ( 2022 ). Guy Tevet, Sigal Raab, Brian Gordon, Yonatan Shafir, Daniel Cohen-Or, and Amit\u00a0H Bermano. 2022. Human motion diffusion model. arXiv preprint arXiv:2209.14916 (2022)."},{"key":"e_1_3_2_1_25_1","volume-title":"Attention is all you need. Advances in neural information processing systems 30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan\u00a0 N Gomez , \u0141ukasz Kaiser , and Illia Polosukhin . 2017. Attention is all you need. Advances in neural information processing systems 30 ( 2017 ). Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan\u00a0N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017)."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3536221.3558065"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2022.08.001"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00589"}],"event":{"name":"ICMI '23: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION","sponsor":["SIGCHI ACM Special Interest Group on Computer-Human Interaction"],"location":"Paris France","acronym":"ICMI '23"},"container-title":["INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3577190.3616116","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3577190.3616116","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:37:02Z","timestamp":1750178222000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3577190.3616116"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,9]]},"references-count":28,"alternative-id":["10.1145\/3577190.3616116","10.1145\/3577190"],"URL":"https:\/\/doi.org\/10.1145\/3577190.3616116","relation":{},"subject":[],"published":{"date-parts":[[2023,10,9]]},"assertion":[{"value":"2023-10-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}