{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,28]],"date-time":"2025-08-28T00:02:08Z","timestamp":1756339328382,"version":"3.44.0"},"reference-count":28,"publisher":"Wiley","issue":"4","license":[{"start":{"date-parts":[[2025,8,7]],"date-time":"2025-08-07T00:00:00Z","timestamp":1754524800000},"content-version":"vor","delay-in-days":37,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Computer Animation &amp;amp; Virtual"],"published-print":{"date-parts":[[2025,7]]},"abstract":"<jats:title>ABSTRACT<\/jats:title><jats:p>Co\u2010speech gestures are essential for natural human communication, yet existing synthesis methods fall short in delivering semantically aligned and contextually appropriate motions. In this paper, we present <jats:bold>RIDGE<\/jats:bold>, a hybrid system that combines rule\u2010based and deep learning approaches to generate realistic gestures for virtual avatars and human\u2010computer interaction. RIDGE employs a high\u2010fidelity rule base, generated from motion capture data with the assistance of large language models, to select reliable gesture mappings. When a high\u2010confidence match is not available, a contrastively trained deep learning model steps in to produce semantically appropriate gestures. Evaluated using a novel Gesture Cluster Affinity (GCA) metric, our system outperforms existing baselines, achieving a GCA score of 0.73 compared to a rule\u2010based baseline of 0.6 and an end\u2010to\u2010end: 0.52, while the ground truth score was 0.90. Detailed analyses of system architecture, data preprocessing, and evaluation methodologies demonstrate RIDGE's potential to enhance gesture synthesis. Project Url: \n<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"https:\/\/www.mrlab.co.kr\/research\/ridge\">https:\/\/www.mrlab.co.kr\/research\/ridge<\/jats:ext-link>.<\/jats:p>","DOI":"10.1002\/cav.70034","type":"journal-article","created":{"date-parts":[[2025,8,7]],"date-time":"2025-08-07T11:24:42Z","timestamp":1754565882000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["RIDGE: Rule\u2010Infused Deep Learning for Realistic Co\u2010Speech Gesture Generation"],"prefix":"10.1002","volume":"36","author":[{"given":"Ghazanfar","family":"Ali","sequence":"first","affiliation":[{"name":"Intelligence and Interaction Research Center Korea Institute of Science and Technology (KIST)  Seoul Republic of Korea"}]},{"given":"HwangYoun","family":"Kim","sequence":"additional","affiliation":[{"name":"AI\u2010Robotics, KIST School University of Science and Technology (UST)  Seoul Republic of Korea"}]},{"given":"Jae\u2010In","family":"Hwang","sequence":"additional","affiliation":[{"name":"Intelligence and Interaction Research Center Korea Institute of Science and Technology (KIST)  Seoul Republic of Korea"}]}],"member":"311","published-online":{"date-parts":[[2025,8,7]]},"reference":[{"key":"e_1_2_10_2_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.170"},{"key":"e_1_2_10_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/VR50410.2021.00037"},{"key":"e_1_2_10_4_1","doi-asserted-by":"publisher","DOI":"10.1002\/cav.1944"},{"key":"e_1_2_10_5_1","unstructured":"S.Ghazanfar W.Kim M. S.Anwar J.\u2010I.Hwang andA.Choi \u201cExpanding Multilingual Co\u2010Speech Interaction: The Impact of Enhanced Gesture Units in Text\u2010To\u2010Gesture Synthesis for Digital Humans \u201d2023 https:\/\/doi.org\/10.21203\/rs.3.rs\u20103350470\/v1."},{"key":"e_1_2_10_6_1","doi-asserted-by":"crossref","unstructured":"S.Nyatsanga T.Kucherenko C.Ahuja G. E.Henter andM.Neff \u201cA Comprehensive Review of Data\u2010Driven Co\u2010Speech Gesture Generation \u201d2023.","DOI":"10.1111\/cgf.14776"},{"key":"e_1_2_10_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3536221.3558058"},{"key":"e_1_2_10_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/11821830_20"},{"key":"e_1_2_10_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00361"},{"key":"e_1_2_10_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-08373-4_8"},{"key":"e_1_2_10_11_1","unstructured":"VHTOOLKIT \u2014 Homepage https:\/\/vhtoolkit.ict.usc.edu\/."},{"key":"e_1_2_10_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-024-19904-3"},{"key":"e_1_2_10_13_1","doi-asserted-by":"publisher","DOI":"10.1002\/cav.2236"},{"key":"e_1_2_10_14_1","doi-asserted-by":"crossref","unstructured":"H.Pang T.Ding L.He M.Tao L.Zhang andQ.Gan \u201cLLM Gesticulator: Leveraging Large Language Models for Scalable and Controllable Co\u2010Speech Gesture Synthesis \u201d arXiv preprint arXiv:2410.10851 (2024).","DOI":"10.1117\/12.3060395"},{"key":"e_1_2_10_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3658134"},{"key":"e_1_2_10_16_1","unstructured":"A.Radford J. W.Kim C.Hallacy et al. \u201cLearning Transferable Visual Models From Natural Language Supervision \u201d2021."},{"key":"e_1_2_10_17_1","unstructured":"T.Chen S.Kornblith M.Norouzi andG.Hinton \u201cA Simple Framework for Contrastive Learning of Visual Representations \u201d arXiv preprint arXiv:2002.05709 (2020)."},{"key":"e_1_2_10_18_1","first-page":"1575","volume-title":"SIGGRAPH Asia 2022 Posters","author":"Ghazanfar A.","year":"2022"},{"key":"e_1_2_10_19_1","unstructured":"C.Saund A.Birladeanu andS.Marsella \u201cCMCF: An Architecture for Realtime Gesture Generation by Clustering Gestures by Motion and Communicative Function \u201d2021."},{"key":"e_1_2_10_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-20071-7_36"},{"key":"e_1_2_10_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/192161.192272"},{"key":"e_1_2_10_22_1","first-page":"79","volume-title":"Proceedings of the 18th International Conference on Intelligent Virtual Agents (IVA)","author":"Dai H.","year":"2018"},{"key":"e_1_2_10_23_1","doi-asserted-by":"crossref","unstructured":"J.Li D.Kang W.Pei et al. \u201cAudio2Gestures: Generating Diverse Gestures From Speech Audio With Conditional VAE \u201d arXiv preprint arXiv:2108.06720 (2021).","DOI":"10.1109\/ICCV48922.2021.01110"},{"key":"e_1_2_10_24_1","first-page":"11478","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Liu X.","year":"2022"},{"key":"e_1_2_10_25_1","first-page":"2321","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Sun M.","year":"2023"},{"key":"e_1_2_10_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3414685.3417838"},{"volume-title":"Proceedings of the International Conference on Multimodal Interaction (ICMI)","year":"2022","author":"Zhou C.","key":"e_1_2_10_27_1"},{"key":"e_1_2_10_28_1","first-page":"205","volume-title":"Proceedings of the 2021 International Conference on Multimodal Interaction (ICMI)","author":"Wolfert P.","year":"2021"},{"key":"e_1_2_10_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2019.2929257"}],"container-title":["Computer Animation and Virtual Worlds"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cav.70034","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,27]],"date-time":"2025-08-27T04:44:06Z","timestamp":1756269846000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/cav.70034"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7]]},"references-count":28,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,7]]}},"alternative-id":["10.1002\/cav.70034"],"URL":"https:\/\/doi.org\/10.1002\/cav.70034","archive":["Portico"],"relation":{},"ISSN":["1546-4261","1546-427X"],"issn-type":[{"type":"print","value":"1546-4261"},{"type":"electronic","value":"1546-427X"}],"subject":[],"published":{"date-parts":[[2025,7]]},"assertion":[{"value":"2025-04-25","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-05-11","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-07","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"e70034"}}