{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T08:53:37Z","timestamp":1773392017189,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":55,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,9,14]],"date-time":"2021-09-14T00:00:00Z","timestamp":1631577600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["409792180"],"award-info":[{"award-number":["409792180"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000781","name":"European Research Council","doi-asserted-by":"publisher","award":["770784"],"award-info":[{"award-number":["770784"]}],"id":[{"id":"10.13039\/501100000781","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,9,14]]},"DOI":"10.1145\/3472306.3478335","type":"proceedings-article","created":{"date-parts":[[2021,9,10]],"date-time":"2021-09-10T10:15:37Z","timestamp":1631268937000},"page":"101-108","source":"Crossref","is-referenced-by-count":95,"title":["Learning Speech-driven 3D Conversational Gestures from Video"],"prefix":"10.1145","author":[{"given":"Ikhsanul","family":"Habibie","sequence":"first","affiliation":[{"name":"Max Planck Institute for Informatics"}]},{"given":"Weipeng","family":"Xu","sequence":"additional","affiliation":[{"name":"Facebook Reality Labs"}]},{"given":"Dushyant","family":"Mehta","sequence":"additional","affiliation":[{"name":"Max Planck Institute for Informatics"}]},{"given":"Lingjie","family":"Liu","sequence":"additional","affiliation":[{"name":"Max Planck Institute for Informatics"}]},{"given":"Hans-Peter","family":"Seidel","sequence":"additional","affiliation":[{"name":"Max Planck Institute for Informatics"}]},{"given":"Gerard","family":"Pons-Moll","sequence":"additional","affiliation":[{"name":"University of T\u00fcbingen"}]},{"given":"Mohamed","family":"Elgharib","sequence":"additional","affiliation":[{"name":"Max Planck Institute for Informatics"}]},{"given":"Christian","family":"Theobalt","sequence":"additional","affiliation":[{"name":"Max Planck Institute for Informatics"}]}],"member":"320","published-online":{"date-parts":[[2021,9,14]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Yukiko I. Nakano, and Louis-Philippe Morency.","author":"Ahuja Chaitanya","year":"2020"},{"key":"e_1_3_2_1_2_1","volume-title":"Taras Kucherenko, and Jonas Beskow.","author":"Alexanderson Simon","year":"2020"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"crossref","unstructured":"Zhe Cao Tomas Simon Shih-En Wei and Yaser Sheikh. 2017. Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. In CVPR.  Zhe Cao Tomas Simon Shih-En Wei and Yaser Sheikh. 2017. Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. In CVPR.","DOI":"10.1109\/CVPR.2017.143"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/2697.001.0001"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/192161.192272"},{"key":"e_1_3_2_1_6_1","volume-title":"Hannes H\u00f6gni Vilhj\u00e1lmsson, and Timothy Bickmore","author":"Cassell Justine","year":"2004"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"crossref","unstructured":"Y. Cha T. Price Z. Wei X. Lu N. Rewkowski R. Chabra Z. Qin H. Kim Z. Su Y. Liu A. Ilie A. State Z. Xu J. Frahm and H. Fuchs. 2018. Towards Fully Mobile 3D Face Body and Environment Capture Using Only Head-worn Cameras. IEEE Transactions on Visualization and Computer Graphics (TVCG) (2018).  Y. Cha T. Price Z. Wei X. Lu N. Rewkowski R. Chabra Z. Qin H. Kim Z. Su Y. Liu A. Ilie A. State Z. Xu J. Frahm and H. Fuchs. 2018. Towards Fully Mobile 3D Face Body and Environment Capture Using Only Head-worn Cameras. IEEE Transactions on Visualization and Computer Graphics (TVCG) (2018).","DOI":"10.1109\/TVCG.2018.2868527"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Chung-Cheng Chiu and Stacy Marsella. 2011. How to Train Your Avatar: A Data Driven Approach to Gesture Generation. In IVA Hannes H\u00f6gni Vilhj\u00e1lmsson Stefan Kopp Stacy Marsella and Kristinn R. Th\u00f3risson (Eds.).  Chung-Cheng Chiu and Stacy Marsella. 2011. How to Train Your Avatar: A Data Driven Approach to Gesture Generation. In IVA Hannes H\u00f6gni Vilhj\u00e1lmsson Stefan Kopp Stacy Marsella and Kristinn R. Th\u00f3risson (Eds.).","DOI":"10.1007\/978-3-642-23974-8_14"},{"key":"e_1_3_2_1_9_1","unstructured":"Chung-Cheng Chiu and Stacy Marsella. 2014. Gesture Generation with Low-dimensional Embeddings. In AAMAS.  Chung-Cheng Chiu and Stacy Marsella. 2014. Gesture Generation with Low-dimensional Embeddings. In AAMAS."},{"key":"e_1_3_2_1_10_1","volume-title":"Learning, and Synthesis of 3D Speaking Styles. CVPR","author":"Cudeiro Daniel","year":"2019"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Ylva Ferstl and Rachel McDonnell. 2018. Investigating the Use of Recurrent Motion Modelling for Speech Gesture Generation. In IVA.  Ylva Ferstl and Rachel McDonnell. 2018. Investigating the Use of Recurrent Motion Modelling for Speech Gesture Generation. In IVA.","DOI":"10.1145\/3267851.3267898"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"crossref","unstructured":"Ylva Ferstl Michael Neff and Rachel McDonnell. 2019. Multi-Objective Adversarial Gesture Generation. In Motion Interaction and Games.  Ylva Ferstl Michael Neff and Rachel McDonnell. 2019. Multi-Objective Adversarial Gesture Generation. In Motion Interaction and Games.","DOI":"10.1145\/3359566.3360053"},{"key":"e_1_3_2_1_13_1","unstructured":"FFmpeg Developers. 2016. FFMPEG.ffmpeg.org. ffmpeg.org  FFmpeg Developers. 2016. FFMPEG.ffmpeg.org. ffmpeg.org"},{"key":"e_1_3_2_1_14_1","author":"Garrido Pablo","journal-title":"Trans. Graph. ([n. d.])."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"crossref","unstructured":"S. Ginosar A. Bar G. Kohavi C. Chan A. Owens and J. Malik. 2019. Learning Individual Styles of Conversational Gesture. In CVPR.  S. Ginosar A. Bar G. Kohavi C. Chan A. Owens and J. Malik. 2019. Learning Individual Styles of Conversational Gesture. In CVPR.","DOI":"10.1109\/CVPR.2019.00361"},{"key":"e_1_3_2_1_16_1","volume-title":"The role of gesture in communication and thinking. Trends in cognitive sciences","author":"Goldin-Meadow Susan","year":"1999"},{"key":"e_1_3_2_1_17_1","unstructured":"Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems.  Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Kathrin Haag and Hiroshi Shimodaira. 2016. Bidirectional LSTM Networks Employing Stacked Bottleneck Features for Expressive Speech-Driven Head Motion Synthesis. In IVA.  Kathrin Haag and Hiroshi Shimodaira. 2016. Bidirectional LSTM Networks Employing Stacked Bottleneck Features for Expressive Speech-Driven Head Motion Synthesis. In IVA.","DOI":"10.1007\/978-3-319-47665-0_18"},{"key":"e_1_3_2_1_19_1","volume-title":"Ng","author":"Hannun Awni","year":"2014"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Dai Hasegawa Naoshi Kaneko Shinichi Shirakawa Hiroshi Sakuta and Kazuhiko Sumi. 2018. Evaluation of Speech-to-Gesture Generation Using Bi-Directional LSTM Network. In IVA.  Dai Hasegawa Naoshi Kaneko Shinichi Shirakawa Hiroshi Sakuta and Kazuhiko Sumi. 2018. Evaluation of Speech-to-Gesture Generation Using Bi-Directional LSTM Network. In IVA.","DOI":"10.1145\/3267851.3267878"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"crossref","unstructured":"Daniel Holden Jun Saito and Taku Komura. 2016. A Deep Learning Framework for Character Motion Synthesis and Editing. ACM Trans. Graph. (2016).  Daniel Holden Jun Saito and Taku Komura. 2016. A Deep Learning Framework for Character Motion Synthesis and Editing. ACM Trans. Graph. (2016).","DOI":"10.1145\/2897824.2925975"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"crossref","unstructured":"Daniel Holden Jun Saito Taku Komura and Thomas Joyce. 2015. Learning Motion Manifolds with Convolutional Autoencoders. In SIGGRAPH Asia 2015 Technical Briefs.  Daniel Holden Jun Saito Taku Komura and Thomas Joyce. 2015. Learning Motion Manifolds with Convolutional Autoencoders. In SIGGRAPH Asia 2015 Technical Briefs.","DOI":"10.1145\/2820903.2820918"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"crossref","unstructured":"Tero Karras Timo Aila Samuli Laine Antti Herva and Jaakko Lehtinen. 2017. Audio-driven Facial Animation by Joint End-to-end Learning of Pose and Emotion. ACM Trans. Graph. (2017).  Tero Karras Timo Aila Samuli Laine Antti Herva and Jaakko Lehtinen. 2017. Audio-driven Facial Animation by Joint End-to-end Learning of Pose and Emotion. ACM Trans. Graph. (2017).","DOI":"10.1145\/3072959.3073658"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511807572"},{"key":"e_1_3_2_1_25_1","volume-title":"Naoshi Kaneko, and Hedvig Kjellstr\u00f6m.","author":"Kucherenko Taras","year":"2019"},{"key":"e_1_3_2_1_26_1","volume-title":"Simon Alexandersson, Iolanda Leite, and Hedvig Kjellstr\u00f6m.","author":"Kucherenko Taras","year":"2020"},{"key":"e_1_3_2_1_27_1","unstructured":"Paul Lamere Philip Kwok Evandro Gouv\u00eaa Bhiksha Raj Rita Singh William Walker Manfred Warmuth and Peter Wolf. 2003. The CMU SPHINX-4 Speech Recognition System.  Paul Lamere Philip Kwok Evandro Gouv\u00eaa Bhiksha Raj Rita Singh William Walker Manfred Warmuth and Peter Wolf. 2003. The CMU SPHINX-4 Speech Recognition System."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"crossref","unstructured":"G. Lee Z. Deng S. Ma T. Shiratori S.S. Srinivasa and Y. Sheikh. 2019. Talking With Hands 16.2M: A Large-Scale Dataset of Synchronized Body-Finger Motion and Audio for Conversational Motion Analysis and Synthesis. In ICCV.  G. Lee Z. Deng S. Ma T. Shiratori S.S. Srinivasa and Y. Sheikh. 2019. Talking With Hands 16.2M: A Large-Scale Dataset of Synchronized Body-Finger Motion and Audio for Conversational Motion Analysis and Synthesis. In ICCV.","DOI":"10.1109\/ICCV.2019.00085"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Sergey Levine Philipp Kr\u00e4henb\u00fchl Sebastian Thrun and Vladlen Koltun. 2010. Gesture controllers. In ACM Trans. Graph.  Sergey Levine Philipp Kr\u00e4henb\u00fchl Sebastian Thrun and Vladlen Koltun. 2010. Gesture controllers. In ACM Trans. Graph.","DOI":"10.1145\/1833349.1778861"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"crossref","unstructured":"Sergey Levine Christian Theobalt and Vladlen Koltun. 2009. Real-time Prosody-driven Synthesis of Body Language. ACM Trans. Graph. (2009).  Sergey Levine Christian Theobalt and Vladlen Koltun. 2009. Real-time Prosody-driven Synthesis of Body Language. ACM Trans. Graph. (2009).","DOI":"10.1145\/1661412.1618518"},{"key":"e_1_3_2_1_31_1","volume-title":"Learning a model of facial shape and expression from 4D scans. ACM Transactions on Graphics","author":"Li Tianye","year":"2017"},{"key":"e_1_3_2_1_32_1","unstructured":"Yilong Liu Feng Xu Jinxiang Chai Xin Tong Lijuan Wang and Qiang Huo. 2015. Video-audio Driven Real-time Facial Animation. ACM Trans. Graph. (2015).  Yilong Liu Feng Xu Jinxiang Chai Xin Tong Lijuan Wang and Qiang Huo. 2015. Video-audio Driven Real-time Facial Animation. ACM Trans. Graph. (2015)."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"crossref","unstructured":"S. Mariooryad and C. Busso. 2012. Generating Human-Like Behaviors Using Joint Speech-Driven Models for Conversational Agents. IEEE Transactions on Audio Speech and Language Processing (2012).  S. Mariooryad and C. Busso. 2012. Generating Human-Like Behaviors Using Joint Speech-Driven Models for Conversational Agents. IEEE Transactions on Audio Speech and Language Processing (2012).","DOI":"10.1109\/TASL.2012.2201476"},{"key":"e_1_3_2_1_34_1","unstructured":"Stacy Marsella Yuyu Xu Margaux Lhommet Andrew Feng Stefan Scherer and Ari Shapiro. [n.d.]. Virtual Character Performance from Speech. In SCA.  Stacy Marsella Yuyu Xu Margaux Lhommet Andrew Feng Stefan Scherer and Ari Shapiro. [n.d.]. Virtual Character Performance from Speech. In SCA."},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"crossref","unstructured":"David McNeill. 2000. Language and Gesture. Cambridge University Press.  David McNeill. 2000. Language and Gesture. Cambridge University Press.","DOI":"10.1017\/CBO9780511620850"},{"key":"e_1_3_2_1_36_1","unstructured":"Dushyant Mehta Oleksandr Sotnychenko Franziska Mueller Weipeng Xu Mohamed Elgharib Pascal Fua Hans-Peter Seidel Helge Rhodin Gerard Pons-Moll and Christian Theobalt. 2020. XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera. ACM Trans. Graph. (2020).  Dushyant Mehta Oleksandr Sotnychenko Franziska Mueller Weipeng Xu Mohamed Elgharib Pascal Fua Hans-Peter Seidel Helge Rhodin Gerard Pons-Moll and Christian Theobalt. 2020. XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera. ACM Trans. Graph. (2020)."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"crossref","unstructured":"Michael Neff Michael Kipp Irene Albrecht and Hans-Peter Seidel. 2008. Gesture Modeling and Animation Based on a Probabilistic Re-creation of Speaker Style. ACM Trans. Graph. (2008).  Michael Neff Michael Kipp Irene Albrecht and Hans-Peter Seidel. 2008. Gesture Modeling and Animation Based on a Probabilistic Re-creation of Speaker Style. ACM Trans. Graph. (2008).","DOI":"10.1145\/1330511.1330516"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"crossref","unstructured":"Hai Xuan Pham Yuting Wang and Vladimir Pavlovic. 2018. End-to-end Learning for 3D Facial Animation from Speech. In ICMI.  Hai Xuan Pham Yuting Wang and Vladimir Pavlovic. 2018. End-to-end Learning for 3D Facial Animation from Speech. In ICMI.","DOI":"10.1145\/3242969.3243017"},{"key":"e_1_3_2_1_39_1","unstructured":"Werner Robitza. 2019. ffmpeg-normalize. github.com\/slhck\/ffmpeg-normalize. https:\/\/github.com\/slhck\/ffmpeg- normalize  Werner Robitza. 2019. ffmpeg-normalize. github.com\/slhck\/ffmpeg-normalize. https:\/\/github.com\/slhck\/ffmpeg- normalize"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"crossref","unstructured":"Najmeh Sadoughi Yang Liu and Carlos Busso. 2014. Speech-Driven Animation Constrained by Appropriate Discourse Functions. In ICMI.  Najmeh Sadoughi Yang Liu and Carlos Busso. 2014. Speech-Driven Animation Constrained by Appropriate Discourse Functions. In ICMI.","DOI":"10.1145\/2663204.2663252"},{"key":"e_1_3_2_1_42_1","volume-title":"Meaningful head movements driven by emotional synthetic speech. Speech Communication","author":"Sadoughi Najmeh","year":"2017"},{"key":"e_1_3_2_1_43_1","unstructured":"Tim Salimans Ian Goodfellow Wojciech Zaremba Vicki Cheung Alec Radford Xi Chen and Xi Chen. 2016. Improved Techniques for Training GANs. In Advances in Neural Information Processing Systems.  Tim Salimans Ian Goodfellow Wojciech Zaremba Vicki Cheung Alec Radford Xi Chen and Xi Chen. 2016. Improved Techniques for Training GANs. In Advances in Neural Information Processing Systems."},{"key":"e_1_3_2_1_44_1","volume-title":"Cohn","author":"Saragih Jason M.","year":"2011"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"crossref","unstructured":"Eli Shlizerman Lucio Dery Hayden Schoen and Ira Kemelmacher-Shlizerman. 2018. Audio to body dynamics. In CVPR.  Eli Shlizerman Lucio Dery Hayden Schoen and Ira Kemelmacher-Shlizerman. 2018. Audio to body dynamics. In CVPR.","DOI":"10.1109\/CVPR.2018.00790"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3072959.3073640"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3125739.3132594"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"crossref","unstructured":"Kenta Takeuchi Souichirou Kubota Keisuke Suzuki Dai Hasegawa and Hiroshi Sakuta. 2017. Creating a Gesture-Speech Dataset for Speech-Based Automatic Gesture Generation. In HCI International - Posters' Extended Abstracts Constantine Stephanidis (Ed.).  Kenta Takeuchi Souichirou Kubota Keisuke Suzuki Dai Hasegawa and Hiroshi Sakuta. 2017. Creating a Gesture-Speech Dataset for Speech-Based Automatic Gesture Generation. In HCI International - Posters' Extended Abstracts Constantine Stephanidis (Ed.).","DOI":"10.1007\/978-3-319-58750-9_28"},{"key":"e_1_3_2_1_49_1","volume-title":"Jessica Hodgins, and Iain Matthews.","author":"Taylor Sarah","year":"2017"},{"key":"e_1_3_2_1_50_1","volume-title":"Synthesising 3D Facial Motion from \"In-the-Wild\" Speech. CoRR abs\/1904.07002","author":"Tzirakis Panagiotis","year":"2019"},{"key":"e_1_3_2_1_51_1","unstructured":"Naoto Usuyama. 2018. github.com\/usuyama\/pytorch-unet. github.com\/usuyama\/pytorch-unet  Naoto Usuyama. 2018. github.com\/usuyama\/pytorch-unet. github.com\/usuyama\/pytorch-unet"},{"key":"e_1_3_2_1_52_1","volume-title":"The persona effect: How substantial is it? In People and computers XIII","author":"Mulken Susanne Van"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"crossref","unstructured":"Youngwoo Yoon Bok Cha Joo-Haeng Lee Minsu Jang Jaeyeon Lee Jaehong Kim and Geehyuk Lee. 2020. Speech Gesture Generation from the Trimodal Context of Text Audio and Speaker Identity. ACM Trans. Graph. (2020).  Youngwoo Yoon Bok Cha Joo-Haeng Lee Minsu Jang Jaeyeon Lee Jaehong Kim and Geehyuk Lee. 2020. Speech Gesture Generation from the Trimodal Context of Text Audio and Speaker Identity. ACM Trans. Graph. (2020).","DOI":"10.1145\/3414685.3417838"},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"crossref","unstructured":"Y. Yoon W. Ko M. Jang J. Lee J. Kim and G. Lee. 2019. Robots Learn Social Skills: End-to-End Learning of Co-Speech Gesture Generation for Humanoid Robots. In ICRA.  Y. Yoon W. Ko M. Jang J. Lee J. Kim and G. Lee. 2019. Robots Learn Social Skills: End-to-End Learning of Co-Speech Gesture Generation for Humanoid Robots. In ICRA.","DOI":"10.1109\/ICRA.2019.8793720"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"crossref","unstructured":"Yuxiao Zhou Marc Habermann Weipeng Xu Ikhsanul Habibie Christian Theobalt and Feng Xu. 2020. Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data. In CVPR.  Yuxiao Zhou Marc Habermann Weipeng Xu Ikhsanul Habibie Christian Theobalt and Feng Xu. 2020. Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data. In CVPR.","DOI":"10.1109\/CVPR42600.2020.00539"}],"event":{"name":"IVA '21: ACM International Conference on Intelligent Virtual Agents","location":"Virtual Event Japan","acronym":"IVA '21","sponsor":["SIGAI ACM Special Interest Group on Artificial Intelligence"]},"container-title":["Proceedings of the 21th ACM International Conference on Intelligent Virtual Agents"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3472306.3478335","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3472306.3478335","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:17:36Z","timestamp":1750191456000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3472306.3478335"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,14]]},"references-count":55,"alternative-id":["10.1145\/3472306.3478335","10.1145\/3472306"],"URL":"https:\/\/doi.org\/10.1145\/3472306.3478335","relation":{},"subject":[],"published":{"date-parts":[[2021,9,14]]}}}