{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:23:42Z","timestamp":1750220622606,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":29,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,10,12]],"date-time":"2020-10-12T00:00:00Z","timestamp":1602460800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Natural Science Foundation of China","award":["61831022"],"award-info":[{"award-number":["61831022"]}]},{"DOI":"10.13039\/100014718","name":"Innovative Research Group Project of the National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61521002"],"award-info":[{"award-number":["61521002"]}],"id":[{"id":"10.13039\/100014718","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,10,12]]},"DOI":"10.1145\/3394171.3414444","type":"proceedings-article","created":{"date-parts":[[2020,10,12]],"date-time":"2020-10-12T12:27:35Z","timestamp":1602505655000},"page":"4521-4523","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Visual-speech Synthesis of Exaggerated Corrective Feedback"],"prefix":"10.1145","author":[{"given":"Yaohua","family":"Bu","sequence":"first","affiliation":[{"name":"Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weijun","family":"Li","sequence":"additional","affiliation":[{"name":"Northeast Normal University, Changchun, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tianyi","family":"Ma","sequence":"additional","affiliation":[{"name":"Tsinghua University &amp; Ministry of Education, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shengqi","family":"Chen","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jia","family":"Jia","sequence":"additional","affiliation":[{"name":"Tsinghua University &amp; Ministry of Education &amp; Beijing National Research Center for Information Science and Technology, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kun","family":"Li","sequence":"additional","affiliation":[{"name":"SpeechX Ltd., Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaobo","family":"Lu","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,10,12]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10639-019-09955-7"},{"key":"e_1_3_2_2_2_1","volume-title":"G\u00e9rard Bailly, Fr\u00e9d\u00e9ric Elisei, and Thomas Hueber.","author":"Badin Pierre","year":"2010","unstructured":"Pierre Badin , Atef Ben Youssef , G\u00e9rard Bailly, Fr\u00e9d\u00e9ric Elisei, and Thomas Hueber. 2010 . Visual articulatory feedback for phonetic correction in second language learning. In Second Language Studies: Acquisition, Learning, Education and Technology . Pierre Badin, Atef Ben Youssef, G\u00e9rard Bailly, Fr\u00e9d\u00e9ric Elisei, and Thomas Hueber. 2010. Visual articulatory feedback for phonetic correction in second language learning. In Second Language Studies: Acquisition, Learning, Education and Technology."},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1159\/000261913"},{"volume-title":"English phonetics and phonology: An introduction","author":"Carr Philip","key":"e_1_3_2_2_4_1","unstructured":"Philip Carr . 2019. English phonetics and phonology: An introduction . John Wiley & Sons . Philip Carr. 2019. English phonetics and phonology: An introduction .John Wiley & Sons."},{"key":"e_1_3_2_2_5_1","first-page":"237","article-title":"Negative language transfer when learning Spanish as a foreign language","volume":"16","author":"Cort\u00e9s Nuria Calvo","year":"2005","unstructured":"Nuria Calvo Cort\u00e9s . 2005 . Negative language transfer when learning Spanish as a foreign language . Interling\u00fc'istica 16 (2005), 237 -- 248 . Nuria Calvo Cort\u00e9s. 2005. Negative language transfer when learning Spanish as a foreign language. Interling\u00fc'istica 16 (2005), 237--248.","journal-title":"Interling\u00fc'istica"},{"key":"e_1_3_2_2_6_1","volume-title":"Second language accent and pronunciation teaching: A research-based approach. TESOL quarterly","author":"Derwing Tracey M","year":"2005","unstructured":"Tracey M Derwing and Murray J Munro . 2005. Second language accent and pronunciation teaching: A research-based approach. TESOL quarterly , Vol. 39 , 3 ( 2005 ), 379--397. Tracey M Derwing and Murray J Munro. 2005. Second language accent and pronunciation teaching: A research-based approach. TESOL quarterly, Vol. 39, 3 (2005), 379--397."},{"key":"e_1_3_2_2_7_1","volume-title":"Second language speech learning: Theory, findings, and problems. Speech perception and linguistic experience: Issues in cross-language research","author":"Flege James E","year":"1995","unstructured":"James E Flege . 1995. Second language speech learning: Theory, findings, and problems. Speech perception and linguistic experience: Issues in cross-language research , Vol. 92 ( 1995 ), 233--277. James E Flege. 1995. Second language speech learning: Theory, findings, and problems. Speech perception and linguistic experience: Issues in cross-language research, Vol. 92 (1995), 233--277."},{"key":"e_1_3_2_2_8_1","volume-title":"Articulatory strengthening at edges of prosodic domains. The journal of the acoustical society of America","author":"Fougeron C\u00e9cile","year":"1997","unstructured":"C\u00e9cile Fougeron and Patricia A Keating . 1997. Articulatory strengthening at edges of prosodic domains. The journal of the acoustical society of America , Vol. 101 , 6 ( 1997 ), 3728--3740. C\u00e9cile Fougeron and Patricia A Keating. 1997. Articulatory strengthening at edges of prosodic domains. The journal of the acoustical society of America, Vol. 101, 6 (1997), 3728--3740."},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-014-1465-2"},{"key":"e_1_3_2_2_10_1","unstructured":"Daniel Jones. 1922. An outline of English phonetics .BG Teubner. Daniel Jones. 1922. An outline of English phonetics .BG Teubner."},{"key":"e_1_3_2_2_11_1","volume-title":"Rating Algorithm for Pronunciation of English Based on Audio Feature Pattern Matching. In MATEC Web of Conferences","volume":"22","author":"Li Kun","year":"2015","unstructured":"Kun Li , Jing Li , Yufang Song , and Hewei Fu . 2015 a. Rating Algorithm for Pronunciation of English Based on Audio Feature Pattern Matching. In MATEC Web of Conferences , Vol. 22 . EDP Sciences, 01032. Kun Li, Jing Li, Yufang Song, and Hewei Fu. 2015a. Rating Algorithm for Pronunciation of English Based on Audio Feature Pattern Matching. In MATEC Web of Conferences, Vol. 22. EDP Sciences, 01032."},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"crossref","unstructured":"Kun Li Xiaojun Qian Shiyin Kang Pengfei Liu and Helen Meng. 2015b. Integrating acoustic and state-transition models for free phone recognition in L2 English speech using multi-distribution deep neural networks.. In SLaTE. 119--124. Kun Li Xiaojun Qian Shiyin Kang Pengfei Liu and Helen Meng. 2015b. Integrating acoustic and state-transition models for free phone recognition in L2 English speech using multi-distribution deep neural networks.. In SLaTE. 119--124.","DOI":"10.21437\/SLaTE.2015-21"},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2016.2621675"},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCSLP.2012.6423507"},{"key":"e_1_3_2_2_15_1","volume-title":"Negotiation of form, recasts, and explicit correction in relation to error types and learner repair in immersion classrooms. Language learning","author":"Lyster Roy","year":"1998","unstructured":"Roy Lyster . 1998. Negotiation of form, recasts, and explicit correction in relation to error types and learner repair in immersion classrooms. Language learning , Vol. 48 , 2 ( 1998 ), 183--218. Roy Lyster. 1998. Negotiation of form, recasts, and explicit correction in relation to error types and learner repair in immersion classrooms. Language learning, Vol. 48, 2 (1998), 183--218."},{"key":"e_1_3_2_2_16_1","volume-title":"A contrastive phonetic study between Cantonese and English to predict salient mispronunciations by Cantonese learners of English. Unpublished article","author":"Meng Helen","year":"2007","unstructured":"Helen Meng , Eric Zee , and Wai Sum Lee . 2007. A contrastive phonetic study between Cantonese and English to predict salient mispronunciations by Cantonese learners of English. Unpublished article . The Chinese University of Hong Kong ( 2007 ). Helen Meng, Eric Zee, and Wai Sum Lee. 2007. A contrastive phonetic study between Cantonese and English to predict salient mispronunciations by Cantonese learners of English. Unpublished article. The Chinese University of Hong Kong (2007)."},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178909"},{"volume-title":"Language transfer","author":"Odlin Terence","key":"e_1_3_2_2_18_1","unstructured":"Terence Odlin . 1989. Language transfer . Vol. 27 . Cambridge University Press Cambridge . Terence Odlin. 1989. Language transfer. Vol. 27. Cambridge University Press Cambridge."},{"key":"e_1_3_2_2_19_1","volume-title":"Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499","author":"van den Oord Aaron","year":"2016","unstructured":"Aaron van den Oord , Sander Dieleman , Heiga Zen , Karen Simonyan , Oriol Vinyals , Alex Graves , Nal Kalchbrenner , Andrew Senior , and Koray Kavukcuoglu . 2016 . Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016). Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016)."},{"key":"e_1_3_2_2_20_1","unstructured":"Aaron van den Oord Yazhe Li Igor Babuschkin Karen Simonyan Oriol Vinyals Koray Kavukcuoglu George van den Driessche Edward Lockhart Luis C Cobo Florian Stimberg etal 2017. Parallel wavenet: Fast high-fidelity speech synthesis. arXiv preprint arXiv:1711.10433 (2017). Aaron van den Oord Yazhe Li Igor Babuschkin Karen Simonyan Oriol Vinyals Koray Kavukcuoglu George van den Driessche Edward Lockhart Luis C Cobo Florian Stimberg et al. 2017. Parallel wavenet: Fast high-fidelity speech synthesis. arXiv preprint arXiv:1711.10433 (2017)."},{"key":"e_1_3_2_2_21_1","volume-title":"Beyond Fossilization: A Course in Strategies and Techniques in Pronunciation for Advanced Adult Learners. TESL Canada Journal","author":"Ricard Ellen","year":"1986","unstructured":"Ellen Ricard . 1986 . Beyond Fossilization: A Course in Strategies and Techniques in Pronunciation for Advanced Adult Learners. TESL Canada Journal (1986), 243--253. Ellen Ricard. 1986. Beyond Fossilization: A Course in Strategies and Techniques in Pronunciation for Advanced Adult Learners. TESL Canada Journal (1986), 243--253."},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2018.8461368"},{"key":"e_1_3_2_2_23_1","unstructured":"Winifred Strange. 1995. Speech perception and linguistic experience: Theoretical and methodological issues. Winifred Strange. 1995. Speech perception and linguistic experience: Theoretical and methodological issues."},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00530-014-0446-1"},{"key":"e_1_3_2_2_25_1","volume-title":"Tacotron: Towards end-to-end speech synthesis. arXiv preprint arXiv:1703.10135","author":"Wang Yuxuan","year":"2017","unstructured":"Yuxuan Wang , RJ Skerry-Ryan , Daisy Stanton , Yonghui Wu , Ron J Weiss , Navdeep Jaitly , Zongheng Yang , Ying Xiao , Zhifeng Chen , Samy Bengio , 2017 . Tacotron: Towards end-to-end speech synthesis. arXiv preprint arXiv:1703.10135 (2017). Yuxuan Wang, RJ Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J Weiss, Navdeep Jaitly, Zongheng Yang, Ying Xiao, Zhifeng Chen, Samy Bengio, et al. 2017. Tacotron: Towards end-to-end speech synthesis. arXiv preprint arXiv:1703.10135 (2017)."},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2011.5947656"},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSDA.2011.6085985"},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133956.3133962"},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6639267"}],"event":{"name":"MM '20: The 28th ACM International Conference on Multimedia","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Seattle WA USA","acronym":"MM '20"},"container-title":["Proceedings of the 28th ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394171.3414444","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3394171.3414444","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:01:24Z","timestamp":1750197684000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394171.3414444"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,12]]},"references-count":29,"alternative-id":["10.1145\/3394171.3414444","10.1145\/3394171"],"URL":"https:\/\/doi.org\/10.1145\/3394171.3414444","relation":{},"subject":[],"published":{"date-parts":[[2020,10,12]]},"assertion":[{"value":"2020-10-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}