{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:24:24Z","timestamp":1750220664553,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":17,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,10,25]],"date-time":"2020-10-25T00:00:00Z","timestamp":1603584000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,10,25]]},"DOI":"10.1145\/3395035.3425640","type":"proceedings-article","created":{"date-parts":[[2020,12,28]],"date-time":"2020-12-28T05:36:27Z","timestamp":1609133787000},"page":"130-134","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["See me Speaking? Differentiating on Whether Words are Spoken On Screen or Off to Optimize Machine Dubbing"],"prefix":"10.1145","author":[{"given":"Shravan","family":"Nayak","sequence":"first","affiliation":[{"name":"Indian Institute of Technology (BHU), Varanasi, India"}]},{"given":"Timo","family":"Baumann","sequence":"additional","affiliation":[{"name":"Universit\u00e4t Hamburg, Hamburg, Germany"}]},{"given":"Supratik","family":"Bhattacharya","sequence":"additional","affiliation":[{"name":"Birla Institute of Technology and Science, Pilani, India"}]},{"given":"Alina","family":"Karakanta","sequence":"additional","affiliation":[{"name":"Fondazione Bruno Kessler &amp; University of Trento, Trento, Italy"}]},{"given":"Matteo","family":"Negri","sequence":"additional","affiliation":[{"name":"Fondazione Bruno Kessler, Trento, Italy"}]},{"given":"Marco","family":"Turchi","sequence":"additional","affiliation":[{"name":"Fondazione Bruno Kessler, Trento, Italy"}]}],"member":"320","published-online":{"date-parts":[[2020,12,27]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Proceedings of ICLR","author":"Bahdanau Dzmitry","year":"2015","unstructured":"Dzmitry Bahdanau , Kyunghyun Cho , and Y. Bengio . 2015. Neural Machine Translation by Jointly Learning to Align and Translate . Proceedings of ICLR 2015 . Dzmitry Bahdanau, Kyunghyun Cho, and Y. Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. Proceedings of ICLR 2015."},{"key":"e_1_3_2_1_2_1","volume-title":"2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG)","volume":"6","author":"Tadas Baltruvs","year":"2015","unstructured":"Tadas Baltruvs aitis, Marwa Mahmoud , and Peter Robinson . 2015 . Cross-dataset learning and person-specific normalisation for automatic action unit detection . In 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG) , Vol. 6 . IEEE, 1--6. Tadas Baltruvs aitis, Marwa Mahmoud, and Peter Robinson. 2015. Cross-dataset learning and person-specific normalisation for automatic action unit detection. In 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Vol. 6. IEEE, 1--6."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/FG.2018.00019"},{"volume-title":"Encyclopedia of Language & Linguistics","author":"Chaume-Varela Frederic","key":"e_1_3_2_1_4_1","unstructured":"Frederic Chaume-Varela . 2006. Dubbing . In Encyclopedia of Language & Linguistics ( Second Edition), Keith Brown (Ed.). Elsevier , Oxford, 6 -- 9. https:\/\/doi.org\/10.1016\/B0-08-044854--2\/00471--5 10.1016\/B0-08-044854--2 Frederic Chaume-Varela. 2006. Dubbing. In Encyclopedia of Language & Linguistics (Second Edition), Keith Brown (Ed.). Elsevier, Oxford, 6 -- 9. https:\/\/doi.org\/10.1016\/B0-08-044854--2\/00471--5"},{"key":"e_1_3_2_1_5_1","volume-title":"Proceedings of the 17th International Conference on Spoken Language Translation. Association for Computational Linguistics, Online, 257--264","author":"Federico Marcello","year":"2020","unstructured":"Marcello Federico , Robert Enyedi , Roberto Barra-Chicote , Ritwik Giri , Umut Isik , Arvindh Krishnaswamy , and Hassan Sawaf . 2020 . From Speech-to-Speech Translation to Automatic Dubbing . In Proceedings of the 17th International Conference on Spoken Language Translation. Association for Computational Linguistics, Online, 257--264 . https:\/\/www.aclweb.org\/anthology\/2020.iwslt-1.31 Marcello Federico, Robert Enyedi, Roberto Barra-Chicote, Ritwik Giri, Umut Isik, Arvindh Krishnaswamy, and Hassan Sawaf. 2020. From Speech-to-Speech Translation to Automatic Dubbing. In Proceedings of the 17th International Conference on Spoken Language Translation. Association for Computational Linguistics, Online, 257--264. https:\/\/www.aclweb.org\/anthology\/2020.iwslt-1.31"},{"key":"e_1_3_2_1_6_1","volume-title":"Film Dubbing: Phonetic Semiotic, Esthetic & Psychological Aspects","author":"Fodor I.","year":"1976","unstructured":"I. Fodor . 1976 . Film Dubbing: Phonetic Semiotic, Esthetic & Psychological Aspects . Buske Helmut Verlag Gmb . I. Fodor. 1976. Film Dubbing: Phonetic Semiotic, Esthetic & Psychological Aspects .Buske Helmut Verlag Gmb."},{"key":"e_1_3_2_1_7_1","volume-title":"Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR","author":"Kingma D. P.","year":"2015","unstructured":"D. P. Kingma and J. Ba . 2015 . Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015 ,. http:\/\/arxiv.org\/abs\/1412.6980 D. P. Kingma and J. Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015,. http:\/\/arxiv.org\/abs\/1412.6980"},{"key":"e_1_3_2_1_8_1","volume-title":"Digital Humanities Conference","author":"Kisler Thomas","year":"2012","unstructured":"Thomas Kisler , Florian Schiel , and Han Sloetjes . 2012 . Signal processing via web services: the use case WebMAUS . In Digital Humanities Conference 2012. Thomas Kisler, Florian Schiel, and Han Sloetjes. 2012. Signal processing via web services: the use case WebMAUS. In Digital Humanities Conference 2012."},{"key":"e_1_3_2_1_9_1","volume-title":"Manning","author":"Luong Minh-Thang","year":"2015","unstructured":"Minh-Thang Luong , Hieu Pham , and Christopher D . Manning . 2015 . Effective Approaches to Attention-based Neural Machine Translation. CoRR , Vol. abs\/ 1508 .04025 (2015). arxiv: 1508.04025 http:\/\/arxiv.org\/abs\/1508.04025 Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. CoRR, Vol. abs\/1508.04025 (2015). arxiv: 1508.04025 http:\/\/arxiv.org\/abs\/1508.04025"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.21437\/IberSPEECH.2018-5"},{"volume-title":"Topics in audiovisual translation","author":"Orero Pilar","key":"e_1_3_2_1_11_1","unstructured":"Pilar Orero . 2004. Topics in audiovisual translation . Vol. 56 . John Benjamins Publishing . Pilar Orero. 2004. Topics in audiovisual translation. Vol. 56. John Benjamins Publishing."},{"volume-title":"PyTorch: An Imperative Style","author":"Paszke Adam","key":"e_1_3_2_1_12_1","unstructured":"Adam Paszke , Sam Gross , Francisco Massa , Adam Lerer , James Bradbury , Gregory Chanan , Trevor Killeen , Zeming Lin , Natalia Gimelshein , Luca Antiga , Alban Desmaison , Andreas Kopf , Edward Yang , Zachary DeVito , Martin Raison , Alykhan Tejani , Sasank Chilamkurthy , Benoit Steiner , Lu Fang , Junjie Bai , and Soumith Chintala . 2019. PyTorch: An Imperative Style , High-Performance Deep Learning Library . In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch\u00e9-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024--8035. http:\/\/papers.neurips.cc\/paper\/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch\u00e9-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024--8035. http:\/\/papers.neurips.cc\/paper\/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf"},{"key":"e_1_3_2_1_13_1","volume-title":"AVA-Active Speaker: An Audio-Visual Dataset for Active Speaker Detection. CoRR","author":"Roth Joseph","year":"2019","unstructured":"Joseph Roth , Sourish Chaudhuri , Ondrej Klejch , Radhika Marvin , Andrew C. Gallagher , Liat Kaver , Sharadh Ramaswamy , Arkadiusz Stopczynski , Cordelia Schmid , Zhonghua Xi , and Caroline Pantofaru . 2019. AVA-Active Speaker: An Audio-Visual Dataset for Active Speaker Detection. CoRR , Vol. abs\/ 1901 .01342 ( 2019 ). arxiv: 1901.01342 http:\/\/arxiv.org\/abs\/1901.01342 Joseph Roth, Sourish Chaudhuri, Ondrej Klejch, Radhika Marvin, Andrew C. Gallagher, Liat Kaver, Sharadh Ramaswamy, Arkadiusz Stopczynski, Cordelia Schmid, Zhonghua Xi, and Caroline Pantofaru. 2019. AVA-Active Speaker: An Audio-Visual Dataset for Active Speaker Detection. CoRR, Vol. abs\/1901.01342 (2019). arxiv: 1901.01342 http:\/\/arxiv.org\/abs\/1901.01342"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-5210"},{"key":"e_1_3_2_1_15_1","volume-title":"Proceedings of the LREC.","author":"Schiel Florian","year":"2004","unstructured":"Florian Schiel . 2004 . MAUS goes iterative . In Proceedings of the LREC. Florian Schiel. 2004. MAUS goes iterative. In Proceedings of the LREC."},{"key":"e_1_3_2_1_16_1","unstructured":"Dong Yi Zhen Lei S. Liao and S. Li. 2014. Learning Face Representation from Scratch. ArXiv Vol. abs\/1411.7923 (2014).  Dong Yi Zhen Lei S. Liao and S. Li. 2014. Learning Face Representation from Scratch. ArXiv Vol. abs\/1411.7923 (2014)."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2016.2603342"}],"event":{"name":"ICMI '20: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION","sponsor":["SIGCHI ACM Special Interest Group on Computer-Human Interaction"],"location":"Virtual Event Netherlands","acronym":"ICMI '20"},"container-title":["Companion Publication of the 2020 International Conference on Multimodal Interaction"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3395035.3425640","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3395035.3425640","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:47Z","timestamp":1750197767000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3395035.3425640"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,25]]},"references-count":17,"alternative-id":["10.1145\/3395035.3425640","10.1145\/3395035"],"URL":"https:\/\/doi.org\/10.1145\/3395035.3425640","relation":{},"subject":[],"published":{"date-parts":[[2020,10,25]]},"assertion":[{"value":"2020-12-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}