{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,2]],"date-time":"2026-03-02T22:57:32Z","timestamp":1772492252736,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":22,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,8,18]],"date-time":"2021-08-18T00:00:00Z","timestamp":1629244800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Singapore Ministry of Education (MOE) Academic Research Fund (AcRF)","award":["19-C220-SMU-002"],"award-info":[{"award-number":["19-C220-SMU-002"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,8,20]]},"DOI":"10.1145\/3468264.3473124","type":"proceedings-article","created":{"date-parts":[[2021,8,19]],"date-time":"2021-08-19T01:40:37Z","timestamp":1629337237000},"page":"1575-1579","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":19,"title":["CrossASR++: a modular differential testing framework for automatic speech recognition"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0862-2579","authenticated-orcid":false,"given":"Muhammad Hilmi","family":"Asyrofi","sequence":"first","affiliation":[{"name":"Singapore Management University, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5938-1918","authenticated-orcid":false,"given":"Zhou","family":"Yang","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4367-7201","authenticated-orcid":false,"given":"David","family":"Lo","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore"}]}],"member":"320","published-online":{"date-parts":[[2021,8,18]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP40001.2021.00009"},{"key":"e_1_3_2_1_2_1","volume-title":"Proceedings of The 33rd International Conference on Machine Learning, Maria Florina Balcan and Kilian Q. Weinberger (Eds.) (Proceedings of Machine Learning Research","volume":"182","author":"Amodei Dario","year":"2016","unstructured":"Dario Amodei , Sundaram Ananthanarayanan , Rishita Anubhai , Jingliang Bai , Eric Battenberg , Carl Case , Jared Casper , Bryan Catanzaro , Qiang Cheng , Guoliang Chen , Jie Chen , Jingdong Chen , Zhijie Chen , Mike Chrzanowski , Adam Coates , Greg Diamos , Ke Ding , Niandong Du , Erich Elsen , Jesse Engel , Weiwei Fang , Linxi Fan , Christopher Fougner , Liang Gao , Caixia Gong , Awni Hannun , Tony Han , Lappi Johannes , Bing Jiang , Cai Ju , Billy Jun , Patrick LeGresley , Libby Lin , Junjie Liu , Yang Liu , Weigao Li , Xiangang Li , Dongpeng Ma , Sharan Narang , Andrew Ng , Sherjil Ozair , Yiping Peng , Ryan Prenger , Sheng Qian , Zongfeng Quan , Jonathan Raiman , Vinay Rao , Sanjeev Satheesh , David Seetapun , Shubho Sengupta , Kavya Srinet , Anuroop Sriram , Haiyuan Tang , Liliang Tang , Chong Wang , Jidong Wang , Kaifu Wang , Yi Wang , Zhijian Wang , Zhiqian Wang , Shuang Wu , Likai Wei , Bo Xiao , Wen Xie , Yan Xie , Dani Yogatama , Bin Yuan , Jun Zhan , and Zhenyao Zhu . 2016 . Deep Speech 2: End-to-End Speech Recognition in English and Mandarin . In Proceedings of The 33rd International Conference on Machine Learning, Maria Florina Balcan and Kilian Q. Weinberger (Eds.) (Proceedings of Machine Learning Research , Vol. 48). PMLR, New York, New York, USA. 173\u2013 182 . http:\/\/proceedings.mlr.press\/v48\/amodei16.html Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, Jie Chen, Jingdong Chen, Zhijie Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Ke Ding, Niandong Du, Erich Elsen, Jesse Engel, Weiwei Fang, Linxi Fan, Christopher Fougner, Liang Gao, Caixia Gong, Awni Hannun, Tony Han, Lappi Johannes, Bing Jiang, Cai Ju, Billy Jun, Patrick LeGresley, Libby Lin, Junjie Liu, Yang Liu, Weigao Li, Xiangang Li, Dongpeng Ma, Sharan Narang, Andrew Ng, Sherjil Ozair, Yiping Peng, Ryan Prenger, Sheng Qian, Zongfeng Quan, Jonathan Raiman, Vinay Rao, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Kavya Srinet, Anuroop Sriram, Haiyuan Tang, Liliang Tang, Chong Wang, Jidong Wang, Kaifu Wang, Yi Wang, Zhijian Wang, Zhiqian Wang, Shuang Wu, Likai Wei, Bo Xiao, Wen Xie, Yan Xie, Dani Yogatama, Bin Yuan, Jun Zhan, and Zhenyao Zhu. 2016. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin. In Proceedings of The 33rd International Conference on Machine Learning, Maria Florina Balcan and Kilian Q. Weinberger (Eds.) (Proceedings of Machine Learning Research, Vol. 48). PMLR, New York, New York, USA. 173\u2013182. http:\/\/proceedings.mlr.press\/v48\/amodei16.html"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSME46990.2020.00066"},{"key":"e_1_3_2_1_4_1","unstructured":"Alexei Baevski Henry Zhou Abdelrahman Mohamed and Michael Auli. 2020. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. arxiv:2006.11477.  Alexei Baevski Henry Zhou Abdelrahman Mohamed and Michael Auli. 2020. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. arxiv:2006.11477."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/SPW.2018.00009"},{"key":"e_1_3_2_1_6_1","unstructured":"Xiaoning Du Xiaofei Xie Yi Li Lei Ma Jianjun Zhao and Yang Liu. 2018. DeepCruiser: Automated Guided Testing for Stateful Deep Learning Systems. arxiv:1812.05339.  Xiaoning Du Xiaofei Xie Yi Li Lei Ma Jianjun Zhao and Yang Liu. 2018. DeepCruiser: Automated Guided Testing for Stateful Deep Learning Systems. arxiv:1812.05339."},{"key":"e_1_3_2_1_7_1","volume-title":"eSpeak TTS","author":"Duddington Jonathan","unstructured":"Jonathan Duddington . [n.d.]. eSpeak TTS . http:\/\/espeak.sourceforge.net Accessed: 2021-04-30. Jonathan Duddington. [n.d.]. eSpeak TTS. http:\/\/espeak.sourceforge.net Accessed: 2021-04-30."},{"key":"e_1_3_2_1_8_1","unstructured":"Pierre Nicolas Durette. [n.d.]. Google Translate\u2019s Text-to-Speech. https:\/\/pypi.org\/project\/gTTS\/ Accessed: 2021-04-30.  Pierre Nicolas Durette. [n.d.]. Google Translate\u2019s Text-to-Speech. https:\/\/pypi.org\/project\/gTTS\/ Accessed: 2021-04-30."},{"key":"e_1_3_2_1_9_1","unstructured":"The Centre for Speech Technology Research. [n.d.]. The Festival Speech Synthesis System. https:\/\/www.cstr.ed.ac.uk\/projects\/festival\/ Accessed: 2021-04-30.  The Centre for Speech Technology Research. [n.d.]. The Festival Speech Synthesis System. https:\/\/www.cstr.ed.ac.uk\/projects\/festival\/ Accessed: 2021-04-30."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE-SEIP.2019.00016"},{"key":"e_1_3_2_1_11_1","unstructured":"Jianmin Guo Yue Zhao Quan Zhang and Yu Jiang. 2021. RNN-Test: Towards Adversarial Testing for Recurrent Neural Network Systems. arxiv:1911.06155.  Jianmin Guo Yue Zhao Quan Zhang and Yu Jiang. 2021. RNN-Test: Towards Adversarial Testing for Recurrent Neural Network Systems. arxiv:1911.06155."},{"key":"e_1_3_2_1_12_1","volume-title":"Ng","author":"Hannun Awni","year":"2014","unstructured":"Awni Hannun , Carl Case , Jared Casper , Bryan Catanzaro , Greg Diamos , Erich Elsen , Ryan Prenger , Sanjeev Satheesh , Shubho Sengupta , Adam Coates , and Andrew Y . Ng . 2014 . Deep Speech : Scaling up end-to-end speech recognition. arxiv:1412.5567. Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, and Andrew Y. Ng. 2014. Deep Speech: Scaling up end-to-end speech recognition. arxiv:1412.5567."},{"key":"e_1_3_2_1_13_1","unstructured":"HuggingFace. [n.d.]. HuggingFace. https:\/\/huggingface.co Accessed: 2021-04-30.  HuggingFace. [n.d.]. HuggingFace. https:\/\/huggingface.co Accessed: 2021-04-30."},{"key":"e_1_3_2_1_14_1","unstructured":"Wit.ai Inc. [n.d.]. Wit. https:\/\/wit.ai Accessed: 2021-04-30.  Wit.ai Inc. [n.d.]. Wit. https:\/\/wit.ai Accessed: 2021-04-30."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"crossref","unstructured":"Shreya Khare Rahul Aralikatte and Senthil Mani. 2019. Adversarial Black-Box Attacks on Automatic Speech Recognition Systems using Multi-Objective Evolutionary Optimization. arxiv:1811.01312.  Shreya Khare Rahul Aralikatte and Senthil Mani. 2019. Adversarial Black-Box Attacks on Automatic Speech Recognition Systems using Multi-Objective Evolutionary Optimization. arxiv:1811.01312.","DOI":"10.21437\/Interspeech.2019-2420"},{"key":"e_1_3_2_1_16_1","unstructured":"Tino Khong. [n.d.]. ResponsiveVoice TTS. https:\/\/pypi.org\/project\/rvtts\/ Accessed: 2021-04-30.  Tino Khong. [n.d.]. ResponsiveVoice TTS. https:\/\/pypi.org\/project\/rvtts\/ Accessed: 2021-04-30."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.1990.115546"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2019.8683535"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/icassp.2019.8683535"},{"key":"e_1_3_2_1_20_1","volume-title":"ICML","author":"Qin Yao","unstructured":"Yao Qin , Nicholas Carlini , Ian J. Goodfellow , Garrison W. Cottrell , and Colin Raffel . 2019. Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition . In ICML . http:\/\/proceedings.mlr.press\/v97\/qin19a\/qin19a.pdf Yao Qin, Nicholas Carlini, Ian J. Goodfellow, Garrison W. Cottrell, and Colin Raffel. 2019. Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition. In ICML. http:\/\/proceedings.mlr.press\/v97\/qin19a\/qin19a.pdf"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"crossref","unstructured":"Lea Sch\u00f6nherr Katharina Kohls Steffen Zeiler Thorsten Holz and Dorothea Kolossa. 2018. Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding. arxiv:1808.05665.  Lea Sch\u00f6nherr Katharina Kohls Steffen Zeiler Thorsten Holz and Dorothea Kolossa. 2018. Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding. arxiv:1808.05665.","DOI":"10.14722\/ndss.2019.23288"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/SPW.2019.00016"}],"event":{"name":"ESEC\/FSE '21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering","location":"Athens Greece","acronym":"ESEC\/FSE '21","sponsor":["SIGSOFT ACM Special Interest Group on Software Engineering"]},"container-title":["Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3468264.3473124","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3468264.3473124","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:17:22Z","timestamp":1750191442000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3468264.3473124"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,18]]},"references-count":22,"alternative-id":["10.1145\/3468264.3473124","10.1145\/3468264"],"URL":"https:\/\/doi.org\/10.1145\/3468264.3473124","relation":{},"subject":[],"published":{"date-parts":[[2021,8,18]]},"assertion":[{"value":"2021-08-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}