{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,24]],"date-time":"2026-07-24T14:53:56Z","timestamp":1784904836180,"version":"3.55.0"},"reference-count":80,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2023,5,9]],"date-time":"2023-05-09T00:00:00Z","timestamp":1683590400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2023,5,31]]},"abstract":"<jats:p>Modern Automatic Speech Recognition (ASR) systems can achieve high performance in terms of recognition accuracy. However, a perfectly accurate transcript still can be challenging to read due to grammatical errors, disfluency, and other noises common in spoken communication. These readable issues introduced by speakers and ASR systems will impair the performance of downstream tasks and the understanding of human readers. In this work, we present a task called ASR post-processing for readability (APR) and formulate it as a sequence-to-sequence text generation problem. The APR task aims to transform the noisy ASR output into a readable text for humans and downstream tasks while maintaining the semantic meaning of speakers. We further study the APR task from the benchmark dataset, evaluation metrics, and baseline models: First, to address the lack of task-specific data, we propose a method to construct a dataset for the APR task by using the data collected for grammatical error correction. Second, we utilize metrics adapted or borrowed from similar tasks to evaluate model performance on the APR task. Lastly, we use several typical or adapted pre-trained models as the baseline models for the APR task. Furthermore, we fine-tune the baseline models on the constructed dataset and compare their performance with a traditional pipeline method in terms of proposed evaluation metrics. Experimental results show that all the fine-tuned baseline models perform better than the traditional pipeline method, and our adapted RoBERTa model outperforms the pipeline method by 4.95 and 6.63 BLEU points on two test sets, respectively. The human evaluation and case study further reveal the ability of the proposed model to improve the readability of ASR transcripts.<\/jats:p>","DOI":"10.1145\/3557894","type":"journal-article","created":{"date-parts":[[2022,8,22]],"date-time":"2022-08-22T11:41:58Z","timestamp":1661168518000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":32,"title":["Improving Readability for Automatic Speech Recognition Transcription"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7321-7583","authenticated-orcid":false,"given":"Junwei","family":"Liao","sequence":"first","affiliation":[{"name":"University of Electronic Science and Technology of China, Chengdu, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6259-5925","authenticated-orcid":false,"given":"Sefik","family":"Eskimez","sequence":"additional","affiliation":[{"name":"Microsoft Speech and Dialogue Research Group, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5003-7459","authenticated-orcid":false,"given":"Liyang","family":"Lu","sequence":"additional","affiliation":[{"name":"Microsoft Speech and Dialogue Research Group, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1872-3429","authenticated-orcid":false,"given":"Yu","family":"Shi","sequence":"additional","affiliation":[{"name":"Microsoft Speech and Dialogue Research Group, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6140-7187","authenticated-orcid":false,"given":"Ming","family":"Gong","sequence":"additional","affiliation":[{"name":"Microsoft STCA NLP Group, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1050-7708","authenticated-orcid":false,"given":"Linjun","family":"Shou","sequence":"additional","affiliation":[{"name":"Microsoft STCA NLP Group, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6114-3441","authenticated-orcid":false,"given":"Hong","family":"Qu","sequence":"additional","affiliation":[{"name":"University of Electronic Science and Technology of China, Chengdu, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5302-5883","authenticated-orcid":false,"given":"Michael","family":"Zeng","sequence":"additional","affiliation":[{"name":"Microsoft Speech and Dialogue Research Group, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2023,5,9]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3373266"},{"key":"e_1_3_2_3_2","first-page":"4234","volume-title":"Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI\u201916)","author":"Anantaram C.","year":"2016","unstructured":"C. Anantaram, Sunil Kumar Kopparapu, Chiragkumar Patel, and Aditya Mittal. 2016. Repairing general-purpose ASR output to improve accuracy of spoken sentences in specific domains using artificial development approach. In Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI\u201916), Subbarao Kambhampati (Ed.). IJCAI\/AAAI Press, 4234\u20134235. http:\/\/www.ijcai.org\/Abstract\/16\/637."},{"key":"e_1_3_2_4_2","article-title":"Post-editing error correction algorithm for speech recognition using bing spelling suggestion","volume":"1203","author":"Bassil Youssef","year":"2012","unstructured":"Youssef Bassil and Mohammad Alwani. 2012. Post-editing error correction algorithm for speech recognition using bing spelling suggestion. ArXiv Preprint abs\/1203.5255 (2012). https:\/\/arxiv.org\/abs\/1203.5255.","journal-title":"ArXiv Preprint"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2008.05.008"},{"key":"e_1_3_2_6_2","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1109\/TSP.2012.6256332","volume-title":"2012 35th International Conference on Telecommunications and Signal Processing (TSP\u201912)","author":"Bohac Marek","year":"2012","unstructured":"Marek Bohac, Karel Blavka, Michaela Kucharova, and Svatava Skodova. 2012. Post-processing of the recognized speech for web presentation of large audio archive. In 2012 35th International Conference on Telecommunications and Signal Processing (TSP\u201912). IEEE, 441\u2013445."},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W16-2301"},{"key":"e_1_3_2_8_2","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. Advances in Neural Information Processing Systems 33 (2020), 1877\u20131901.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-4406"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-4773"},{"key":"e_1_3_2_11_2","volume-title":"Proceedings of the 12th International Workshop on Spoken Language Translation (IWSLT\u201915).","author":"Cho Eunah","year":"2015","unstructured":"Eunah Cho, Jan Niehues, Kevin Kilgour, and Alex Waibel. 2015. Punctuation insertion for real-time spoken language translation. In Proceedings of the 12th International Workshop on Spoken Language Translation (IWSLT\u201915).https:\/\/aclanthology.org\/2015.iwslt-papers.8."},{"key":"e_1_3_2_12_2","volume-title":"International Workshop on Spoken Language Translation 2012 (IWSLT\u201912)","author":"Cho Eunah","year":"2012","unstructured":"Eunah Cho, Jan Niehues, and Alex Waibel. 2012. Segmentation and punctuation prediction in speech language translation using a monolingual translation system. In International Workshop on Spoken Language Translation 2012 (IWSLT\u201912)."},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-4423"},{"key":"e_1_3_2_14_2","article-title":"Palm: Scaling language modeling with pathways","author":"Chowdhery Aakanksha","year":"2022","unstructured":"Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, and Noah Fiedel. 2022. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311 (2022).","journal-title":"arXiv preprint arXiv:2204.02311"},{"key":"e_1_3_2_15_2","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1007\/978-3-642-39593-2_7","volume-title":"International Conference on Statistical Language and Speech Processing (SLSP\u201913)","author":"Cucu Horia","year":"2013","unstructured":"Horia Cucu, Andi Buzo, Laurent Besacier, and Corneliu Burileanu. 2013. Statistical error correction methods for domain-specific ASR systems. In International Conference on Statistical Language and Speech Processing (SLSP\u201913). Springer, 83\u201392."},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.5555\/2382029.2382118"},{"key":"e_1_3_2_17_2","first-page":"3079","volume-title":"Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015 (NeurIPS\u201915).","author":"Dai Andrew M.","year":"2015","unstructured":"Andrew M. Dai and Quoc V. Le. 2015. Semi-supervised sequence learning. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015 (NeurIPS\u201915)., Corinna Cortes, Neil D. Lawrence, Daniel D. Lee, Masashi Sugiyama, and Roman Garnett (Eds.). 3079\u20133087. https:\/\/proceedings.neurips.cc\/paper\/2015\/hash\/7137debd45ae4d0ab9aa953017286b20-Abstract.html."},{"key":"e_1_3_2_18_2","article-title":"Modeling multi-speaker latent space to improve neural TTS: Quick enrolling new speaker and enhancing premium voice","volume":"1812","author":"Deng Yan","year":"2018","unstructured":"Yan Deng, Lei He, and Frank Soong. 2018. Modeling multi-speaker latent space to improve neural TTS: Quick enrolling new speaker and enhancing premium voice. ArXiv Preprint abs\/1812.05253 (2018). https:\/\/arxiv.org\/abs\/1812.05253.","journal-title":"ArXiv Preprint"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1423"},{"key":"e_1_3_2_20_2","first-page":"13042","volume-title":"Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 (NeurIPS\u201919)","author":"Dong Li","year":"2019","unstructured":"Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, and Hsiao-Wuen Hon. 2019. Unified language model pre-training for natural language understanding and generation. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 (NeurIPS\u201919), Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d\u2019Alch\u00e9-Buc, Emily B. Fox, and Roman Garnett (Eds.). 13042\u201313054. https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/c20bb2d9a50d5ac1f713f8b34d9aac5a-Abstract.html."},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W18-2706"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1177\/001316447303300309"},{"key":"e_1_3_2_23_2","article-title":"Reaching human-level performance in automatic grammatical error correction: An empirical study","volume":"1807","author":"Ge Tao","year":"2018","unstructured":"Tao Ge, Furu Wei, and Ming Zhou. 2018. Reaching human-level performance in automatic grammatical error correction: An empirical study. ArXiv Preprint abs\/1807.01270 (2018). https:\/\/arxiv.org\/abs\/1807.01270.","journal-title":"ArXiv Preprint"},{"key":"e_1_3_2_24_2","first-page":"927","article-title":"Switchboard-1 Release 2","volume":"926","author":"Godfrey John J.","year":"1997","unstructured":"John J. Godfrey and Edward Holliman. 1997. Switchboard-1 Release 2. Linguistic Data Consortium 926 (1997), 927.","journal-title":"Linguistic Data Consortium"},{"key":"e_1_3_2_25_2","doi-asserted-by":"crossref","first-page":"25","DOI":"10.4324\/9781315841342","volume-title":"Learner English on Computer","author":"Granger Sylviane","year":"2014","unstructured":"Sylviane Granger. 2014. The computer learner corpus: A versatile new source of data for SLA research: Sylviane Granger. In Learner English on Computer. Routledge, 25\u201340."},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2009.4960690"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-4427"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2019.8683745"},{"key":"e_1_3_2_29_2","article-title":"Gaussian error linear units (GELUs)","volume":"1606","author":"Hendrycks Dan","year":"2016","unstructured":"Dan Hendrycks and Kevin Gimpel. 2016. Gaussian error linear units (GELUs). ArXiv Preprint abs\/1606.08415 (2016). https:\/\/arxiv.org\/abs\/1606.08415.","journal-title":"ArXiv Preprint"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-4775"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1031"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP40776.2020.9053051"},{"key":"e_1_3_2_33_2","volume-title":"3rd International Conference on Learning Representations (ICLR\u201915), Conference Track Proceedings","author":"Kingma Diederik P.","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations (ICLR\u201915), Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http:\/\/arxiv.org\/abs\/1412.6980."},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1119"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.703"},{"key":"e_1_3_2_36_2","first-page":"7699","volume-title":"2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201920)","author":"Li Jinyu","year":"2020","unstructured":"Jinyu Li, Rui Zhao, Eric Sun, Jeremy H. M. Wong, Amit Das, Zhong Meng, and Yifan Gong. 2020. High-accuracy and low-latency speech recognition with two-head contextual layer trajectory LSTM model. In 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201920). IEEE, 7699\u20137703."},{"key":"e_1_3_2_37_2","article-title":"Jurassic-1: Technical details and evaluation","author":"Lieber Opher","year":"2021","unstructured":"Opher Lieber, Or Sharir, Barak Lenz, and Yoav Shoham. 2021. Jurassic-1: Technical details and evaluation. White Paper. AI21 Labs (2021).","journal-title":"White Paper. AI21 Labs"},{"key":"e_1_3_2_38_2","first-page":"501","volume-title":"Proceedings of the 20th International Conference on Computational Linguistics (COLING\u201904)","author":"Lin Chin-Yew","year":"2004","unstructured":"Chin-Yew Lin and Franz Josef Och. 2004. ORANGE: A method for evaluating automatic evaluation metrics for machine translation. In Proceedings of the 20th International Conference on Computational Linguistics (COLING\u201904). COLING, 501\u2013507. https:\/\/aclanthology.org\/C04-1072."},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3440993"},{"key":"e_1_3_2_40_2","article-title":"Roberta: A robustly optimized BERT pretraining approach","volume":"1907","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized BERT pretraining approach. ArXiv Preprint abs\/1907.11692 (2019). https:\/\/arxiv.org\/abs\/1907.11692.","journal-title":"ArXiv Preprint"},{"key":"e_1_3_2_41_2","first-page":"2232","volume-title":"Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC\u201916)","author":"Liyanapathirana Jeevanthi","year":"2016","unstructured":"Jeevanthi Liyanapathirana and Andrei Popescu-Belis. 2016. Using the TED talks to evaluate spoken post-editing of machine translation. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC\u201916). European Language Resources Association (ELRA), 2232\u20132239. https:\/\/aclanthology.org\/L16-1355."},{"key":"e_1_3_2_42_2","first-page":"6294","volume-title":"Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017 (NeurIPS\u201917)","author":"McCann Bryan","year":"2017","unstructured":"Bryan McCann, James Bradbury, Caiming Xiong, and Richard Socher. 2017. Learned in translation: Contextualized word vectors. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017 (NeurIPS\u201917). Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 6294\u20136305. https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/20c86a628232a67e7bd46f76fba7ce12-Abstract.html."},{"key":"e_1_3_2_43_2","doi-asserted-by":"crossref","first-page":"2284","DOI":"10.1109\/ICASSP.2016.7472084","volume-title":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201916)","author":"Miao Yajie","year":"2016","unstructured":"Yajie Miao, Jinyu Li, Yongqiang Wang, Shi-Xiong Zhang, and Yifan Gong. 2016. Simplifying long short-term memory acoustic models for fast training and decoding. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201916). IEEE, 2284\u20132288."},{"key":"e_1_3_2_44_2","first-page":"147","volume-title":"Proceedings of 5th International Joint Conference on Natural Language Processing (IJCNLP\u201911)","author":"Mizumoto Tomoya","year":"2011","unstructured":"Tomoya Mizumoto, Mamoru Komachi, Masaaki Nagata, and Yuji Matsumoto. 2011. Mining revision log of language learning SNS for automated japanese error correction of second language learners. In Proceedings of 5th International Joint Conference on Natural Language Processing (IJCNLP\u201911). Asian Federation of Natural Language Processing, 147\u2013155. https:\/\/aclanthology.org\/I11-1017."},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2022.3152001"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00282"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/p15-2097"},{"key":"e_1_3_2_48_2","first-page":"229","volume-title":"Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL\u201917)","author":"Napoles Courtney","year":"2017","unstructured":"Courtney Napoles, Keisuke Sakaguchi, and Joel Tetreault. 2017. JFLEG: A fluency corpus and benchmark for grammatical error correction. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL\u201917). Association for Computational Linguistics, 229\u2013234. https:\/\/aclanthology.org\/E17-2037."},{"key":"e_1_3_2_49_2","first-page":"1","volume-title":"Proceedings of the 17th Conference on Computational Natural Language Learning (CoNLL\u201913)","author":"Ng Hwee Tou","year":"2013","unstructured":"Hwee Tou Ng, Siew Mei Wu, Yuanbin Wu, Christian Hadiwinoto, and Joel Tetreault. 2013. The CoNLL-2013 shared task on grammatical error correction. In Proceedings of the 17th Conference on Computational Natural Language Learning (CoNLL\u201913). Association for Computational Linguistics, 1\u201312. https:\/\/aclanthology.org\/W13-3601."},{"key":"e_1_3_2_50_2","first-page":"349","volume-title":"Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL\u201917)","author":"Pal Santanu","year":"2017","unstructured":"Santanu Pal, Sudip Kumar Naskar, Mihaela Vela, Qun Liu, and Josef van Genabith. 2017. Neural automatic post-editing using prior alignment and reranking. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL\u201917). Association for Computational Linguistics, 349\u2013355. https:\/\/aclanthology.org\/E17-2056."},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-2046"},{"key":"e_1_3_2_52_2","volume-title":"Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC\u201902)","author":"Papineni Kishore","year":"2002","unstructured":"Kishore Papineni. 2002. Machine translation evaluation: N-grams to the rescue. In Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC\u201902). European Language Resources Association (ELRA). http:\/\/www.lrec-conf.org\/proceedings\/lrec2002\/pdf\/347.pdf."},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073135"},{"key":"e_1_3_2_54_2","doi-asserted-by":"crossref","first-page":"5105","DOI":"10.1109\/ICASSP.2008.4518807","volume-title":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201908)","author":"Paulik Matthias","year":"2008","unstructured":"Matthias Paulik, Sharath Rao, Ian Lane, Stephan Vogel, and Tanja Schultz. 2008. Sentence segmentation and punctuation recovery for spoken language translation. In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201908). IEEE, 5105\u20135108."},{"key":"e_1_3_2_55_2","volume-title":"6th International Conference on Learning Representations (ICLR\u201918), Conference Track Proceedings","author":"Paulus Romain","year":"2018","unstructured":"Romain Paulus, Caiming Xiong, and Richard Socher. 2018. A deep reinforced model for abstractive summarization. In 6th International Conference on Learning Representations (ICLR\u201918), Conference Track Proceedings. OpenReview.net. https:\/\/openreview.net\/forum?id=HkAClQgA-."},{"key":"e_1_3_2_56_2","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. https:\/\/s3-us-west-2.amazonaws.com\/openai-assets\/researchcovers\/languageunsupervised\/languageunderstandingpaper.pdf."},{"issue":"8","key":"e_1_3_2_57_2","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford Alec","year":"2019","unstructured":"Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019).","journal-title":"OpenAI Blog"},{"key":"e_1_3_2_58_2","article-title":"Scaling language models: Methods, analysis & insights from training gopher","author":"Rae Jack W.","year":"2021","unstructured":"Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, H. Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese, Amy Wu, Erich Elsen, Siddhant M. Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan, Michela Paganini, Laurent Sifre, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d.Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew Johnson, Blake A. Hechtman, Laura Weidinger, Iason Gabriel, William S. Isaac, Edward Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis, Koray Kavukcuoglu, and Geoffrey Irving. 2021. Scaling language models: Methods, analysis & insights from training gopher. arXiv preprint arXiv:2112.11446 (2021).","journal-title":"arXiv preprint arXiv:2112.11446"},{"key":"e_1_3_2_59_2","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","volume":"1910","author":"Raffel Colin","year":"2019","unstructured":"Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. ArXiv Preprint abs\/1910.10683 (2019). https:\/\/arxiv.org\/abs\/1910.10683.","journal-title":"ArXiv Preprint"},{"key":"e_1_3_2_60_2","first-page":"702","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL\/IJCNLP\u201921)","author":"Rothe Sascha","year":"2021","unstructured":"Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, and Aliaksei Severyn. 2021. A simple recipe for multilingual grammatical error correction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL\/IJCNLP\u201921). 702\u2013707."},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1162"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2018.8461368"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1145\/3465383"},{"key":"e_1_3_2_64_2","article-title":"Learning from past mistakes: Improving automatic speech recognition output via noisy-clean phrase context modeling","volume":"8","author":"Shivakumar Prashanth Gurunath","year":"2019","unstructured":"Prashanth Gurunath Shivakumar, Haoqi Li, Kevin Knight, and Panayiotis Georgiou. 2019. Learning from past mistakes: Improving automatic speech recognition output via noisy-clean phrase context modeling. APSIPA Transactions on Signal and Information Processing 8 (2019).","journal-title":"APSIPA Transactions on Signal and Information Processing"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.5555\/1857999.1858022"},{"key":"e_1_3_2_66_2","doi-asserted-by":"crossref","first-page":"446","DOI":"10.1007\/978-3-642-32790-2_54","volume-title":"International Conference on Text, Speech and Dialogue (TSD\u201912)","author":"\u0160kodov\u00e1 Svatava","year":"2012","unstructured":"Svatava \u0160kodov\u00e1, Michaela Kucha\u0159ov\u00e1, and Ladislav \u0160eps. 2012. Discretion of speech units for the text post-processing phase of automatic transcription (in the Czech language). In International Conference on Text, Speech and Dialogue (TSD\u201912). Springer, 446\u2013455."},{"key":"e_1_3_2_67_2","article-title":"Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model","author":"Smith Shaden","year":"2022","unstructured":"Shaden Smith, Mostofa Patwary, Brandon Norick, Patrick LeGresley, Samyam Rajbhandari, Jared Casper, Zhun Liu, Shrimai Prabhumoye, George Zerveas, Vijay Korthikanti, Elton Zheng, Rewon Child, Reza Yazdani Aminabadi, Julie Bernauer, Xia Song, Mohammad Shoeybi, Yuxiong He, Michael Houston, Saurabh Tiwary, and Bryan Catanzaro. 2022. Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model. arXiv preprint arXiv:2201.11990 (2022).","journal-title":"arXiv preprint arXiv:2201.11990"},{"key":"e_1_3_2_68_2","series-title":"Proceedings of the 36th International Conference on Machine Learning (ICML\u201919)","first-page":"5926","volume":"97","author":"Song Kaitao","year":"2019","unstructured":"Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2019. MASS: Masked sequence to sequence pre-training for language generation. In Proceedings of the 36th International Conference on Machine Learning (ICML\u201919)(Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 5926\u20135936. http:\/\/proceedings.mlr.press\/v97\/song19d.html."},{"key":"e_1_3_2_69_2","first-page":"194","volume-title":"13th Annual Conference of the International Speech Communication Association (INTERSPEECH\u201912 Portland, Oregon, USA, September 9-13, 2012)","author":"Sundermeyer Martin","year":"2012","unstructured":"Martin Sundermeyer, Ralf Schl\u00fcter, and Hermann Ney. 2012. LSTM neural networks for language modeling. In 13th Annual Conference of the International Speech Communication Association (INTERSPEECH\u201912 Portland, Oregon, USA, September 9-13, 2012), ISCA, 194\u2013197. http:\/\/www.isca-speech.org\/archive\/interspeech_2012\/i12_0194.html."},{"key":"e_1_3_2_70_2","first-page":"3104","volume-title":"Advances in Neural Information Processing Systems (NeurIPS\u201914)","author":"Sutskever Ilya","year":"2014","unstructured":"Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems (NeurIPS\u201914). 3104\u20133112."},{"key":"e_1_3_2_71_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.308"},{"key":"e_1_3_2_72_2","first-page":"198","volume-title":"Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL\u201912)","author":"Tajiri Toshikazu","year":"2012","unstructured":"Toshikazu Tajiri, Mamoru Komachi, and Yuji Matsumoto. 2012. Tense and aspect error correction for ESL learners using global context. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL\u201912). Association for Computational Linguistics, 198\u2013202. https:\/\/aclanthology.org\/P12-2039."},{"key":"e_1_3_2_73_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-4776"},{"key":"e_1_3_2_74_2","first-page":"5998","volume-title":"Annual Conference on Neural Information Processing Systems 2017 (NeurIPS\u201917)","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Annual Conference on Neural Information Processing Systems 2017 (NeurIPS\u201917). Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998\u20136008. https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html."},{"key":"e_1_3_2_75_2","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2016-911"},{"key":"e_1_3_2_76_2","doi-asserted-by":"publisher","DOI":"10.1006\/csla.2001.0182"},{"key":"e_1_3_2_77_2","article-title":"Google\u2019s neural machine translation system: Bridging the gap between human and machine translation","volume":"1609","author":"Wu Yonghui","year":"2016","unstructured":"Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Gregory S. Corrado, Macduff Hughes, and Jeffrey Dean. 2016. Google\u2019s neural machine translation system: Bridging the gap between human and machine translation. ArXiv Preprint abs\/1609.08144 (2016). https:\/\/arxiv.org\/abs\/1609.08144.","journal-title":"ArXiv Preprint"},{"key":"e_1_3_2_78_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2018.8461870"},{"key":"e_1_3_2_79_2","first-page":"5754","volume-title":"Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 (NeurIPS\u201919)","author":"Yang Zhilin","year":"2019","unstructured":"Zhilin Yang, Zihang Dai, Yiming Yang, Jaime G. Carbonell, Ruslan Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized autoregressive pretraining for language understanding. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 (NeurIPS\u201919), Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d\u2019Alch\u00e9-Buc, Emily B. Fox, and Roman Garnett (Eds.). 5754\u20135764. https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/dc6a7e655d7e5840e66733e9ee67cc69-Abstract.html."},{"key":"e_1_3_2_80_2","first-page":"180","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL\u201911)","author":"Yannakoudakis Helen","year":"2011","unstructured":"Helen Yannakoudakis, Ted Briscoe, and Ben Medlock. 2011. A new dataset and method for automatically grading ESOL texts. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL\u201911). Association for Computational Linguistics, 180\u2013189. https:\/\/aclanthology.org\/P11-1019."},{"key":"e_1_3_2_81_2","article-title":"Sequence-to-sequence Pre-training with data augmentation for sentence rewriting","volume":"1909","author":"Zhang Yi","year":"2019","unstructured":"Yi Zhang, Tao Ge, Furu Wei, Ming Zhou, and Xu Sun. 2019. Sequence-to-sequence Pre-training with data augmentation for sentence rewriting. ArXiv Preprint abs\/1909.06002 (2019). https:\/\/arxiv.org\/abs\/1909.06002.","journal-title":"ArXiv Preprint"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3557894","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3557894","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:02:40Z","timestamp":1750186960000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3557894"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,9]]},"references-count":80,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2023,5,31]]}},"alternative-id":["10.1145\/3557894"],"URL":"https:\/\/doi.org\/10.1145\/3557894","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"value":"2375-4699","type":"print"},{"value":"2375-4702","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,9]]},"assertion":[{"value":"2022-03-08","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-08-09","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-05-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}