{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T14:09:55Z","timestamp":1769004595611,"version":"3.49.0"},"reference-count":94,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2025,1,21]],"date-time":"2025-01-21T00:00:00Z","timestamp":1737417600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100018537","name":"National Science and Technology Major Project","doi-asserted-by":"crossref","award":["2021ZD0112903"],"award-info":[{"award-number":["2021ZD0112903"]}],"id":[{"id":"10.13039\/501100018537","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62372225"],"award-info":[{"award-number":["62372225"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2025,2,28]]},"abstract":"<jats:p>\n            With the development of Deep Learning, Natural Language Processing (NLP) applications have reached or even exceeded human-level capabilities in certain tasks. Although NLP applications have shown good performance, they can still have bugs like traditional software and even lead to serious consequences. Inspired by Lego blocks and syntax structure analysis, we propose an assembling test generation method for NLP applications or models and implement it in\n            <jats:monospace>NLPLego<\/jats:monospace>\n            . The key idea of\n            <jats:monospace>NLPLego<\/jats:monospace>\n            is to assemble the sentence skeleton and adjuncts in order by simulating the building of Lego blocks to generate multiple grammatically and semantically correct sentences based on one seed sentence. The sentences generated by\n            <jats:monospace>NLPLego<\/jats:monospace>\n            have derivation relations and different degrees of variation. These characteristics make it well-suited for integration with metamorphic testing theory, addressing the challenge of test oracle absence in NLP application testing. To validate\n            <jats:monospace>NLPLego<\/jats:monospace>\n            , we conduct experiments on three commonly used NLP tasks (i.e., machine reading comprehension, sentiment analysis, and semantic similarity measures), focusing on the efficiency of test generation and the quality and effectiveness of generated tests. We select five advanced NLP models and one popular industrial NLP software as the tested subjects. Given seed tests from SQuAD 2.0, SST, and QQP,\n            <jats:monospace>NLPLego<\/jats:monospace>\n            successfully detects 1,732, 3,140, and 261,879 incorrect behaviors with around 93.1% precision in three tasks, respectively. The experiment results show that\n            <jats:monospace>NLPLego<\/jats:monospace>\n            can efficiently generate high-quality tests for multiple NLP tasks to detect erroneous behaviors effectively. In the case study, we analyze the testing results provided by\n            <jats:monospace>NLPLego<\/jats:monospace>\n            to obtain intuitive representations of the different NLP capabilities of the tested subjects. The case study confirms that\n            <jats:monospace>NLPLego<\/jats:monospace>\n            can provide developers with clarity on the direction to improve NLP models or applications, laying the foundation for enhancing performance.\n          <\/jats:p>","DOI":"10.1145\/3691631","type":"journal-article","created":{"date-parts":[[2024,10,5]],"date-time":"2024-10-05T13:15:29Z","timestamp":1728134129000},"page":"1-36","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["NLPLego: Assembling Test Generation for Natural Language Processing Applications"],"prefix":"10.1145","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-3154-4409","authenticated-orcid":false,"given":"Pin","family":"Ji","sequence":"first","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7477-3642","authenticated-orcid":false,"given":"Yang","family":"Feng","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-1107-5736","authenticated-orcid":false,"given":"Ruohao","family":"Zhang","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-8381-1281","authenticated-orcid":false,"given":"Ruichen","family":"Xue","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-6536-0885","authenticated-orcid":false,"given":"Yichi","family":"Zhang","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-9832-0124","authenticated-orcid":false,"given":"Weitao","family":"Huang","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8368-4898","authenticated-orcid":false,"given":"Jia","family":"Liu","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-8850-5583","authenticated-orcid":false,"given":"Zhihong","family":"Zhao","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China"}]}],"member":"320","published-online":{"date-parts":[[2025,1,21]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"Google Translate. 2024. The Feedback Channel of Google Translate. Retrieved from https:\/\/translate.google.com\/intl\/en\/about\/contribute\/"},{"key":"e_1_3_1_3_2","unstructured":"GitHub. 2024. The Github Repository of NLPLego. Retrieved from https:\/\/github.com\/SSCT-Lab\/NLPLego"},{"key":"e_1_3_1_4_2","unstructured":"ChatGPT. 2024. The Website of ChatGPT. Retrieved from https:\/\/chat.openai.com\/"},{"key":"e_1_3_1_5_2","unstructured":"OpenAI. 2024. The Website of OpenAI. Retrieved from https:\/\/openai.com\/"},{"key":"e_1_3_1_6_2","unstructured":"spaCy. 2024. The Website of spaCy. Retrieved from https:\/\/spacy.io\/"},{"key":"e_1_3_1_7_2","unstructured":"Standford CoreNLP. 2024. The Website of Standford CoreNLP. Retrieved from https:\/\/stanfordnlp.github.io\/CoreNLP\/"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2008.12.005"},{"key":"e_1_3_1_9_2","volume-title":"The Comparative Evaluation of Dependency Parsers in Parsing Estonian","author":"Alam Nusaeb Nur","year":"2017","unstructured":"Nusaeb Nur Alam. 2017. The Comparative Evaluation of Dependency Parsers in Parsing Estonian. Master\u2019s thesis. University of Tartu."},{"key":"e_1_3_1_10_2","unstructured":"Gina Cherelus Amy Tennery. 2016. Microsoft\u2019s AI Twitter Bot Goes Dark after Racist Sexist Tweets. Retrieved from https:\/\/www.reuters.com\/article\/us-microsoft-twitter-bot-idUSKCN0WQ2LA"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2017.2788019"},{"issue":"12","key":"e_1_3_1_12_2","first-page":"5087","article-title":"Biasfinder: Metamorphic test generation to uncover bias for sentiment analysis systems","volume":"48","author":"Asyrofi Muhammad Hilmi","year":"2021","unstructured":"Muhammad Hilmi Asyrofi, Zhou Yang, Imam Nur Bani Yusuf, Hong Jin Kang, Ferdian Thung, and David Lo. 2021. Biasfinder: Metamorphic test generation to uncover bias for sentiment analysis systems. IEEE Transactions on Software Engineering 48, 12 (2021), 5087\u20135101.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2021.107584"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2019.06.012"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324921000395"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2014.2372785"},{"key":"e_1_3_1_17_2","first-page":"63","volume-title":"Proceedings of the ACL-02 Workshop onEffective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics","author":"Loper Edward","year":"2002","unstructured":"Edward Loper and Steven Bird. 2002. NLTK: The natural language toolkit. In Proceedings of the ACL-02 Workshop onEffective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, 63\u201370."},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.4324\/9781003070702"},{"key":"e_1_3_1_19_2","volume-title":"Syntax: A Generative Introduction","author":"Carnie Andrew","year":"2021","unstructured":"Andrew Carnie. 2021. Syntax: A Generative Introduction. John Wiley & Sons."},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3440755"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3468264.3468569"},{"key":"e_1_3_1_22_2","unstructured":"Tsong Y. Chen Shing C. Cheung and Shiu Ming Yiu. 2020. Metamorphic testing: A new approach for generating next test cases. arXiv:2002.12543. Retrieved from https:\/\/arxiv.org\/abs\/2002.12543"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/3143561"},{"key":"e_1_3_1_24_2","first-page":"13","volume-title":"Proceedings of the 2019 IEEE\/ACM 12th International Workshop on Search-Based Software Testing (SBST)","author":"Cohen Myra B.","year":"2019","unstructured":"Myra B. Cohen. 2019. The maturation of search-based software testing: Successes and challenges. In Proceedings of the 2019 IEEE\/ACM 12th International Workshop on Search-Based Software Testing (SBST). IEEE, 13\u201314."},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/3406095"},{"key":"e_1_3_1_26_2","first-page":"95","volume-title":"Proceedings of the Canadian Linguistics Society Meeting, Cahiers Linguistiques d\u2019Ottawa","author":"DeArmond Richard C.","year":"1998","unstructured":"Richard C. DeArmond and Nancy Hedberg. 1998. On complements and adjuncts. In Proceedings of the Canadian Linguistics Society Meeting, Cahiers Linguistiques d\u2019Ottawa, 95\u2013106."},{"key":"e_1_3_1_27_2","first-page":"1","article-title":"Simulated annealing: From basics to applications","author":"Delahaye Daniel","year":"2019","unstructured":"Daniel Delahaye, Supatcha Chaimatanan, and Marcel Mongeau. 2019. Simulated annealing: From basics to applications. Handbook of Metaheuristics (2019), 1\u201335.","journal-title":"Handbook of Metaheuristics"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-020-09866-x"},{"key":"e_1_3_1_29_2","first-page":"103","volume-title":"Proceedings of the 2021 IEEE International Conference on Artificial Intelligence Testing (AITest)","author":"Ebadi Hamid","year":"2021","unstructured":"Hamid Ebadi, Mahshid Helali Moghadam, Markus Borg, Gregory Gay, Afonso Fontes, and Kasper Socha. 2021. Efficient and effective generation of test cases for pedestrian detection-search-based software testing of Baidu Apollo in SVL. In Proceedings of the 2021 IEEE International Conference on Artificial Intelligence Testing (AITest). IEEE, 103\u2013110."},{"issue":"3","key":"e_1_3_1_30_2","first-page":"389","article-title":"A survey on semantic similarity measure","volume":"2","author":"Elavarasi S. Anitha","year":"2014","unstructured":"S. Anitha Elavarasi, J. Akilandeswari, and K. Menaga. 2014. A survey on semantic similarity measure. International Journal of Research in Advent Technology 2, 3 (2014), 389\u2013398.","journal-title":"International Journal of Research in Advent Technology"},{"key":"e_1_3_1_31_2","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511486258","volume":"96","author":"Ernst Thomas","year":"2001","unstructured":"Thomas Ernst. 2001. The Syntax of Adjuncts, Vol. 96. Cambridge University Press.","journal-title":"The Syntax of Adjuncts"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","unstructured":"Matt Gardner Yoav Artzi Victoria Basmova Jonathan Berant Ben Bogin Sihao Chen Pradeep Dasigi Dheeru Dua Yanai Elazar Ananth Gottumukkala et al. 2020. Evaluating models\u2019 local decision boundaries via contrast sets. arXiv:2004.02709. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.2004.02709","DOI":"10.48550\/arXiv.2004.02709"},{"issue":"12","key":"e_1_3_1_33_2","first-page":"1","article-title":"Prompt engineering with ChatGPT: A guide for academic writers","volume":"51","author":"Giray Louie","year":"2023","unstructured":"Louie Giray. 2023. Prompt engineering with ChatGPT: A guide for academic writers. Annals of Biomedical Engineering 51, 12 (2023), 1\u20135.","journal-title":"Annals of Biomedical Engineering"},{"key":"e_1_3_1_34_2","unstructured":"NIST Multimodal Information Group. 2024. NIST 2002 Open Machine Translation (OpenMT) Evaluation. Retrieved from https:\/\/catalog.ldc.upenn.edu\/LDC2010T10"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/3368089.3409756"},{"key":"e_1_3_1_36_2","first-page":"1097","volume-title":"Proceedings of COLING 2012","author":"Harashima Jun","year":"2012","unstructured":"Jun Harashima and Sadao Kurohashi. 2012. Flexible Japanese sentence compression by relaxing unit constraints. In Proceedings of COLING 2012, 1097\u20131112."},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","unstructured":"Hany Hassan Anthony Aue Chang Chen Vishal Chowdhary Jonathan Clark Christian Federmann Xuedong Huang Marcin Junczys-Dowmunt William Lewis Mu Li et al. 2018. Achieving human parity on automatic Chinese to English news translation. arXiv:1803.05567. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.1803.05567","DOI":"10.48550\/arXiv.1803.05567"},{"key":"e_1_3_1_38_2","unstructured":"Pengcheng He Jianfeng Gao and Weizhu Chen. 2021. Debertav3: Improving Deberta using electra-style pre-training with gradient-disentangled embedding sharing. arXiv:2111.09543."},{"key":"e_1_3_1_39_2","first-page":"961","volume-title":"Proceedings of the 2020 IEEE\/ACM 42nd International Conference on Software Engineering (ICSE)","author":"He Pinjia","year":"2020","unstructured":"Pinjia He, Clara Meister, and Zhendong Su. 2020. Structure-invariant testing for machine translation. In Proceedings of the 2020 IEEE\/ACM 42nd International Conference on Software Engineering (ICSE), 961\u2013973."},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE43902.2021.00047"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/ITSC45102.2020.9294489"},{"key":"e_1_3_1_42_2","first-page":"1","article-title":"Prompting meaning: A hermeneutic approach to optimising prompt engineering with ChatGPT","author":"Henrickson Leah","year":"2023","unstructured":"Leah Henrickson and Albert Mero\u00f1o-Pe\u00f1uela. 2023. Prompting meaning: A hermeneutic approach to optimising prompt engineering with ChatGPT. AI & SOCIETY (2023), 1\u201316.","journal-title":"AI & SOCIETY"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.aaa8685"},{"key":"e_1_3_1_44_2","unstructured":"Shankar Iyer Nikhil Dandekar and Kornel Csernai. 2017. First Quora Dataset Release: Question Pairs. Retrieved from https:\/\/quoradata.quora.com\/First-Quora-Dataset-Release-Question-Pairs"},{"key":"e_1_3_1_45_2","first-page":"468","volume-title":"Proceedings of the 2021 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE)","author":"Ji Pin","year":"2021","unstructured":"Pin Ji, Yang Feng, Jia Liu, Zhihong Zhao, and Baowen Xu. 2021. Automated testing for machine translation via constituency invariance. In Proceedings of the 2021 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE). IEEE, 468\u2013479."},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2022.106966"},{"key":"e_1_3_1_47_2","first-page":"8050","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"34","author":"Kamigaito Hidetaka","year":"2020","unstructured":"Hidetaka Kamigaito and Manabu Okumura. 2020. Syntactically look-ahead attention network for sentence compression. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, 8050\u20138057."},{"key":"e_1_3_1_48_2","doi-asserted-by":"crossref","first-page":"148","DOI":"10.18653\/v1\/2022.nlp4convai-1.13","volume-title":"Proceedings of the 4th Workshop on NLP for Conversational AI","author":"Kann Katharina","year":"2022","unstructured":"Katharina Kann, Abteen Ebrahimi, Joewie Koh, Shiran Dudy, and Alessandro Roncone. 2022. Open-domain dialogue generation: What we can do, cannot do, and should do next. In Proceedings of the 4th Workshop on NLP for Conversational AI, 148\u2013165."},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-020-10139-6"},{"key":"e_1_3_1_50_2","first-page":"4171","volume-title":"Proceedings of NAACL-HLT","author":"Kenton Jacob Devlin Ming-Wei Chang","year":"2019","unstructured":"Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, 4171\u20134186."},{"key":"e_1_3_1_51_2","unstructured":"Zhenzhong Lan Mingda Chen Sebastian Goodman Kevin Gimpel Piyush Sharma and Radu Soricut. 2019. ALBERT: A lite Bert for self-supervised learning of language representations. arXiv:1909.11942."},{"issue":"1","key":"e_1_3_1_52_2","first-page":"1","article-title":"Exploring strategies for training deep neural networks","volume":"10","author":"Larochelle Hugo","year":"2009","unstructured":"Hugo Larochelle, Yoshua Bengio, J\u00e9r\u00f4me Louradour, and Pascal Lamblin. 2009. Exploring strategies for training deep neural networks. Journal of Machine Learning Research 10, 1 (2009), 1\u201340.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_1_53_2","unstructured":"Dave Lee. 2018. Amazon Promises Fix for Creepy Alexa Laugh. Retrieved from https:\/\/www.bbc.com\/news\/technology-43325230"},{"key":"e_1_3_1_54_2","first-page":"74","article-title":"Rouge: A package for automatic evaluation of summaries","author":"Lin Chin-Yew","year":"2004","unstructured":"Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out, 74\u201381.","journal-title":"Text Summarization Branches Out"},{"key":"e_1_3_1_55_2","first-page":"2287","volume-title":"Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI)","author":"Liu Bo","year":"2017","unstructured":"Bo Liu, Ying Wei, Yu Zhang, and Qiang Yang. 2017. Deep neural networks for high dimension, low sample size data. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI), 2287\u20132293."},{"issue":"3","key":"e_1_3_1_56_2","first-page":"491","article-title":"Search-based algorithm with scatter search strategy for automated test case generation of NLP toolkit","volume":"5","author":"Liu Fangqing","year":"2019","unstructured":"Fangqing Liu, Han Huang, Zhongming Yang, Zhifeng Hao, and Jiangping Wang. 2019. Search-based algorithm with scatter search strategy for automated test case generation of NLP toolkit. IEEE Transactions on Emerging Topics in Computational Intelligence 5, 3 (2019), 491\u2013503.","journal-title":"IEEE Transactions on Emerging Topics in Computational Intelligence"},{"key":"e_1_3_1_57_2","unstructured":"Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy Mike Lewis Luke Zettlemoyer and Veselin Stoyanov. 2019. Roberta: A robustly optimized Bert pretraining approach. arXiv:1907.11692."},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1145\/3460319.3464829"},{"key":"e_1_3_1_59_2","doi-asserted-by":"publisher","DOI":"10.1145\/3551349.3556929"},{"key":"e_1_3_1_60_2","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1007\/978-3-319-91086-4_5","article-title":"Iterated local search: Framework and applications","author":"Louren\u00e7o Helena Ramalhinho","year":"2019","unstructured":"Helena Ramalhinho Louren\u00e7o, Olivier C. Martin, and Thomas St\u00fctzle. 2019. Iterated local search: Framework and applications. Handbook of Metaheuristics (2019), 129\u2013168.","journal-title":"Handbook of Metaheuristics"},{"key":"e_1_3_1_61_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2015.2510633"},{"key":"e_1_3_1_62_2","first-page":"116","volume-title":"Proceedings of the 4th International Conference on Dependency Linguistics (Depling \u201917)","author":"Mazziotta Nicolas","year":"2017","unstructured":"Nicolas Mazziotta and Sylvain Kahane. 2017. To what extent is immediate constituency analysis dependency-based? A survey of foundational texts. In Proceedings of the 4th International Conference on Dependency Linguistics (Depling \u201917). Link\u00f6ping University Electronic Press, Pisa, Italy, 116\u2013126. Retrieved from https:\/\/www.aclweb.org\/anthology\/W17-6515"},{"key":"e_1_3_1_63_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asej.2014.04.011"},{"key":"e_1_3_1_64_2","doi-asserted-by":"publisher","DOI":"10.1145\/219717.219748"},{"key":"e_1_3_1_65_2","doi-asserted-by":"crossref","first-page":"371","DOI":"10.1007\/978-981-13-0761-4_36","volume-title":"Harmony Search and Nature Inspired Optimization Algorithms: Theory and Applications, ICHSA 2018","author":"Mishra Deepti Bala","year":"2019","unstructured":"Deepti Bala Mishra, Rajashree Mishra, Arup Abhinna Acharya, and Kedar Nath Das. 2019. Test case optimization and prioritization based on multi-objective genetic algorithm. In Harmony Search and Nature Inspired Optimization Algorithms: Theory and Applications, ICHSA 2018. Springer, 371\u2013381."},{"key":"e_1_3_1_66_2","doi-asserted-by":"crossref","DOI":"10.5040\/9781350934009","volume-title":"An Introduction to Syntax: Fundamentals of Syntactic Analysis","author":"Moravcsik Edith A.","year":"2006","unstructured":"Edith A. Moravcsik. 2006. An Introduction to Syntax: Fundamentals of Syntactic Analysis. A & C Black."},{"key":"e_1_3_1_67_2","unstructured":"Thuy Ong. 2017. Facebook Apologizes after Wrong Translation Sees Palestinian Man Arrested for Posting \u201cGood Morning\u201d. Retrieved from https:\/\/www.theverge.com\/us-world\/2017\/10\/24\/16533496\/facebook-apology-wrong-translation-palestinian-arrested-post-good-morning"},{"key":"e_1_3_1_68_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2020.2979670"},{"key":"e_1_3_1_69_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2017.2663435"},{"key":"e_1_3_1_70_2","first-page":"311","volume-title":"Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics","author":"Papineni Kishore","year":"2002","unstructured":"Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 311\u2013318."},{"issue":"21","key":"e_1_3_1_71_2","doi-asserted-by":"crossref","first-page":"9910","DOI":"10.3390\/app11219910","article-title":"Sentence compression using BERT and graph convolutional networks","volume":"11","author":"Park Yo-Han","year":"2021","unstructured":"Yo-Han Park, Gyong-Ho Lee, Yong-Seok Choi, and Kong-Joo Lee. 2021. Sentence compression using BERT and graph convolutional networks. Applied Sciences 11, 21 (2021), 9910.","journal-title":"Applied Sciences"},{"key":"e_1_3_1_72_2","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511800924","volume-title":"An Introduction to English Sentence Structure","author":"Radford Andrew","year":"2009","unstructured":"Andrew Radford. 2009. An Introduction to English Sentence Structure. Cambridge University Press."},{"key":"e_1_3_1_73_2","first-page":"784","article-title":"Know what you don\u2019t know: Unanswerable questions for SQuAD","author":"Rajpurkar P.","year":"2018","unstructured":"P. Rajpurkar, R. Jia, and P. Liang. 2018. Know what you don\u2019t know: Unanswerable questions for SQuAD. In Meeting of the Association for Computational Linguistics, 784\u2013789.","journal-title":"Meeting of the Association for Computational Linguistics"},{"key":"e_1_3_1_74_2","doi-asserted-by":"crossref","first-page":"2383","DOI":"10.18653\/v1\/D16-1264","volume-title":"Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Rajpurkar Pranav","year":"2016","unstructured":"Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100, 000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2383\u20132392."},{"key":"e_1_3_1_75_2","doi-asserted-by":"crossref","first-page":"4902","DOI":"10.18653\/v1\/2020.acl-main.442","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL)","author":"Ribeiro Marco Tulio","year":"2020","unstructured":"Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, and Sameer Singh. 2020. Beyond accuracy: Behavioral testing of NLP models with CheckList. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 4902\u20134912."},{"key":"e_1_3_1_76_2","doi-asserted-by":"publisher","unstructured":"Marco Tulio Ribeiro Tongshuang Wu Carlos Guestrin and Sameer Singh. 2020. Beyond accuracy: Behavioral testing of NLP models with CheckList. arXiv:2005.04118. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.2005.04118","DOI":"10.48550\/arXiv.2005.04118"},{"key":"e_1_3_1_77_2","doi-asserted-by":"publisher","DOI":"10.1145\/3551349.3556953"},{"key":"e_1_3_1_78_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2013.08.015"},{"key":"e_1_3_1_79_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D13-1170"},{"key":"e_1_3_1_80_2","first-page":"8968","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"34","author":"Sun Yu","year":"2020","unstructured":"Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang. 2020. Ernie 2.0: A continual pre-training framework for language understanding. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, 8968\u20138975."},{"key":"e_1_3_1_81_2","doi-asserted-by":"publisher","DOI":"10.1145\/3377811.3380420"},{"key":"e_1_3_1_82_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510206"},{"key":"e_1_3_1_83_2","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1007\/978-3-030-85248-1_15","volume-title":"Proceedings of the Formal Methods for Industrial Critical Systems: 26th International Conference, FMICS \u201921","author":"Thibeault Quinn","year":"2021","unstructured":"Quinn Thibeault, Jacob Anderson, Aniruddh Chandratre, Giulia Pedrielli, and Georgios Fainekos. 2021. Psy-taliro: A Python toolbox for search-based test generation for cyber-physical systems. In Proceedings of the Formal Methods for Industrial Critical Systems: 26th International Conference, FMICS \u201921. Springer, 223\u2013231."},{"key":"e_1_3_1_84_2","doi-asserted-by":"publisher","DOI":"10.1162\/089120103321337458"},{"key":"e_1_3_1_85_2","volume-title":"Proceedings of the 7th International Conference on Learning Representations (ICLR \u201919)","author":"Wang Alex","year":"2019","unstructured":"Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2019. Glue: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 7th International Conference on Learning Representations (ICLR \u201919)."},{"key":"e_1_3_1_86_2","doi-asserted-by":"publisher","DOI":"10.1145\/3597926.3598081"},{"key":"e_1_3_1_87_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-022-10144-1"},{"key":"e_1_3_1_88_2","doi-asserted-by":"publisher","unstructured":"Yonghui Wu Mike Schuster Zhifeng Chen Quoc V. Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey et al. 2016. Google\u2019s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.1609.08144","DOI":"10.48550\/arXiv.1609.08144"},{"key":"e_1_3_1_89_2","article-title":"Xlnet: Generalized autoregressive pretraining for language understanding","volume":"32","author":"Yang Zhilin","year":"2019","unstructured":"Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R. Salakhutdinov, and Quoc V. Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. Advances in Neural Information Processing Systems 32 (2019).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_90_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-021-10024-2"},{"issue":"1","key":"e_1_3_1_91_2","doi-asserted-by":"crossref","first-page":"307","DOI":"10.32604\/iasc.2021.017239","article-title":"Optimizing the software testing problem using search-based software engineering techniques","volume":"29","author":"Zayed Hissah A. Ben","year":"2021","unstructured":"Hissah A. Ben Zayed and Mashael S. Maashi. 2021. Optimizing the software testing problem using search-based software engineering techniques. Intelligent Automation & Soft Computing 29, 1 (2021), 307\u2013318.","journal-title":"Intelligent Automation & Soft Computing"},{"key":"e_1_3_1_92_2","doi-asserted-by":"publisher","DOI":"10.3390\/app10217640"},{"key":"e_1_3_1_93_2","doi-asserted-by":"publisher","DOI":"10.1145\/3062341.3062379"},{"key":"e_1_3_1_94_2","doi-asserted-by":"publisher","unstructured":"Zhuosheng Zhang Hai Zhao and Rui Wang. 2020. Machine reading comprehension: The role of contextualized language models and beyond. arXiv:2005.06249. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.0708.0794","DOI":"10.48550\/arXiv.0708.0794"},{"key":"e_1_3_1_95_2","first-page":"403","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Zi Kangli","year":"2021","unstructured":"Kangli Zi, Shi Wang, Yu Liu, Jicun Li, Yanan Cao, and Cungen Cao. 2021. SOM-NCSCM: An efficient neural Chinese sentence compression model enhanced with self-organizing map. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 403\u2013415."}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3691631","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3691631","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:09:40Z","timestamp":1750295380000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3691631"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,21]]},"references-count":94,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,2,28]]}},"alternative-id":["10.1145\/3691631"],"URL":"https:\/\/doi.org\/10.1145\/3691631","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,21]]},"assertion":[{"value":"2023-12-30","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-07-29","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-01-21","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}