{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T02:38:30Z","timestamp":1774579110584,"version":"3.50.1"},"reference-count":77,"publisher":"Association for Computing Machinery (ACM)","issue":"FSE","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Softw. Eng."],"published-print":{"date-parts":[[2025,6,19]]},"abstract":"<jats:p>Unit testing is crucial for software development and maintenance. Effective unit testing ensures and improves software quality, but writing unit tests is time-consuming and labor-intensive. Recent studies have proposed deep learning (DL) techniques or large language models (LLMs) to automate unit test generation. These models are usually trained or fine-tuned on large-scale datasets. Despite growing awareness of the importance of data quality, there has been limited research on the quality of datasets used for test generation. To bridge this gap, we systematically examine the impact of noise on the performance of learning-based test generation models. We first apply the open card sorting method to analyze the most popular and largest test generation dataset, Methods2Test, to categorize eight distinct types of noise. Further, we conduct detailed interviews with 17 domain experts to validate and assess the importance, reasonableness, and correctness of the noise taxonomy. Then, we propose CleanTest, an automated noise-cleaning framework designed to improve the quality of test generation datasets. CleanTest comprises three filters: a rule-based syntax filter, a rule-based relevance filter, and a model-based coverage filter. To evaluate its effectiveness, we apply CleanTest on two widely-used test generation datasets, i.e., Methods2Test and Atlas. Our findings indicate that 43.52% and 29.65% of datasets contain noise, highlighting its prevalence. Finally, we conduct comparative experiments using four LLMs (i.e., CodeBERT, AthenaTest, StarCoder, and CodeLlama7B) to assess the impact of noise on test generation performance. The results show that filtering noise positively influences the test generation ability of the models. Fine-tuning the four LLMs with the filtered Methods2Test dataset, on average, improves its performance by 67% in branch coverage, using the Defects4J benchmark. For the Atlas dataset, the four LLMs improve branch coverage by 39%. Additionally, filtering noise improves bug detection performance, resulting in a 21.42% increase in bugs detected by the generated tests.<\/jats:p>","DOI":"10.1145\/3715778","type":"journal-article","created":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:15:34Z","timestamp":1750346134000},"page":"1293-1316","source":"Crossref","is-referenced-by-count":3,"title":["Less Is More: On the Importance of Data Quality for Unit Test Generation"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4323-8951","authenticated-orcid":false,"given":"Junwei","family":"Zhang","sequence":"first","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0093-3292","authenticated-orcid":false,"given":"Xing","family":"Hu","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-2695-2968","authenticated-orcid":false,"given":"Shan","family":"Gao","sequence":"additional","affiliation":[{"name":"Huawei, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6302-3256","authenticated-orcid":false,"given":"Xin","family":"Xia","sequence":"additional","affiliation":[{"name":"Huawei, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4367-7201","authenticated-orcid":false,"given":"David","family":"Lo","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2615-9792","authenticated-orcid":false,"given":"Shanping","family":"Li","sequence":"additional","affiliation":[{"name":"Zhejiang University, 0000-0003-2615-9792, China"}]}],"member":"320","published-online":{"date-parts":[[2025,6,19]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.INFSOF.2024.107565"},{"key":"e_1_2_1_2_1","volume-title":"Likert scales and data analyses. Quality progress, 40, 7","author":"Elaine Allen I","year":"2007","unstructured":"I Elaine Allen and Christopher A Seaman. 2007. Likert scales and data analyses. Quality progress, 40, 7 (2007), 64\u201365."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE-SEIP.2017.27"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1108\/10662241211199960"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2014.2372785"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.3390\/JIMAGING6060041"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3551349.3556961"},{"key":"e_1_2_1_8_1","volume-title":"MacDonell","author":"Bosu Michael Franklin","year":"2021","unstructured":"Michael Franklin Bosu and Stephen G. MacDonell. 2021. A Taxonomy of Data Quality Challenges in Empirical Software Engineering. CoRR, abs\/2106.06141 (2021), arxiv:2106.06141"},{"key":"e_1_2_1_9_1","unstructured":"Cobertura.. 2025. Cobertura is a free java tool that calculates the percentage of code accessed by tests. https:\/\/cobertura.github.io\/cobertura\/"},{"key":"e_1_2_1_10_1","unstructured":"Jackson Core.. 2025. Jackson Core. https:\/\/github.com\/google\/Closure"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE48619.2023.00022"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1368088.1368127"},{"key":"e_1_2_1_13_1","unstructured":"Jackson Dataformat-xml.. 2025. Jackson Dataformat-xml. https:\/\/github.com\/FasterXML\/jackson-dataformat-xml"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2022.3227418"},{"key":"e_1_2_1_15_1","volume-title":"Neural Unit Test Suggestions. CoRR, abs\/2109.09262","author":"Dinella Elizabeth","year":"2021","unstructured":"Elizabeth Dinella, Shuvendu K. Lahiri, Todd Mytkowicz, and Gabriel Ryan. 2021. Neural Unit Test Suggestions. CoRR, abs\/2109.09262 (2021), arxiv:2109.09262"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510141"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.SCICO.2007.01.015"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.18653\/V1"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2025113.2025179"},{"key":"e_1_2_1_20_1","volume-title":"InCoder: A Generative Model for Code Infilling and Synthesis. In The Eleventh International Conference on Learning Representations, ICLR 2023","author":"Fried Daniel","year":"2023","unstructured":"Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Scott Yih, Luke Zettlemoyer, and Mike Lewis. 2023. InCoder: A Generative Model for Code Infilling and Synthesis. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net. https:\/\/openreview.net\/forum?id=hQwb-lbM6EL"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-59762-7_19"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3597503.3608134"},{"key":"e_1_2_1_23_1","volume-title":"Goodrich and Roberto Tamassia","author":"Michael","year":"2002","unstructured":"Michael T. Goodrich and Roberto Tamassia. 2002. Algorithm design - foundations, analysis and internet examples. Wiley. isbn:978-0-471-38365-9"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3624032.3624035"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2009.71"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2308.08033"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3238147.3238183"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3611643.3616265"},{"key":"e_1_2_1_29_1","volume-title":"LoRA: Low-Rank Adaptation of Large Language Models. In The Tenth International Conference on Learning Representations, ICLR 2022","author":"Hu Edward J.","year":"2022","unstructured":"Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2022. LoRA: Low-Rank Adaptation of Large Language Models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net. https:\/\/openreview.net\/forum?id=nZeVKeeFYf9"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3196321.3196334"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3338906.3340459"},{"key":"e_1_2_1_32_1","unstructured":"Jacoco.. 2025. JaCoCo is a free code coverage library for Java which has been created by the EclEmma team based on the lessons learned from using and integration existing libraries for many years.. https:\/\/www.jacoco.org\/jacoco\/"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2931037.2931062"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2610384.2628055"},{"key":"e_1_2_1_35_1","unstructured":"Commons jxpath.. 2025. Commons jxpath. https:\/\/github.com\/apache\/commons-jxpath"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.18653\/V1"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE48619.2023.00085"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","unstructured":"Raymond Li Loubna Ben Allal Yangtian Zi Niklas Muennighoff Denis Kocetkov Chenghao Mou Marc Marone Christopher Akiki Jia Li Jenny Chim Qian Liu Evgenii Zheltonozhskii Terry Yue Zhuo Thomas Wang Olivier Dehaene Mishig Davaadorj Joel Lamy-Poirier Jo\u00e3o Monteiro Oleh Shliazhko Nicolas Gontier Nicholas Meade Armel Zebaze Ming-Ho Yee Logesh Kumar Umapathi Jian Zhu Benjamin Lipkin Muhtasham Oblokulov Zhiruo Wang Rudra Murthy V Jason Stillerman Siva Sankalp Patel Dmitry Abulkhanov Marco Zocca Manan Dey Zhihan Zhang Nour Moustafa-Fahmy Urvashi Bhattacharyya Wenhao Yu Swayam Singh Sasha Luccioni Paulo Villegas Maxim Kunakov Fedor Zhdanov Manuel Romero Tony Lee Nadav Timor Jennifer Ding Claire Schlesinger Hailey Schoelkopf Jan Ebert Tri Dao Mayank Mishra Alex Gu Jennifer Robinson Carolyn Jane Anderson Brendan Dolan-Gavitt Danish Contractor Siva Reddy Daniel Fried Dzmitry Bahdanau Yacine Jernite Carlos Mu\u00f1oz Ferrandis Sean Hughes Thomas Wolf Arjun Guha Leandro von Werra and Harm de Vries. 2023. StarCoder: may the source be with you!. CoRR abs\/2305.06161 (2023) https:\/\/doi.org\/10.48550\/ARXIV.2305.06161 10.48550\/ARXIV.2305.06161","DOI":"10.48550\/ARXIV.2305.06161"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2972958.2972967"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3597926.3598080"},{"key":"e_1_2_1_41_1","unstructured":"Openclover.. 2025. Openclover code coverage platform for java and groovy. https:\/\/openclover.org\/"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1297846.1297902"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2007.37"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","unstructured":"Replication Package.. 2025. Replication. https:\/\/doi.org\/10.5281\/zenodo.13767074 10.5281\/zenodo.13767074","DOI":"10.5281\/zenodo.13767074"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073135"},{"key":"e_1_2_1_46_1","unstructured":"The parser generator tool.. 2025. Tree Sitter. https:\/\/tree-sitter.github.io\/tree-sitter"},{"key":"e_1_2_1_47_1","unstructured":"Math Project.. 2025. apache\/commons-math. https:\/\/github.com\/apache\/commons-math"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/IWAST.2012.6228988"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASE56229.2023.00193"},{"key":"e_1_2_1_50_1","volume-title":"CodeBLEU: a Method for Automatic Evaluation of Code Synthesis. CoRR, abs\/2009.10297","author":"Ren Shuo","year":"2020","unstructured":"Shuo Ren, Daya Guo, Shuai Lu, Long Zhou, Shujie Liu, Duyu Tang, Neel Sundaresan, Ming Zhou, Ambrosio Blanco, and Shuai Ma. 2020. CodeBLEU: a Method for Automatic Evaluation of Code Synthesis. CoRR, abs\/2009.10297 (2020), arxiv:2009.10297"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","unstructured":"Baptiste Rozi\u00e8re Jonas Gehring Fabian Gloeckle Sten Sootla Itai Gat Xiaoqing Ellen Tan Yossi Adi Jingyu Liu Tal Remez J\u00e9r\u00e9my Rapin Artyom Kozhevnikov Ivan Evtimov Joanna Bitton Manish Bhatt Cristian Canton-Ferrer Aaron Grattafiori Wenhan Xiong Alexandre D\u00e9fossez Jade Copet Faisal Azhar Hugo Touvron Louis Martin Nicolas Usunier Thomas Scialom and Gabriel Synnaeve. 2023. Code Llama: Open Foundation Models for Code. CoRR abs\/2308.12950 (2023) https:\/\/doi.org\/10.48550\/ARXIV.2308.12950 10.48550\/ARXIV.2308.12950","DOI":"10.48550\/ARXIV.2308.12950"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1111\/J.1468-0394.2005.00300.X"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2302.06527"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2023.3334955"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3540250.3549145"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2412.14308"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-32381-3_16"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510160"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2015.139"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/3524842.3528009"},{"key":"e_1_2_1_61_1","volume-title":"Shao Kun Deng, and Neel Sundaresan","author":"Tufano Michele","year":"2020","unstructured":"Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, and Neel Sundaresan. 2020. Unit test case generation with transformers and focal context. arXiv preprint arXiv:2009.05617."},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/3524481.3527220"},{"key":"e_1_2_1_63_1","volume-title":"IOP conference series: materials science and engineering. 324","author":"Wang Weijie","year":"2018","unstructured":"Weijie Wang and Yanmin Lu. 2018. Analysis of the mean absolute error (MAE) and the root mean square error (RMSE) in assessing rounding model. In IOP conference series: materials science and engineering. 324, 012049."},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.18653\/V1"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/3377811.3380429"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-25231-0_5"},{"key":"e_1_2_1_67_1","volume-title":"ReAssert: Deep Learning for Assert Generation. CoRR, abs\/2011.09784","author":"White Robert","year":"2020","unstructured":"Robert White and Jens Krinke. 2020. ReAssert: Deep Learning for Assert Generation. CoRR, abs\/2011.09784 (2020), arxiv:2011.09784"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2021.3063727"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/1289971.1289983"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE48619.2023.00129"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2301.13246"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2013.6693084"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1145\/3520312.3534862"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2406.18181"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510149"},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2305.04207"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.24963\/IJCAI.2022"}],"container-title":["Proceedings of the ACM on Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3715778","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:17:07Z","timestamp":1750346227000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3715778"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,19]]},"references-count":77,"journal-issue":{"issue":"FSE","published-print":{"date-parts":[[2025,6,19]]}},"alternative-id":["10.1145\/3715778"],"URL":"https:\/\/doi.org\/10.1145\/3715778","relation":{},"ISSN":["2994-970X"],"issn-type":[{"value":"2994-970X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,19]]}}}