{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,12]],"date-time":"2026-05-12T19:21:41Z","timestamp":1778613701297,"version":"3.51.4"},"reference-count":73,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2025,3,24]],"date-time":"2025-03-24T00:00:00Z","timestamp":1742774400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,3,24]],"date-time":"2025-03-24T00:00:00Z","timestamp":1742774400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001866","name":"Fonds National de la Recherche Luxembourg","doi-asserted-by":"publisher","award":["17185670"],"award-info":[{"award-number":["17185670"]}],"id":[{"id":"10.13039\/501100001866","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Empir Software Eng"],"published-print":{"date-parts":[[2025,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    The quality of software is closely tied to the effectiveness of the tests it undergoes. Manual test writing, though crucial for bug detection, is time-consuming, which has driven significant research into automated test case generation. However, current methods often struggle to generate relevant inputs, limiting the effectiveness of the tests produced. To address this, we introduce\n                    <jats:sc>BRMiner<\/jats:sc>\n                    , a novel approach that leverages Large Language Models (LLMs) in combination with traditional techniques to extract relevant inputs from bug reports, thereby enhancing automated test generation tools. In this study, we evaluate\n                    <jats:sc>BRMiner<\/jats:sc>\n                    using the Defects4J benchmark and test generation tools such as EvoSuite and Randoop. Our results demonstrate that\n                    <jats:sc>BRMiner<\/jats:sc>\n                    achieves a Relevant Input Rate (RIR) of 60.03% and a Relevant Input Extraction Accuracy Rate (RIEAR) of 31.71%, significantly outperforming methods that rely on LLMs alone. The integration of BRMiner\u2019s input enhances EvoSuite ability to generate more effective test, leading to increased code coverage, with gains observed in branch, instruction, method, and line coverage across multiple projects. Furthermore,\n                    <jats:sc>BRMiner<\/jats:sc>\n                    facilitated the detection of 58 unique bugs, including those that were missed by traditional baseline approaches. Overall,\n                    <jats:sc>BRMiner<\/jats:sc>\n                    \u2019s combination of LLM filtering with traditional input extraction techniques significantly improves the relevance and effectiveness of automated test generation, advancing the detection of bugs and enhancing code coverage, thereby contributing to higher-quality software development.\n                  <\/jats:p>","DOI":"10.1007\/s10664-025-10635-z","type":"journal-article","created":{"date-parts":[[2025,3,26]],"date-time":"2025-03-26T01:48:47Z","timestamp":1742953727000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Enriching automatic test case generation by extracting relevant test inputs from bug reports"],"prefix":"10.1007","volume":"30","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-7312-6273","authenticated-orcid":false,"given":"Wendk\u00fbuni C.","family":"Ou\u00e9draogo","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Laura","family":"Plein","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kader","family":"Kabor\u00e9","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andrew","family":"Habib","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jacques","family":"Klein","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Lo","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tegawend\u00e9 F.","family":"Bissyand\u00e9","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2025,3,24]]},"reference":[{"key":"10635_CR1","doi-asserted-by":"crossref","unstructured":"Almasi MM, Hemmati H, Fraser G, Arcuri A, Benefelds J (2017) An industrial evaluation of unit test generation: Finding real faults in a financial application. In: 2017 IEEE\/ACM 39th international conference on software engineering: software engineering in practice track (ICSE-SEIP), IEEE, pp 263\u2013272","DOI":"10.1109\/ICSE-SEIP.2017.27"},{"key":"10635_CR2","unstructured":"Amatriain X (2024) Prompt design and engineering: Introduction and advanced methods. arXiv:2401.14423"},{"key":"10635_CR3","doi-asserted-by":"publisher","first-page":"594","DOI":"10.1007\/s10664-013-9249-9","volume":"18","author":"A Arcuri","year":"2013","unstructured":"Arcuri A, Fraser G (2013) Parameter tuning or default values? an empirical investigation in search-based software engineering. Empirical Softw Eng 18:594\u2013623","journal-title":"Empirical Softw Eng"},{"key":"10635_CR4","doi-asserted-by":"crossref","unstructured":"Artzi S, Dolby J, Jensen SH, M\u00f8ller A, Tip F (2011) A framework for automated testing of javascript web applications. In: Proceedings of the 33rd international conference on software engineering, pp 571\u2013580","DOI":"10.1145\/1985793.1985871"},{"key":"10635_CR5","unstructured":"Bai Y, Jones A, Ndousse K, Askell A, Chen A, DasSarma N, Drain D, Fort S, Ganguli D, Henighan T, et\u00a0al (2022) Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv:2204.05862"},{"issue":"3","key":"10635_CR6","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3182657","volume":"51","author":"R Baldoni","year":"2018","unstructured":"Baldoni R, Coppa E, D\u2019elia DC, Demetrescu C, Finocchi I (2018) A survey of symbolic execution techniques. ACM Comput Surv (CSUR) 51(3):1\u201339","journal-title":"ACM Comput Surv (CSUR)"},{"key":"10635_CR7","doi-asserted-by":"crossref","unstructured":"Bettenburg N, Premraj R, Zimmermann T, Kim S (2008) Extracting structural information from bug reports. In: Proceedings of the 2008 international working conference on Mining software repositories, pp 27\u201330","DOI":"10.1145\/1370750.1370757"},{"key":"10635_CR8","doi-asserted-by":"crossref","unstructured":"Bozkurt M, Harman M (2011) Automatically generating realistic test input from web services. In: Proceedings of 2011 IEEE 6th international symposium on service oriented system (SOSE), IEEE, pp 13\u201324","DOI":"10.1109\/SOSE.2011.6139088"},{"key":"10635_CR9","first-page":"1877","volume":"33","author":"T Brown","year":"2020","unstructured":"Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877\u20131901","journal-title":"Adv Neural Inf Process Syst"},{"issue":"2","key":"10635_CR10","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1145\/2408776.2408795","volume":"56","author":"C Cadar","year":"2013","unstructured":"Cadar C, Sen K (2013) Symbolic execution for software testing: three decades later. Commun ACM 56(2):82\u201390","journal-title":"Commun ACM"},{"issue":"2","key":"10635_CR11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/1455518.1455522","volume":"12","author":"C Cadar","year":"2008","unstructured":"Cadar C, Ganesh V, Pawlowski PM, Dill DL, Engler DR (2008) Exe: Automatically generating inputs of death. ACM Trans Inf Syst Sec (TISSEC) 12(2):1\u201338","journal-title":"ACM Trans Inf Syst Sec (TISSEC)"},{"key":"10635_CR12","doi-asserted-by":"crossref","unstructured":"Cedric\u00a0Richter HW (2022) Tssb-3m: Mining single statement bugs at massive scale. In: MSR","DOI":"10.1145\/3524842.3528505"},{"key":"10635_CR13","doi-asserted-by":"crossref","unstructured":"Chen Y, Hu Z, Zhi C, Han J, Deng S, Yin J (2023) Chatunitest: A framework for llm-based test generation. arXiv e-prints pp arXiv\u20132305","DOI":"10.1145\/3663529.3663801"},{"key":"10635_CR14","unstructured":"Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805"},{"key":"10635_CR15","doi-asserted-by":"crossref","unstructured":"Elbaum S, Karre S, Rothermel G (2003) Improving web application testing with user session data. In: 25th International conference on software engineering, 2003. Proceedings., IEEE, pp 49\u201359","DOI":"10.1109\/ICSE.2003.1201187"},{"key":"10635_CR16","doi-asserted-by":"crossref","unstructured":"Fan A, Gokkaya B, Harman M, Lyubarskiy M, Sengupta S, Yoo S, Zhang JM (2023) Large language models for software engineering: Survey and open problems. In: 2023 IEEE\/ACM international conference on software engineering: future of software engineering (ICSE-FoSE), IEEE, pp 31\u201353","DOI":"10.1109\/ICSE-FoSE59343.2023.00008"},{"key":"10635_CR17","doi-asserted-by":"crossref","unstructured":"Fazzini M, Prammer M, d\u2019Amorim M, Orso A (2018) Automatically translating bug reports into test cases for mobile apps. In: Proceedings of the 27th ACM SIGSOFT international symposium on software testing and analysis, pp 141\u2013152","DOI":"10.1145\/3213846.3213869"},{"key":"10635_CR18","doi-asserted-by":"crossref","unstructured":"Fraser G, Arcuri A (2011) Evosuite: automatic test suite generation for object-oriented software. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pp 416\u2013419","DOI":"10.1145\/2025113.2025179"},{"issue":"4","key":"10635_CR19","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2699688","volume":"24","author":"G Fraser","year":"2015","unstructured":"Fraser G, Staats M, McMinn P, Arcuri A, Padberg F (2015) Does automated unit test generation really help software testers? a controlled empirical study. ACM Trans Softw Eng Methodol (TOSEM) 24(4):1\u201349","journal-title":"ACM Trans Softw Eng Methodol (TOSEM)"},{"key":"10635_CR20","doi-asserted-by":"crossref","unstructured":"Galeotti JP, Fraser G, Arcuri A (2013) Improving search-based test suite generation with dynamic symbolic execution. In: 2013 ieee 24th international symposium on software reliability engineering (issre), IEEE, pp 360\u2013369","DOI":"10.1109\/ISSRE.2013.6698889"},{"key":"10635_CR21","doi-asserted-by":"crossref","unstructured":"Galeotti JP, Fraser G, Arcuri A (2014) Extending a search-based test generator with adaptive dynamic symbolic execution. In: Proceedings of the 2014 international symposium on software testing and analysis, pp 421\u2013424","DOI":"10.1145\/2610384.2628049"},{"key":"10635_CR22","doi-asserted-by":"crossref","unstructured":"Godefroid P, Klarlund N, Sen K (2005) Dart: Directed automated random testing. In: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pp 213\u2013223","DOI":"10.1145\/1065010.1065036"},{"key":"10635_CR23","doi-asserted-by":"crossref","unstructured":"Han X, Yu T, Lo D (2018) Perflearner: Learning from bug reports to understand and generate performance test frames. In: Proceedings of the 33rd ACM\/IEEE international conference on automated software engineering, pp 17\u201328","DOI":"10.1145\/3238147.3238204"},{"issue":"2","key":"10635_CR24","doi-asserted-by":"publisher","first-page":"226","DOI":"10.1109\/TSE.2009.71","volume":"36","author":"M Harman","year":"2010","unstructured":"Harman M, McMinn P (2010) A theoretical and empirical study of search-based testing: Local, global, and hybrid search. IEEE Trans Softw Eng 36(2):226\u2013247. https:\/\/doi.org\/10.1109\/TSE.2009.71","journal-title":"IEEE Trans Softw Eng"},{"key":"10635_CR25","doi-asserted-by":"crossref","unstructured":"Just R, Jalali D, Ernst MD (2014) Defects4j: A database of existing faults to enable controlled testing studies for java programs. In: Proceedings of the 2014 international symposium on software testing and analysis, pp 437\u2013440","DOI":"10.1145\/2610384.2628055"},{"issue":"7","key":"10635_CR26","doi-asserted-by":"publisher","first-page":"385","DOI":"10.1145\/360248.360252","volume":"19","author":"JC King","year":"1976","unstructured":"King JC (1976) Symbolic execution and program testing. Commun ACM 19(7):385\u2013394","journal-title":"Commun ACM"},{"key":"10635_CR27","doi-asserted-by":"crossref","unstructured":"Kochhar PS, Bissyand\u00e9 TF, Lo D, Jiang L (2013a) Adoption of software testing in open source projects\u2013a preliminary study on 50,000 projects. In: 2013 17th european conference on software maintenance and reengineering, IEEE, pp 353\u2013356","DOI":"10.1109\/CSMR.2013.48"},{"key":"10635_CR28","doi-asserted-by":"crossref","unstructured":"Kochhar PS, Bissyand\u00e9 TF, Lo D, Jiang L (2013b) An empirical study of adoption of software testing in open source projects. In: 2013 13th International conference on quality software, IEEE, pp 103\u2013112","DOI":"10.1109\/QSIC.2013.57"},{"key":"10635_CR29","doi-asserted-by":"crossref","unstructured":"Kochhar PS, Thung F, Nagappan N, Zimmermann T, Lo D (2015) Understanding the test automation culture of app developers. In: 2015 IEEE 8th International conference on software testing, verification and validation (ICST), IEEE, pp 1\u201310","DOI":"10.1109\/ICST.2015.7102609"},{"key":"10635_CR30","first-page":"22199","volume":"35","author":"T Kojima","year":"2022","unstructured":"Kojima T, Gu SS, Reid M, Matsuo Y, Iwasawa Y (2022) Large language models are zero-shot reasoners. Adv Neural Inf Process Syst 35:22199\u201322213","journal-title":"Adv Neural Inf Process Syst"},{"key":"10635_CR31","doi-asserted-by":"crossref","unstructured":"Liu P, Zhang X, Pistoia M, Zheng Y, Marques M, Zeng L (2017) Automatic text input generation for mobile testing. In: 2017 IEEE\/ACM 39th International conference on software engineering (ICSE), IEEE, pp 643\u2013653","DOI":"10.1109\/ICSE.2017.65"},{"key":"10635_CR32","doi-asserted-by":"crossref","unstructured":"Liu Z, Chen C, Wang J, Chen M, Wu B, Tian Z, Huang Y, Hu J, Wang Q (2024) Testing the limits: Unusual text inputs generation for mobile app crash detection with large language model. In: Proceedings of the IEEE\/ACM 46th International conference on software engineering, pp 1\u201312","DOI":"10.1145\/3597503.3639118"},{"key":"10635_CR33","unstructured":"Long J (2023) Large language model guided tree-of-thought. arXiv:2305.08291"},{"key":"10635_CR34","doi-asserted-by":"crossref","unstructured":"Macedo M, Tian Y, Cogo FR, Adams B (2024) Exploring the impact of the output format on the evaluation of large language models for code translation. arXiv:2403.17214","DOI":"10.1145\/3650105.3652301"},{"key":"10635_CR35","doi-asserted-by":"crossref","unstructured":"Majumdar R, Xu RG (2007) Directed test generation using symbolic grammars. In: Proceedings of the 22nd IEEE\/ACM international conference on automated software engineering, pp 134\u2013143","DOI":"10.1145\/1321631.1321653"},{"key":"10635_CR36","doi-asserted-by":"crossref","unstructured":"Mariani L, Pezz\u00e8 M, Riganelli O, Santoro M (2014) Link: exploiting the web of data to generate test inputs. In: Proceedings of the 2014 international symposium on software testing and analysis, pp 373\u2013384","DOI":"10.1145\/2610384.2610397"},{"key":"10635_CR37","doi-asserted-by":"crossref","unstructured":"McMinn P, Shahbaz M, Stevenson M (2012) Search-based test input generation for string data types using the results of web queries. In: 2012 IEEE Fifth International conference on software testing, verification and validation, IEEE, pp 141\u2013150","DOI":"10.1109\/ICST.2012.94"},{"key":"10635_CR38","doi-asserted-by":"crossref","unstructured":"Milani\u00a0Fard A, Mirzaaghaei M, Mesbah A (2014) Leveraging existing tests in automated test generation for web applications. In: Proceedings of the 29th ACM\/IEEE international conference on Automated software engineering, pp 67\u201378","DOI":"10.1145\/2642937.2642991"},{"key":"10635_CR39","unstructured":"Naveed H, Khan AU, Qiu S, Saqib M, Anwar S, Usman M, Barnes N, Mian A (2023) A comprehensive overview of large language models. arXiv:2307.06435"},{"key":"10635_CR40","doi-asserted-by":"crossref","unstructured":"Pacheco C, Lahiri SK, Ernst MD, Ball T (2007) Feedback-directed random test generation. In: 29th International conference on software engineering (ICSE\u201907), IEEE, pp 75\u201384","DOI":"10.1109\/ICSE.2007.37"},{"key":"10635_CR41","doi-asserted-by":"crossref","unstructured":"Panichella A, Kifetew FM, Tonella P (2015) Reformulating branch coverage as a many-objective optimization problem. In: 2015 IEEE 8th international conference on software testing, verification and validation (ICST), IEEE, pp 1\u201310","DOI":"10.1109\/ICST.2015.7102604"},{"issue":"2","key":"10635_CR42","doi-asserted-by":"publisher","first-page":"122","DOI":"10.1109\/TSE.2017.2663435","volume":"44","author":"A Panichella","year":"2017","unstructured":"Panichella A, Kifetew FM, Tonella P (2017) Automated test case generation as a many-objective optimisation problem with dynamic selection of the targets. IEEE Trans Softw Eng 44(2):122\u2013158","journal-title":"IEEE Trans Softw Eng"},{"key":"10635_CR43","doi-asserted-by":"crossref","unstructured":"Perera A, Aleti A, B\u00f6hme M, Turhan B (2020) Defect prediction guided search-based software testing. In: Proceedings of the 35th IEEE\/ACM international conference on automated software engineering, pp 448\u2013460","DOI":"10.1145\/3324884.3416612"},{"key":"10635_CR44","doi-asserted-by":"crossref","unstructured":"Pradel M, Gross TR (2012) Fully automatic and precise detection of thread safety violations. In: Proceedings of the 33rd ACM SIGPLAN conference on programming language design and implementation, pp 521\u2013530","DOI":"10.1145\/2254064.2254126"},{"key":"10635_CR45","doi-asserted-by":"crossref","unstructured":"Rabin MRI, Alipour MA (2021) Configuring test generators using bug reports: a case study of gcc compiler and csmith. In: Proceedings of the 36th annual ACM symposium on applied computing, pp 1750\u20131758","DOI":"10.1145\/3412841.3442047"},{"issue":"140","key":"10635_CR46","first-page":"1","volume":"21","author":"C Raffel","year":"2020","unstructured":"Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1\u201367","journal-title":"J Mach Learn Res"},{"key":"10635_CR47","doi-asserted-by":"crossref","unstructured":"Reynolds L, McDonell K (2021) Prompt programming for large language models: Beyond the few-shot paradigm. In: Extended abstracts of the 2021 CHI conference on human factors in computing systems, pp 1\u20137","DOI":"10.1145\/3411763.3451760"},{"key":"10635_CR48","doi-asserted-by":"crossref","unstructured":"Sahoo P, Singh AK, Saha S, Jain V, Mondal S, Chadha A (2024) A systematic survey of prompt engineering in large language models: Techniques and applications. arXiv:2402.07927","DOI":"10.1007\/979-8-8688-0569-1_4"},{"key":"10635_CR49","doi-asserted-by":"crossref","unstructured":"Sen K, Agha G (2006) Cute and jcute: Concolic unit testing and explicit path model-checking tools: (tool paper). In: Computer Aided Verification: 18th International Conference, CAV 2006, Seattle, WA, USA, August 17-20, 2006. Proceedings 18, Springer, pp 419\u2013423","DOI":"10.1007\/11817963_38"},{"issue":"5","key":"10635_CR50","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1145\/1095430.1081750","volume":"30","author":"K Sen","year":"2005","unstructured":"Sen K, Marinov D, Agha G (2005) Cute: A concolic unit testing engine for c. ACM SIGSOFT Softw Eng Notes 30(5):263\u2013272","journal-title":"ACM SIGSOFT Softw Eng Notes"},{"key":"10635_CR51","doi-asserted-by":"crossref","unstructured":"Shahbaz M, McMinn P, Stevenson M (2012) Automated discovery of valid test strings from the web using dynamic regular expressions collation and natural language processing. In: 2012 12th International conference on quality software, IEEE, pp 79\u201388","DOI":"10.1109\/QSIC.2012.15"},{"key":"10635_CR52","doi-asserted-by":"crossref","unstructured":"Shamshiri S, Just R, Rojas JM, Fraser G, McMinn P, Arcuri A (2015) Do automatically generated unit tests find real faults? an empirical study of effectiveness and challenges (t). In: 2015 30th IEEE\/ACM International conference on automated software engineering (ASE), IEEE, pp 201\u2013211","DOI":"10.1109\/ASE.2015.86"},{"issue":"2","key":"10635_CR53","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1145\/3624724","volume":"67","author":"M Shanahan","year":"2024","unstructured":"Shanahan M (2024) Talking about large language models. Commun ACM 67(2):68\u201379","journal-title":"Commun ACM"},{"key":"10635_CR54","first-page":"8887","volume":"975","author":"S Shelke","year":"2014","unstructured":"Shelke S, Nagpure S (2014) Generation of string test input from web using regular expression. Int J Comput Appl 975:8887","journal-title":"Int J Comput Appl"},{"key":"10635_CR55","unstructured":"Si C, Gan Z, Yang Z, Wang S, Wang J, Boyd-Graber J, Wang L (2022) Prompting gpt-3 to be reliable. arXiv:2210.09150"},{"key":"10635_CR56","unstructured":"Siddiq ML, Dristi S, Saha J, Santos J (2024a) Quality assessment of prompts used in code generation. arXiv:2404.10155"},{"key":"10635_CR57","doi-asserted-by":"crossref","unstructured":"Siddiq ML, Santos JC, Tanvir RH, Ulfat N, Al\u00a0Rifat F, Lopes VC (2024b) Using large language models to generate junit tests: An empirical study","DOI":"10.1145\/3661167.3661216"},{"key":"10635_CR58","doi-asserted-by":"crossref","unstructured":"Tang Y, Liu Z, Zhou Z, Luo X (2024) Chatgpt vs sbst: A comparative assessment of unit test suite generation. IEEE Trans Softw Eng","DOI":"10.1109\/TSE.2024.3382365"},{"key":"10635_CR59","doi-asserted-by":"publisher","unstructured":"Toffola LD, Staicu CA, Pradel M (2017) Saying \u2019hi!\u2019 is not enough: mining inputs for effective test generation. In: Proceedings of the 32nd International Conference on Automated Software Engineering, IEEE Computer Society, pp 44\u201349. https:\/\/doi.org\/10.1109\/ASE.2017.8115617","DOI":"10.1109\/ASE.2017.8115617"},{"key":"10635_CR60","doi-asserted-by":"crossref","unstructured":"Tsigkanos C, Rani P, M\u00fcller S, Kehrer T (2023) Variable discovery with large language models for metamorphic testing of scientific software. In: International conference on computational science, Springer, pp 321\u2013335","DOI":"10.1007\/978-3-031-35995-8_23"},{"issue":"5","key":"10635_CR61","doi-asserted-by":"publisher","first-page":"478","DOI":"10.1049\/sfw2.12063","volume":"16","author":"KJ Valle-G\u00f3mez","year":"2022","unstructured":"Valle-G\u00f3mez KJ, Garc\u00eda-Dom\u00ednguez A, Delgado-P\u00e9rez P, Medina-Bulo I (2022) Mutation-inspired symbolic execution for software testing. IET Softw 16(5):478\u2013492","journal-title":"IET Softw"},{"key":"10635_CR62","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser \u0141, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30"},{"key":"10635_CR63","doi-asserted-by":"crossref","unstructured":"Vogelsang A, Fischbach J (2024) Using large language models for natural language processing tasks in requirements engineering: A systematic guideline. arXiv:2402.13823","DOI":"10.1007\/978-3-031-73143-3_16"},{"key":"10635_CR64","doi-asserted-by":"crossref","unstructured":"Wang J, Huang Y, Chen C, Liu Z, Wang S, Wang Q (2024) Software testing with large language models: Survey, landscape, and vision. IEEE Trans Softw Eng","DOI":"10.1109\/TSE.2024.3368208"},{"key":"10635_CR65","doi-asserted-by":"crossref","unstructured":"Weeratunge D, Zhang X, Jagannathan S (2010) Analyzing multicore dumps to facilitate concurrency bug reproduction. In: Proceedings of the fifteenth International conference on architectural support for programming languages and operating systems, pp 155\u2013166","DOI":"10.1145\/1736020.1736039"},{"key":"10635_CR66","first-page":"24824","volume":"35","author":"J Wei","year":"2022","unstructured":"Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le QV, Zhou D et al (2022) Chain-of-thought prompting elicits reasoning in large language models. Adv Neural Inf Process Syst 35:24824\u201324837","journal-title":"Adv Neural Inf Process Syst"},{"key":"10635_CR67","doi-asserted-by":"crossref","unstructured":"Xie T, Marinov D, Schulte W, Notkin D (2005) Symstra: A framework for generating object-oriented unit tests using symbolic execution. In: Tools and Algorithms for the Construction and Analysis of Systems: 11th International Conference, TACAS 2005, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2005, Edinburgh, UK, April 4-8, 2005. Proceedings 11, Springer, pp 365\u2013381","DOI":"10.1007\/978-3-540-31980-1_24"},{"key":"10635_CR68","doi-asserted-by":"crossref","unstructured":"Yang C, Deng Y, Lu R, Yao J, Liu J, Jabbarvand R, Zhang L (2023) White-box compiler fuzzing empowered by large language models. arXiv:2310.15991","DOI":"10.1145\/3689736"},{"key":"10635_CR69","unstructured":"Yao S, Yu D, Zhao J, Shafran I, Griffiths T, Cao Y, Narasimhan K (2024) Tree of thoughts: Deliberate problem solving with large language models. Adv Neural Inf Process Syst 36"},{"key":"10635_CR70","doi-asserted-by":"crossref","unstructured":"Yenduri G, Ramalingam M, Selvi GC, Supriya Y, Srivastava G, Maddikunta PKR, Raj GD, Jhaveri RH, Prabadevi B, Wang W, et\u00a0al (2024) Gpt (generative pre-trained transformer)\u2013a comprehensive review on enabling technologies, potential applications, emerging challenges, and future directions. IEEE Access","DOI":"10.1109\/ACCESS.2024.3389497"},{"key":"10635_CR71","doi-asserted-by":"crossref","unstructured":"Yu T, Zaman TS, Wang C (2017) Descry: reproducing system-level concurrency failures. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pp 694\u2013704","DOI":"10.1145\/3106237.3106266"},{"key":"10635_CR72","doi-asserted-by":"crossref","unstructured":"Zhong H (2022) Enriching compiler testing with real program from bug report. In: Proceedings of the 37th IEEE\/ACM International conference on automated software engineering, pp 1\u201312","DOI":"10.1145\/3551349.3556894"},{"key":"10635_CR73","unstructured":"Ziegler DM, Stiennon N, Wu J, Brown TB, Radford A, Amodei D, Christiano P, Irving G (2019) Fine-tuning language models from human preferences. arXiv:1909.08593"}],"container-title":["Empirical Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-025-10635-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10664-025-10635-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-025-10635-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,20]],"date-time":"2025-11-20T13:29:48Z","timestamp":1763645388000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10664-025-10635-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,24]]},"references-count":73,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,5]]}},"alternative-id":["10635"],"URL":"https:\/\/doi.org\/10.1007\/s10664-025-10635-z","relation":{},"ISSN":["1382-3256","1573-7616"],"issn-type":[{"value":"1382-3256","type":"print"},{"value":"1573-7616","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,3,24]]},"assertion":[{"value":"4 March 2025","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 March 2025","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflicts of interest or competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of Interest\/Competing Interests"}},{"value":"This article does not contain any studies with human participants or animals performed by any of the authors.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical Approval"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Informed Consent"}}],"article-number":"85"}}