{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,30]],"date-time":"2026-01-30T02:29:56Z","timestamp":1769740196457,"version":"3.49.0"},"reference-count":45,"publisher":"Association for Computing Machinery (ACM)","issue":"FSE","funder":[{"name":"National Natural Science Foundation of China","award":["62232003,62172037"],"award-info":[{"award-number":["62232003,62172037"]}]},{"name":"China Postdoctoral Science Foundation","award":["2023M740078"],"award-info":[{"award-number":["2023M740078"]}]},{"name":"China National Postdoctoral Program for Innovative Talents","award":["BX20240008"],"award-info":[{"award-number":["BX20240008"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Softw. Eng."],"published-print":{"date-parts":[[2025,6,19]]},"abstract":"<jats:p>\n            It is often valuable to know whether a given piece of source code has or hasn\u2019t been used to train a given deep learning model. On one side, it helps avoid data contamination problems that may exaggerate the performance of evaluated models. Conversely, it facilitates copyright protection by identifying private or protected code leveraged for model training without permission. To this end, automated approaches have been proposed for the detection, known as data contamination detection. Such approaches often heavily rely on the confidence of the involved models, assuming that the models should be more confident in handling contaminated data than cleaned data. However, such approaches do not consider the nature of the given data item, i.e., how difficult it is to predict the given item. Consequently, difficult-to-predict contaminated data and easy-to-predict cleaned data are often misclassified. As an initial attempt to solve this problem, this paper presents a naturalness-based approach, called\n            <jats:italic toggle=\"yes\">Natural-DaCoDe<\/jats:italic>\n            , for code-completion models to distinguish contaminated source code from cleaned ones.\n            <jats:italic toggle=\"yes\">Natural-DaCoDe<\/jats:italic>\n            leverages code naturalness to quantitatively measure the difficulty of a given source code for code-completion models. It then trains a classifier to distinguish contaminated source code according to both code naturalness and the performance of the code-completion models on the given source code. We evaluate\n            <jats:italic toggle=\"yes\">Natural-DaCoDe<\/jats:italic>\n            with two pre-trained large language models (e.g.,\n            <jats:italic toggle=\"yes\">ChatGPT<\/jats:italic>\n            and\n            <jats:italic toggle=\"yes\">Claude<\/jats:italic>\n            ) and two code-completion models that we trained from scratch for detecting contamination data. Our evaluation results suggest that\n            <jats:italic toggle=\"yes\">Natural-DaCoDe<\/jats:italic>\n            substantially outperformed the state-of-the-art approaches in detecting contaminated data, improving the average accuracy by 61.78%. We also evaluate\n            <jats:italic toggle=\"yes\">Natural-DaCoDe<\/jats:italic>\n            with method name suggestion task, and it remains more accurate than the state-of-the-art approaches, improving the accuracy by 54.39%. Furthermore, Natural-DaCoDe was tested on a natural language text benchmark, significantly outperforming the state-of-the-art approaches by 22% . It may suggest that\n            <jats:italic toggle=\"yes\">Natural-DaCoDe<\/jats:italic>\n            could be applied to various source code related tasks besides code complete.\n          <\/jats:p>","DOI":"10.1145\/3715765","type":"journal-article","created":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:15:34Z","timestamp":1750346134000},"page":"1046-1067","source":"Crossref","is-referenced-by-count":1,"title":["Has My Code Been Stolen for Model Training? A Naturalness Based Approach to Code Contamination Detection"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-5423-2827","authenticated-orcid":false,"given":"Haris Ali","family":"Khan","sequence":"first","affiliation":[{"name":"Beijing Institute of Technology, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6404-9143","authenticated-orcid":false,"given":"Yanjie","family":"Jiang","sequence":"additional","affiliation":[{"name":"Peking University, Beijing, China"},{"name":"Beijing Institute of Technology, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0237-3025","authenticated-orcid":false,"given":"Qasim","family":"Umer","sequence":"additional","affiliation":[{"name":"King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9371-5931","authenticated-orcid":false,"given":"Yuxia","family":"Zhang","sequence":"additional","affiliation":[{"name":"Beijing Institute of Technology, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-9987-1238","authenticated-orcid":false,"given":"Waseem","family":"Akram","sequence":"additional","affiliation":[{"name":"Beijing Institute of Technology, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3267-6801","authenticated-orcid":false,"given":"Hui","family":"Liu","sequence":"additional","affiliation":[{"name":"Beijing Institute of Technology, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2025,6,19]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2024. Google Drive File. https:\/\/drive.google.com\/file\/d\/11DtSrfFRFrOZ3pAKm4V3dpteo62g-eIw\/view?usp=drive_link"},{"key":"e_1_2_1_2_1","unstructured":"2024. Google Drive File. https:\/\/drive.google.com\/file\/d\/1PGWJ5mSyxT1m29ytXK8GhFNyjMMoobFq\/view?usp=drive_link"},{"key":"e_1_2_1_3_1","unstructured":"2024. Replication Package. https:\/\/github.com\/naturalnessbasedappraoch\/Natural-DaCode"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward!","author":"Allamanis Miltiadis","year":"2019","unstructured":"Miltiadis Allamanis. 2019. The Adverse Effects of Code Duplication in Machine Learning Models of Code. In Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward! 2019). Association for Computing Machinery, New York, NY, USA. 143\u2013153. isbn:9781450369954 https:\/\/doi.org\/10.1145\/3359591.3359735 10.1145\/3359591.3359735"},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. 281\u2013293","author":"Allamanis Miltiadis","year":"2014","unstructured":"Miltiadis Allamanis, Earl T Barr, and Charles Sutton. 2014. Learning natural coding conventions. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. 281\u2013293. https:\/\/doi.org\/10.1145\/2635868.2635883 10.1145\/2635868.2635883"},{"key":"e_1_2_1_6_1","unstructured":"Anthropic. 2023. How Up-to-Date Is Claude\u2019s Training Data? https:\/\/support.anthropic.com\/en\/articles\/8114494-how-up-to-date-is-claude-s-training-data"},{"key":"e_1_2_1_7_1","unstructured":"Anthropic. 2024. Claude LLM: An Overview. Anthropic. https:\/\/www.anthropic.com\/claude"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","unstructured":"Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell Sandhini Agarwal Ariel Herbert-Voss Gretchen Krueger Tom Henighan Rewon Child Aditya Ramesh Daniel M. Ziegler Jeffrey Wu Clemens Winter Christopher Hesse Mark Chen Eric Sigler Mateusz Litwin Scott Gray Benjamin Chess Jack Clark Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever and Dario Amodei. 2020. Language models are few-shot learners. Article 159 25 pages. isbn:9781713829546 https:\/\/doi.org\/10.5555\/3495724.3495883","DOI":"10.5555\/3495724.3495883"},{"key":"e_1_2_1_9_1","volume-title":"2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP). 7\u201375","author":"Bunkerd Thanadon","year":"2019","unstructured":"Thanadon Bunkerd, Dong Wang, Raula Kula, Chaiyong Ragkhitwetsagul, Morakot Choetkiertikul, Thanakorn Sunetnanta, Takashi Ishio, and Ken-ichi Matsumoto. 2019. How do contributors impact code naturalness? An exploratory study of 50 python projects. In 2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP). 7\u201375. https:\/\/doi.org\/10.1109\/IWESEP49350.2019.00010 10.1109\/IWESEP49350.2019.00010"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11432-023-4127-5"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses (RAID \u201923)","author":"Chen Yizheng","year":"2023","unstructured":"Yizheng Chen, Zhoujie Ding, Lamya Alowain, Xinyun Chen, and David Wagner. 2023. DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection. In Proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses (RAID \u201923). Association for Computing Machinery, New York, NY, USA. 654\u2013668. isbn:9798400707650 https:\/\/doi.org\/10.1145\/3607199.3607242 10.1145\/3607199.3607242"},{"key":"e_1_2_1_12_1","unstructured":"Common Crawl. 2024. Open Repository of Web Crawl Data. https:\/\/commoncrawl.org\/"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2402.02823"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 37th IEEE\/ACM International Conference on Software Engineering. 2, 543\u2013546","author":"Devanbu Premkumar","year":"2015","unstructured":"Premkumar Devanbu. 2015. New Initiative: The Naturalness of Software. In Proceedings of the 37th IEEE\/ACM International Conference on Software Engineering. 2, 543\u2013546. https:\/\/doi.org\/10.5555\/2819009.2819097"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2303.06808"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Online. 1325\u20131335","author":"Elangovan Aparna","year":"2021","unstructured":"Aparna Elangovan, Jiayuan He, and Karin Verspoor. 2021. Memorization vs. Generalization: Quantifying Data Leakage in NLP Performance Evaluation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Online. 1325\u20131335. https:\/\/doi.org\/10.18653\/v1\/2021.eacl-main.113 10.18653\/v1\/2021.eacl-main.113"},{"key":"e_1_2_1_17_1","unstructured":"Hugging Face. 2021. CodeParrot: An Open-Source Code Generation Model. https:\/\/huggingface.co\/codeparrot"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","unstructured":"Shahriar Golchin and M. Surdeanu. 2023. Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models. arXiv https:\/\/doi.org\/10.48550\/arXiv.2311.06233 arxiv:arXiv:2311.06233. 10.48550\/arXiv.2311.06233","DOI":"10.48550\/arXiv.2311.06233"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the Twelfth International Conference on Learning Representations (ICLR \u201924)","author":"Golchin Shahriar","year":"2024","unstructured":"Shahriar Golchin and Mihai Surdeanu. 2024. Time Travel in LLMs: Tracing Data Contamination in Large Language Models. In Proceedings of the Twelfth International Conference on Learning Representations (ICLR \u201924). ACM, New York, NY, USA. https:\/\/openreview.net\/forum?id=2Rwq6c3tvr"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","unstructured":"Daya Guo Shuai Lu Nan Duan Yanlin Wang Ming Zhou and Jian Yin. 2022. UniXcoder: Unified Cross-Modal Pre-training for Code Representation. May 7212\u20137225. https:\/\/doi.org\/10.18653\/v1\/2022.acl-long.499 10.18653\/v1\/2022.acl-long.499","DOI":"10.18653\/v1"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2902362"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 32nd IEEE\/ACM International Conference on Program Comprehension (ICPC \u201924)","author":"Huang Tao","year":"2024","unstructured":"Tao Huang, Zhihong Sun, Zhi Jin, Ge Li, and Chen Lyu. 2024. Knowledge-Aware Code Generation with Large Language Models. In Proceedings of the 32nd IEEE\/ACM International Conference on Program Comprehension (ICPC \u201924). Association for Computing Machinery, New York, NY, USA. 52\u201363. isbn:9798400705861 https:\/\/doi.org\/10.1145\/3643916.3644418 10.1145\/3643916.3644418"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","unstructured":"Alon Jacovi Avi Caciularu Omer Goldman and Yoav Goldberg. 2023. Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks. arXiv 5075\u20135084. https:\/\/doi.org\/10.48550\/arXiv.2305.10160 10.48550\/arXiv.2305.10160","DOI":"10.48550\/arXiv.2305.10160"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2023.3267028"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","unstructured":"Yucheng Li. 2023. An Open Source Data Contamination Report for Llama Series Models. arXiv https:\/\/doi.org\/10.48550\/arXiv.2310.17589 arxiv:arXiv:2310.17589. 10.48550\/arXiv.2310.17589","DOI":"10.48550\/arXiv.2310.17589"},{"key":"e_1_2_1_26_1","volume-title":"2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). 594\u2013598","author":"Lin Baoyuan","year":"2019","unstructured":"Baoyuan Lin, Csaba Nagy, Gabriele Bavota, and Michele Lanza. 2019. On the impact of refactoring operations on code naturalness. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). 594\u2013598. https:\/\/doi.org\/10.1109\/SANER.2019.8667992 10.1109\/SANER.2019.8667992"},{"key":"e_1_2_1_27_1","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Mattern Justus","year":"2023","unstructured":"Justus Mattern, Fatemehsadat Mireshghallah, Zhijing Jin, Bernhard Schoelkopf, Mrinmaya Sachan, and Taylor Berg-Kirkpatrick. 2023. Membership Inference Attacks against Language Models via Neighbourhood Comparison. In Findings of the Association for Computational Linguistics: ACL 2023, Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada. 11330\u201311343. https:\/\/doi.org\/10.18653\/v1\/2023.findings-acl.719 10.18653\/v1\/2023.findings-acl.719"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2022.3180184"},{"key":"e_1_2_1_29_1","unstructured":"OpenAI. 2023. OpenAI Data Partnerships. https:\/\/openai.com\/index\/data-partnerships\/"},{"key":"e_1_2_1_30_1","unstructured":"OpenAI. 2024. GPT-3.5 Turbo Updates. https:\/\/help.openai.com\/en\/articles\/8555514-gpt-3-5-turbo-updates"},{"key":"e_1_2_1_31_1","unstructured":"OpenAI. 2024. New Embedding Models and API Updates. https:\/\/openai.com\/blog\/new-embedding-models-and-api-updates\/"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078195"},{"key":"e_1_2_1_33_1","volume-title":"2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE). 37\u201348","author":"Rahman Musfiqur","year":"2019","unstructured":"Musfiqur Rahman, Dharani Palani, and Peter C Rigby. 2019. Natural software revisited. In 2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE). 37\u201348. https:\/\/doi.org\/10.1109\/ICSE.2019.00022 10.1109\/ICSE.2019.00022"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 38th International Conference on Software Engineering. 428\u2013439","author":"Ray Baishakhi","year":"2016","unstructured":"Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, and Premkumar Devanbu. 2016. On the \"Naturalness\" of Buggy Code. In Proceedings of the 38th International Conference on Software Engineering. 428\u2013439. https:\/\/doi.org\/10.1145\/2884781.2884848 10.1145\/2884781.2884848"},{"key":"e_1_2_1_35_1","unstructured":"Microsoft Research. 2021. UnixCoder: A Pre-trained Model for Code Understanding and Generation. https:\/\/github.com\/microsoft\/CodeBERT"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","unstructured":"Weijia Shi Anirudh Ajith Mengzhou Xia Yangsibo Huang Daogao Liu Terra Blevins Danqi Chen and Luke Zettlemoyer. 2023. Detecting Pretraining Data from Large Language Models. arXiv https:\/\/doi.org\/10.48550\/arXiv.2310.16789 arxiv:arXiv:2310.16789. 10.48550\/arXiv.2310.16789","DOI":"10.48550\/arXiv.2310.16789"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the Twelfth International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=zWqr3MQuNs","author":"Shi Weijia","year":"2024","unstructured":"Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, and Luke Zettlemoyer. 2024. Detecting Pretraining Data from Large Language Models. In Proceedings of the Twelfth International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=zWqr3MQuNs"},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the 2023 10th International Conference on Dependable Systems and Their Applications (DSA). 831\u2013838","author":"Su Haoran","year":"2023","unstructured":"Haoran Su, Jun Ai, Dan Yu, and Hong Zhang. 2023. An Evaluation Method for Large Language Models\u2019 Code Generation Capability. In Proceedings of the 2023 10th International Conference on Dependable Systems and Their Applications (DSA). 831\u2013838. https:\/\/doi.org\/10.1109\/DSA59317.2023.00118 10.1109\/DSA59317.2023.00118"},{"key":"e_1_2_1_39_1","unstructured":"SWJ0419. 2024. WikiMIA. https:\/\/huggingface.co\/datasets\/swj0419\/WikiMIA"},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=rkZvSe-RZ","author":"Tram\u00e8r Florian","year":"2018","unstructured":"Florian Tram\u00e8r, Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh, and Patrick McDaniel. 2018. Ensemble Adversarial Training: Attacks and Defenses. In Proceedings of the International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=rkZvSe-RZ"},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. 269\u2013280","author":"Tu Zhaopeng","year":"2014","unstructured":"Zhaopeng Tu, Zhendong Su, and Premkumar Devanbu. 2014. On the Localness of Software. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. 269\u2013280. https:\/\/doi.org\/10.1145\/2635868.2635875 10.1145\/2635868.2635875"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22)","author":"Wu Shuang","year":"2022","unstructured":"Shuang Wu, Jingyu Zhao, and Guangjian Tian. 2022. Understanding and Mitigating Data Contamination in Deep Anomaly Detection: A Kernel-based Approach. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22). 2319\u20132325. https:\/\/doi.org\/10.24963\/ijcai.2022\/322 10.24963\/ijcai.2022\/322"},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming (MAPS 2022","author":"Xu Frank F.","year":"2022","unstructured":"Frank F. Xu, Uri Alon, Graham Neubig, and Vincent Josua Hellendoorn. 2022. A Systematic Evaluation of Large Language Models of Code. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming (MAPS 2022). Association for Computing Machinery, New York, NY, USA. 1\u201310. isbn:9781450392730 https:\/\/doi.org\/10.1145\/3520312.3534862 10.1145\/3520312.3534862"},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of the 2018 IEEE 31st Computer Security Foundations Symposium (CSF). 268\u2013282","author":"Yeom Samuel","year":"2018","unstructured":"Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. 2018. Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting. In Proceedings of the 2018 IEEE 31st Computer Security Foundations Symposium (CSF). 268\u2013282. https:\/\/doi.org\/10.1109\/CSF.2018.00027 10.1109\/CSF.2018.00027"},{"key":"e_1_2_1_45_1","volume-title":"Proceedings of the 5th International Conference on Learning Representations (ICLR 2017), Toulon, France, April 24\u201326, 2017, Conference Track Proceedings. https:\/\/openreview.net\/forum?id=Sy8gdB9xx","author":"Zhang Chiyuan","year":"2017","unstructured":"Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2017. Understanding Deep Learning Requires Rethinking Generalization. In Proceedings of the 5th International Conference on Learning Representations (ICLR 2017), Toulon, France, April 24\u201326, 2017, Conference Track Proceedings. https:\/\/openreview.net\/forum?id=Sy8gdB9xx"}],"container-title":["Proceedings of the ACM on Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3715765","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:30:00Z","timestamp":1750347000000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3715765"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,19]]},"references-count":45,"journal-issue":{"issue":"FSE","published-print":{"date-parts":[[2025,6,19]]}},"alternative-id":["10.1145\/3715765"],"URL":"https:\/\/doi.org\/10.1145\/3715765","relation":{},"ISSN":["2994-970X"],"issn-type":[{"value":"2994-970X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,19]]}}}