{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,4]],"date-time":"2026-02-04T21:49:46Z","timestamp":1770241786741,"version":"3.49.0"},"reference-count":94,"publisher":"Association for Computing Machinery (ACM)","issue":"OOPSLA2","license":[{"start":{"date-parts":[[2024,10,8]],"date-time":"2024-10-08T00:00:00Z","timestamp":1728345600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Program. Lang."],"published-print":{"date-parts":[[2024,10,8]]},"abstract":"<jats:p>Compiler correctness is crucial, as miscompilation can falsify program behaviors, leading to serious consequences over the software supply chain. In the literature, fuzzing has been extensively studied to uncover compiler defects. However, compiler fuzzing remains challenging: Existing arts focus on black- and grey-box fuzzing, which generates test programs without sufficient understanding of internal compiler behaviors. As such, they often fail to construct test programs to exercise intricate optimizations. Meanwhile, traditional white-box techniques, such as symbolic execution, are computationally inapplicable to the giant codebase of compiler systems. Recent advances demonstrate that Large Language Models (LLMs) excel in code generation\/understanding tasks and even have achieved state-of-the-art performance in black-box fuzzing. Nonetheless, guiding LLMs with compiler source-code information remains a missing piece of research in compiler testing.<\/jats:p>\n                  <jats:p>\n                    To this end, we propose W\n                    <jats:sc>hite<\/jats:sc>\n                    F\n                    <jats:sc>ox<\/jats:sc>\n                    , the first white-box compiler fuzzer using LLMs with source-code information to test compiler optimization, with a spotlight on detecting deep logic bugs in the emerging deep learning (DL) compilers. W\n                    <jats:sc>hite<\/jats:sc>\n                    F\n                    <jats:sc>ox<\/jats:sc>\n                    adopts a multi-agent framework: (i) an LLM-based analysis agent examines the low-level optimization source code and produces requirements on the high-level test programs that can trigger the optimization; (ii) an LLM-based generation agent produces test programs based on the summarized requirements. Additionally, optimization-triggering tests are also used as feedback to further enhance the test generation prompt on the fly. Our evaluation on the three most popular DL compilers (\n                    <jats:italic toggle=\"yes\">i.e<\/jats:italic>\n                    ., PyTorch Inductor, TensorFlow-XLA, and TensorFlow Lite) shows that W\n                    <jats:sc>hite<\/jats:sc>\n                    F\n                    <jats:sc>ox<\/jats:sc>\n                    can generate high-quality test programs to exercise deep optimizations requiring intricate conditions, practicing up to 8 times more optimizations than state-of-the-art fuzzers. To date, W\n                    <jats:sc>hite<\/jats:sc>\n                    F\n                    <jats:sc>ox<\/jats:sc>\n                    has found in total 101 bugs for the compilers under test, with 92 confirmed as previously unknown and 70 already fixed. Notably, W\n                    <jats:sc>hite<\/jats:sc>\n                    F\n                    <jats:sc>ox<\/jats:sc>\n                    has been recently acknowledged by the PyTorch team, and is in the process of being incorporated into its development workflow. Finally, beyond DL compilers, W\n                    <jats:sc>hite<\/jats:sc>\n                    F\n                    <jats:sc>ox<\/jats:sc>\n                    can also be adapted for compilers in different domains, such as LLVM, where WHiteFox has already found multiple bugs.\n                  <\/jats:p>","DOI":"10.1145\/3689736","type":"journal-article","created":{"date-parts":[[2024,10,8]],"date-time":"2024-10-08T03:23:04Z","timestamp":1728357784000},"page":"709-735","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":44,"title":["WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language Models"],"prefix":"10.1145","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7976-5086","authenticated-orcid":false,"given":"Chenyuan","family":"Yang","sequence":"first","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Champaign, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4628-4219","authenticated-orcid":false,"given":"Yinlin","family":"Deng","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Champaign, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-5261-6147","authenticated-orcid":false,"given":"Runyu","family":"Lu","sequence":"additional","affiliation":[{"name":"Huazhong University of Science and Technology, Wuhan, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8588-4356","authenticated-orcid":false,"given":"Jiayi","family":"Yao","sequence":"additional","affiliation":[{"name":"Chinese University of Hong Kong, Shenzhen, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7122-8625","authenticated-orcid":false,"given":"Jiawei","family":"Liu","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Champaign, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0668-8526","authenticated-orcid":false,"given":"Reyhaneh","family":"Jabbarvand","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Champaign, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5175-2702","authenticated-orcid":false,"given":"Lingming","family":"Zhang","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Champaign, USA"}]}],"member":"320","published-online":{"date-parts":[[2024,10,8]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"2021. News. https:\/\/www.vice.com\/en%us\/article\/9kga85\/uber-is-giving-up-on-self-driving-cars-in-california-after-deadly-crash."},{"key":"e_1_3_1_3_2","unstructured":"2022. Coverage.py. https:\/\/github.com\/nedbat\/coveragepy."},{"key":"e_1_3_1_4_2","unstructured":"2022. GCOV. https:\/\/gcc.gnu.org\/onlinedocs\/gcc\/Gcov.html."},{"key":"e_1_3_1_5_2","unstructured":"Abien Fred Agarap. 2018. Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375 (2018)."},{"key":"e_1_3_1_6_2","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877\u20131901.","journal-title":"Advances in neural information processing systems"},{"key":"e_1_3_1_7_2","unstructured":"S\u00e9bastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke Eric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg et al. 2023. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712 (2023)."},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3363562"},{"key":"e_1_3_1_9_2","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Ponde de Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman et al. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021)."},{"key":"e_1_3_1_10_2","unstructured":"Tianqi Chen Thierry Moreau Ziheng Jiang Lianmin Zheng Eddie Yan Haichen Shen Meghan Cowan Leyuan Wang Yuwei Hu Luis Ceze et al. 2018. { TVM }: An automated { End-to-End } optimizing compiler for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 578\u2013594."},{"key":"e_1_3_1_11_2","doi-asserted-by":"crossref","unstructured":"Yinghao Chen Zehao Hu Chen Zhi Junxiao Han Shuiguang Deng and Jianwei Yin. 2024. ChatUniTest: A Framework for LLM-Based Test Generation. In Companion Proceedings of the 32nd ACM International Conference on the Foundations ofSoftware Engineering. 572\u2013576.","DOI":"10.1145\/3663529.3663801"},{"key":"e_1_3_1_12_2","doi-asserted-by":"crossref","first-page":"642","DOI":"10.1109\/SP40001.2021.00071","volume-title":"2021 IEEE Symposium on Security and Privacy (SP)","author":"Chen Yongheng","year":"2021","unstructured":"Yongheng Chen, Rui Zhong, Hong Hu, Hangfan Zhang, Yupeng Yang, Dinghao Wu, and Wenke Lee. 2021. One engine to fuzz\u2019em all: Generic language processor testing with semantic validation. In 2021 IEEE Symposium on Security and Privacy (SP). IEEE, 642\u2013658."},{"key":"e_1_3_1_13_2","doi-asserted-by":"crossref","unstructured":"Mingi Cho Seoyoung Kim and Taekyoung Kwon. 2019. Intriguer: Field-level constraint solving for hybrid fuzzing. In Proceedings ofthe 2019 ACM SIGSAC Conference on Computerand Communications Security. 515\u2013530.","DOI":"10.1145\/3319535.3354249"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2019.00082"},{"key":"e_1_3_1_15_2","unstructured":"Aakanksha Chowdhery Sharan Narang Jacob Devlin Maarten Bosma Gaurav Mishra Adam Roberts Paul Barham Hyung Won Chung Charles Sutton Sebastian Gehrmann et al. 2022. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311 (2022)."},{"key":"e_1_3_1_16_2","unstructured":"Neophytos Christou Di Jin Vaggelis Atlidakis Baishakhi Ray and Vasileios P Kemerlis. 2023. { IvySyn } : Automated Vulnerability Discovery in Deep Learning Frameworks. In 32nd USENIX Security Symposium (USENIX Security 23). 2383\u20132400."},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2024.107468"},{"key":"e_1_3_1_18_2","doi-asserted-by":"crossref","unstructured":"Yinlin Deng Chunqiu Steven Xia Haoran Peng Chenyuan Yang and Lingming Zhang. 2023. Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2023).","DOI":"10.1145\/3597926.3598067"},{"key":"e_1_3_1_19_2","doi-asserted-by":"crossref","unstructured":"Yinlin Deng Chunqiu Steven Xia Chenyuan Yang Shizhuo Dylan Zhang Shujing Yang and Lingming Zhang. 2023. Large language models are edge-case fuzzers: Testing deep learning libraries via fuzzgpt. arXiv preprint arXiv:2304.02014 (2023).","DOI":"10.1145\/3597926.3598067"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3133917"},{"key":"e_1_3_1_21_2","doi-asserted-by":"crossref","unstructured":"Alastair F Donaldson Paul Thomson Vasyl Teliman Stefano Milizia Andr\u00e9 Perez Maselco and Antoni Karpinski. 2021. Test-case reduction and deduplication almost for free with transformation-based compiler testing. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 1017\u20131032.","DOI":"10.1145\/3453483.3454092"},{"key":"e_1_3_1_22_2","doi-asserted-by":"crossref","unstructured":"Karine Even-Mendoza Arindam Sharma Alastair F Donaldson and Cristian Cadar. 2023. GrayC: Greybox Fuzzing of Compilers and Analysers for C. (2023).","DOI":"10.1145\/3597926.3598130"},{"key":"e_1_3_1_23_2","doi-asserted-by":"crossref","unstructured":"Zhangyin Feng Daya Guo Duyu Tang Nan Duan Xiaocheng Feng Ming Gong Linjun Shou Bing Qin Ting Liu Daxin Jiang et al. 2020. Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020).","DOI":"10.18653\/v1\/2020.findings-emnlp.139"},{"key":"e_1_3_1_24_2","doi-asserted-by":"crossref","unstructured":"Gordon Fraser and Andrea Arcuri. 2011. Evosuite: automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 416\u2013419.","DOI":"10.1145\/2025113.2025179"},{"key":"e_1_3_1_25_2","unstructured":"GCC 2023. GCC. https:\/\/gcc.gnu.org\/."},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/1065010.1065036"},{"key":"e_1_3_1_27_2","unstructured":"Rahul Gopinath Bachir Bendrissou Bj\u00f6rn Mathis and Andreas Zeller. 2020. Fuzzing with fast failure feedback. arXiv preprint arXiv:2012.13516 (2020)."},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510092"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/3324884.3416571"},{"key":"e_1_3_1_30_2","first-page":"630","volume-title":"Computer Vision-ECCV 2016:14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part IV 14","author":"He Kaiming","year":"2016","unstructured":"Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity mappings in deep residual networks. In Computer Vision-ECCV 2016:14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part IV 14. Springer, 630\u2013645."},{"key":"e_1_3_1_31_2","unstructured":"Christian Holler Kim Herzig and Andreas Zeller. 2012. Fuzzing with code fragments. In 21st USENIX Security Symposium (USENIX Security 12). 445\u2013458."},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/SP40000.2020.00063"},{"key":"e_1_3_1_33_2","unstructured":"HuggingFace 2023. Hugging Face. https:\/\/huggingface.co."},{"key":"e_1_3_1_34_2","doi-asserted-by":"crossref","unstructured":"Laura Inozemtseva and Reid Holmes. 2014. Coverage is not strongly correlated with test suite effectiveness. In Proceedings of the 36th international conference on software engineering. 435\u2013445.","DOI":"10.1145\/2568225.2568271"},{"key":"e_1_3_1_35_2","doi-asserted-by":"crossref","first-page":"410","DOI":"10.1109\/ICSE48619.2023.00045","volume-title":"2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE)","author":"Jiang Ling","year":"2023","unstructured":"Ling Jiang, Hengchen Yuan, Mingyuan Wu, Lingming Zhang, and Yuqun Zhang. 2023. Evaluating and improving hybrid fuzzing. In 2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 410\u2013422."},{"key":"e_1_3_1_36_2","unstructured":"Kyungtae Kim Dae R Jeong Chung Hwan Kim Yeongjin Jang Insik Shin and Byoungyoung Lee. 2020. HFL: Hybrid Fuzzing on the Linux Kernel.. In NDSS."},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/360248.360252"},{"key":"e_1_3_1_38_2","doi-asserted-by":"crossref","unstructured":"George Klees Andrew Ruef Benji Cooper Shiyi Wei and Michael Hicks. 2018. Evaluating fuzz testing. In Proceedings of the 2018 ACM SIGSAC conference on computer and communications security. 2123\u20132138.","DOI":"10.1145\/3243734.3243804"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2004.1281665"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/2666356.2594334"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/2858965.2814319"},{"key":"e_1_3_1_42_2","doi-asserted-by":"crossref","unstructured":"Caroline Lemieux Jeevana Priya Inala Shuvendu K Lahiri and Siddhartha Sen. 2023. CODAMOSA: Escaping coverage plateaus in test generation with pre-trained large language models. In International conference on software engineering (ICSE).","DOI":"10.1109\/ICSE48619.2023.00085"},{"key":"e_1_3_1_43_2","unstructured":"Raymond Li Loubna Ben Allal Yangtian Zi Niklas Muennighoff Denis Kocetkov Chenghao Mou Marc Marone Christopher Akiki Jia Li Jenny Chim et al. 2023. StarCoder: may the source be with you! arXiv preprint arXiv:2305.06161 (2023)."},{"key":"e_1_3_1_44_2","doi-asserted-by":"crossref","unstructured":"Wen Li Haoran Yang Xiapu Luo Long Cheng and Haipeng Cai. 2023. PyRTFuzz: Detecting Bugs in Python Runtimes via Two-Level Collaborative Fuzzing. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security. 1645\u20131659.","DOI":"10.1145\/3576915.3623166"},{"key":"e_1_3_1_45_2","unstructured":"libFuzzer 2023. libFuzzer \u2013 a library for coverage-guided fuzz testing. https:\/\/llvm.org\/docs\/LibFuzzer.html."},{"key":"e_1_3_1_46_2","doi-asserted-by":"crossref","unstructured":"Jiawei Liu Jinkun Lin Fabian Ruffy Cheng Tan Jinyang Li Aurojit Panda and Lingming Zhang. 2023. NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers. In ASPLOS. 530\u2013543.","DOI":"10.1145\/3575693.3575707"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/3611643.3616337"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/3527317"},{"key":"e_1_3_1_49_2","unstructured":"Nelson F Liu Kevin Lin John Hewitt Ashwin Paranjape Michele Bevilacqua Fabio Petroni and Percy Liang. 2023. Lost in the middle: How language models use long contexts. arXiv preprint arXiv:2307.03172 (2023)."},{"key":"e_1_3_1_50_2","doi-asserted-by":"crossref","unstructured":"Yinxi Liu and Wei Meng. 2023. DSFuzz: Detecting Deep State Bugs with Dependent State Exploration. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security. 1242\u20131256.","DOI":"10.1145\/3576915.3616594"},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/3428264"},{"key":"e_1_3_1_52_2","unstructured":"LLVM 2023. LLVM\u2019s Analysis and Transform Passes. https:\/\/llvm.org\/docs\/Passes.html."},{"key":"e_1_3_1_53_2","doi-asserted-by":"crossref","unstructured":"Bj\u00f6rn Mathis Rahul Gopinath and Andreas Zeller. 2020. Learning input tokens for effective fuzzing. In Proceedings of the 29th ACM SIGSOFT international symposium on software testing and analysis. 27\u201337.","DOI":"10.1145\/3395363.3397348"},{"issue":"1","key":"e_1_3_1_54_2","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1016\/0304-4076(94)01612-4","article-title":"A generalization of the beta distribution with applications","volume":"66","author":"McDonald James B","year":"1995","unstructured":"James B McDonald and Yexiao J Xu. 1995. A generalization of the beta distribution with applications. Journal of Econometrics 66, 1-2 (1995), 133\u20131528.","journal-title":"Journal of Econometrics"},{"issue":"1","key":"e_1_3_1_55_2","first-page":"100","article-title":"Differential testing for software","volume":"10","author":"McKeeman William M","year":"1998","unstructured":"William M McKeeman. 1998. Differential testing for software. Digital Technical Journal 10, 1 (1998), 100\u2013107.","journal-title":"Digital Technical Journal"},{"key":"e_1_3_1_56_2","unstructured":"MKLDNN 2024. MKL-DNN. https:\/\/github.com\/rsdubtso\/mkl-dnn."},{"key":"e_1_3_1_57_2","unstructured":"Pengyu Nie Rahul Banerjee Junyi Jessy Li Raymond J Mooney and Milos Gligoric. 2023. Learning Deep Semantics for Test Completion. arXiv preprint arXiv:2302.10166 (2023)."},{"key":"e_1_3_1_58_2","unstructured":"oneDNN 2024. oneDNN. https:\/\/github.com\/oneapi-src\/oneDNN."},{"key":"e_1_3_1_59_2","unstructured":"OpenAI. 2023. ChatGPT. (2023). https:\/\/openai.com\/blog\/chatgpt."},{"key":"e_1_3_1_60_2","unstructured":"OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]"},{"key":"e_1_3_1_61_2","doi-asserted-by":"crossref","unstructured":"Kexin Pei Yinzhi Cao Junfeng Yang and Suman Jana. 2017. Deepxplore: Automated whitebox testing of deep learning systems. In proceedings ofthe 26th Symposium on Operating Systems Principles. 1\u201318.","DOI":"10.1145\/3132747.3132785"},{"key":"e_1_3_1_62_2","doi-asserted-by":"publisher","unstructured":"Hung Viet Pham Thibaud Lutellier Weizhen Qi and Lin Tan. 2019. CRADLE: Cross-Backend Validation to Detect and Localize Bugs in Deep Learning Libraries. In 2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE). 1027\u20131038. https:\/\/doi.org\/10.1109\/ICSE.2019.00107 10.1109\/ICSE.2019.00107","DOI":"10.1109\/ICSE.2019.00107"},{"key":"e_1_3_1_63_2","unstructured":"PyTorch 2023. PyTorch. http:\/\/pytorch.org."},{"key":"e_1_3_1_64_2","unstructured":"PyTorch 2023. PyTorch 2.0. https:\/\/pytorch.org\/get-started\/pytorch-2.0."},{"key":"e_1_3_1_65_2","unstructured":"Max Sch\u00e4fer Sarah Nadi Aryaz Eghbali and Frank Tip. 2023. Adaptive test generation using a large language model. arXiv preprint arXiv:2302.06527 (2023)."},{"key":"e_1_3_1_66_2","unstructured":"Mozilla Security. 2007. jsfunfuzz. https:\/\/github.com\/MozillaSecurity\/funfuzz."},{"key":"e_1_3_1_67_2","doi-asserted-by":"crossref","unstructured":"Koushik Sen. 2007. Concolic testing. In Proceedings of the 22nd IEEE\/ACM international conference on Automated software engineering. 571\u2013572.","DOI":"10.1145\/1321631.1321746"},{"key":"e_1_3_1_68_2","doi-asserted-by":"publisher","DOI":"10.1145\/1081706.1081750"},{"key":"e_1_3_1_69_2","unstructured":"Weijie Shao Yuyang Gao Fu Song Sen Chen and Lingling Fan. 2023. An Empirical Study of Bugs in Open-Source Federated Learning Framework. ArXiv abs\/2308.05014 (2023). https:\/\/api.semanticscholar.org\/CorpusID:265221980"},{"key":"e_1_3_1_70_2","volume-title":"OpenGL programming guide: the official guide to learning OpenGL, versions 3.0 and 3.1","author":"Dave Shreiner","year":"2009","unstructured":"Dave Shreiner et al. 2009. OpenGL programming guide: the official guide to learning OpenGL, versions 3.0 and 3.1. Pearson Education."},{"key":"e_1_3_1_71_2","doi-asserted-by":"crossref","unstructured":"Ting Su Jue Wang and Zhendong Su. 2021. Benchmarking automated GUI testing for Android against real-world bugs. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations ofSoftware Engineering. 119\u2013130.","DOI":"10.1145\/3468264.3468620"},{"key":"e_1_3_1_72_2","doi-asserted-by":"crossref","unstructured":"Chengnian Sun Vu Le Qirun Zhang and Zhendong Su. 2016. Toward understanding compiler bugs in GCC and LLVM. In Proceedings ofthe 25th international symposium on software testing and analysis. 294\u2013305.","DOI":"10.1145\/2931037.2931074"},{"key":"e_1_3_1_73_2","unstructured":"Maolin Sun Yibiao Yang Yang Wang Ming Wen Haoxiang Jia and Yuming Zhou. 2023. SMT Solver Validation Empowered by Large Pre-trained Language Models. In ASE."},{"key":"e_1_3_1_74_2","volume-title":"Fuzzing: brute force vulnerability discovery","author":"Sutton Michael","year":"2007","unstructured":"Michael Sutton, Adam Greene, and Pedram Amini. 2007. Fuzzing: brute force vulnerability discovery. Pearson Education."},{"key":"e_1_3_1_75_2","doi-asserted-by":"crossref","unstructured":"Yutian Tang Zhijie Liu Zhichao Zhou and Xiapu Luo. 2024. Chatgpt vs sbst: A comparative assessment of unit test suite generation. IEEE Transactions on Software Engineering (2024).","DOI":"10.1109\/TSE.2024.3382365"},{"key":"e_1_3_1_76_2","unstructured":"TensorFlow 2023. TensorFlow. https:\/\/www.tensorflow.org."},{"key":"e_1_3_1_77_2","unstructured":"TensorFlowLite 2023. TensorFlow Lite. https:\/\/www.tensorflow.org\/lite."},{"key":"e_1_3_1_78_2","unstructured":"TensorFlowXLA 2023. TensorFlow XLA. https:\/\/www.tensorflow.org\/xla."},{"key":"e_1_3_1_79_2","doi-asserted-by":"publisher","DOI":"10.1093\/biomet\/25.3-4.285"},{"key":"e_1_3_1_80_2","unstructured":"Triton 2024. Triton. https:\/\/github.com\/openai\/triton."},{"key":"e_1_3_1_81_2","article-title":"Attention is all you need","volume":"30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).","journal-title":"Advances in neural information processing systems"},{"key":"e_1_3_1_82_2","unstructured":"Jiannan Wang Thibaud Lutellier Shangshu Qian Hung Viet Pham and Lin Tan. 2022. EAGLE: Creating Equivalent Graphs to Test Deep Learning Libraries. (2022)."},{"key":"e_1_3_1_83_2","doi-asserted-by":"crossref","unstructured":"Zan Wang Ming Yan Junjie Chen Shuang Liu and Dongdi Zhang. 2020. Deep learning library testing via effective model generation. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 788\u2013799.","DOI":"10.1145\/3368089.3409761"},{"key":"e_1_3_1_84_2","doi-asserted-by":"publisher","unstructured":"Anjiang Wei Yinlin Deng Chenyuan Yang and Lingming Zhang. 2022. Free Lunch for Testing: Fuzzing Deep-Learning Libraries from Open Source. In 2022IEEE\/ACM 44th International Conference on Software Engineering (ICSE). 995\u20131007. https:\/\/doi.org\/10.1145\/3510003.3510041 10.1145\/3510003.3510041","DOI":"10.1145\/3510003.3510041"},{"key":"e_1_3_1_85_2","unstructured":"Chunqiu Steven Xia Matteo Paltenghi Jia Le Tian Michael Pradel and Lingming Zhang. 2023. Universal fuzzing via large language models. arXiv preprint arXiv:2308.04748 (2023)."},{"key":"e_1_3_1_86_2","doi-asserted-by":"crossref","unstructured":"Danning Xie Yitong Li Mijung Kim Hung Viet Pham Lin Tan Xiangyu Zhang and Michael W Godfrey. 2022. DocTer: documentation-guided fuzzing for testing deep learning API functions. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis. 176\u2013188.","DOI":"10.1145\/3533767.3534220"},{"key":"e_1_3_1_87_2","unstructured":"Jianhao Xu Kangjie Lu Zhengjie Du Zhu Ding Linke Li Qiushi Wu Mathias Payer and Bing Mao. 2023. Silent Bugs Matter: A Study of {Compiler-Introduced} Security Bugs. In 32nd USENIX Security Symposium (USENIX Security 23). 3655\u20133672."},{"key":"e_1_3_1_88_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE48619.2023.00105"},{"key":"e_1_3_1_89_2","doi-asserted-by":"crossref","unstructured":"Xuejun Yang Yang Chen Eric Eide and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation. 283\u2013294.","DOI":"10.1145\/1993498.1993532"},{"key":"e_1_3_1_90_2","unstructured":"Zhaomo Yang Brian Johannesmeyer Anders Trier Olesen Sorin Lerner and Kirill Levchenko. 2017. Dead store elimination (still) considered harmful. In 26th USENIX Security Symposium (USENIX Security 17). 1025\u20131040."},{"key":"e_1_3_1_91_2","unstructured":"Shafiq Joty Yue Wang Weishi Wang and Steven C.H. Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained EncoderDecoder Models for Code Understanding and Generation. In EMNLP 2021."},{"key":"e_1_3_1_92_2","unstructured":"Insu Yun Sangho Lee Meng Xu Yeongjin Jang and Taesoo Kim. 2018. {QSYM}: A practical concolic execution engine tailored for hybrid fuzzing. In 27th USENIX Security Symposium (USENIX Security 18). 745\u2013761."},{"key":"e_1_3_1_93_2","unstructured":"Andreas Zeller Rahul Gopinath Marcel B\u00f6hme Gordon Fraser and Christian Holler. 2019. The fuzzing book."},{"key":"e_1_3_1_94_2","unstructured":"Quanjun Zhang Chunrong Fang Yang Xie Yaxin Zhang Yun Yang Weisong Sun Shengcheng Yu and Zhenyu Chen. 2023. A survey on large language models for software engineering. arXiv preprint arXiv:2312.15223 (2023)."},{"key":"e_1_3_1_95_2","doi-asserted-by":"crossref","unstructured":"Qirun Zhang Chengnian Sun and Zhendong Su. 2017. Skeletal program enumeration for rigorous compiler testing. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. 347\u2013361.","DOI":"10.1145\/3062341.3062379"}],"container-title":["Proceedings of the ACM on Programming Languages"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3689736","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3689736","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,4]],"date-time":"2026-02-04T09:05:50Z","timestamp":1770195950000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3689736"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,8]]},"references-count":94,"journal-issue":{"issue":"OOPSLA2","published-print":{"date-parts":[[2024,10,8]]}},"alternative-id":["10.1145\/3689736"],"URL":"https:\/\/doi.org\/10.1145\/3689736","relation":{},"ISSN":["2475-1421"],"issn-type":[{"value":"2475-1421","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,8]]},"assertion":[{"value":"2024-04-06","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-08-18","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-10-08","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}