{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T02:03:46Z","timestamp":1768961026106,"version":"3.49.0"},"reference-count":88,"publisher":"Association for Computing Machinery (ACM)","issue":"2","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2026,2,28]]},"abstract":"<jats:p>In recent years, the practice of fuzzing Deep Learning (DL) APIs has received significant attention in the software engineering community. Many API-level DL fuzzers have been proposed to test individual DL APIs by generating malformed input. Although these fuzzers have been effective in detecting bugs and outperforming prior work, there remains a gap in benchmarking them against ground-truth, real-world bugs in DL libraries. Existing comparisons among these API-level DL fuzzers primarily focus on the bugs detected but do not offer a comprehensive, in-depth evaluation of the fuzzers\u2019 effectiveness.<\/jats:p>\n                  <jats:p>\n                    In this work, we perform the first in-depth evaluation of state-of-the-art API-level DL fuzzers that generate tests for single DL APIs, focusing on their effectiveness against real-world bugs. We manually created an extensive benchmark dataset, including 517 real-world DL bugs collected from PyTorch and TensorFlow libraries that can be triggered by malformed inputs. We then apply seven state-of-the-art DL fuzzers\u2014\n                    <jats:monospace>FreeFuzz<\/jats:monospace>\n                    ,\n                    <jats:monospace>DeepRel<\/jats:monospace>\n                    ,\n                    <jats:monospace>NablaFuzz<\/jats:monospace>\n                    ,\n                    <jats:monospace>DocTer<\/jats:monospace>\n                    ,\n                    <jats:monospace>ACETest<\/jats:monospace>\n                    ,\n                    <jats:monospace>TitanFuzz<\/jats:monospace>\n                    , and\n                    <jats:monospace>FuzzGPT<\/jats:monospace>\n                    \u2014to our benchmark dataset, following their respective instructions. Our results show that these fuzzers detect only 6.5% (34 out of 517) of the unique real-world bugs in the dataset. Our analysis identifies two dominant factors that impact the effectiveness of these fuzzers in detecting real-world bugs. These findings suggest opportunities for improving the performance of fuzzers in future work. Overall, this study extends previous work on DL fuzzers by providing an extensive evaluation and benchmarking platform for fuzzing DL libraries.\n                  <\/jats:p>","DOI":"10.1145\/3729533","type":"journal-article","created":{"date-parts":[[2025,4,15]],"date-time":"2025-04-15T13:20:40Z","timestamp":1744723240000},"page":"1-34","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Evaluating API-Level Deep Learning Fuzzers: A Comprehensive Benchmarking Study"],"prefix":"10.1145","volume":"35","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0484-3972","authenticated-orcid":false,"given":"Nima Shiri","family":"Harzevili","sequence":"first","affiliation":[{"name":"York University, Toronto, Ontario, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1659-1960","authenticated-orcid":false,"given":"Moshi","family":"Wei","sequence":"additional","affiliation":[{"name":"York University, Toronto, Ontario, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-8192-0164","authenticated-orcid":false,"given":"Mohammad Mahdi","family":"Mohajer","sequence":"additional","affiliation":[{"name":"York University, Toronto, Ontario, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0617-2877","authenticated-orcid":false,"given":"Hung Viet","family":"Pham","sequence":"additional","affiliation":[{"name":"York University, Toronto, Ontario, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0861-8326","authenticated-orcid":false,"given":"Song","family":"Wang","sequence":"additional","affiliation":[{"name":"York University, Toronto, Ontario, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2026,1,20]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"NVIDIA Corporation. 2025. NVIDIA TensorRT: an SDK for high-performance deep learning inference. Retrieved May 12 2025 from https:\/\/developer.nvidia.com\/tensorrt"},{"key":"e_1_3_2_3_2","unstructured":"Microsoft Corporation. 2025. ONNX Runtime: Production-grade AI engine to speed up training and inferencing in your existing technology stack. Retrieved May 12 2025 from https:\/\/onnxruntime.ai\/docs\/performance\/graph-optimizations.html"},{"key":"e_1_3_2_4_2","unstructured":"Mart\u00edn Abadi Ashish Agarwal Paul Barham Eugene Brevdo Zhifeng Chen Craig Citro Greg S. Corrado Andy Davis Jeffrey Dean Matthieu Devin et al. 2015. TensorFlow: Large-scale machine learning on heterogeneous systems. Retrieved from https:\/\/www.tensorflow.org\/Software available from tensorflow.org"},{"key":"e_1_3_2_5_2","first-page":"265","volume-title":"Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation, Vol. 16, Savannah, GA, 265\u2013283."},{"key":"e_1_3_2_6_2","doi-asserted-by":"crossref","first-page":"106771","DOI":"10.1016\/j.knosys.2021.106771","article-title":"Image classification with deep learning in the presence of noisy labels: A survey","volume":"215","author":"Algan G\u00f6rkem","year":"2021","unstructured":"G\u00f6rkem Algan and Ilkay Ulusoy. 2021. Image classification with deep learning in the presence of noisy labels: A survey. Knowledge-Based Systems 215 (2021), 106771.","journal-title":"Knowledge-Based Systems"},{"key":"e_1_3_2_7_2","first-page":"195","volume-title":"Proceedings of the 2021 IEEE 15th International Conference on Semantic Computing (ICSC)","author":"Athreya Ram G.","year":"2021","unstructured":"Ram G. Athreya, Srividya K. Bansal, Axel-Cyrille Ngonga Ngomo, and Ricardo Usbeck. 2021. Template-based question answering using recursive neural networks. In Proceedings of the 2021 IEEE 15th International Conference on Semantic Computing (ICSC). IEEE, 195\u2013198."},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3551349.3561161"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510230"},{"key":"e_1_3_2_10_2","unstructured":"Sicong Cao Xiaobing Sun Lili Bo Rongxin Wu Bin Li and Chuanqi Tao. 2022. MVD: Memory-related vulnerability detection based on flow-sensitive graph neural networks. arXiv:2203.02660. Retrieved from https:\/\/arxiv.org\/abs\/2203.02660"},{"key":"e_1_3_2_11_2","first-page":"96","volume-title":"Proceedings of the 2011 IEEE 35th Annual Computer Software and Applications Conference Workshops","author":"Chatzieleftheriou George","year":"2011","unstructured":"George Chatzieleftheriou and Panagiotis Katsaros. 2011. Test-driving static analysis tools in search of C code vulnerabilities. In Proceedings of the 2011 IEEE 35th Annual Computer Software and Applications Conference Workshops. IEEE, 96\u2013103."},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/3587155"},{"key":"e_1_3_2_13_2","first-page":"578","volume-title":"Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI \u201918)","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, et al. 2018. \\(\\{\\) TVM \\(\\}\\) : An automated \\(\\{\\) End-to-End \\(\\}\\) optimizing compiler for deep learning. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI \u201918), 578\u2013594."},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.5555\/1406221"},{"key":"e_1_3_2_15_2","first-page":"2383","volume-title":"Proceedings of the 32nd USENIX Security Symposium (USENIX Security \u201923","author":"Christou Neophytos","year":"2023","unstructured":"Neophytos Christou, Di Jin, Vaggelis Atlidakis, Baishakhi Ray, and Vasileios P. Kemerlis. 2023. \\(\\{\\) IvySyn \\(\\}\\) : Automated vulnerability discovery in deep learning frameworks. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security \u201923), 2383\u20132400."},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/3597926.3598067"},{"key":"e_1_3_2_17_2","doi-asserted-by":"crossref","unstructured":"Yinlin Deng Chunqiu Steven Xia Chenyuan Yang Shizhuo Dylan Zhang Shujing Yang and Lingming Zhang. 2023. Large language models are edge-case fuzzers: Testing deep learning libraries via fuzzgpt. arXiv:2304.02014. Retrieved from https:\/\/arxiv.org\/abs\/2304.02014","DOI":"10.1145\/3597926.3598067"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3540250.3549085"},{"key":"e_1_3_2_19_2","doi-asserted-by":"crossref","first-page":"755","DOI":"10.1109\/UBMK.2017.8093521","volume-title":"Proceedings of the 2017 International Conference on Computer Science and Engineering (UBMK)","author":"Ertam Fatih","year":"2017","unstructured":"Fatih Ertam and Galip Ayd\u0131n. 2017. Data classification with deep learning using Tensorflow. In Proceedings of the 2017 International Conference on Computer Science and Engineering (UBMK). IEEE, 755\u2013758."},{"key":"e_1_3_2_20_2","unstructured":"Facebook. 2019. American Fuzzy Lop. Retrieved from https:\/\/github.com\/google\/AFL\/releases"},{"key":"e_1_3_2_21_2","doi-asserted-by":"crossref","unstructured":"Emily First Markus N. Rabe Talia Ringer and Yuriy Brun. 2023. Baldur: Whole-proof generation and repair with large language models. arXiv:2303.04910. Retrieved from https:\/\/arxiv.org\/abs\/2303.04910","DOI":"10.1145\/3611643.3616243"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3551349.3560431"},{"key":"e_1_3_2_23_2","first-page":"30","volume-title":"Proceedings of the 2021 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE)","author":"Gauthier Ian X.","year":"2021","unstructured":"Ian X. Gauthier, Maxime Lamothe, Gunter Mussbacher, and Shane McIntosh. 2021. Is historical data an appropriate benchmark for reviewer recommendation systems?: A case study of the gerrit community. In Proceedings of the 2021 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE). IEEE, 30\u201341."},{"key":"e_1_3_2_24_2","unstructured":"Samuel Gro\u00df. 2019. Coverage-guided fuzzer for dynamic language interpreters based on a custom intermediate language. Retrieved from https:\/\/github.com\/googleprojectzero\/fuzzilli"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510092"},{"key":"e_1_3_2_26_2","first-page":"486","volume-title":"Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering","author":"Guo Qianyu","year":"2020","unstructured":"Qianyu Guo, Xiaofei Xie, Yi Li, Xiaoyu Zhang, Yang Liu, Xiaohong Li, and Chao Shen. 2020. Audee: Automated testing for deep learning frameworks. In Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering, 486\u2013498."},{"key":"e_1_3_2_27_2","first-page":"317","volume-title":"Proceedings of the 2018 33rd IEEE\/ACM International Conference on Automated Software Engineering (ASE)","author":"Habib Andrew","year":"2018","unstructured":"Andrew Habib and Michael Pradel. 2018. How many of all bugs do we find? A study of static bug detectors. In Proceedings of the 2018 33rd IEEE\/ACM International Conference on Automated Software Engineering (ASE). IEEE, 317\u2013328."},{"key":"e_1_3_2_28_2","first-page":"1","volume-title":"Proceedings of the 10th ACM\/IEEE International Symposium on Empirical Software Engineering and Measurement","author":"Han Xue","year":"2016","unstructured":"Xue Han and Tingting Yu. 2016. An empirical study on performance bugs for highly configurable software systems. In Proceedings of the 10th ACM\/IEEE International Symposium on Empirical Software Engineering and Measurement, 1\u201310."},{"key":"e_1_3_2_29_2","first-page":"795","volume-title":"Proceedings of the 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE)","author":"Harzevili Nima Shiri","year":"2023","unstructured":"Nima Shiri Harzevili, Jiho Shin, Junjie Wang, Song Wang, and Nachiappan Nagappan. 2023. Automatic static vulnerability detection for machine learning libraries: Are we there yet? In Proceedings of the 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 795\u2013806."},{"key":"e_1_3_2_30_2","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1109\/MSR59073.2023.00018","volume-title":"Proceedings of the 2023 IEEE\/ACM 20th International Conference on Mining Software Repositories (MSR)","author":"Harzevili Nima Shiri","year":"2023","unstructured":"Nima Shiri Harzevili, Jiho Shin, Junjie Wang, Song Wang, and Nachiappan Nagappan. 2023. Characterizing and understanding software security vulnerabilities in machine learning libraries. In Proceedings of the 2023 IEEE\/ACM 20th International Conference on Mining Software Repositories (MSR). IEEE, 27\u201338."},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3428334"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00799-020-00279-3"},{"key":"e_1_3_2_33_2","doi-asserted-by":"crossref","first-page":"510","DOI":"10.1145\/3338906.3338955","volume-title":"Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering","author":"Islam Md Johirul","year":"2019","unstructured":"Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A comprehensive study on deep learning bug characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 510\u2013520."},{"key":"e_1_3_2_34_2","doi-asserted-by":"crossref","first-page":"110935","DOI":"10.1016\/j.jss.2021.110935","article-title":"The symptoms, causes, and repairs of bugs inside a deep learning library","volume":"177","author":"Jia Li","year":"2021","unstructured":"Li Jia, Hao Zhong, Xiaoyin Wang, Linpeng Huang, and Xuansheng Lu. 2021. The symptoms, causes, and repairs of bugs inside a deep learning library. Journal of Systems and Software 177 (2021), 110935.","journal-title":"Journal of Systems and Software"},{"key":"e_1_3_2_35_2","doi-asserted-by":"crossref","first-page":"3318","DOI":"10.1145\/3460120.3485364","volume-title":"Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security","author":"Jiang Zhiyuan","year":"2021","unstructured":"Zhiyuan Jiang, Xiyue Jiang, Ahmad Hazimeh, Chaojing Tang, Chao Zhang, and Mathias Payer. 2021. Igor: Crash deduplication through root-cause clustering. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 3318\u20133336."},{"key":"e_1_3_2_36_2","first-page":"1","volume-title":"Proceedings of the 37th IEEE\/ACM International Conference on Automated Software Engineering","author":"Joshy Ashwin Kallingal","year":"2022","unstructured":"Ashwin Kallingal Joshy and Wei Le. 2022. FuzzerAid: Grouping fuzzed crashes based on fault signatures. In Proceedings of the 37th IEEE\/ACM International Conference on Automated Software Engineering, 1\u201312."},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/3243734.3243804"},{"key":"e_1_3_2_38_2","first-page":"1","volume-title":"Proceedings of the 2018 4th International Conference on Computing Communication Control and Automation (ICCUBEA)","author":"Kulkarni Ruturaj","year":"2018","unstructured":"Ruturaj Kulkarni, Shruti Dhavalikar, and Sonal Bangar. 2018. Traffic light detection and recognition for self driving cars using deep learning. In Proceedings of the 2018 4th International Conference on Computing Communication Control and Automation (ICCUBEA). IEEE, 1\u20134."},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3551349.3556908"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510177"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3628159"},{"key":"e_1_3_2_42_2","unstructured":"Jiawei Liu Jinkun Lin Fabian Ruffy Cheng Tan Jinyang Li Aurojit Panda and Lingming Zhang. 2022. Finding deep-learning compilation bugs with NNSmith. arXiv:2207.13066. Retrieved from https:\/\/arxiv.org\/abs\/2207.13066"},{"key":"e_1_3_2_43_2","first-page":"530","volume-title":"Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"2","author":"Liu Jiawei","year":"2023","unstructured":"Jiawei Liu, Jinkun Lin, Fabian Ruffy, Cheng Tan, Jinyang Li, Aurojit Panda, and Lingming Zhang. 2023. Nnsmith: Generating diverse and valid test cases for deep learning compilers. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Vol. 2, 530\u2013543."},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3527317"},{"key":"e_1_3_2_45_2","first-page":"288","volume-title":"Proceedings of the 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE)","author":"Luo Weisi","unstructured":"Weisi Luo, Dong Chai, Xiaoyue Ruan, Jiang Wang, Chunrong Fang, and Zhenyu Chen. 2021. Graph-based fuzz testing for deep learning inference engines. In Proceedings of the 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 288\u2013299."},{"key":"e_1_3_2_46_2","doi-asserted-by":"crossref","first-page":"108364","DOI":"10.1016\/j.patcog.2021.108364","article-title":"Semi-supervised active salient object detection","volume":"123","author":"Lv Yunqiu","year":"2022","unstructured":"Yunqiu Lv, Bowen Liu, Jing Zhang, Yuchao Dai, Aixuan Li, and Tong Zhang. 2022. Semi-supervised active salient object detection. Pattern Recognition 123 (2022), 108364.","journal-title":"Pattern Recognition"},{"key":"e_1_3_2_47_2","first-page":"1949","volume-title":"USENIX Security Symposium","author":"Lyu Chenyang","year":"2019","unstructured":"Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li, Wei-Han Lee, Yu Song, and Raheem Beyah. 2019. MOPT: Optimized mutation scheduling for fuzzers. In USENIX Security Symposium, 1949\u20131966."},{"key":"e_1_3_2_48_2","first-page":"24","volume-title":"Proceedings of the 29rd Annual Network and Distributed System Security Symposium (NDSS)","author":"Lyu Chenyang","year":"2022","unstructured":"Chenyang Lyu, Shouling Ji, Xuhong Zhang, Hong Liang, Binbin Zhao, Kangjie Lu, and Raheem Beyah. 2022. EMS: History-driven mutation for coverage-based fuzzing. In Proceedings of the 29rd Annual Network and Distributed System Security Symposium (NDSS), 24\u201328."},{"key":"e_1_3_2_49_2","unstructured":"Farzaneh Mahdisoltani Guillaume Berger Waseem Gharbieh David Fleet and Roland Memisevic. 2018. Fine-grained video classification and captioning. arXiv:1804.09235. Retrieved from https:\/\/arxiv.org\/abs\/1804.09235"},{"key":"e_1_3_2_50_2","doi-asserted-by":"crossref","first-page":"1024","DOI":"10.1145\/3377811.3380421","volume-title":"Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering","author":"Man\u00e8s Valentin J. M.","year":"2020","unstructured":"Valentin J. M. Man\u00e8s, Soomin Kim, and Sang Kil Cha. 2020. Ankou: Guiding grey-box fuzzing towards combinatorial difference. In Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering, 1024\u20131036."},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/3468264.3473932"},{"key":"e_1_3_2_52_2","first-page":"923","volume-title":"Proceedings of the 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","author":"Minaee Shervin","year":"2017","unstructured":"Shervin Minaee and Zhu Liu. 2017. Automatic question-answering using a deep similarity neural network. In Proceedings of the 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP). IEEE, 923\u2013927."},{"key":"e_1_3_2_53_2","doi-asserted-by":"crossref","first-page":"662","DOI":"10.1145\/3460319.3469077","volume-title":"Proceedings of the 30th ACM SIGSOFT international Symposium on Software Testing and Analysis","author":"Natella Roberto","year":"2021","unstructured":"Roberto Natella and Van-Thuan Pham. 2021. Profuzzbench: A benchmark for stateful protocol fuzzing. In Proceedings of the 30th ACM SIGSOFT international Symposium on Software Testing and Analysis, 662\u2013665."},{"key":"e_1_3_2_54_2","first-page":"1050","volume-title":"Proceedings of the 2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE)","author":"Nguyen Phuong T.","year":"2019","unstructured":"Phuong T. Nguyen, Juri Di Rocco, Davide Di Ruscio, Lina Ochoa, Thomas Degueule, and Massimiliano Di Penta. 2019. Focus: A recommender system for mining API function calls and usage patterns. In Proceedings of the 2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 1050\u20131060."},{"key":"e_1_3_2_55_2","author":"Paszke Adam","year":"2017","unstructured":"Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. Retrieved from https:\/\/openreview.net\/forum?id=BJJsrmfCZ","journal-title":"Automatic differentiation in PyTorch"},{"key":"e_1_3_2_56_2","first-page":"8024","volume-title":"Advances in Neural Information Processing Systems","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems. Vol. 32, Curran Associates, Inc., 8024\u20138035. Retrieved from http:\/\/papers.neurips.cc\/paper\/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf"},{"issue":"4","key":"e_1_3_2_57_2","doi-asserted-by":"crossref","first-page":"1876","DOI":"10.1109\/TSE.2022.3197063","article-title":"Revisiting, benchmarking and exploring API recommendation: How far are we","volume":"49","author":"Peng Yun","year":"2022","unstructured":"Yun Peng, Shuqing Li, Wenwei Gu, Yichen Li, Wenxuan Wang, Cuiyun Gao, and Michael R. Lyu. 2022. Revisiting, benchmarking and exploring API recommendation: How far are we? IEEE Transactions on Software Engineering 49, 4 (2022), 1876\u20131897.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_3_2_58_2","first-page":"1027","volume-title":"Proceedings of the 2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE)","author":"Viet Pham Hung","year":"2019","unstructured":"Hung Viet Pham, Thibaud Lutellier, Weizhen Qi, and Lin Tan. 2019. CRADLE: Cross-backend validation to detect and localize bugs in deep learning libraries. In Proceedings of the 2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 1027\u20131038."},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/IVS.2017.7995849"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11063-021-10470-5"},{"key":"e_1_3_2_61_2","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1109\/ISSRE.2004.1","volume-title":"Proceedings of the 15th International Symposium on Software Reliability Engineering","author":"Rutar Nick","year":"2004","unstructured":"Nick Rutar, Christian B. Almazan, and Jeffrey S. Foster. 2004. A comparison of bug finding tools for java. In Proceedings of the 15th International Symposium on Software Reliability Engineering. IEEE, 245\u2013256."},{"key":"e_1_3_2_62_2","unstructured":"K. Serebryany. 2015. libFuzzer a library for coverage-guided fuzz testing. Retrieved from https:\/\/llvm.org\/docs\/LibFuzzer.html"},{"key":"e_1_3_2_63_2","doi-asserted-by":"crossref","first-page":"968","DOI":"10.1145\/3468264.3468591","volume-title":"Proceedings of the 29th ACM Joint meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering","author":"Shen Qingchao","year":"2021","unstructured":"Qingchao Shen, Haoyang Ma, Junjie Chen, Yongqiang Tian, Shing-Chi Cheung, and Xiang Chen. 2021. A comprehensive study of deep learning compiler bugs. In Proceedings of the 29th ACM Joint meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 968\u2013980."},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1145\/3597926.3598088"},{"issue":"1","key":"e_1_3_2_65_2","first-page":"23","article-title":"Self-driving cars: Evaluation of deep learning techniques for object detection in different driving conditions","volume":"2","author":"Simhambhatla Ramesh","year":"2019","unstructured":"Ramesh Simhambhatla, Kevin Okiah, Shravan Kuchkula, and Robert Slater. 2019. Self-driving cars: Evaluation of deep learning techniques for object detection in different driving conditions. SMU Data Science Review 2, 1 (2019), 23.","journal-title":"SMU Data Science Review"},{"key":"e_1_3_2_66_2","first-page":"1","volume-title":"Proceedings of the Symposium on Network and Distributed System Security","author":"Stephens Nick","year":"2016","unstructured":"Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting fuzzing through selective symbolic execution. In Proceedings of the Symposium on Network and Distributed System Security, Vol. 16, 1\u201316."},{"key":"e_1_3_2_67_2","unstructured":"Qidong Su Chuqin Geng Gennady Pekhimenko and Xujie Si. 2023. TorchProbe: Fuzzing dynamic deep learning compilers. arXiv:2310.20078. Retrieved from https:\/\/arxiv.org\/abs\/2310.20078"},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.5555\/3275309"},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-013-9258-8"},{"key":"e_1_3_2_70_2","doi-asserted-by":"publisher","DOI":"10.1145\/3236024.3275439"},{"key":"e_1_3_2_71_2","first-page":"292","volume-title":"Proceedings of the 2021 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE)","author":"Tomassi David A.","year":"2021","unstructured":"David A. Tomassi and Cindy Rubio-Gonz\u00e1lez. 2021. On the real-world effectiveness of static bug detectors at finding null pointer exceptions. In Proceedings of the 2021 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE). IEEE, 292\u2013303."},{"key":"e_1_3_2_72_2","article-title":"Automatic differentiation in ML: Where we are and where we should be going","volume":"31","author":"Van Merri\u00ebnboer Bart","year":"2018","unstructured":"Bart Van Merri\u00ebnboer, Olivier Breuleux, Arnaud Bergeron, and Pascal Lamblin. 2018. Automatic differentiation in ML: Where we are and where we should be going. In Advances in Neural Information Processing Systems, Vol. 31.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_73_2","first-page":"437","volume-title":"European Conference on Machine Learning","author":"Vermorel Joannes","year":"2005","unstructured":"Joannes Vermorel and Mehryar Mohri. 2005. Multi-armed bandit algorithms and empirical evaluation. In European Conference on Machine Learning. Springer, 437\u2013448."},{"key":"e_1_3_2_74_2","doi-asserted-by":"crossref","first-page":"319","DOI":"10.1109\/MSR.2013.6624045","volume-title":"Proceedings of the 2013 10th Working Conference on Mining Software Repositories (MSR)","author":"Wang Jue","year":"2013","unstructured":"Jue Wang, Yingnong Dang, Hongyu Zhang, Kai Chen, Tao Xie, and Dongmei Zhang. 2013. Mining succinct and high-coverage API usage patterns from source code. In Proceedings of the 2013 10th Working Conference on Mining Software Repositories (MSR). IEEE, 319\u2013328."},{"key":"e_1_3_2_75_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510165"},{"key":"e_1_3_2_76_2","first-page":"1548","volume-title":"Proceedings of the 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE)","author":"Wang Song","year":"2021","unstructured":"Song Wang, Nishtha Shrestha, Abarna Kucheri Subburaman, Junjie Wang, Moshi Wei, and Nachiappan Nagappan. 2021. Automatic unit test generation for machine learning libraries: How far are we? In Proceedings of the 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1548\u20131560."},{"key":"e_1_3_2_77_2","doi-asserted-by":"publisher","DOI":"10.1145\/3368089.3409761"},{"key":"e_1_3_2_78_2","unstructured":"Anjiang Wei Yinlin Deng Chenyuan Yang and Lingming Zhang. 2022. Free lunch for testing: Fuzzing deep-learning libraries from open source. arXiv:2201.06589. Retrieved from https:\/\/arxiv.org\/abs\/2201.06589"},{"key":"e_1_3_2_79_2","doi-asserted-by":"publisher","DOI":"10.1145\/3540250.3549124"},{"key":"e_1_3_2_80_2","unstructured":"Yuxiang Wei Chunqiu Steven Xia and Lingming Zhang. 2023. Copiloting the copilots: Fusing large language models with completion engines for automated program repair. arXiv:2309.00608. Retrieved from https:\/\/arxiv.org\/abs\/2309.00608"},{"key":"e_1_3_2_81_2","doi-asserted-by":"publisher","DOI":"10.1145\/3377811.3380396"},{"key":"e_1_3_2_82_2","doi-asserted-by":"publisher","DOI":"10.1145\/3533767.3534220"},{"key":"e_1_3_2_83_2","doi-asserted-by":"publisher","DOI":"10.1145\/3520312.3534862"},{"key":"e_1_3_2_84_2","unstructured":"Chenyuan Yang Yinlin Deng Jiayi Yao Yuxing Tu Hanchi Li and Lingming Zhang. 2023. Fuzzing automatic differentiation in deep-learning libraries. arXiv:2302.04351. Retrieved from https:\/\/arxiv.org\/abs\/2302.04351"},{"key":"e_1_3_2_85_2","doi-asserted-by":"publisher","DOI":"10.1145\/3551349.3560415"},{"key":"e_1_3_2_86_2","doi-asserted-by":"crossref","unstructured":"Jiyang Zhang Pengyu Nie Junyi Jessy Li and Milos Gligoric. 2023. Multilingual code co-evolution using large language models. arXiv:2307.14991. Retrieved from https:\/\/arxiv.org\/abs\/2307.14991","DOI":"10.1145\/3611643.3616350"},{"key":"e_1_3_2_87_2","first-page":"318","volume-title":"\u2013Proceedings of the 23rd European Conference on Object-Oriented Programming (ECOOP \u201909)","author":"Zhong Hao","year":"2009","unstructured":"Hao Zhong, Tao Xie, Lu Zhang, Jian Pei, and Hong Mei. 2009. MAPO: Mining and recommending API usage patterns. In\u2013Proceedings of the 23rd European Conference on Object-Oriented Programming (ECOOP \u201909). Springer, 318\u2013343."},{"key":"e_1_3_2_88_2","doi-asserted-by":"publisher","DOI":"10.1145\/3106237.3117771"},{"key":"e_1_3_2_89_2","doi-asserted-by":"publisher","DOI":"10.1145\/1029894.1029911"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3729533","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T14:03:02Z","timestamp":1768917782000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3729533"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,20]]},"references-count":88,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,2,28]]}},"alternative-id":["10.1145\/3729533"],"URL":"https:\/\/doi.org\/10.1145\/3729533","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,20]]},"assertion":[{"value":"2024-06-19","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-07","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-01-20","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}