{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,4]],"date-time":"2026-01-04T09:46:54Z","timestamp":1767520014502,"version":"3.48.0"},"reference-count":44,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2026,1,4]],"date-time":"2026-01-04T00:00:00Z","timestamp":1767484800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,1,4]],"date-time":"2026-01-04T00:00:00Z","timestamp":1767484800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2023YFB2704903"],"award-info":[{"award-number":["2023YFB2704903"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2020YFB1807504"],"award-info":[{"award-number":["2020YFB1807504"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Cybersecurity"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    The widespread adoption of deep learning (DL) libraries has raised concerns about their reliability and security. While prior works leveraged large language models (LLMs) to generate test programs for DL library APIs, the hardcoded program behaviors and low code validity rates render them impractical for real-world testing. To address these challenges, we propose FD-FACTORY, a fully automated framework that leverages LLMs to generate fuzz drivers for DL API testing. The fuzz driver programs accept mutated inputs from fuzzing engines to achieve effective code analysis. Inspired by the modular design of industrial production lines, FD-FACTORY decomposes the generation process into eight distinct stages:\n                    <jats:italic>Preparation, Initial Fuzz Driver Generation, Early Stop Checks, Verification, Issue Diagnosis, Decision Making, Repair Loop, and Deployment<\/jats:italic>\n                    . Each stage is handled by dedicated agents or tools to enhance construction efficiency. Experimental results demonstrate that FD-FACTORY achieves 73.67% and 65.33% success rates in generating fuzz drivers for PyTorch and TensorFlow, producing an improvement of 34.66 to\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$$-$$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mo>-<\/mml:mo>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    \u00a054.66% than existing approaches. In addition, FD-FACTORY provides more comprehensive coverage tracking by supporting both Python and native C\/C code. It achieves a total coverage of 308,351 lines on PyTorch and 528,427 lines on TensorFlow, substantially surpassing the results reported by previous approaches. Unlike prior approaches relying on repeated interactions with the LLM servers throughout the entire testing process, our framework confines the use of LLMs strictly to the fuzz driver generation stages before deployment. Once generated, the fuzz drivers can be reused without further LLM involvement, thereby enhancing the practicality and sustainability of LLM-assisted fuzzing in real-world scenarios.\n                  <\/jats:p>","DOI":"10.1186\/s42400-025-00532-9","type":"journal-article","created":{"date-parts":[[2026,1,4]],"date-time":"2026-01-04T09:43:44Z","timestamp":1767519824000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Automating fuzz driver generation for deep learning libraries with large language models"],"prefix":"10.1186","volume":"9","author":[{"given":"Tianming","family":"Zheng","sequence":"first","affiliation":[]},{"given":"Fanchao","family":"Meng","sequence":"additional","affiliation":[]},{"given":"Ping","family":"Yi","sequence":"additional","affiliation":[]},{"given":"Yue","family":"Wu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2026,1,4]]},"reference":[{"issue":"1","key":"532_CR1","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1007\/s10462-023-10631-z","volume":"57","author":"R Archana","year":"2024","unstructured":"Archana R, Jeevaraj PE (2024) Deep learning models for digital image processing: a review. Artif Intell Rev 57(1):11","journal-title":"Artif Intell Rev"},{"key":"532_CR2","doi-asserted-by":"crossref","unstructured":"Choi S, Alkinoon A, Alghuried A, Alghamdi A, Mohaisen D (2025) Attributing chatgpt-transformed synthetic code. In: 2025 IEEE 45th international conference on distributed computing systems (ICDCS), pp 89\u201399. IEEE","DOI":"10.1109\/ICDCS63083.2025.00018"},{"key":"532_CR3","unstructured":"Christou N, Jin D, Atlidakis V, Ray B, Kemerlis VP (2023) Ivysyn: Automated vulnerability discovery in deep learning frameworks. In: Calandrino, J.A., Troncoso, C. (eds) 32nd USENIX security symposium, USENIX security 2023, Anaheim, CA, USA, pp 2383\u20132400. USENIX Association. https:\/\/www.usenix.org\/conference\/usenixsecurity23\/presentation\/christou"},{"key":"532_CR4","doi-asserted-by":"publisher","unstructured":"Deng Y, Xia CS, Peng H, Yang C, Zhang L (2023) Large language models are zero-shot fuzzers: Fuzzing deep-learning libraries via large language models. In: Just R., Fraser G. (eds) Proceedings of the 32nd ACM SIGSOFT international symposium on software testing and analysis, ISSTA 2023, Seattle, pp 423\u2013435. ACM. https:\/\/doi.org\/10.1145\/3597926.3598067","DOI":"10.1145\/3597926.3598067"},{"key":"532_CR5","doi-asserted-by":"publisher","unstructured":"Deng Y, Xia CS, Yang C, Zhang SD, Yang S, Zhang L (2024) Large language models are edge-case generators: crafting unusual programs for fuzzing deep learning libraries. In: Proceedings of the 46th IEEE\/ACM international conference on software engineering, ICSE 2024, Lisbon, Portugal, pp 70\u201317013. ACM. https:\/\/doi.org\/10.1145\/3597503.3623343","DOI":"10.1145\/3597503.3623343"},{"key":"532_CR6","doi-asserted-by":"publisher","unstructured":"Deng Y, Yang C, Wei A, Zhang L (2022) Fuzzing deep-learning libraries via automated relational API inference. In: Roychoudhury A, Cadar C, Kim M (eds) Proceedings of the 30th ACM joint European software engineering conference and symposium on the foundations of software engineering, ESEC\/FSE 2022, pp 44\u201356. ACM. https:\/\/doi.org\/10.1145\/3540250.3549085","DOI":"10.1145\/3540250.3549085"},{"key":"532_CR7","doi-asserted-by":"crossref","unstructured":"Fakhoury S, Naik A, Sakkas G, Chakraborty S, Lahiri SK (2024) Llm-based test-driven interactive code generation: user study and empirical evaluation. IEEE Trans Softw Eng","DOI":"10.1109\/TSE.2024.3428972"},{"key":"532_CR8","doi-asserted-by":"publisher","DOI":"10.1016\/J.COSE.2022.102948","volume":"124","author":"K Filus","year":"2023","unstructured":"Filus K, Domanska J (2023) Software vulnerabilities in tensorflow-based deep learning applications. Comput. Secur. 124:102948. https:\/\/doi.org\/10.1016\/J.COSE.2022.102948","journal-title":"Comput. Secur."},{"key":"532_CR9","volume-title":"Security and privacy controls for information systems and organizations","author":"JT Force","year":"2017","unstructured":"Force JT (2017) Security and privacy controls for information systems and organizations. Technical report, National Institute of Standards and Technology"},{"key":"532_CR10","unstructured":"Google: google\/atheris. [Online; accessed 2025-06-04] (2025). https:\/\/github.com\/google\/atheris"},{"key":"532_CR11","unstructured":"GoogleCloud: applications of artificial intelligence (AI) | Google Cloud. [Online; accessed 2025-03-22] (2025). https:\/\/cloud.google.com\/discover\/ai-applications"},{"key":"532_CR12","doi-asserted-by":"publisher","unstructured":"Gu J, Luo X, Zhou Y, Wang X (2022) Muffin: Testing deep learning libraries via neural architecture fuzzing. In: 44th IEEE\/ACM 44th international conference on software engineering, ICSE 2022, Pittsburgh, pp 1418\u20131430. ACM. https:\/\/doi.org\/10.1145\/3510003.3510092","DOI":"10.1145\/3510003.3510092"},{"key":"532_CR13","doi-asserted-by":"publisher","unstructured":"Guo Q, Xie X, Li Y, Zhang X, Liu Y, Li X, Shen C (2020) Audee: Automated testing for deep learning frameworks. In: 35th IEEE\/ACM international conference on automated software engineering, ASE 2020, Melbourne, Australia, 2020, pp. 486\u2013498. IEEE . https:\/\/doi.org\/10.1145\/3324884.3416571","DOI":"10.1145\/3324884.3416571"},{"key":"532_CR14","doi-asserted-by":"crossref","unstructured":"Huang D, Zhang JM, Bu Q, Xie X, Chen J, Cui H (2024) Bias testing and mitigation in llm-based code generation. ACM Trans Softw Eng Methodol","DOI":"10.1145\/3724117"},{"key":"532_CR15","unstructured":"Huang D, Zhang JM, Luck M, Bu Q, Qing Y, Cui H (2023) Agentcoder: Multi-agent-based code generation with iterative testing and optimisation. arXiv preprint arXiv:2312.13010"},{"key":"532_CR16","doi-asserted-by":"publisher","unstructured":"Islam MA, Ali ME, Parvez MR (2024) Mapcoder: multi-agent code generation for competitive problem solving. In: Ku, L., Martins, A., Srikumar, V. (eds.) Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, pp 4912\u20134944. Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/V1\/2024.ACL-LONG.269","DOI":"10.18653\/V1\/2024.ACL-LONG.269"},{"issue":"9","key":"532_CR17","doi-asserted-by":"publisher","first-page":"216","DOI":"10.3390\/ai6090216","volume":"6","author":"NO Jaffal","year":"2025","unstructured":"Jaffal NO, Alkhanafseh M, Mohaisen D (2025) Large language models in cybersecurity: a survey of applications, vulnerabilities, and defense techniques. AI 6(9):216","journal-title":"AI"},{"issue":"6","key":"532_CR18","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3712060","volume":"57","author":"A Joshi","year":"2025","unstructured":"Joshi A, Dabre R, Kanojia D, Li Z, Zhan H, Haffari G, Dippold D (2025) Natural language processing for dialects of a language: A survey. ACM Comput Surv 57(6):1\u201337","journal-title":"ACM Comput Surv"},{"key":"532_CR19","doi-asserted-by":"crossref","unstructured":"Kheddar H, Hemis M, Himeur Y (2024) Automatic speech recognition using advanced deep learning approaches: A survey. Information Fusion, 102422","DOI":"10.1016\/j.inffus.2024.102422"},{"key":"532_CR20","unstructured":"LazyProgrammer: PyTorch vs. TensorFlow: Full Overview 2025 Guide. [Online; accessed 2025-03-23] (2025). https:\/\/lazyprogrammer.me\/pytorch-vs-tensorflow\/"},{"key":"532_CR21","unstructured":"Lin F, Kim DJ et al (2024) When llm-based code generation meets the software development process. arXiv e-prints, 2403"},{"key":"532_CR22","doi-asserted-by":"crossref","unstructured":"Lin J, Mohaisen D (2025) From large to mammoth: a comparative evaluation of large language models in vulnerability detection. In: Proceedings of the 2025 network and distributed system security symposium (NDSS)","DOI":"10.14722\/ndss.2025.241491"},{"key":"532_CR23","unstructured":"Li J, Tao C, Li J, Li G, Jin Z, Zhang H, Fang Z, Liu F (2023) Large language model-aware in-context learning for code generation. ACM Trans Softw Eng Methodol"},{"issue":"2","key":"532_CR24","doi-asserted-by":"publisher","first-page":"50","DOI":"10.1145\/3628159","volume":"33","author":"J Liu","year":"2024","unstructured":"Liu J, Huang Y, Wang Z, Ma L, Fang C, Gu M, Zhang X, Chen Z (2024) Generation-based differential fuzzing for deep learning libraries. ACM Trans Softw Eng Methodol 33(2):50\u201315028. https:\/\/doi.org\/10.1145\/3628159","journal-title":"ACM Trans Softw Eng Methodol"},{"key":"532_CR25","unstructured":"Liu F, Liu Y, Shi L, Huang H, Wang R, Yang Z, Zhang L, Li Z, Ma Y (2024) Exploring and evaluating hallucinations in llm-powered code generation. arXiv preprint arXiv:2404.00971"},{"key":"532_CR26","unstructured":"Nunez A, Islam NT, Jha SK, Najafirad P (2024) Autosafecoder: A multi-agent framework for securing llm code generation through static analysis and fuzz testing. arXiv preprint arXiv:2409.10737"},{"key":"532_CR27","doi-asserted-by":"publisher","unstructured":"Pham HV, Lutellier T, Qi W, Tan L (2019) CRADLE: cross-backend validation to detect and localize bugs in deep learning libraries. In: Atlee, J.M., Bultan T, Whittle J (eds) Proceedings of the 41st international conference on software engineering, ICSE 2019, Montreal, QC, Canada, pp 1027\u20131038. IEEE\/ACM. https:\/\/doi.org\/10.1109\/ICSE.2019.00107","DOI":"10.1109\/ICSE.2019.00107"},{"key":"532_CR28","unstructured":"PyTorch: PyTorch. [Online; accessed 2025-03-23] (2025). https:\/\/pytorch.org\/"},{"key":"532_CR29","unstructured":"Ruff: Ruff-An extremely fast Python linter and code formatter, written in Rust. [Online; accessed 2025-05-26] (2025). https:\/\/docs.astral.sh\/ruff\/"},{"key":"532_CR30","unstructured":"Tableau: everyday examples and applications of artificial intelligence (AI) | Tableau. [Online; accessed 2025-03-22] (2025). https:\/\/www.tableau.com\/data-insights\/ai\/examples"},{"key":"532_CR31","unstructured":"TensorFlow: TensorFlow. [Online; accessed 2025-03-23] (2025). https:\/\/www.tensorflow.org\/"},{"key":"532_CR32","unstructured":"Vidhya A (2023) Which are Common Applications of Deep Learning in AI? [Online; accessed 2025-03-22] . https:\/\/www.analyticsvidhya.com\/blog\/2023\/06\/common-applications-of-deep-learning-in-artificial-intelligence\/"},{"issue":"1","key":"532_CR33","doi-asserted-by":"crossref","first-page":"105","DOI":"10.53941\/tai.2025.100014","volume":"1","author":"J Wang","year":"2025","unstructured":"Wang J, Ni T, Lee W-B, Zhao Q (2025) A contemporary survey of large language model assisted program analysis. Trans Artif Intell 1(1):105\u2013129","journal-title":"Trans Artif Intell"},{"key":"532_CR34","doi-asserted-by":"publisher","unstructured":"Wang Z, Yan M, Chen J, Liu S, Zhang D (2020) Deep learning library testing via effective model generation. In: Devanbu P, Cohen MB, Zimmermann T (eds) ESEC\/FSE \u201920: 28th ACM joint European software engineering conference and symposium on the foundations of software engineering, Virtual Event, pp 788\u2013799. ACM. https:\/\/doi.org\/10.1145\/3368089.3409761","DOI":"10.1145\/3368089.3409761"},{"key":"532_CR35","first-page":"24824","volume":"35","author":"J Wei","year":"2022","unstructured":"Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le QV, Zhou D et al (2022) Chain-of-thought prompting elicits reasoning in large language models. Adv Neural Inf Process Syst 35:24824\u201324837","journal-title":"Adv Neural Inf Process Syst"},{"key":"532_CR36","doi-asserted-by":"publisher","unstructured":"Wei A, Deng Y, Yang C, Zhang L (2022) Free lunch for testing: Fuzzing deep-learning libraries from open source. In: 44th IEEE\/ACM 44th international conference on software engineering, ICSE 2022, Pittsburgh, pp 995\u20131007. ACMhttps:\/\/doi.org\/10.1145\/3510003.3510041","DOI":"10.1145\/3510003.3510041"},{"key":"532_CR37","doi-asserted-by":"publisher","unstructured":"Xie D, Li Y, Kim M, Pham HV, Tan L, Zhang X, Godfrey MW (2022) Docter: documentation-guided fuzzing for testing deep learning API functions. In: Ryu S, Smaragdakis Y (eds) ISSTA \u201922: 31st ACM SIGSOFT international symposium on software testing and analysis, virtual event, South Korea, pp 176\u2013188. ACM. https:\/\/doi.org\/10.1145\/3533767.3534220","DOI":"10.1145\/3533767.3534220"},{"key":"532_CR38","doi-asserted-by":"publisher","unstructured":"Yang C, Deng Y, Yao J, Tu Y, Li H, Zhang L (2023) Fuzzing automatic differentiation in deep-learning libraries. In: 45th IEEE\/ACM international conference on software engineering, ICSE 2023, Melbourne, Australia, pp 1174\u20131186. IEEE. https:\/\/doi.org\/10.1109\/ICSE48619.2023.00105","DOI":"10.1109\/ICSE48619.2023.00105"},{"issue":"7","key":"532_CR39","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3716497","volume":"57","author":"X Zhang","year":"2025","unstructured":"Zhang X, Jiang W, Shen C, Li Q, Wang Q, Lin C, Guan X (2025) Deep learning library testing: definition, methods and challenges. ACM Comput Surv 57(7):1\u201337","journal-title":"ACM Comput Surv"},{"key":"532_CR40","doi-asserted-by":"publisher","unstructured":"Zhang C, Zheng Y, Bai M, Li Y, Ma W, Xie X, Li Y, Sun L, Liu Y (2024) How effective are they? exploring large language model based fuzz driver generation. In: Christakis M, Pradel M (eds) Proceedings of the 33rd ACM SIGSOFT international symposium on software testing and analysis, ISSTA 2024, Vienna, Austria, pp 1223\u20131235. ACM. https:\/\/doi.org\/10.1145\/3650212.3680355","DOI":"10.1145\/3650212.3680355"},{"key":"532_CR41","unstructured":"Zhao Z, Shi Y, Wu S, Yang F, Song W, Liu N (2023) Interpretation of time-series deep models: a survey. arXiv preprint arXiv:2305.14582"},{"key":"532_CR42","doi-asserted-by":"crossref","unstructured":"Zhong L, Wang Z (2024) Can llm replace stack overflow? A study on robustness and reliability of large language model code generation. In: Proceedings of the AAAI conference on artificial intelligence vol 38, pp 21841\u201321849","DOI":"10.1609\/aaai.v38i19.30185"},{"key":"532_CR43","doi-asserted-by":"crossref","unstructured":"Zhou Y, Ni T, Lee W-B, Zhao Q (2025) A survey on backdoor threats in large language models (llms): attacks, defenses, and evaluation methods. Trans Artif Intell 3\u20133","DOI":"10.53941\/tai.2025.100003"},{"key":"532_CR44","unstructured":"Zhou H, Wan X, Sun R, Palangi H, Iqbal S, Vuli\u0107 I, Korhonen A, Ar\u0131k S \u00d6 (2025) Multi-agent design: Optimizing agents with better prompts and topologies. arXiv preprint arXiv:2502.02533"}],"container-title":["Cybersecurity"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-025-00532-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s42400-025-00532-9","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-025-00532-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,4]],"date-time":"2026-01-04T09:43:49Z","timestamp":1767519829000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1186\/s42400-025-00532-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,4]]},"references-count":44,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2026,12]]}},"alternative-id":["532"],"URL":"https:\/\/doi.org\/10.1186\/s42400-025-00532-9","relation":{},"ISSN":["2523-3246"],"issn-type":[{"value":"2523-3246","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,4]]},"assertion":[{"value":"10 October 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 November 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 January 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare the following financial interests\/personal relationships which may be considered as potential Conflict of interest: Yue Wu reports financial support was provided by National Key R&D Program of China. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interest"}}],"article-number":"7"}}