{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,17]],"date-time":"2026-05-17T04:26:35Z","timestamp":1778991995254,"version":"3.51.4"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,10,29]],"date-time":"2025-10-29T00:00:00Z","timestamp":1761696000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,10,29]],"date-time":"2025-10-29T00:00:00Z","timestamp":1761696000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Empir Software Eng"],"published-print":{"date-parts":[[2026,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The growing prevalence of software vulnerabilities has increased the need for effective detection methods, particularly in cross-project settings where domain differences create significant challenges. Existing vulnerability detection models often struggle to generalise across projects due to variations in coding styles, feature distributions, and the absence of labelled target data. This paper presents ZSVulD, a zero-shot, cross-project vulnerability detection framework designed to operate without target-domain labels. ZSVulD uses domain-agnostic CodeBERT embeddings to capture both syntactic and semantic features of source code, enabling knowledge transfer between projects. The framework applies an iterative pseudo-labelling process in which a neural network and XGBoost classifier collaboratively refine predictions for the target domain. Feature alignment is incorporated as a diagnostic technique to assess and visualise distributional differences between source and target datasets. Experiments on the Devign and REVEAL datasets show that ZSVulD achieves higher recall, F1, and F2 scores compared to existing methods, with an emphasis on reducing false negatives. These findings indicate that ZSVulD can support automated vulnerability detection pipelines, contributing to more reliable security assessments across different software projects.<\/jats:p>","DOI":"10.1007\/s10664-025-10749-4","type":"journal-article","created":{"date-parts":[[2025,10,29]],"date-time":"2025-10-29T05:02:14Z","timestamp":1761714134000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["A zero-shot framework for cross-project vulnerability detection in source code"],"prefix":"10.1007","volume":"31","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-6928-1598","authenticated-orcid":false,"given":"Radowanul","family":"Haque","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4578-7631","authenticated-orcid":false,"given":"Aftab","family":"Ali","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6871-3504","authenticated-orcid":false,"given":"Sally","family":"McClean","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9301-5855","authenticated-orcid":false,"given":"Naveed","family":"Khan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2025,10,29]]},"reference":[{"key":"10749_CR1","doi-asserted-by":"publisher","first-page":"83","DOI":"10.1007\/978-981-99-0272-9_6","volume":"1768 CCIS","author":"MS Akter","year":"2023","unstructured":"Akter MS, Shahriar H, Bhuiya ZA (2023) Automated vulnerability detection in source code using quantum natural Language processing. Commun Comput Inform Sci 1768 CCIS:83\u2013102. https:\/\/doi.org\/10.1007\/978-981-99-0272-9_6","journal-title":"Commun Comput Inform Sci"},{"key":"10749_CR2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2024.112038","volume":"213","author":"Z Cai","year":"2024","unstructured":"Cai Z, Cai Y, Chen X et al (2024) CSVD-TF: Cross-project software vulnerability detection with tradaboost by fusing expert metrics and semantic metrics. J Syst Softw 213:112038. https:\/\/doi.org\/10.1016\/j.jss.2024.112038","journal-title":"J Syst Softw"},{"key":"10749_CR3","doi-asserted-by":"publisher","first-page":"3280","DOI":"10.1109\/TSE.2021.3087402","volume":"48","author":"S Chakraborty","year":"2022","unstructured":"Chakraborty S, Krishna R, Ding Y, Ray B (2022) Deep learning based vulnerability detection: are we there yet? IEEE Trans Software Eng 48:3280\u20133296. https:\/\/doi.org\/10.1109\/TSE.2021.3087402","journal-title":"IEEE Trans Software Eng"},{"key":"10749_CR4","doi-asserted-by":"publisher","unstructured":"Chen T, Guestrin C (2016) XGBoost: A scalable tree boosting system. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 13-17-August-2016:785\u2013794. https:\/\/doi.org\/10.1145\/2939672.2939785\/SUPPL_FILE\/KDD2016_CHEN_BOOSTING_SYSTEM_01-ACM.MP4","DOI":"10.1145\/2939672.2939785\/SUPPL_FILE\/KDD2016_CHEN_BOOSTING_SYSTEM_01-ACM.MP4"},{"key":"10749_CR5","doi-asserted-by":"publisher","first-page":"654","DOI":"10.1145\/3607199.3607242","volume":"15","author":"Y Chen","year":"2023","unstructured":"Chen Y, Ding Z, Alowain L et al (2023) DiverseVul: A new vulnerable source code dataset for deep learning based vulnerability detection. ACM Int Conf Proceeding Ser 15:654\u2013668. https:\/\/doi.org\/10.1145\/3607199.3607242","journal-title":"ACM Int Conf Proceeding Ser"},{"key":"10749_CR6","unstructured":"David A, Wheeler (2017) Flawfinder Home Page. https:\/\/dwheeler.com\/flawfinder\/. Accessed 18 Jan 2025"},{"key":"10749_CR7","doi-asserted-by":"publisher","unstructured":"Devlin J, Chang M-W, Lee K et al (2019) BERT: Pre-training of deep bidirectional Transformers for Language Understanding. Proc 2019 Conf North 4171\u20134186. https:\/\/doi.org\/10.18653\/V1\/N19-1423","DOI":"10.18653\/V1\/N19-1423"},{"key":"10749_CR8","doi-asserted-by":"publisher","unstructured":"Ding Y, Fu Y, Ibrahim O et al (2024) Vulnerability Detection with Code Language Models: How Far Are We? 1729\u20131741. https:\/\/doi.org\/10.1109\/icse55347.2025.00038","DOI":"10.1109\/icse55347.2025.00038"},{"key":"10749_CR9","doi-asserted-by":"publisher","first-page":"1536","DOI":"10.18653\/v1\/2020.findings-emnlp.139","volume-title":"Findings of the association for computational linguistics: EMNLP 2020","author":"Z Feng","year":"2020","unstructured":"Feng Z, Guo D, Tang D et al (2020) CodeBERT: A Pre-Trained model for programming and natural languages. Findings of the association for computational linguistics: EMNLP 2020. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 1536\u20131547"},{"key":"10749_CR10","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1109\/JAS.2022.105860","volume":"10","author":"X Feng","year":"2023","unstructured":"Feng X, Zhu X, Han QL (2023) Detecting vulnerability on IoT device firmware: a survey. IEEE\/CAA Journal of Automatica Sinica 10:25\u201341. https:\/\/doi.org\/10.1109\/JAS.2022.105860","journal-title":"IEEE\/CAA Journal of Automatica Sinica"},{"key":"10749_CR11","doi-asserted-by":"publisher","first-page":"7212","DOI":"10.18653\/v1\/2022.acl-long.499","volume":"1","author":"D Guo","year":"2022","unstructured":"Guo D, Lu S, Duan N et al (2022) UniXcoder: unified Cross-Modal Pre-training for code representation. Proc Annual Meeting Association Comput Linguistics 1:7212\u20137225. https:\/\/doi.org\/10.18653\/v1\/2022.acl-long.499","journal-title":"Proc Annual Meeting Association Comput Linguistics"},{"key":"10749_CR12","doi-asserted-by":"publisher","DOI":"10.3390\/APP14219697","volume":"14","author":"R G\u00fcrfidan","year":"2024","unstructured":"G\u00fcrfidan R (2024) Vulrem: Fine-tuned BERT-based source-code potential vulnerability scanning system to mitigate attacks in web applications. Appl Sci 14:9697. https:\/\/doi.org\/10.3390\/APP14219697","journal-title":"Appl Sci"},{"key":"10749_CR13","doi-asserted-by":"publisher","unstructured":"Hanif H, Maffeis S (2022) VulBERTa: Simplified Source Code Pre-Training for Vulnerability Detection. Proceedings of the International Joint Conference on Neural Networks 2022-July: https:\/\/doi.org\/10.1109\/IJCNN55064.2022.9892280","DOI":"10.1109\/IJCNN55064.2022.9892280"},{"key":"10749_CR14","doi-asserted-by":"publisher","first-page":"409","DOI":"10.1109\/ACCESS.2023.3343329","volume":"12","author":"R Haque","year":"2024","unstructured":"Haque R, Ali A, Mcclean S et al (2024) Heterogeneous cross-project defect prediction using encoder networks and transfer learning. IEEE Access 12:409\u2013419. https:\/\/doi.org\/10.1109\/ACCESS.2023.3343329","journal-title":"IEEE Access"},{"key":"10749_CR15","doi-asserted-by":"publisher","unstructured":"Harzevili NS, Belle AB, Wang J et al (2023) A survey on automated software vulnerability detection using machine learning and deep learning. ACM Comput Surv 37. https:\/\/doi.org\/10.1145\/3699711\/ASSET\/CF5E99B1-65E5-4D5E-B788-FFDFB5725C05\/ASSETS\/GRAPHIC\/CSUR-2023-0542-U07.JPG","DOI":"10.1145\/3699711\/ASSET\/CF5E99B1-65E5-4D5E-B788-FFDFB5725C05\/ASSETS\/GRAPHIC\/CSUR-2023-0542-U07.JPG"},{"key":"10749_CR16","unstructured":"IBM (2024) Cost of a data breach 2024 | IBM. https:\/\/www.ibm.com\/reports\/data-breach. Accessed 29 Dec 2024"},{"key":"10749_CR17","doi-asserted-by":"publisher","first-page":"2023","DOI":"10.1016\/j.procs.2020.04.217","volume":"171","author":"A Kaur","year":"2020","unstructured":"Kaur A, Nayyar R (2020) A comparative study of static code analysis tools for vulnerability detection in C\/C\u2009+\u2009+\u2009and JAVA source code. Procedia Comput Sci 171:2023\u20132029. https:\/\/doi.org\/10.1016\/j.procs.2020.04.217","journal-title":"Procedia Comput Sci"},{"key":"10749_CR18","doi-asserted-by":"publisher","DOI":"10.14722\/ndss.2018.23158","author":"Z Li","year":"2018","unstructured":"Li Z, Zou D, Xu S et al (2018) VulDeePecker: A deep Learning-Based system for vulnerability detection. 25th Annual Netw Distrib Syst Secur Symp NDSS 2018. https:\/\/doi.org\/10.14722\/ndss.2018.23158","journal-title":"25th Annual Netw Distrib Syst Secur Symp NDSS 2018"},{"key":"10749_CR19","doi-asserted-by":"publisher","first-page":"2244","DOI":"10.1109\/TDSC.2021.3051525","volume":"19","author":"Z Li","year":"2022","unstructured":"Li Z, Zou D, Xu S et al (2022) Sysevr: a framework for using deep learning to detect software vulnerabilities. IEEE Trans Depend Secure Comput 19:2244\u20132258. https:\/\/doi.org\/10.1109\/TDSC.2021.3051525","journal-title":"IEEE Trans Depend Secure Comput"},{"key":"10749_CR20","doi-asserted-by":"publisher","DOI":"10.1016\/J.COSE.2022.103017","volume":"125","author":"X Li","year":"2023","unstructured":"Li X, Xin Y, Zhu H (2023a) Cross-domain vulnerability detection using graph embedding and domain adaptation. Computers & Secur 125:103017. https:\/\/doi.org\/10.1016\/J.COSE.2022.103017","journal-title":"Computers & Secur"},{"key":"10749_CR21","doi-asserted-by":"publisher","unstructured":"Li Y, Hui B, Xia X et al (2023b) One-Shot Learning as Instruction Data Prospector for Large Language Models. Proceedings of the Annual Meeting of the Association for Computational Linguistics 1:4586\u20134601. https:\/\/doi.org\/10.18653\/v1\/2024.acl-long.252","DOI":"10.18653\/v1\/2024.acl-long.252"},{"key":"10749_CR22","doi-asserted-by":"publisher","first-page":"3289","DOI":"10.1109\/TII.2018.2821768","volume":"14","author":"G Lin","year":"2018","unstructured":"Lin G, Zhang J, Luo W et al (2018) Cross-project transfer representation learning for vulnerable function discovery. IEEE Trans Ind Inform 14:3289\u20133297. https:\/\/doi.org\/10.1109\/TII.2018.2821768","journal-title":"IEEE Trans Ind Inform"},{"key":"10749_CR23","doi-asserted-by":"publisher","first-page":"1825","DOI":"10.1109\/JPROC.2020.2993293","volume":"108","author":"G Lin","year":"2020","unstructured":"Lin G, Wen S, Han QL et al (2020) Software vulnerability detection using deep neural networks: a survey. Proc IEEE 108:1825\u20131848. https:\/\/doi.org\/10.1109\/JPROC.2020.2993293","journal-title":"Proc IEEE"},{"key":"10749_CR24","doi-asserted-by":"publisher","first-page":"438","DOI":"10.1109\/TDSC.2020.2984505","volume":"19","author":"S Liu","year":"2022","unstructured":"Liu S, Lin G, Qu L et al (2022) CD-VulD: Cross-domain vulnerability discovery based on deep domain adaptation. IEEE Trans Depend Secure Comput 19:438\u2013451. https:\/\/doi.org\/10.1109\/TDSC.2020.2984505","journal-title":"IEEE Trans Depend Secure Comput"},{"key":"10749_CR25","doi-asserted-by":"publisher","unstructured":"Ma M, Zhao Z, Soremekun E et al (2022) GraphCode2Vec: generic code embedding via lexical and program dependence analyses. Proc \u2013 2022 Min Softw Repositories Conf MSR 2022 524\u2013536. https:\/\/doi.org\/10.1145\/3524842.3528456","DOI":"10.1145\/3524842.3528456"},{"key":"10749_CR26","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1109\/MSEC.2022.3176058","volume":"20","author":"T Marjanov","year":"2022","unstructured":"Marjanov T, Pashchenko I, Massacci F (2022) Machine learning for source code vulnerability detection: what works and what isn\u2019t there yet. IEEE Security & Privacy 20:60\u201376. https:\/\/doi.org\/10.1109\/MSEC.2022.3176058","journal-title":"IEEE Security & Privacy"},{"key":"10749_CR27","volume-title":"Dual-Component deep domain adaptation: A new approach for cross project software vulnerability Detection. In: advances in knowledge discovery and data mining","author":"V Nguyen","year":"2020","unstructured":"Nguyen V, Le T, de Vel O et al (2020) Dual-Component deep domain adaptation: A new approach for cross project software vulnerability Detection. In: advances in knowledge discovery and data mining. Springer, Cham"},{"key":"10749_CR28","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3664602","volume":"33","author":"V Nguyen","year":"2024","unstructured":"Nguyen V, Le T, Tantithamthavorn C et al (2024) Deep domain adaptation with max-margin principle for cross-project imbalanced software vulnerability detection. ACM Trans Softw Eng Methodol 33:1\u201334. https:\/\/doi.org\/10.1145\/3664602","journal-title":"ACM Trans Softw Eng Methodol"},{"key":"10749_CR29","unstructured":"Rothenberger JC, Diochnos DI (2023) Meta Co-Training. Two Views are Better than One"},{"key":"10749_CR30","doi-asserted-by":"publisher","first-page":"197158","DOI":"10.1109\/ACCESS.2020.3034766","volume":"8","author":"P Zeng","year":"2020","unstructured":"Zeng P, Lin G, Pan L (2020) Software vulnerability analysis and discovery using deep learning techniques: a survey. IEEE Access 8:197158\u2013197172. https:\/\/doi.org\/10.1109\/ACCESS.2020.3034766","journal-title":"IEEE Access"},{"key":"10749_CR31","doi-asserted-by":"publisher","first-page":"4152","DOI":"10.1109\/TSE.2023.3285910","volume":"49","author":"C Zhang","year":"2023","unstructured":"Zhang C, Liu B, Xin Y, Yao L (2023) CPVD: cross project vulnerability detection based on graph attention network and domain adaptation. IEEE Trans Software Eng 49:4152\u20134168. https:\/\/doi.org\/10.1109\/TSE.2023.3285910","journal-title":"IEEE Trans Software Eng"},{"key":"10749_CR32","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/JAS.2024.124983","volume":"12","author":"W Zhou","year":"2025","unstructured":"Zhou W, Zhu X, Han QL et al (2025) The security of using large language models: a survey with emphasis on ChatGPT. IEEE\/CAA Journal of Automatica Sinica 12:1\u201326. https:\/\/doi.org\/10.1109\/JAS.2024.124983","journal-title":"IEEE\/CAA Journal of Automatica Sinica"},{"key":"10749_CR33","doi-asserted-by":"publisher","unstructured":"Zhou Y, Liu S, Siow J et al (n.d.) Devign: effective vulnerability identification by learning comprehensive program semantics via Graph Neural Networks. https:\/\/doi.org\/10.5555\/3454287.3455202","DOI":"10.5555\/3454287.3455202"},{"key":"10749_CR34","doi-asserted-by":"publisher","DOI":"10.1145\/3512345","author":"X Zhu","year":"2022","unstructured":"Zhu X, Wen S, Camtepe S, Xiang Y (2022) Fuzzing: a survey for roadmap. ACM Comput Surv. https:\/\/doi.org\/10.1145\/3512345","journal-title":"ACM Comput Surv"},{"key":"10749_CR35","doi-asserted-by":"publisher","first-page":"317","DOI":"10.1109\/JAS.2024.124971","volume":"12","author":"X Zhu","year":"2025","unstructured":"Zhu X, Zhou W, Han QL et al (2025) When software security meets large language models: a survey. IEEE\/CAA Journal of Automatica Sinica 12:317\u2013334. https:\/\/doi.org\/10.1109\/JAS.2024.124971","journal-title":"IEEE\/CAA Journal of Automatica Sinica"}],"container-title":["Empirical Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-025-10749-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10664-025-10749-4","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-025-10749-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T11:25:55Z","timestamp":1770809155000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10664-025-10749-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,29]]},"references-count":35,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,1]]}},"alternative-id":["10749"],"URL":"https:\/\/doi.org\/10.1007\/s10664-025-10749-4","relation":{},"ISSN":["1382-3256","1573-7616"],"issn-type":[{"value":"1382-3256","type":"print"},{"value":"1573-7616","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,29]]},"assertion":[{"value":"18 February 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 October 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 October 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"3"}}