{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T13:12:44Z","timestamp":1773839564186,"version":"3.50.1"},"reference-count":49,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T00:00:00Z","timestamp":1773792000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T00:00:00Z","timestamp":1773792000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Cybersecurity"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Source code vulnerability detection is critical to software security. Existing methods either rely on external tools to construct program graphs, which are constrained by environmental limitations, incur high computational costs, and are prone to failure, or adopt lightweight sequence models that overlook code structural features. To address this, we propose LiteVul, a lightweight framework that constructs compact directed graphs directly from raw source code without relying on external tools, capturing multi-scale lexical co-occurrence relationships and fine-grained structural dependencies. By leveraging structure-aware graph neural networks, LiteVul aggregates multi-layer features to achieve precise vulnerability prediction. Experiments conducted on the Devign, Reveal, and DiverseVul benchmark datasets show that LiteVul substantially surpasses nine representative baseline methods in performance. Additionally, its graph construction efficiency is substantially improved, providing a feasible technical pathway for vulnerability detection in large-scale codebases.<\/jats:p>","DOI":"10.1186\/s42400-026-00570-x","type":"journal-article","created":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T09:37:14Z","timestamp":1773826634000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Litevul: a lightweight vulnerability detection via directed PPMI-weighted token graphs"],"prefix":"10.1186","volume":"9","author":[{"given":"Zhaohui","family":"Liu","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0009-0006-5393-0887","authenticated-orcid":false,"given":"Wenjie","family":"Xie","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2026,3,18]]},"reference":[{"key":"570_CR1","first-page":"1877","volume":"33","author":"T Brown","year":"2020","unstructured":"Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877\u20131901","journal-title":"Adv Neural Inf Process Syst"},{"issue":"9","key":"570_CR2","doi-asserted-by":"publisher","first-page":"3280","DOI":"10.1109\/TSE.2021.3087402","volume":"48","author":"S Chakraborty","year":"2021","unstructured":"Chakraborty S, Krishna R, Ding Y, Ray B (2021) Deep learning based vulnerability detection: Are we there yet? IEEE Trans Software Eng 48(9):3280\u20133296","journal-title":"IEEE Trans Software Eng"},{"key":"570_CR3","doi-asserted-by":"crossref","unstructured":"Chen Y, Ding Z, Alowain L, Chen X, Wagner D (2023) Diversevul: A new vulnerable source code dataset for deep learning based vulnerability detection. In: Proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses, pp. 654\u2013668","DOI":"10.1145\/3607199.3607242"},{"key":"570_CR4","doi-asserted-by":"crossref","unstructured":"Cho K, Van\u00a0Merri\u00ebnboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078","DOI":"10.3115\/v1\/D14-1179"},{"key":"570_CR5","doi-asserted-by":"crossref","unstructured":"Xu Z, Chen B, Chandramohan M, Liu Y, Song F (2017) Spain: security patch analysis for binaries towards understanding the pain and pills. In: 2017 IEEE\/ACM 39th International Conference on Software Engineering (ICSE), pp. 462\u2013472 (2017). IEEE","DOI":"10.1109\/ICSE.2017.49"},{"key":"570_CR6","unstructured":"Dam HK, Tran T, Pham T, Ng SW, Grundy J, Ghose A (2017) Automatic feature learning for vulnerability prediction. arXiv preprint arXiv:1708.02368"},{"key":"570_CR7","doi-asserted-by":"crossref","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (long and Short Papers), pp. 4171\u20134186","DOI":"10.18653\/v1\/N19-1423"},{"issue":"5","key":"570_CR8","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1145\/502059.502041","volume":"35","author":"D Engler","year":"2001","unstructured":"Engler D, Chen DY, Hallem S, Chou A, Chelf B (2001) Bugs as deviant behavior: a general approach to inferring errors in systems code. ACM SIGOPS Op Syst Rev 35(5):57\u201372","journal-title":"ACM SIGOPS Op Syst Rev"},{"key":"570_CR9","doi-asserted-by":"crossref","unstructured":"Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, et\u00a0al (2020) Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155","DOI":"10.18653\/v1\/2020.findings-emnlp.139"},{"key":"570_CR10","doi-asserted-by":"crossref","unstructured":"Fu M, Tantithamthavorn C (2022) Linevul: a transformer-based line-level vulnerability prediction. In: Proceedings of the 19th International Conference on Mining Software Repositories, pp. 608\u2013620","DOI":"10.1145\/3524842.3528452"},{"key":"570_CR11","doi-asserted-by":"crossref","unstructured":"Graves A (2012) Long short-term memory. Supervised sequence labelling with recurrent neural networks, 37\u201345","DOI":"10.1007\/978-3-642-24797-2_4"},{"key":"570_CR12","unstructured":"Guo D, Ren S, Lu S, Feng Z, Tang D, Liu S, Zhou L, Duan N, Svyatkovskiy A, Fu S, et\u00a0al (2020) Graphcodebert: Pre-training code representations with data flow. arXiv preprint arXiv:2009.08366"},{"key":"570_CR13","doi-asserted-by":"crossref","unstructured":"Guo D, Lu S, Duan N, Wang Y, Zhou M, Yin J (2022) Unixcoder: Unified cross-modal pre-training for code representation. arXiv preprint arXiv:2203.03850","DOI":"10.18653\/v1\/2022.acl-long.499"},{"key":"570_CR14","first-page":"15908","volume":"34","author":"K Han","year":"2021","unstructured":"Han K, Xiao A, Wu E, Guo J, Xu C, Wang Y (2021) Transformer in transformer. Adv Neural Inf Process Syst 34:15908\u201315919","journal-title":"Adv Neural Inf Process Syst"},{"key":"570_CR15","doi-asserted-by":"crossref","unstructured":"Jiang B, Zhang Z, Lin D, Tang J, Luo B (2019) Semi-supervised learning with graph learning-convolutional networks. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 11313\u201311320","DOI":"10.1109\/CVPR.2019.01157"},{"key":"570_CR16","unstructured":"Larochelle D, Evans D (2001) Statically detecting likely buffer overflow vulnerabilities. In: 10th USENIX Security Symposium (USENIX Security 01)"},{"key":"570_CR17","unstructured":"Levy O, Goldberg Y (2014) Neural word embedding as implicit matrix factorization. Adv Neural Inf Process Syst 27"},{"key":"570_CR18","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1162\/tacl_a_00134","volume":"3","author":"O Levy","year":"2015","unstructured":"Levy O, Goldberg Y, Dagan I (2015) Improving distributional similarity with lessons learned from word embeddings. Trans Assoc Comput Linguist 3:211\u2013225","journal-title":"Trans Assoc Comput Linguist"},{"key":"570_CR19","unstructured":"Li Y, Tarlow D, Brockschmidt M, Zemel R (2015) Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493"},{"key":"570_CR20","doi-asserted-by":"crossref","unstructured":"Li Z, Zou D, Xu S, Ou X, Jin H, Wang S, Deng Z, Zhong Y (2018) Vuldeepecker: A deep learning-based system for vulnerability detection. arXiv preprint arXiv:1801.01681","DOI":"10.14722\/ndss.2018.23158"},{"key":"570_CR21","doi-asserted-by":"crossref","unstructured":"Li Y, Xue Y, Chen H, Wu X, Zhang C, Xie X, Wang H, Liu Y (2019) Cerebro: context-aware adaptive fuzzing for effective vulnerability detection. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 533\u2013544","DOI":"10.1145\/3338906.3338975"},{"issue":"4","key":"570_CR22","doi-asserted-by":"publisher","first-page":"2244","DOI":"10.1109\/TDSC.2021.3051525","volume":"19","author":"Z Li","year":"2021","unstructured":"Li Z, Zou D, Xu S, Jin H, Zhu Y, Chen Z (2021) Sysevr: a framework for using deep learning to detect software vulnerabilities. IEEE Trans Dependable Secure Comput 19(4):2244\u20132258","journal-title":"IEEE Trans Dependable Secure Comput"},{"key":"570_CR23","doi-asserted-by":"crossref","unstructured":"Liang H, Wang L, Wu D, Xu J (2016) Mlsa: A static bugs analysis tool based on llvm ir. In: 2016 17th IEEE\/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel\/Distributed Computing (SNPD), pp. 407\u2013412 . IEEE","DOI":"10.1109\/SNPD.2016.7515932"},{"issue":"5","key":"570_CR24","doi-asserted-by":"publisher","first-page":"2469","DOI":"10.1109\/TDSC.2019.2954088","volume":"18","author":"G Lin","year":"2019","unstructured":"Lin G, Zhang J, Luo W, Pan L, De Vel O, Montague P, Xiang Y (2019) Software vulnerability discovery via learning multi-domain knowledge bases. IEEE Trans Dependable Secure Comput 18(5):2469\u20132485","journal-title":"IEEE Trans Dependable Secure Comput"},{"key":"570_CR25","doi-asserted-by":"crossref","unstructured":"Lipp S, Banescu S, Pretschner A (2022) An empirical study on the effectiveness of static c code analyzers for vulnerability detection. In: Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 544\u2013555","DOI":"10.1145\/3533767.3534380"},{"key":"570_CR26","unstructured":"Liu A, Feng B, Xue B, Wang B, Wu B, Lu C, Zhao C, Deng C, Zhang C, Ruan C, et\u00a0al (2024) Deepseek-v3 technical report. arXiv preprint arXiv:2412.19437"},{"key":"570_CR27","doi-asserted-by":"crossref","unstructured":"Neuhaus S, Zimmermann T, Holler C, Zeller A (2007) Predicting vulnerable software components. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, pp. 529\u2013540","DOI":"10.1145\/1315245.1315311"},{"key":"570_CR28","doi-asserted-by":"crossref","unstructured":"Nguyen V-A, Nguyen DQ, Nguyen V, Le T, Tran QH, Phung D (2022) Regvd: Revisiting graph neural networks for vulnerability detection. In: Proceedings of the ACM\/IEEE 44th International Conference on Software Engineering: Companion Proceedings, pp. 178\u2013182","DOI":"10.1145\/3510454.3516865"},{"key":"570_CR29","doi-asserted-by":"crossref","unstructured":"Peng X, Wang S, Qin Y, Lin B, Chen L, Cheng J, Mao X (2025) Keep it simple: Self-adaptive code graph simplification for accurate vulnerability detection. IEEE Trans Softw Eng","DOI":"10.1109\/TSE.2025.3593515"},{"key":"570_CR30","doi-asserted-by":"crossref","unstructured":"Cho K, Van\u00a0Merri\u00ebnboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078","DOI":"10.3115\/v1\/D14-1179"},{"key":"570_CR31","doi-asserted-by":"crossref","unstructured":"Qiu F, Liu Z, Hu X, Xia X, Chen G, Wang X (2024) Vulnerability detection via multiple-graph-based code representation. IEEE Trans Softw Eng","DOI":"10.1109\/TSE.2024.3427815"},{"issue":"8","key":"570_CR32","first-page":"9","volume":"1","author":"A Radford","year":"2019","unstructured":"Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I et al (2019) Language models are unsupervised multitask learners. OpenAI Blog 1(8):9","journal-title":"OpenAI Blog"},{"key":"570_CR33","doi-asserted-by":"crossref","unstructured":"Reid D, Jahanshahi M, Mockus A (2022) The extent of orphan vulnerabilities from code reuse in open source software. In: Proceedings of the 44th International Conference on Software Engineering, pp. 2104\u20132115","DOI":"10.1145\/3510003.3510216"},{"key":"570_CR34","doi-asserted-by":"crossref","unstructured":"Russell R, Kim L, Hamilton L, Lazovich T, Harer J, Ozdemir O, Ellingwood P, McConley M (2018) Automated vulnerability detection in source code using deep representation learning. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 757\u2013762 . IEEE","DOI":"10.1109\/ICMLA.2018.00120"},{"issue":"10","key":"570_CR35","doi-asserted-by":"publisher","first-page":"993","DOI":"10.1109\/TSE.2014.2340398","volume":"40","author":"R Scandariato","year":"2014","unstructured":"Scandariato R, Walden J, Hovsepyan A, Joosen W (2014) Predicting vulnerable software components via text mining. IEEE Trans Software Eng 40(10):993\u20131006","journal-title":"IEEE Trans Software Eng"},{"key":"570_CR36","unstructured":"Shao M, Ding Y (2024) $$\\{FVD-DPM\\}$$: Fine-grained vulnerability detection via conditional diffusion probabilistic models. In: 33rd USENIX Security Symposium (USENIX Security 24), pp. 7375\u20137392"},{"key":"570_CR37","doi-asserted-by":"crossref","unstructured":"Steenhoek B, Gao H, Le W (2024) Dataflow analysis-inspired deep learning for efficient vulnerability detection. In: Proceedings of the 46th Ieee\/acm International Conference on Software Engineering, pp. 1\u201313","DOI":"10.1145\/3597503.3623345"},{"key":"570_CR38","unstructured":"Veli\u010dkovi\u0107 P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903"},{"key":"570_CR39","doi-asserted-by":"publisher","first-page":"1943","DOI":"10.1109\/TIFS.2020.3044773","volume":"16","author":"H Wang","year":"2020","unstructured":"Wang H, Ye G, Tang Z, Tan SH, Huang S, Fang D, Feng Y, Bian L, Wang Z (2020) Combining graph-based learning with automated data collection for code vulnerability detection. IEEE Trans Inf Forensics Secur 16:1943\u20131958","journal-title":"IEEE Trans Inf Forensics Secur"},{"key":"570_CR40","doi-asserted-by":"crossref","unstructured":"Wang Y, Wang W, Joty S, Hoi SC (2021) Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859","DOI":"10.18653\/v1\/2021.emnlp-main.685"},{"key":"570_CR41","doi-asserted-by":"crossref","unstructured":"Wang W, Zhwng P, Wei G, Ge Z, Qin Z, Sun X (2023) Buffer overflow vulnerability detection based on static analysis-assisted symbolic execution. In: 2023 4th International Symposium on Computer Engineering and Intelligent Communications (ISCEIC), pp. 546\u2013550 . IEEE","DOI":"10.1109\/ISCEIC59030.2023.10271194"},{"key":"570_CR42","first-page":"1176898","volume":"1","author":"B Wu","year":"2022","unstructured":"Wu B, Zou F (2022) Code vulnerability detection based on deep sequence and graph models: a survey. Secur Commun Netw 1:1176898","journal-title":"Secur Commun Netw"},{"key":"570_CR43","doi-asserted-by":"publisher","first-page":"3986","DOI":"10.1109\/TIFS.2024.3374219","volume":"19","author":"T Wu","year":"2024","unstructured":"Wu T, Chen L, Du G, Meng D, Shi G (2024) Ultravcs: ultra-fine-grained variable-based code slicing for automated vulnerability detection. IEEE Trans Inf Forensics Secur 19:3986\u20134000","journal-title":"IEEE Trans Inf Forensics Secur"},{"key":"570_CR44","doi-asserted-by":"crossref","unstructured":"Xu Z, Chen B, Chandramohan M, Liu Y, Song F (2017) Spain: security patch analysis for binaries towards understanding the pain and pills. In: 2017 IEEE\/ACM 39th International Conference on Software Engineering (ICSE), pp. 462\u2013472 (2017). IEEE","DOI":"10.1109\/ICSE.2017.49"},{"key":"570_CR45","doi-asserted-by":"crossref","unstructured":"Yamaguchi F, Golde N, Arp D, Rieck K (2014) Modeling and discovering vulnerabilities with code property graphs. In: 2014 IEEE Symposium on Security and Privacy, pp. 590\u2013604 . IEEE","DOI":"10.1109\/SP.2014.44"},{"key":"570_CR46","unstructured":"Yang A, Li A, Yang B, Zhang B, Hui B, Zheng B, Yu B, Gao C, Huang C, Lv C, et\u00a0al (2025) Qwen3 technical report. arXiv preprint arXiv:2505.09388"},{"key":"570_CR47","unstructured":"Zhou Y, Liu S, Siow J, Du X, Liu Y (2019) Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. Advances in neural information processing systems 32"},{"issue":"5","key":"570_CR48","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3708522","volume":"34","author":"X Zhou","year":"2025","unstructured":"Zhou X, Cao S, Sun X, Lo D (2025) Large language model for vulnerability detection and repair: Literature review and the road ahead. ACM Trans Softw Eng Methodol 34(5):1\u201331","journal-title":"ACM Trans Softw Eng Methodol"},{"issue":"5","key":"570_CR49","first-page":"2224","volume":"18","author":"D Zou","year":"2019","unstructured":"Zou D, Wang S, Xu S, Li Z, Jin H (2019) $$\\mu$$vuldeepecker: a deep learning-based system for multiclass vulnerability detection. IEEE Trans Dependable Secure Comput 18(5):2224\u20132236","journal-title":"IEEE Trans Dependable Secure Comput"}],"container-title":["Cybersecurity"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-026-00570-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s42400-026-00570-x","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-026-00570-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T09:37:30Z","timestamp":1773826650000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1186\/s42400-026-00570-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,18]]},"references-count":49,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2026,12]]}},"alternative-id":["570"],"URL":"https:\/\/doi.org\/10.1186\/s42400-026-00570-x","relation":{},"ISSN":["2523-3246"],"issn-type":[{"value":"2523-3246","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,18]]},"assertion":[{"value":"21 October 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 March 2026","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 March 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"140"}}