{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T14:30:22Z","timestamp":1775745022765,"version":"3.50.1"},"reference-count":65,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2022,4,9]],"date-time":"2022-04-09T00:00:00Z","timestamp":1649462400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"The National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62102284, 61872262"],"award-info":[{"award-number":["62102284, 61872262"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2022,7,31]]},"abstract":"<jats:p>\n            Security vulnerabilities have been continually disclosed and documented. For the effective understanding, management, and mitigation of the fast-growing number of vulnerabilities, an important practice in documenting vulnerabilities is to describe the key vulnerability aspects, such as vulnerability type, root cause, affected product, impact, attacker type, and attack vector. In this article, we first investigate 133,639 vulnerability reports in the\n            <jats:bold>Common Vulnerabilities and Exposures (CVE)<\/jats:bold>\n            database over the past 20 years. We find that 56%, 85%, 38%, and 28% of CVEs miss vulnerability type, root cause, attack vector, and attacker type, respectively. By comparing the differences of the latest updated CVE reports across different databases, we observe that 1,476 missing key aspects in 1,320 CVE descriptions were augmented manually in the\n            <jats:bold>National Vulnerability Database (NVD)<\/jats:bold>\n            , which indicates that the vulnerability database maintainers try to complete the vulnerability descriptions in practice to mitigate such a problem.\n          <\/jats:p>\n          <jats:p>\n            To help complete the missing information of key vulnerability aspects and reduce human efforts, we propose a neural-network-based approach called\n            <jats:bold>PMA<\/jats:bold>\n            to predict the missing key aspects of a vulnerability based on its known aspects. We systematically explore the design space of the neural network models and empirically identify the most effective model design in the scenario. Our ablation study reveals the prominent correlations among vulnerability aspects when predicting. Trained with historical CVEs, our model achieves 88%, 71%, 61%, and 81% in F1 for predicting the missing vulnerability type, root cause, attacker type, and attack vector of 8,623 \u201cfuture\u201d CVEs across 3 years, respectively. Furthermore, we validate the predicting performance of key aspect augmentation of CVEs based on the manually augmented CVE data collected from NVD, which confirms the practicality of our approach. We finally highlight that PMA has the ability to reduce human efforts by recommending and augmenting missing key aspects for vulnerability databases, and to facilitate other research works such as severity level prediction of CVEs based on the vulnerability descriptions.\n          <\/jats:p>","DOI":"10.1145\/3498537","type":"journal-article","created":{"date-parts":[[2022,1,31]],"date-time":"2022-01-31T17:28:25Z","timestamp":1643650105000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":39,"title":["Detecting and Augmenting Missing Key Aspects in Vulnerability Descriptions"],"prefix":"10.1145","volume":"31","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1594-8422","authenticated-orcid":false,"given":"Hao","family":"Guo","sequence":"first","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9477-4100","authenticated-orcid":false,"given":"Sen","family":"Chen","sequence":"additional","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, China"}]},{"given":"Zhenchang","family":"Xing","sequence":"additional","affiliation":[{"name":"Research School of Computer Science, Australian National University, Australia"}]},{"given":"Xiaohong","family":"Li","sequence":"additional","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, China"}]},{"given":"Yude","family":"Bai","sequence":"additional","affiliation":[{"name":"College of Intelligence and Computing, Tianjin University, China"}]},{"given":"Jiamou","family":"Sun","sequence":"additional","affiliation":[{"name":"Research School of Computer Science, Australian National University, Australia"}]}],"member":"320","published-online":{"date-parts":[[2022,4,9]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"Mart\u00edn Abadi Paul Barham Jianmin Chen Zhifeng Chen Andy Davis Jeffrey Dean Matthieu Devin Sanjay Ghemawat Geoffrey Irving Michael Isard Manjunath Kudlur Josh Levenberg Rajat Monga Sherry Moore Derek Murray Benoit Steiner Paul Tucker Vijay Vasudevan Pete Warden and Xiaoqiang Zhang. 2016. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th USENIX symposium on operating systems design and implementation (OSDI\u201916) . 265\u2013283."},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/PST47121.2019.8949047"},{"key":"e_1_3_2_4_2","article-title":"Cleaning the NVD: Comprehensive quality assessment, improvements, and analyses","author":"Anwar Afsah","year":"2020","unstructured":"Afsah Anwar, Ahmed Abusnaina, Songqing Chen, Frank Li, and David Mohaisen. 2020. Cleaning the NVD: Comprehensive quality assessment, improvements, and analyses. arXiv preprint arXiv:2006.15074 (2020).","journal-title":"arXiv preprint arXiv:2006.15074"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2016.07.318"},{"key":"e_1_3_2_6_2","doi-asserted-by":"crossref","unstructured":"H. Binyamini R. Bitton M. Inokuchi T. Yagyu Y. Elovici and A. Shabtai. 2020. An automated end-to-end framework for modeling attacks from vulnerability descriptions. arXiv preprint arXiv:2008.04377 .","DOI":"10.1145\/3447548.3467159"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/1835804.1835821"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/1835804.1835821"},{"key":"e_1_3_2_9_2","unstructured":"Brompwnie. 2017. cve-2020-5260. https:\/\/github.com\/brompwnie\/cve-2020-5260\/. [Online; accessed 21-January-2017]."},{"key":"e_1_3_2_10_2","article-title":"Common Attack Pattern Enumeration and Classification","year":"2019","unstructured":"CAPEC. 2019. Common Attack Pattern Enumeration and Classification. http:\/\/cwe.mitre.org\/. [Online; accessed 30-June-2019].","journal-title":"http:\/\/cwe.mitre.org\/"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/2970276.2970317"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/3377813.3381360"},{"key":"e_1_3_2_13_2","first-page":"294","volume-title":"Journal of Systems Architecture","author":"Chowdhury Istehad","year":"2010","unstructured":"Istehad Chowdhury and Mohammad Zulkernine. 2010. Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities. Journal of Systems Architecture, 57, 294\u2013313."},{"key":"e_1_3_2_14_2","unstructured":"CWE. 2019. Common weakness enumeration (CWE). http:\/\/capec.mitre.org\/. [Online; accessed 30-June-2019]."},{"key":"e_1_3_2_15_2","unstructured":"The MITRE Corporation. 2019. CveForm: Submit a CVE request. https:\/\/cveform.mitre.org\/. [Online; accessed 30-June-2019]."},{"key":"e_1_3_2_16_2","first-page":"869","volume-title":"Proceedings of the 28th  \\( \\lbrace \\) USENIX \\( \\rbrace \\)  Security Symposium ( \\( \\lbrace \\) USENIX \\( \\rbrace \\)  Security 19)","author":"Dong Ying","year":"2019","unstructured":"Ying Dong, Wenbo Guo, Yueqi Chen, Xinyu Xing, Yuqing Zhang, and Gang Wang. 2019. Towards the detection of inconsistencies in public security vulnerability reports. In Proceedings of the 28th \\( \\lbrace \\) USENIX \\( \\rbrace \\) Security Symposium ( \\( \\lbrace \\) USENIX \\( \\rbrace \\) Security 19). 869\u2013885."},{"key":"e_1_3_2_17_2","first-page":"82","volume-title":"Commun. ACM","author":"Feldman Ronen","year":"2013","unstructured":"Ronen Feldman. 2013. Techniques and applications for sentiment analysis. Commun. ACM, 56, 82\u201389."},{"key":"e_1_3_2_18_2","unstructured":"FIRST. 2019. Common Vulnerability Scoring System (CVSS). https:\/\/www.first.org\/cvss. [Online; accessed 30-June-2019]."},{"key":"e_1_3_2_19_2","first-page":"315","volume-title":"Proceedings of the 14th International Conference on Artificial Intelligence and Statistics","author":"Glorot Xavier","year":"2011","unstructured":"Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Deep sparse rectifier neural networks. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. 315\u2013323."},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICECCS.2019.00011"},{"key":"e_1_3_2_21_2","unstructured":"Google. 2019. Word2vec. https:\/\/code.google.com\/archive\/p\/word2vec\/. [Online; accessed 30-June-2019]."},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/COMPSAC51774.2021.00138"},{"key":"e_1_3_2_23_2","first-page":"885\u2014892","volume-title":"J. Biomed. Informat.","author":"Gurulingappa Harsha","year":"2012","unstructured":"Harsha Gurulingappa, Abdul Mateen Rajput, Angus Roberts, Juliane Fluck, Martin Hofmann-Apitius, and Luca Toldo. 2012. Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. J. Biomed. Informat. 45, 885\u2014892."},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSME.2017.52"},{"key":"e_1_3_2_25_2","first-page":"13949","volume-title":"IEEE Access","author":"Hassan A.","year":"2018","unstructured":"A. Hassan and A. Mahmood. 2018. Convolutional recurrent deep learning model for sentence classification. IEEE Access, 6, 13949\u201313957."},{"key":"e_1_3_2_26_2","article-title":"BRON\u2013Linking attack tactics, techniques, and patterns with defensive weaknesses, vulnerabilities and affected platform configurations","author":"Hemberg Erik","year":"2020","unstructured":"Erik Hemberg, Jonathan Kelly, Michal Shlapentokh-Rothman, Bryn Reinstadler, Katherine Xu, Nick Rutar, and Una-May O\u2019Reilly. 2020. BRON\u2013Linking attack tactics, techniques, and patterns with defensive weaknesses, vulnerabilities and affected platform configurations. arXiv preprint arXiv:2010.00533 (2020).","journal-title":"arXiv preprint arXiv:2010.00533"},{"key":"e_1_3_2_27_2","volume-title":"CoRR","author":"Howard Jeremy","year":"2018","unstructured":"Jeremy Howard and Sebastian Ruder. 2018. Fine-tuned language models for text classification. In CoRR, Vol. abs\/1801.06146. arxiv:1801.06146."},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1085"},{"key":"e_1_3_2_29_2","unstructured":"IBM. 2019. IBM X-Force Exchange. https:\/\/exchange.xforce.ibmcloud.com\/. [Online; accessed 30-June-2019]."},{"key":"e_1_3_2_30_2","article-title":"MITRE key details phrasing","author":"Evans Jonathan","year":"2020","unstructured":"Jonathan Evans. 2020. MITRE key details phrasing. http:\/\/cveproject.github.io\/docs\/content\/key-details-phrasing.pdf. [Online; accessed February-2020].","journal-title":"http:\/\/cveproject.github.io\/docs\/content\/key-details-phrasing.pdf"},{"key":"e_1_3_2_31_2","article-title":"whatsapp-rce-patched","author":"Dekel Kasif","year":"2017","unstructured":"Kasif Dekel. 2017. whatsapp-rce-patched. https:\/\/github.com\/kasif-dekel\/whatsapp-rce-patched\/. [Online; accessed 21-January-2017].","journal-title":"https:\/\/github.com\/kasif-dekel\/whatsapp-rce-patched\/"},{"key":"e_1_3_2_32_2","volume-title":"CoRR","author":"Kim Yoon","year":"2014","unstructured":"Yoon Kim. 2014. Convolutional neural networks for sentence classification. In CoRR, abs\/1408.5882. arxiv:1408.5882."},{"key":"e_1_3_2_33_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Kingma Diederik","year":"2014","unstructured":"Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_34_2","first-page":"36","volume-title":"Information Sciences","author":"Li Jing","year":"2018","unstructured":"Jing Li, Aixin Sun, and Zhenchang Xing. 2018. Learning to answer programming questions with software documentation through social context embedding. In Information Sciences, Vol. 448-449. 36\u201352."},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-4421"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-5010"},{"key":"e_1_3_2_37_2","first-page":"3111","volume-title":"Advances in Neural Information Processing Systems","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems. 3111\u20133119."},{"key":"e_1_3_2_38_2","article-title":"National Vulnerability Database (NVD)","author":"MITRE Corporation","year":"2017","unstructured":"Corporation MITRE. 2017. National Vulnerability Database (NVD). https:\/\/nvd.nist.gov\/. [Online; accessed 21-January-2017].","journal-title":"https:\/\/nvd.nist.gov\/"},{"key":"e_1_3_2_39_2","volume-title":"Common Attack Pattern Enumeration and Classification Submission.https:\/\/cveform.mitre.org","author":"MITRE Corporation","year":"2019","unstructured":"Corporation MITRE. 2019. Common Attack Pattern Enumeration and Classification Submission.https:\/\/cveform.mitre.org. [Online; accessed 30-June-2019]."},{"key":"e_1_3_2_40_2","article-title":"Common Vulnerabilities and Exposures (CVE)","author":"MITRE Corporation","year":"2019","unstructured":"Corporation MITRE. 2019. Common Vulnerabilities and Exposures (CVE). https:\/\/cve.mitre.org\/. [Online; accessed 30-June-2019].","journal-title":"https:\/\/cve.mitre.org\/"},{"key":"e_1_3_2_41_2","volume-title":"CoRR","author":"Mou Lili","year":"2014","unstructured":"Lili Mou, Ge Li, Zhi Jin, Lu Zhang, and Tao Wang. 2014. TBCNN: A tree-based convolutional neural network for programming language processing. CoRR, abs\/1409.5718. arxiv:1409.5718."},{"key":"e_1_3_2_42_2","first-page":"919","volume-title":"Proceedings of the 27th USENIX Security Symposium (USENIX Security 18)","author":"Mu Dongliang","year":"2018","unstructured":"Dongliang Mu, Alejandro Cuevas, Limin Yang, Hang Hu, Xinyu Xing, Bing Mao, and Gang Wang. 2018. Understanding the reproducibility of crowd-reported security vulnerabilities. In Proceedings of the 27th USENIX Security Symposium (USENIX Security 18). USENIX Association, Baltimore, MD, 919\u2013936."},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/2872518.2889361"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/1315245.1315311"},{"key":"e_1_3_2_45_2","article-title":"National Institute of Standards and Technology (NIST)","year":"2017","unstructured":"NIST. 2017. National Institute of Standards and Technology (NIST). https:\/\/www.nist.gov\/. [Online; accessed 21-January-2017].","journal-title":"https:\/\/www.nist.gov\/"},{"key":"e_1_3_2_46_2","volume-title":"CoRR","author":"Peters Matthew E.","year":"2018","unstructured":"Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. CoRR, abs\/1802.05365. arxiv:1802.05365."},{"key":"e_1_3_2_47_2","unstructured":"Scott Reed and Nando Freitas. 2015. Neural programmer-interpreters. arXiv preprint arXiv:1511.06279 ."},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.5555\/1870457.1870476"},{"key":"e_1_3_2_49_2","first-page":"993","volume-title":"IEEE Transactions on Software Engineering","author":"Scandariato R.","year":"2014","unstructured":"R. Scandariato, J. Walden, A. Hovsepyan, and W. Joosen. 2014. Predicting vulnerable software components via text mining. IEEE Transactions on Software Engineering 40, 993\u20131006."},{"key":"e_1_3_2_50_2","first-page":"772","volume-title":"IEEE Transactions on Software Engineering","author":"Shin Yonghee","year":"2011","unstructured":"Yonghee Shin, Andrew Meneely, Laurie Williams, and Jason A. Osborne. 2011. Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Transactions on Software Engineering, 37, 772\u2013787."},{"key":"e_1_3_2_51_2","unstructured":"Ravindra Singh and Naurang Mangat. 2013. Elements of survey sampling Vol. 15. Springer Science & Business Media."},{"key":"e_1_3_2_52_2","article-title":"securityFocus","year":"2019","unstructured":"Symantec. 2019. securityFocus. https:\/\/www.securityfocus.com\/. [Online; accessed 30-June-2019].","journal-title":"https:\/\/www.securityfocus.com\/"},{"key":"e_1_3_2_53_2","first-page":"283","volume-title":"Lecture Notes in Computer Science","author":"Wang Lingyu","year":"2008","unstructured":"Lingyu Wang, Tania Islam, Long Tao, Anoop Singhal, and Sushil Jajodia. 2008. An attack graph based probabilistic security metric. In Lecture Notes in Computer Science, Vol. 5094, 283\u2013296."},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1145\/2884781.2884804"},{"key":"e_1_3_2_55_2","first-page":"1","volume-title":"Wiley Encyclopedia of Clinical Trials","author":"Woolson R. F.","year":"2007","unstructured":"R. F. Woolson. 2007. Wilcoxon signed-rank test. In Wiley Encyclopedia of Clinical Trials. Wiley Online Library, 1\u20133."},{"key":"e_1_3_2_56_2","first-page":"977","volume-title":"IEEE Transactions on Software Engineering","author":"Xia Xin","year":"2016","unstructured":"Xin Xia, David Lo, Sinno Jialin Pan, Nachiappan Nagappan, and Xinyu Wang. 2016. Hydra: Massively compositional model for cross-project defect prediction. IEEE Transactions on Software Engineering42, 977\u2013998."},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-36718-3_5"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1145\/2970276.2970357"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1145\/2970276.2970357"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N16-1174"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1145\/2884781.2884862"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1109\/COMPSAC51774.2021.00116"},{"key":"e_1_3_2_63_2","volume-title":"Proceedings of the 2020 IEEE\/ACM 42st International Conference on Software Engineering (ICSE \u201920)","author":"Yude Bai","year":"2020","unstructured":"Bai Yude, Xing Zhenchang, Li Xiaohong, Feng Zhiyong, and Ma Duoyuan. 2020. Unsuccessful story about few shot malware family classification and Siamese network to the rescue. In Proceedings of the 2020 IEEE\/ACM 42st International Conference on Software Engineering (ICSE \u201920)."},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE43902.2021.00150"},{"key":"e_1_3_2_65_2","volume-title":"CoRR","author":"Zhang Ye","year":"2015","unstructured":"Ye Zhang and Byron C. Wallace. 2015. A sensitivity analysis of (and practitioners\u2019 guide to) convolutional neural networks for sentence classification. In CoRR, Vol. abs\/1510.03820. arxiv:1510.03820."},{"key":"e_1_3_2_66_2","volume-title":"CoRR","author":"Zhou Yaqin","year":"2019","unstructured":"Yaqin Zhou, Shangqing Liu, Jingkai Siow, Xiaoning Du, and Yang Liu. 2019. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. In CoRR. arxiv:1909.03496."}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3498537","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3498537","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:27Z","timestamp":1750188627000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3498537"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,9]]},"references-count":65,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,7,31]]}},"alternative-id":["10.1145\/3498537"],"URL":"https:\/\/doi.org\/10.1145\/3498537","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,4,9]]},"assertion":[{"value":"2021-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-04-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}