{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T16:12:29Z","timestamp":1774627949394,"version":"3.50.1"},"reference-count":42,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2021,7,26]],"date-time":"2021-07-26T00:00:00Z","timestamp":1627257600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"SCAT Research Grant","award":["Ecosystem for Security, Privacy, and Trust in Network Software Systems"],"award-info":[{"award-number":["Ecosystem for Security, Privacy, and Trust in Network Software Systems"]}]},{"name":"MEXT enPiT-Pro","award":["Smart SE: Smart Systems and Services innovative professional Education program"],"award-info":[{"award-number":["Smart SE: Smart Systems and Services innovative professional Education program"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>For effective vulnerability management, vulnerability and attack information must be collected quickly and efficiently. A security knowledge repository can collect such information. The Common Vulnerabilities and Exposures (CVE) provides known vulnerabilities of products, while the Common Attack Pattern Enumeration and Classification (CAPEC) stores attack patterns, which are descriptions of common attributes and approaches employed by adversaries to exploit known weaknesses. Due to the fact that the information in these two repositories are not linked, identifying related CAPEC attack information from CVE vulnerability information is challenging. Currently, the related CAPEC-ID can be traced from the CVE-ID using Common Weakness Enumeration (CWE) in some but not all cases. Here, we propose a method to automatically trace the related CAPEC-IDs from CVE-ID using three similarity measures: TF\u2013IDF, Universal Sentence Encoder (USE), and Sentence-BERT (SBERT). We prepared and used 58 CVE-IDs as test input data. Then, we tested whether we could trace CAPEC-IDs related to each of the 58 CVE-IDs. Additionally, we experimentally confirm that TF\u2013IDF is the best similarity measure, as it traced 48 of the 58 CVE-IDs to the related CAPEC-ID.<\/jats:p>","DOI":"10.3390\/info12080298","type":"journal-article","created":{"date-parts":[[2021,7,26]],"date-time":"2021-07-26T22:22:46Z","timestamp":1627338166000},"page":"298","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":35,"title":["Tracing CVE Vulnerability Information to CAPEC Attack Patterns Using Natural Language Processing Techniques"],"prefix":"10.3390","volume":"12","author":[{"given":"Kenta","family":"Kanakogi","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, Waseda University, Shinjuku-ku, Tokyo 169-8555, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1417-9879","authenticated-orcid":false,"given":"Hironori","family":"Washizaki","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Waseda University, Shinjuku-ku, Tokyo 169-8555, Japan"}]},{"given":"Yoshiaki","family":"Fukazawa","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Waseda University, Shinjuku-ku, Tokyo 169-8555, Japan"}]},{"given":"Shinpei","family":"Ogata","sequence":"additional","affiliation":[{"name":"Institute of Engineering, Academic Assembly, Shinshu University, Nagano City, Nagano 380-8553, Japan"}]},{"given":"Takao","family":"Okubo","sequence":"additional","affiliation":[{"name":"Institute of Information Security, Yokohama, Kanagawa 221-0835, Japan"}]},{"given":"Takehisa","family":"Kato","sequence":"additional","affiliation":[{"name":"Hitachi, Ltd., Chiyoda-ku, Tokyo 100-8280, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8560-8714","authenticated-orcid":false,"given":"Hideyuki","family":"Kanuka","sequence":"additional","affiliation":[{"name":"Hitachi, Ltd., Chiyoda-ku, Tokyo 100-8280, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6583-1521","authenticated-orcid":false,"given":"Atsuo","family":"Hazeyama","sequence":"additional","affiliation":[{"name":"Department of Information Science, Tokyo Gakugei University, Koganei-shi, Tokyo 184-8501, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1986-5675","authenticated-orcid":false,"given":"Nobukazu","family":"Yoshioka","sequence":"additional","affiliation":[{"name":"Research Institute for Science and Engineering, Waseda University, Shinjuku-ku, Tokyo 169-8555, Japan"}]}],"member":"1968","published-online":{"date-parts":[[2021,7,26]]},"reference":[{"key":"ref_1","unstructured":"(2021, June 16). Common Vulnerabilities and Exploits. Available online: https:\/\/cve.mitre.org\/."},{"key":"ref_2","unstructured":"(2021, June 16). Common Attack Pattern Enumeration and Classification. Available online: https:\/\/capec.mitre.org\/."},{"key":"ref_3","unstructured":"(2021, June 16). Common Weakness Enumeration. Available online: https:\/\/cwe.mitre.org\/."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Dang, Q., and Fran\u00e7ois, J. (2018, January 25\u201329). Utilizing attack enumerations to study SDN\/NFV vulnerabilities. Proceedings of the 2018 4th IEEE Conference on Network Softwarization and Workshops (NetSoft), Montreal, QC, Canada.","DOI":"10.1109\/NETSOFT.2018.8459961"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Navarro, J., Legrand, V., Lagraa, S., Francois, J., Lahmadi, A., Santis, G.D., Festor, O., Lammari, N., Hamdi, F., and Deruyver, A. (2017, January 23\u201325). HuMa: A multi-layer framework for threat analysis in a heterogeneous log environment. Proceedings of the 10th International Symposium on Foundations & Practice of Security, Nancy, France.","DOI":"10.1007\/978-3-319-75650-9_10"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Aghaei, E., and Shaer, E.A. (2019, January 1\u20133). ThreatZoom: Neural Network for Automated Vulnerability Mitigation. Proceedings of the 6th Annual Symposium on Hot Topics in the Science of Security, New York, NY, USA.","DOI":"10.1145\/3314058.3318167"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"e25","DOI":"10.7717\/peerj-cs.25","article-title":"Mining known attack patterns from security-related events","volume":"1","author":"Scarabeo","year":"2015","journal-title":"PeerJ Comput. Sci."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Ma, X., Davoodi, E., Kosseim, L., and Scarabeo, N. (2018, January 13\u201315). Semantic Mapping of Security Events to Known Attack Patterns. Proceedings of the 23rd International Conference on Natural Language and Information Systems, Paris, France.","DOI":"10.1007\/978-3-319-91947-8_10"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Kanakogi, K., Washizaki, H., Fukazawa, Y., Ogata, S., Okubo, T., Kato, T., Kanuka, H., Hazeyama, A., and Yoshioka, N. (2021, January 4\u20138). Tracing CAPEC Attack Patterns from CVE Vulnerability Information using Natural Language Processing Technique. Proceedings of the 54th Hawaii International Conference on System Sciences, Kauai, HI, USA.","DOI":"10.24251\/HICSS.2021.841"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Miller, D., Leek, T., and Schwartz, R. (1999, January 15\u201319). A hidden Markov model information retrieval system. Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, NY, USA.","DOI":"10.1145\/312624.312680"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Cer, D., Yang, Y., Kong, S.Y., Hua, N., Limtiaco, N., St. John, R., Constant, N., Guajardo-Cespedes, M., Yuan, S., and Tar, C. (November, January 31). Universal sentence encoder for English. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Brussels, Belgium.","DOI":"10.18653\/v1\/D18-2029"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Reimers, N., and Gurevych, I. (2019, January 3\u20137). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Hong Kong, China.","DOI":"10.18653\/v1\/D19-1410"},{"key":"ref_13","unstructured":"Ouchn, J.N. (2018). Method and System for Automated Computer Vulnerability Tracking. The United States Patent and Trademark Office. (9,871,815), U.S. Patent."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Adams, S., Carter, B., Fleming, C., and Beling, P.A. (2018\u20133, January 31). Selecting system specific cybersecurity attack patterns using topic modeling. Proceedings of the 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications and 12th IEEE International Conference on Big Data Science and Engineering, Trustcom, New York, NY, USA.","DOI":"10.1109\/TrustCom\/BigDataSE.2018.00076"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Mounika, V., Yuan, X., and Bandaru, K. (2019, January 5\u20137). Analyzing CVE Database Using Unsupervised Topic Modelling. Proceedings of the 6th Annual Conference on Computational Science and Computational Intelligence, Las Vegas, NV, USA.","DOI":"10.1109\/CSCI49370.2019.00019"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ou, S., and Kim, H. (2018, January 25\u201328). Unsupervised Citation Sentence Identification Based on Similarity Measurement. Proceedings of the 13th International Conference on Transforming Digital Worlds, Sheffield, UK.","DOI":"10.1007\/978-3-319-78105-1_42"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1016\/j.ins.2018.10.006","article-title":"Multi-co-training for document classification using various document representations: TF\u2013IDF, LDA, and Doc2Vec","volume":"477","author":"Kim","year":"2019","journal-title":"J. Inf. Sci."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhu, L., Zhang, Z., Xia, G., and Jiang, C. (2019, January 24\u201326). Research on Vulnerability Ontology Model. Proceedings of the 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference, Chongqing, China.","DOI":"10.1109\/ITAIC.2019.8785783"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"554","DOI":"10.1007\/s12204-013-1439-5","article-title":"Ontology-based model of network and computer attacks for security assessment","volume":"18","author":"Gao","year":"2013","journal-title":"J. Shanghai Jiaotong Univ. Sci."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ansarinia, M., Asghari, S.A., Souzani, A., and Ghaznavi, A. (2012, January 6\u20138). Ontology-based modeling of DDoS attacks for attack plan detection. Proceedings of the 6th International Symposium on Telecommunications, Tehran, Iran.","DOI":"10.1109\/ISTEL.2012.6483131"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Wang, J.A., Wang, H., Guo, M., Zhou, L., and Camargo, J. (2010, January 5\u20138). Ranking attacks based on vulnerability analysis. Proceedings of the 43rd Hawaii International Conference on System Sciences, Kauai, HI, USA.","DOI":"10.1109\/HICSS.2010.313"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Wita, R., Jiamnapanon, N., and Teng-Amnuay, Y. (2010, January 2\u20134). An ontology for vulnerability lifecycle. Proceedings of the 2010 Third International Symposium on Intelligent Information Technology and Security Informatics, Jinggangshan, China.","DOI":"10.1109\/IITSI.2010.141"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"120009","DOI":"10.1109\/ACCESS.2020.3004661","article-title":"Practical Vulnerability-Information-Sharing Architecture for Automotive Security-Risk Analysis","volume":"8","author":"Lee","year":"2020","journal-title":"IEEE Access"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"102316","DOI":"10.1016\/j.cose.2021.102316","article-title":"Assessing IoT enabled cyber-physical attack paths against critical system","volume":"107","author":"Stellios","year":"2021","journal-title":"J. Comput. Secur."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Rostami, S., Kleszcz, A., Dimanov, D., and Katos, V. (2020, January 8\u20139). A Machine Learning Approach to Dataset Imputation for Software Vulnerabilities. Proceedings of the 10th International Conference on Multimedia Communications, Services and Security, Krakow, Poland.","DOI":"10.1007\/978-3-030-59000-0_3"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Sion, L., Tuma, K., Scandariato, R., Yskout, K., and Joosen, W. (2019, January 10\u201315). Towards Automated Security Design Flaw Detection. Proceedings of the 2019 34th IEEE\/ACM International Conference on Automated Software Engineering Workshops, San Diego, CA, USA.","DOI":"10.1109\/ASEW.2019.00028"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Almorsy, M., Grundy, J., and Ibrahim, A.S. (2011, January 4\u20139). Collaboration-based cloud computing security management framework. Proceedings of the 2011 IEEE 4th International Conference on Cloud Computing, Washington, DC, USA.","DOI":"10.1109\/CLOUD.2011.9"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Kotenko, I., and Doynikova, E. (2015, January 24\u201326). The CAPEC based generator of attack scenarios for network security evaluation. Proceedings of the 2015 IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, Warsaw, Poland.","DOI":"10.1109\/IDAACS.2015.7340774"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Xianghui, Z., Yong, P., Zan, Z., Yi, J., and Yuangang, Y. (2015, January 23\u201325). Research on parallel vulnerabilities discovery based on open source database and text mining. Proceedings of the 2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, Adelaide, Australia.","DOI":"10.1109\/IIH-MSP.2015.84"},{"key":"ref_30","first-page":"265","article-title":"Toward Validation of Textual Information Retrieval Techniques for Software Weaknesses","volume":"903","author":"Ruohonen","year":"2018","journal-title":"Commun. Comput. Inf. Sci."},{"key":"ref_31","unstructured":"Guo, M., and Wang, J. (2009, January 5\u20137). An ontology-based approach to model common vulnerabilities and exposures in information security. Proceedings of the ASEE Southest Section Conference, Marietta, GA, USA."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1007\/s11416-014-0231-x","article-title":"An overview of vulnerability assessment and penetration testing techniques","volume":"11","author":"Shah","year":"2015","journal-title":"J. Comput. Virol. Hacking Tech."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Khera, Y., Kumar, D., Sujay, S., and Garg, N. (2019, January 14\u201316). Analysis and Impact of Vulnerability Assessment and Penetration Testing. Proceedings of the International Conference on Machine Learning, Big Data, Cloud and Parallel Computing: Trends, Prespectives and Prospects, Faridabad, India.","DOI":"10.1109\/COMITCon.2019.8862224"},{"key":"ref_34","unstructured":"Grigoriadis, C. (2019). Identification and Assessment of Security Attacks and Vulnerabilities, Utilizing CVE, CWE and CAPEC. [Master\u2019s Thesis, University of Piraeus]."},{"key":"ref_35","unstructured":"(2021, June 16). Scikit-Learn. Available online: https:\/\/scikit-learn.org\/stable\/."},{"key":"ref_36","unstructured":"(2021, June 16). Tensorflow Hub. Available online: https:\/\/tfhub.dev\/."},{"key":"ref_37","unstructured":"(2021, June 16). Sentence Transformers Documentation. Available online: https:\/\/www.sbert.net\/."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1016\/j.ins.2015.02.024","article-title":"Learning similarity with cosine similarity ensemble","volume":"307","author":"Xia","year":"2015","journal-title":"Inf. Sci."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Hafiz, M., Adamczyk, P., and Johnson, R. (2012, January 19\u201326). Growing a pattern language (for security). Proceedings of the ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Tucson, AZ, USA.","DOI":"10.1145\/2384592.2384607"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Biswas, B., Mukhopadhyay, A., and Gupta, G. (2018, January 2\u20136). \u201cLeadership in Action: How Top Hackers Behave\u201d A Big-Data Approach with Text-Mining and Sentiment Analysis. Proceedings of the 51st Hawaii International Conference on System Sciences, Honolulu, HI, USA.","DOI":"10.24251\/HICSS.2018.221"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1023","DOI":"10.1080\/07421222.2017.1394049","article-title":"Exploring emerging hacker assets and key hackers for proactive cyber threat intelligence","volume":"34","author":"Samtani","year":"2017","journal-title":"J. Manag. Inf. Syst."},{"key":"ref_42","first-page":"1","article-title":"CSPM: Metamodel for Handling Security and Privacy Knowledge in Cloud Service Development","volume":"12","author":"Xia","year":"2021","journal-title":"J. Syst. Softw. Secur. Prot."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/12\/8\/298\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:35:08Z","timestamp":1760164508000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/12\/8\/298"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,26]]},"references-count":42,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2021,8]]}},"alternative-id":["info12080298"],"URL":"https:\/\/doi.org\/10.3390\/info12080298","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,7,26]]}}}