{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T20:24:55Z","timestamp":1757622295180,"version":"3.44.0"},"publisher-location":"Cham","reference-count":17,"publisher":"Springer Nature Switzerland","isbn-type":[{"type":"print","value":"9783032006325"},{"type":"electronic","value":"9783032006332"}],"license":[{"start":{"date-parts":[[2025,1,1]],"date-time":"2025-01-01T00:00:00Z","timestamp":1735689600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,8,9]],"date-time":"2025-08-09T00:00:00Z","timestamp":1754697600000},"content-version":"vor","delay-in-days":220,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Cyber Threat Intelligence (CTI) has become an indispensable element of cybersecurity operations and any mechanism or tool that alleviates the workload of security analysts is highly valuable. Natural Language Processing (NLP) supports efficient processing of news articles, and enables us to group articles that report about the same story. This allows Open Source Intelligence (OSINT) analysts to manage information overload and focus only on essential events. Therefore, the contributions of this paper are manyfold: (i) We identify the relevant requirements for designing an OSINT clustering tool, (ii) present a solution that can support such requirements, and (iii) evaluate the solution considering the needs of OSINT analysts. Our clustering approach, denoted as SC4OSINT, is inspired by an existing semi-supervised graph-based story clustering method and adapted to the OSINT requirements. Unlike the original method, SC4OSINT is a fully unsupervised two-layer approach, which handles multilingual streaming data and uses sentence transformers to create fine-grained clusters. We evaluate SC4OSINT\u2019s story clustering by letting security experts rate the clustering quality across various model configurations. The results show that the best hyper-parameter configuration achieves an average rating of 4.19\/5, demonstrating the efficiency of our approach.<\/jats:p>","DOI":"10.1007\/978-3-032-00633-2_1","type":"book-chapter","created":{"date-parts":[[2025,8,8]],"date-time":"2025-08-08T10:15:11Z","timestamp":1754648111000},"page":"5-24","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["SC4OSINT: A Story Clustering Approach to\u00a0Optimize OSINT Analysis"],"prefix":"10.1007","author":[{"given":"Elisabeth","family":"Woisetschl\u00e4ger","sequence":"first","affiliation":[]},{"given":"Medina","family":"Andresel","sequence":"additional","affiliation":[]},{"given":"Florian","family":"Skopik","sequence":"additional","affiliation":[]},{"given":"Benjamin","family":"Akhras","sequence":"additional","affiliation":[]},{"given":"Peter","family":"Leitmann","sequence":"additional","affiliation":[]},{"given":"Max","family":"Landauer","sequence":"additional","affiliation":[]},{"given":"Markus","family":"Wurzenberger","sequence":"additional","affiliation":[]},{"given":"Alexander","family":"Schindler","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,8,9]]},"reference":[{"key":"1_CR1","unstructured":"Taranis AI (2024). https:\/\/taranis.ai\/"},{"key":"1_CR2","unstructured":"Akbik, A., Bergmann, T., Blythe, D., Rasul, K., Schweter, S., Vollgraf, R.: FLAIR: an easy-to-use framework for state-of-the-art NLP. In: NAACL 2019, pp. 54\u201359 (2019)"},{"issue":"10","key":"1_CR3","doi-asserted-by":"publisher","first-page":"P10008","DOI":"10.1088\/1742-5468\/2008\/10\/P10008","volume":"2008","author":"VD Blondel","year":"2008","unstructured":"Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10008 (2008)","journal-title":"J. Stat. Mech. Theory Exp."},{"key":"1_CR4","doi-asserted-by":"crossref","unstructured":"Chowdhary, K.R.: Natural Language Processing, pp. 603\u2013649. Springer, New Delhi (2020)","DOI":"10.1007\/978-81-322-3972-7_19"},{"key":"1_CR5","doi-asserted-by":"crossref","unstructured":"H\u00e4rdle, W.K., Simar, L., Fengler, M.R.: Uniform manifold approximation and projection, pp. 581\u2013595. Springer, Cham (2024)","DOI":"10.1007\/978-3-031-63833-6_23"},{"key":"1_CR6","unstructured":"Kuehn, P., Kerk, M., Wendelborn, M., Reuter, C.: Clustering of threat information to mitigate information overload for computer emergency response teams. CoRR arxiv:2210.14067 (2022)"},{"key":"1_CR7","doi-asserted-by":"crossref","unstructured":"Liu, B., Han, F.X., Niu, D., Kong, L., Lai, K., Xu, Y.: Story forest: extracting events and telling stories from breaking news. ACM Trans. Knowl. Discov. Data 14(3), 31:1\u201331:28 (2020)","DOI":"10.1145\/3377939"},{"key":"1_CR8","doi-asserted-by":"crossref","unstructured":"Ma, C., et al.: FineCTI: a framework for mining fine-grained cyber threat information from twitter using NER model. In: 2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp. 531\u2013538 (2023)","DOI":"10.1109\/TrustCom60117.2023.00085"},{"issue":"3","key":"1_CR9","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3530977","volume":"25","author":"C Martins","year":"2022","unstructured":"Martins, C., Medeiros, I.: Generating quality threat intelligence leveraging OSINT and a cyber threat unified taxonomy. ACM Trans. Priv. Secur. 25(3), 1\u201339 (2022)","journal-title":"ACM Trans. Priv. Secur."},{"key":"1_CR10","unstructured":"Rajaraman, A., Ullman, J.D.: Data Mining, pp. 1\u201317. Cambridge University Press, Cambridge (2011)"},{"key":"1_CR11","doi-asserted-by":"crossref","unstructured":"Rani, N., Saha, B., Maurya, V., Shukla, S.: Ttpxhunter: actionable threat intelligence extraction as TTPS from finished cyber threat reports. Digit. Threats: Res. Pract. (2024)","DOI":"10.1145\/3696427"},{"key":"1_CR12","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"429","DOI":"10.1007\/978-3-030-86890-1_24","volume-title":"Information and Communications Security","author":"T Riebe","year":"2021","unstructured":"Riebe, T., et al.: CySecAlert: an alert generation system for cyber security events using open source intelligence data. In: Gao, D., Li, Q., Guan, X., Liao, X. (eds.) ICICS 2021. LNCS, vol. 12918, pp. 429\u2013446. Springer, Cham (2021). https:\/\/doi.org\/10.1007\/978-3-030-86890-1_24"},{"key":"1_CR13","doi-asserted-by":"crossref","unstructured":"Sayyadi, H., Raschid, L.: A graph analytical approach for topic detection. ACM Trans. Internet Technol. 13(2) (2013)","DOI":"10.1145\/2542214.2542215"},{"key":"1_CR14","doi-asserted-by":"publisher","DOI":"10.4324\/9781315397900","volume-title":"Collaborative Cyber Threat Intelligence: Detecting and Responding to Advanced Cyber Attacks at the National Level","author":"F Skopik","year":"2017","unstructured":"Skopik, F.: Collaborative Cyber Threat Intelligence: Detecting and Responding to Advanced Cyber Attacks at the National Level. CRC Press, Boca Raton (2017)"},{"key":"1_CR15","doi-asserted-by":"crossref","unstructured":"Skopik, F., Akhras, B., Woisetschl\u00e4ger, E., Andresel, M., Wurzenberger, M., Landauer, M.: On the application of natural language processing for advanced OSINT analysis in cyber defence. In: Proceedings of the 19th International Conference on Availability, Reliability and Security. ARES 2024. Association for Computing Machinery, New York (2024)","DOI":"10.1145\/3664476.3670899"},{"issue":"3","key":"1_CR16","doi-asserted-by":"publisher","first-page":"1748","DOI":"10.1109\/COMST.2023.3273282","volume":"25","author":"N Sun","year":"2023","unstructured":"Sun, N., et al.: Cyber threat intelligence mining for proactive cybersecurity defense: a survey and new perspectives. IEEE Commun. Surv. Tutorials 25(3), 1748\u20131774 (2023)","journal-title":"IEEE Commun. Surv. Tutorials"},{"issue":"12","key":"1_CR17","doi-asserted-by":"publisher","first-page":"3770","DOI":"10.14778\/3554821.3554896","volume":"15","author":"I Trummer","year":"2022","unstructured":"Trummer, I.: From BERT to GPT-3 codex: harnessing the potential of very large language models for data management. Proc. VLDB Endow. 15(12), 3770\u20133773 (2022)","journal-title":"Proc. VLDB Endow."}],"container-title":["Lecture Notes in Computer Science","Availability, Reliability and Security"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-032-00633-2_1","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,8]],"date-time":"2025-09-08T19:11:02Z","timestamp":1757358662000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-032-00633-2_1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025]]},"ISBN":["9783032006325","9783032006332"],"references-count":17,"URL":"https:\/\/doi.org\/10.1007\/978-3-032-00633-2_1","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2025]]},"assertion":[{"value":"9 August 2025","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"ARES","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Availability, Reliability and Security","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Ghent","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Belgium","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2025","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"11 August 2025","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"14 August 2025","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"20","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"ares-12025","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/2025.ares-conference.eu","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}}]}}