{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T14:54:39Z","timestamp":1777128879937,"version":"3.51.4"},"reference-count":45,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2025,4,21]],"date-time":"2025-04-21T00:00:00Z","timestamp":1745193600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>This study examines multi-lexical data sources, utilizing an extracted dataset from an open-source corpus and the Global Terrorism Datasets (GTDs), to predict lexical patterns that are directly linked to terrorism. This is essential as specific patterns within a textual context can facilitate the identification of terrorism-related content. The research methodology focuses on generating a corpus from various published works and extracting texts pertinent to \u201cterrorism\u201d. Afterwards, we extract additional lexical contexts of GTDs that directly relate to terrorism. The integration of multi-lexical data sources generates lexical patterns linked to terrorism. Machine learning models were used to train the dataset. We conducted two primary experiments and analyzed the results. The analysis of data obtained from open sources reveals that while the Extra Trees model achieved the highest accuracy at 94.31%, the XGBoost model demonstrated superior overall performance with a higher recall (81.32%) and F1-Score (83.06%) after tuning, indicating a better balance between sensitivity and precision. Similarly, on the GTD dataset, XGBoost consistently outperformed other models in recall and the F1-score, making it a more suitable candidate for tasks where minimizing false negatives is critical. This implies that we can establish a specific co-occurrence and context within the terrorism dataset from multiple lexical data sources in effectively identifying certain multi-lexical patterns such as \u201cSuicide Attack\/Casualty\u201d, \u201cCivilians\/Victims\u201d, and \u201cHostage Taking\/Abduction\u201d across various applications or contexts. This will facilitate the development of a framework for understanding the lexical patterns associated with terrorism.<\/jats:p>","DOI":"10.3390\/fi17040182","type":"journal-article","created":{"date-parts":[[2025,4,21]],"date-time":"2025-04-21T20:38:26Z","timestamp":1745267906000},"page":"182","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Cybersecurity Intelligence Through Textual Data Analysis: A Framework Using Machine Learning and Terrorism Datasets"],"prefix":"10.3390","volume":"17","author":[{"given":"Mohammed Salem","family":"Atoum","sequence":"first","affiliation":[{"name":"Department of Computer Science, The University of Jordan, Amman 11942, Jordan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6099-1027","authenticated-orcid":false,"given":"Ala Abdulsalam","family":"Alarood","sequence":"additional","affiliation":[{"name":"College of Computer Science and Engineering, University of Jeddah, Jeddah 21959, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0132-6662","authenticated-orcid":false,"given":"Eesa","family":"Alsolami","sequence":"additional","affiliation":[{"name":"College of Computer Science and Engineering, University of Jeddah, Jeddah 21959, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6471-481X","authenticated-orcid":false,"given":"Adamu","family":"Abubakar","sequence":"additional","affiliation":[{"name":"Department of Computer Science, International Islamic University Malaysia, Kuala Lumpur 53100, Malaysia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4930-0074","authenticated-orcid":false,"given":"Ahmad K. Al","family":"Hwaitat","sequence":"additional","affiliation":[{"name":"Department of Computer Science, The University of Jordan, Amman 11942, Jordan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7832-5081","authenticated-orcid":false,"given":"Izzat","family":"Alsmadi","sequence":"additional","affiliation":[{"name":"Department of Computing, Engineering and Mathematical Sciences, Texas A&M University, San Antonio, TX 78224, USA"},{"name":"Department of Computer Information Systems, The University of Jordan, Aqaba 77110, Jordan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,4,21]]},"reference":[{"key":"ref_1","unstructured":"Biber, D., and Egbert, J. (2020). Register Variation Online, Cambridge University Press."},{"key":"ref_2","unstructured":"Hoffmann, T., and Hilpert, M. (2021). Construction Grammar and Its Application to English, Cambridge University Press."},{"key":"ref_3","first-page":"104366","article-title":"A neural network model of linguistic and non-linguistic semantics: Exploring language grounding in text","volume":"132","author":"Louwerse","year":"2023","journal-title":"J. Mem. Lang."},{"key":"ref_4","first-page":"1","article-title":"Multi-label text classification for automated ICD-9 coding in healthcare","volume":"22","author":"Chiu","year":"2021","journal-title":"BMC Bioinform."},{"key":"ref_5","first-page":"1746","article-title":"A sensitivity analysis of (and practitioners\u2019 guide to) convolutional neural networks for sentence classification","volume":"31","author":"Zhang","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_6","first-page":"3111","article-title":"Distributed representations of words and phrases and their compositionality","volume":"26","author":"Mikolov","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"100263","DOI":"10.1016\/j.chbr.2022.100263","article-title":"Scoping review of the neural evidence on the uncanny valley","volume":"9","author":"Alimardani","year":"2023","journal-title":"Comput. Hum. Behav. Rep."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"103854","DOI":"10.1109\/ACCESS.2019.2929798","article-title":"Analysis of the terrorist organization alliance network based on complex network theory","volume":"7","author":"Li","year":"2019","journal-title":"IEEE Access"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Hu, J., Chu, C., Xu, L., Wu, P., and Lia, H.J. (2021). Critical terrorist organizations and terrorist organization alliance networks based on key nodes founding. Front. Phys., 9.","DOI":"10.3389\/fphy.2021.687883"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1016\/j.neucom.2020.07.125","article-title":"Research on historical phase division of terrorism: An analysis method by time series complex network","volume":"420","author":"Qiao","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Arslan, M.E. (2024). Targeting telecommunications: Why do rebel groups target information and communication technology infrastructure?. J. Peace Res.","DOI":"10.1177\/00223433241268668"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Sharma, A., Rushton, K., Lin, I.W., Nguyen, T., and Althoff, T. (2024, January 11). Facilitating self-guided mental health interventions through human-language model interaction: A case study of cognitive restructuring. Proceedings of the CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA.","DOI":"10.1145\/3613904.3642761"},{"key":"ref_13","first-page":"102788","article-title":"Analyzing social media language for detecting potential security threats: A case of Twitter and terrorism","volume":"58","author":"Alsmadi","year":"2021","journal-title":"J. Inf. Secur. Appl."},{"key":"ref_14","first-page":"1184","article-title":"Looking for patterns in terrorist communication: Machine learning approaches for distinguishing propaganda","volume":"32","author":"Youngblood","year":"2020","journal-title":"Terror. Polit. Viol."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1146\/annurev-polisci-053119-015921","article-title":"Machine learning for social science: An agnostic approach to text and its limits","volume":"24","author":"Grimmer","year":"2021","journal-title":"Annu. Rev. Polit. Sci."},{"key":"ref_16","first-page":"55","article-title":"Analyzing key term extraction for document summarization and clustering in social media content on terrorism","volume":"12","author":"Afzal","year":"2022","journal-title":"Soc. Netw. Anal. Min."},{"key":"ref_17","first-page":"238","article-title":"Understanding radicalization through linguistic patterns: Detecting extremist narratives","volume":"6","author":"Magdy","year":"2023","journal-title":"J. Comput. Soc. Sci."},{"key":"ref_18","first-page":"323","article-title":"Transformer-based deep intelligent contextual embedding for classification of fake news and toxic comments","volume":"140","author":"Naseem","year":"2020","journal-title":"Pattern Recognit. Lett."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"103251","DOI":"10.1016\/j.ipm.2022.103251","article-title":"Feature selection based on absolute deviation factor for text classification","volume":"60","author":"Jin","year":"2023","journal-title":"Inf. Process. Manag."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"105201","DOI":"10.1016\/j.resourpol.2024.105201","article-title":"The role of gold in terrorism: Risk aversion or financing source?","volume":"95","author":"Song","year":"2024","journal-title":"Resour. Policy"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1016\/j.ins.2022.11.158","article-title":"XRR: Extreme multi-label text classification with candidate retrieving and deep ranking","volume":"622","author":"Xiong","year":"2023","journal-title":"Inf. Sci."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"20898","DOI":"10.1073\/pnas.1904418116","article-title":"Local alliances and rivalries shape near-repeat terror activity of al-Qaeda, ISIS, and insurgents","volume":"116","author":"Chuang","year":"2019","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"123340","DOI":"10.1016\/j.energy.2022.123340","article-title":"Terrorist attacks and oil prices: A time-varying causal relationship analysis","volume":"246","author":"Song","year":"2022","journal-title":"Energy"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"107","DOI":"10.7763\/IJKE.2015.V1.18","article-title":"An experimental study of classification algorithms for terrorism prediction","volume":"1","author":"Tolan","year":"2015","journal-title":"Int. J. Knowl. Eng."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Hu, X., Lai, F., Chen, G., Zou, R., and Feng, Q. (2019). Quantitative research on global terrorist attacks and terrorist attack classification. Sustainability, 11.","DOI":"10.3390\/su11051487"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1186\/s40854-022-00445-3","article-title":"Cryptocurrency technology revolution: Are Bitcoin prices and terrorist attacks related?","volume":"9","author":"Song","year":"2023","journal-title":"Financ. Innov."},{"key":"ref_27","first-page":"e02728","article-title":"Predictive modeling for compressive strength of 3D printed fiber-reinforced concrete using machine learning algorithms","volume":"20","author":"Alyami","year":"2024","journal-title":"Case Stud. Constr. Mater."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Ogunpola, A., Saeed, F., Basurra, S., Albarrak, A.M., and Qasem, S.N. (2024). Machine learning-based predictive models for detection of cardiovascular diseases. Diagnostics, 14.","DOI":"10.3390\/diagnostics14020144"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"102148","DOI":"10.1016\/j.rineng.2024.102148","article-title":"Machine learning-based predictive model for thermal comfort and energy optimization in smart buildings","volume":"22","author":"Boutahri","year":"2024","journal-title":"Results Eng."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"121549","DOI":"10.1016\/j.eswa.2023.121549","article-title":"An improved random forest based on the classification accuracy and correlation measurement of decision trees","volume":"237","author":"Sun","year":"2024","journal-title":"Expert Syst. Appl."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1413","DOI":"10.3934\/mbe.2024061","article-title":"Decision tree models for the estimation of geo-polymer concrete compressive strength","volume":"21","author":"Zhou","year":"2024","journal-title":"Math. Biosci. Eng."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Blockeel, H., Devos, L., Fr\u00e9nay, B., Nanfack, G., and Nijssen, S. (2023). Decision trees: From efficient prediction to responsible AI. Front. Artif. Intell., 6.","DOI":"10.3389\/frai.2023.1124553"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"3111200","DOI":"10.1155\/2022\/3111200","article-title":"Cloud-Based Framework for COVID-19 Detection through Feature Fusion with Bootstrap Aggregated Extreme Learning Machine","volume":"2022","author":"Saba","year":"2022","journal-title":"Discret. Dyn. Nat. Soc."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1015","DOI":"10.1007\/s11600-022-00934-0","article-title":"Snow water equivalent prediction in a mountainous area using hybrid bagging machine learning approaches","volume":"71","author":"Khosravi","year":"2023","journal-title":"Acta Geophys."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1016\/j.procs.2019.01.179","article-title":"XG-SF: An XGBoost classifier based on shapelet features for time series classification","volume":"147","author":"Ji","year":"2019","journal-title":"Procedia Comput. Sci."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"122136","DOI":"10.1016\/j.eswa.2023.122136","article-title":"Evaluation of students\u2019 performance during the academic period using the XG-Boost Classifier-Enhanced AEO hybrid model","volume":"238","author":"Cheng","year":"2024","journal-title":"Expert Syst. Appl."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Liu, Y., Yang, T., Tian, L., Huang, B., Yang, J., and Zeng, Z. (2024). Ada-XG-CatBoost: A Combined Forecasting Model for Gross Ecosystem Product (GEP) Prediction. Sustainability, 16.","DOI":"10.3390\/su16167203"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Nivetha, M., and Sudha, I. Cocoon morphological Features Based Silk Quality Prediction Using XG Boost Algorithm. Proceedings of the 2024 IEEE International Students\u2019 Conference on Electrical, Electronics and Computer Science (SCEECS), Bhopal, India.","DOI":"10.1109\/SCEECS61402.2024.10481988"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1016\/j.isprsjprs.2016.01.011","article-title":"Random forest in remote sensing: A review of applications and future directions","volume":"114","author":"Belgiu","year":"2016","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"104503","DOI":"10.1016\/j.scs.2023.104503","article-title":"Predicting the carbon dioxide emission caused by road transport using a Random Forest (RF) model combined by Meta-Heuristic Algorithms","volume":"93","author":"Khajavi","year":"2023","journal-title":"Sustain. Cities Soc."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Iranzad, R., and Liu, X. (2024). A review of random forest-based feature selection methods for data science education and applications. Int. J. Data Sci. Anal., 1\u20135.","DOI":"10.1007\/s41060-024-00509-w"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"111752","DOI":"10.1016\/j.ecolind.2024.111752","article-title":"Improved random forest algorithms for increasing the accuracy of forest aboveground biomass estimation using Sentinel-2 imagery","volume":"159","author":"Zhang","year":"2024","journal-title":"Ecol. Indic."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1016\/j.aej.2023.11.061","article-title":"Boruta extra tree-bidirectional long short-term memory model development for Pan evaporation forecasting: Investigation of arid climate condition","volume":"86","author":"Karbasi","year":"2024","journal-title":"Alex. Eng. J."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"2350066","DOI":"10.1142\/S0219649223500661","article-title":"An Integrated Feature Extraction Based on Principal Components and Deep Auto Encoder with Extra Tree for Intrusion Detection Systems","volume":"23","author":"Mallampati","year":"2024","journal-title":"J. Inf. Knowl. Manag."},{"key":"ref_45","unstructured":"(2025, February 02). National Consortium for the Study of Terrorism and Responses to Terrorism (START), University of Maryland (2018). The Global Terrorism Database (GTD). Available online: https:\/\/www.kaggle.com\/datasets\/START-UMD\/gtd\/https:\/\/www.start.umd.edu\/gtd."}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/17\/4\/182\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:18:45Z","timestamp":1760030325000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/17\/4\/182"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,21]]},"references-count":45,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2025,4]]}},"alternative-id":["fi17040182"],"URL":"https:\/\/doi.org\/10.3390\/fi17040182","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,4,21]]}}}