{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,20]],"date-time":"2026-04-20T10:38:30Z","timestamp":1776681510171,"version":"3.51.2"},"reference-count":57,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,2,7]],"date-time":"2023-02-07T00:00:00Z","timestamp":1675728000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Recently proposed methods in intrusion detection are iterating on machine learning methods as a potential solution. These novel methods are validated on one or more datasets from a sparse collection of academic intrusion detection datasets. Their recognition as improvements to the state-of-the-art is largely dependent on whether they can demonstrate a reliable increase in classification metrics compared to similar works validated on the same datasets. Whether these increases are meaningful outside of the training\/testing datasets is rarely asked and never investigated. This work aims to demonstrate that strong general performance does not typically follow from strong classification on the current intrusion detection datasets. Binary classification models from a range of algorithmic families are trained on the attack classes of CSE-CIC-IDS2018, a state-of-the-art intrusion detection dataset. After establishing baselines for each class at various points of data access, the same trained models are tasked with classifying samples from the corresponding attack classes in CIC-IDS2017, CIC-DoS2017 and CIC-DDoS2019. Contrary to what the baseline results would suggest, the models have rarely learned a generally applicable representation of their attack class. Stability and predictability of generalized model performance are central issues for all methods on all attack classes. Focusing only on the three best-in-class models in terms of interdataset generalization, reveals that for network-centric attack classes (brute force, denial of service and distributed denial of service), general representations can be learned with flat losses in classification performance (precision and recall) below 5%. Other attack classes vary in generalized performance from stark losses in recall (\u221235%) with intact precision (98+%) for botnets to total degradation of precision and moderate recall loss for Web attack and infiltration models. The core conclusion of this article is a warning to researchers in the field. Expecting results of proposed methods on the test sets of state-of-the-art intrusion detection datasets to translate to generalized performance is likely a serious overestimation. Four proposals to reduce this overestimation are set out as future work directions.<\/jats:p>","DOI":"10.3390\/s23041846","type":"journal-article","created":{"date-parts":[[2023,2,8]],"date-time":"2023-02-08T02:04:16Z","timestamp":1675821856000},"page":"1846","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":17,"title":["Investigating Generalized Performance of Data-Constrained Supervised Machine Learning Models on Novel, Related Samples in Intrusion Detection"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5086-6361","authenticated-orcid":false,"given":"Laurens","family":"D\u2019hooge","sequence":"first","affiliation":[{"name":"IDLab, Department of Information Technology, Ghent University-imec, 9052 Gent, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1781-900X","authenticated-orcid":false,"given":"Miel","family":"Verkerken","sequence":"additional","affiliation":[{"name":"IDLab, Department of Information Technology, Ghent University-imec, 9052 Gent, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2618-3311","authenticated-orcid":false,"given":"Tim","family":"Wauters","sequence":"additional","affiliation":[{"name":"IDLab, Department of Information Technology, Ghent University-imec, 9052 Gent, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4824-1199","authenticated-orcid":false,"given":"Filip","family":"De Turck","sequence":"additional","affiliation":[{"name":"IDLab, Department of Information Technology, Ghent University-imec, 9052 Gent, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0575-5894","authenticated-orcid":false,"given":"Bruno","family":"Volckaert","sequence":"additional","affiliation":[{"name":"IDLab, Department of Information Technology, Ghent University-imec, 9052 Gent, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,2,7]]},"reference":[{"key":"ref_1","unstructured":"Denning, D., and Neumann, P.G. (1985). Requirements and Model for IDES-a Real-Time Intrusion-Detection Expert System, SRI International Menlo Park."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1109\/TSE.1987.232894","article-title":"An intrusion-detection model","volume":"SE-13","author":"Denning","year":"1987","journal-title":"IEEE Trans. Softw. Eng."},{"key":"ref_3","unstructured":"Google (2022, December 20). Google Transparency Report. Available online: https:\/\/transparencyreport.google.com\/https\/overview?hl=en."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.1109\/SURV.2011.092311.00082","article-title":"Evasion techniques: Sneaking through your intrusion detection\/prevention systems","volume":"14","author":"Cheng","year":"2011","journal-title":"IEEE Commun. Surv. Tutor."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1016\/j.ins.2013.03.022","article-title":"Adversarial attacks against intrusion detection systems: Taxonomy, solutions and open issues","volume":"239","author":"Corona","year":"2013","journal-title":"Inf. Sci."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Li, Z., Das, A., and Zhou, J. (2005, January 7\u201310). Model generalization and its implications on intrusion detection. Proceedings of the International Conference on Applied Cryptography and Network Security, New York, NY, USA.","DOI":"10.1007\/11496137_16"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Xu, X., and Wang, X. (2005, January 22\u201324). An adaptive network intrusion detection method based on PCA and support vector machines. Proceedings of the International Conference on Advanced Data Mining and Applications, Wuhan, China.","DOI":"10.1007\/11527503_82"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"577","DOI":"10.1109\/TSMCB.2007.914695","article-title":"Adaboost-based algorithm for network intrusion detection","volume":"38","author":"Hu","year":"2008","journal-title":"IEEE Trans. Syst. Man, Cybern. Part (Cybern.)"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Chen, R.C., Cheng, K.F., Chen, Y.H., and Hsieh, C.F. (2009, January 1\u20133). Using rough set and support vector machine for network intrusion detection system. Proceedings of the 2009 First Asian Conference on Intelligent Information and Database Systems, Dong hoi, Vietnam.","DOI":"10.1109\/ACIIDS.2009.59"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.proeng.2012.01.827","article-title":"A hybrid intelligent approach for network intrusion detection","volume":"30","author":"Panda","year":"2012","journal-title":"Procedia Eng."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1016\/j.asej.2013.01.003","article-title":"A hybrid network intrusion detection framework based on random forests and weighted k-means","volume":"4","author":"Elbasiony","year":"2013","journal-title":"Ain Shams Eng. J."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Javaid, A., Niyaz, Q., Sun, W., and Alam, M. (2016, January 3\u20135). A deep learning approach for network intrusion detection system. Proceedings of the 9th EAI International Conference on Bio-Inspired Information and Communications Technologies (formerly BIONETICS), New York, NY, USA.","DOI":"10.4108\/eai.3-12-2015.2262516"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1051","DOI":"10.1007\/s00521-016-2418-1","article-title":"An effective combining classifier approach using tree algorithms for network intrusion detection","volume":"28","author":"Kevric","year":"2017","journal-title":"Neural Comput. Appl."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1109\/TETCI.2017.2772792","article-title":"A deep learning approach to network intrusion detection","volume":"2","author":"Shone","year":"2018","journal-title":"IEEE Trans. Emerg. Top. Comput. Intell."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"167455","DOI":"10.1109\/ACCESS.2019.2953451","article-title":"Classification hardness for supervised learners on 20 years of intrusion detection data","volume":"7","author":"Wauters","year":"2019","journal-title":"IEEE Access"},{"key":"ref_16","unstructured":"Recht, B., Roelofs, R., Schmidt, L., and Shankar, V. (2019, January 10\u201315). Do imagenet classifiers generalize to imagenet?. Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA."},{"key":"ref_17","unstructured":"Marasovic, A. (2018). NLP\u2019s generalization problem, and how researchers are tackling it. Gradient, Available online: https:\/\/thegradient.pub\/frontiers-of-generalization-in-natural-language-processing\/."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Jia, R., and Liang, P. (2017). Adversarial examples for evaluating reading comprehension systems. arXiv.","DOI":"10.18653\/v1\/D17-1215"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Iyyer, M., Wieting, J., Gimpel, K., and Zettlemoyer, L. (2018). Adversarial example generation with syntactically controlled paraphrase networks. arXiv.","DOI":"10.18653\/v1\/N18-1170"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Angiulli, F., Argento, L., and Furfaro, A. (2015, January 9\u201311). Exploiting n-gram location for intrusion detection. Proceedings of the 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI), Vietri sul Mare, Italy.","DOI":"10.1109\/ICTAI.2015.155"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Azizjon, M., Jumabek, A., and Kim, W. (2020, January 19\u201321). 1D CNN based network intrusion detection with normalization on imbalanced data. Proceedings of the 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Fukuoka, Japan.","DOI":"10.1109\/ICAIIC48513.2020.9064976"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Kim, J., Kim, J., Kim, H., Shim, M., and Choi, E. (2020). CNN-based network intrusion detection against denial-of-service attacks. Electronics, 9.","DOI":"10.3390\/electronics9060916"},{"key":"ref_23","unstructured":"Ma\u0142owidzki, M., Berezinski, P., and Mazur, M. (2015, January 23). Network intrusion detection: Half a kingdom for a good dataset. Proceedings of the NATO STO SAS-139 Workshop, Lisbon, Portugal."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Vasilomanolakis, E., Cordero, C.G., Milanov, N., and M\u00fchlh\u00e4user, M. (2016, January 25\u201329). Towards the creation of synthetic, yet realistic, intrusion detection datasets. Proceedings of the NOMS 2016\u20142016 IEEE\/IFIP Network Operations and Management Symposium, Istanbul, Turkey.","DOI":"10.1109\/NOMS.2016.7502989"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1016\/j.cose.2019.06.005","article-title":"A survey of network-based intrusion detection data sets","volume":"86","author":"Ring","year":"2019","journal-title":"Comput. Secur."},{"key":"ref_26","first-page":"3237","article-title":"A novel approach of KPCA and SVM for intrusion detection","volume":"8","author":"Kuang","year":"2012","journal-title":"J. Comput. Inf. Syst."},{"key":"ref_27","unstructured":"Govindarajan, M., and Chandrasekaran, R. (2012, January 4\u20136). Intrusion detection using an ensemble of classification methods. Proceedings of the World Congress on Engineering and Computer Science, London, UK."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Sommer, R., and Paxson, V. (2010, January 16\u201319). Outside the closed world: On using machine learning for network intrusion detection. Proceedings of the 2010 IEEE Symposium on Security and Privacy, Oakland, CA, USA.","DOI":"10.1109\/SP.2010.25"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Gates, C., and Taylor, C. (2006, January 19\u201322). Challenging the Anomaly Detection Paradigm: A Provocative Discussion. Proceedings of the 2006 workshop on New Security Paradigms, Schloss Dagstuhl, Germany.","DOI":"10.1145\/1278940.1278945"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Maggi, F., Robertson, W., Kruegel, C., and Vigna, G. (2009, January 23\u201325). Protecting a moving target: Addressing web application concept drift. Proceedings of the International Workshop on Recent Advances in Intrusion Detection, Saint-Malo, France.","DOI":"10.1007\/978-3-642-04342-0_2"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Creech, G., and Hu, J. (2013, January 7\u201310). Generation of a new IDS test dataset: Time to retire the KDD collection. Proceedings of the 2013 IEEE Wireless Communications and Networking Conference (WCNC), Shanghai, China.","DOI":"10.1109\/WCNC.2013.6555301"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1109\/MC.2018.2888764","article-title":"KDD cup 99 data sets: A perspective on the role of data sets in network intrusion detection research","volume":"52","author":"Siddique","year":"2019","journal-title":"Computer"},{"key":"ref_33","unstructured":"Barbosa, R.R.R., Sadre, R., Pras, A., and van de Meent, R. (2010). Simpleweb\/university of twente traffic traces data repository. Cent. Telemat. Inf. Technol. Univ. Twente."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1016\/j.cose.2014.05.011","article-title":"An empirical comparison of botnet detection methods","volume":"45","author":"Garcia","year":"2014","journal-title":"Comput. Secur."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Moustafa, N., and Slay, J. (2015, January 10\u201312). UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, ACT, Australia.","DOI":"10.1109\/MilCIS.2015.7348942"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Tavallaee, M., Bagheri, E., Lu, W., and Ghorbani, A.A. (2009, January 8\u201310). A detailed analysis of the KDD CUP 99 data set. Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada.","DOI":"10.1109\/CISDA.2009.5356528"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1016\/j.cose.2011.12.012","article-title":"Toward developing a systematic approach to generate benchmark datasets for intrusion detection","volume":"31","author":"Shiravi","year":"2012","journal-title":"Comput. Secur."},{"key":"ref_38","first-page":"108","article-title":"Toward generating a new intrusion detection dataset and intrusion traffic characterization","volume":"1","author":"Sharafaldin","year":"2018","journal-title":"ICISSp"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Sharafaldin, I., and CIC (2022, December 27). CIC-IDS2017. Available online: https:\/\/www.unb.ca\/cic\/datasets\/ids-2017.html.","DOI":"10.13052\/jsn2445-9739.2017.009"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/j.comnet.2017.03.018","article-title":"Detecting HTTP-based application layer DoS attacks on web servers in the presence of sampling","volume":"121","author":"Jazi","year":"2017","journal-title":"Comput. Netw."},{"key":"ref_41","unstructured":"Jazi, H.H., and CIC (2022, December 27). CIC-DoS2017. Available online: https:\/\/www.unb.ca\/cic\/datasets\/dos-dataset.html."},{"key":"ref_42","first-page":"177","article-title":"Towards a reliable intrusion detection benchmark dataset","volume":"2018","author":"Sharafaldin","year":"2018","journal-title":"Softw. Netw."},{"key":"ref_43","unstructured":"Sharafaldin, I., and CIC (2022, December 27). CSE-CIC-IDS2018. Available online: https:\/\/www.unb.ca\/cic\/datasets\/ids-2018.html."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Sharafaldin, I., Lashkari, A.H., Hakak, S., and Ghorbani, A.A. (2019, January 1\u20133). Developing realistic distributed denial of service (DDoS) attack dataset and taxonomy. Proceedings of the 2019 International Carnahan Conference on Security Technology (ICCST), Chennai, India.","DOI":"10.1109\/CCST.2019.8888419"},{"key":"ref_45","unstructured":"Sharafaldin, I., and CIC (2022, December 27). CIC-DDoS2019. Available online: https:\/\/www.unb.ca\/cic\/datasets\/ddos-2019.html."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Hastie, T., Tibshirani, R., Friedman, J.H., and Friedman, J.H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer.","DOI":"10.1007\/978-0-387-84858-7"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). Xgboost: A scalable tree boosting system. Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s10994-006-6226-1","article-title":"Extremely randomized trees","volume":"63","author":"Geurts","year":"2006","journal-title":"Mach. Learn."},{"key":"ref_49","unstructured":"(2022, December 27). Touyachrist. Evo-Zeus. Available online: https:\/\/github.com\/touyachrist\/evo-zeus."},{"key":"ref_50","unstructured":"(2022, December 27). Sweetsoftware. Ares. Available online: https:\/\/github.com\/sweetsoftware\/Ares."},{"key":"ref_51","unstructured":"Cybersecurity & Infrastructure Security Agency (CISA), U.G (2022, December 20). UDP-Based Amplification Attacks, Available online: https:\/\/www.us-cert.gov\/ncas\/alerts\/TA14-017A."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Alom, M.Z., Bontupalli, V., and Taha, T.M. (2015, January 15\u201319). Intrusion detection using deep belief networks. Proceedings of the 2015 National Aerospace and Electronics Conference (NAECON), Dayton, OH, USA.","DOI":"10.1109\/NAECON.2015.7443094"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Kim, J., Kim, J., Thu, H.L.T., and Kim, H. (2016, January 15\u201317). Long short term memory recurrent neural network classifier for intrusion detection. Proceedings of the 2016 International Conference on Platform Technology and Service (PlatCon), Jeju, Republic of Korea.","DOI":"10.1109\/PlatCon.2016.7456805"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"12499","DOI":"10.1007\/s00521-020-04708-x","article-title":"An efficient XGBoost\u2013DNN-based classification model for network intrusion detection system","volume":"32","author":"Devan","year":"2020","journal-title":"Neural Comput. Appl."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Lei, M., Li, X., Cai, B., Li, Y., Liu, L., and Kong, W. (2020, January 19\u201324). P-DNN: An effective intrusion detection method based on pruning deep neural network. Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK.","DOI":"10.1109\/IJCNN48605.2020.9206805"},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"582","DOI":"10.1109\/TNSM.2021.3091517","article-title":"Practical Intrusion Detection of Emerging Threats","volume":"19","author":"Mills","year":"2022","journal-title":"IEEE Trans. Netw. Serv. Manag."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"1077","DOI":"10.1109\/TNSM.2020.3036138","article-title":"WIDS: An Anomaly Based Intrusion Detection System for Wi-Fi (IEEE 802.11) Protocol","volume":"18","author":"Satam","year":"2021","journal-title":"IEEE Trans. Netw. Serv. Manag."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/4\/1846\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:26:39Z","timestamp":1760120799000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/4\/1846"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,7]]},"references-count":57,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["s23041846"],"URL":"https:\/\/doi.org\/10.3390\/s23041846","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,7]]}}}