{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,5]],"date-time":"2026-05-05T18:15:17Z","timestamp":1778004917770,"version":"3.51.4"},"reference-count":72,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T00:00:00Z","timestamp":1770681600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T00:00:00Z","timestamp":1770681600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001798","name":"Edith Cowan University","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001798","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["New Gener. Comput."],"published-print":{"date-parts":[[2026,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Artificial Intelligence, particularly machine learning (ML) algorithms, plays a crucial role in detecting cyberattacks, including anomalies and intrusions. However, machine learning models trained on imbalanced cybersecurity datasets often struggle to accurately detect minority data instances and potential threats, thereby weakening overall system security. Despite extensive research, a persistent challenge is the inadequate explanation for model predictions concerning minority data classes. This study aims to address these limitations by developing a generative AI-based approach to manage minority classes in anomaly detection, incorporating concept drift handling and explainability analysis. We introduce an over-sampling technique, CGGReaT, designed to enhance the presence of minority classes in the anomaly detection domain. Leveraging Large Language Models (LLMs) as a hybrid approach, we use pre-trained transformer-based LLM DistilGPT-2 for generating synthetic tabular data. Extensive experiments on two publicly available benchmark datasets, UNSW NB15 and CIC-IDS2017, underscore the efficacy of our proposed approach. We employed concept drift detection and adaptation techniques to maintain reliable and sustainable ML performance. To enhance interpretability, eXplainable Artificial Intelligence (XAI) methods, including SHAP and LIME, are employed to quantify feature contributions to model outputs. Extensive experiments reveal that testing ML algorithms on datasets balanced with synthetic samples generated by cGGReaT boosts the prediction accuracy on the UNSW NB15 and CIC-IDS2017 datasets, compared to classifiers tested on imbalanced datasets.<\/jats:p>","DOI":"10.1007\/s00354-026-00318-8","type":"journal-article","created":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T08:44:26Z","timestamp":1770713066000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A Generative AI Method for Minority Class Handling in Anomaly Detection with Drift and Explainability Analysis"],"prefix":"10.1007","volume":"44","author":[{"given":"Kelvin J.","family":"Mwiga","sequence":"first","affiliation":[]},{"given":"Mussa A.","family":"Dida","sequence":"additional","affiliation":[]},{"given":"Ahmad","family":"Mohsin","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1740-5517","authenticated-orcid":false,"given":"Iqbal H.","family":"Sarker","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2026,2,10]]},"reference":[{"key":"318_CR1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-54497-2","volume-title":"AI-Driven Cybersecurity and Threat Intelligence","author":"IH Sarker","year":"2024","unstructured":"Sarker, I.H.: AI-Driven Cybersecurity and Threat Intelligence. Springer, Berlin (2024)"},{"issue":"4","key":"318_CR2","doi-asserted-by":"publisher","first-page":"1119","DOI":"10.1109\/TSMCB.2012.2187280","volume":"42","author":"S Wang","year":"2012","unstructured":"Wang, S., Yao, X.: Multiclass imbalance problems: Analysis and potential solutions. IEEE Trans. Syst. Man Cybern. B Cybern. 42(4), 1119\u20131130 (2012)","journal-title":"IEEE Trans. Syst. Man Cybern. B Cybern."},{"key":"318_CR3","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1016\/j.ins.2020.01.032","volume":"519","author":"X Tao","year":"2020","unstructured":"Tao, X., Li, Q., Guo, W., Ren, C., He, Q., Liu, R., Zou, J.: Adaptive weighted over-sampling for imbalanced datasets based on density peaks clustering with heuristic filtering. Inf. Sci. 519, 43\u201373 (2020)","journal-title":"Inf. Sci."},{"issue":"1","key":"318_CR4","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/1007730.1007733","volume":"6","author":"NV Chawla","year":"2004","unstructured":"Chawla, N.V., Japkowicz, N., Kotcz, A.: Special issue on learning from imbalanced data sets. ACM SIGKDD Explor. Newsl. 6(1), 1\u20136 (2004)","journal-title":"ACM SIGKDD Explor. Newsl."},{"key":"318_CR5","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1016\/j.knosys.2013.01.018","volume":"42","author":"A Fern\u00e1ndez","year":"2013","unstructured":"Fern\u00e1ndez, A., L\u00f3pez, V., Galar, M., Del Jesus, M.J., Herrera, F.: Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches. Knowl.-Based Syst. 42, 97\u2013110 (2013)","journal-title":"Knowl.-Based Syst."},{"issue":"6","key":"318_CR6","doi-asserted-by":"publisher","first-page":"888","DOI":"10.1109\/TNNLS.2013.2246188","volume":"24","author":"CL Castro","year":"2013","unstructured":"Castro, C.L., Braga, A.P.: Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data. IEEE Trans. Neural Netw. Learn. Syst. 24(6), 888\u2013899 (2013)","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"318_CR7","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1007\/s13042-013-0180-6","volume":"5","author":"A Ghazikhani","year":"2014","unstructured":"Ghazikhani, A., Monsefi, R., Sadoghi Yazdi, H.: Online neural network model for non-stationary and imbalanced data stream classification. Int. J. Mach. Learn. Cybern. 5, 51\u201362 (2014)","journal-title":"Int. J. Mach. Learn. Cybern."},{"issue":"1","key":"318_CR8","first-page":"25","volume":"30","author":"S Kotsiantis","year":"2006","unstructured":"Kotsiantis, S., Kanellopoulos, D., Pintelas, P., et al.: Handling imbalanced datasets: A review. GESTS Int. Trans. Comput. Sci. Eng. 30(1), 25\u201336 (2006)","journal-title":"GESTS Int. Trans. Comput. Sci. Eng."},{"key":"318_CR9","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1613\/jair.953","volume":"16","author":"NV Chawla","year":"2002","unstructured":"Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321\u2013357 (2002)","journal-title":"J. Artif. Intell. Res."},{"key":"318_CR10","doi-asserted-by":"crossref","unstructured":"Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: International Conference on Intelligent Computing, pp. 878\u2013887 (2005). Springer","DOI":"10.1007\/11538059_91"},{"key":"318_CR11","doi-asserted-by":"crossref","unstructured":"He, H., Bai, Y., Garcia, E.A., Li, S.: Adasyn: Adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1322\u20131328 (2008). Ieee","DOI":"10.1109\/IJCNN.2008.4633969"},{"issue":"4","key":"318_CR12","first-page":"42","volume":"2","author":"V Ganganwar","year":"2012","unstructured":"Ganganwar, V.: An overview of classification algorithms for imbalanced datasets. Int. J. Emerging Technol. Adv. Eng. 2(4), 42\u201347 (2012)","journal-title":"Int. J. Emerging Technol. Adv. Eng."},{"issue":"9","key":"318_CR13","doi-asserted-by":"publisher","first-page":"1263","DOI":"10.1109\/TKDE.2008.239","volume":"21","author":"H He","year":"2009","unstructured":"He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263\u20131284 (2009)","journal-title":"IEEE Trans. Knowl. Data Eng."},{"issue":"1","key":"318_CR14","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1049\/cps2.12019","volume":"7","author":"R Ahsan","year":"2022","unstructured":"Ahsan, R., Shi, W., Ma, X., Lee Croft, W.: A comparative analysis of cgan-based oversampling for anomaly detection. IET Cyber-Phys. Syst.: Theory Appl. 7(1), 40\u201350 (2022)","journal-title":"IET Cyber-Phys. Syst.: Theory Appl."},{"key":"318_CR15","doi-asserted-by":"publisher","first-page":"148","DOI":"10.1016\/j.neucom.2020.12.001","volume":"431","author":"W Li","year":"2021","unstructured":"Li, W., Xu, L., Liang, Z., Wang, S., Cao, J., Lam, T.C., Cui, X.: Jdgan: Enhancing generator on extremely limited data via joint distribution. Neurocomputing 431, 148\u2013162 (2021)","journal-title":"Neurocomputing"},{"key":"318_CR16","doi-asserted-by":"publisher","first-page":"13635","DOI":"10.1007\/s00521-021-05993-w","volume":"33","author":"G Dlamini","year":"2021","unstructured":"Dlamini, G., Fahim, M.: Dgm: a data generative model to improve minority class presence in anomaly detection domain. Neural Comput. Appl. 33, 13635\u201313646 (2021)","journal-title":"Neural Comput. Appl."},{"key":"318_CR17","unstructured":"Borisov, V., Se\u00dfler, K., Leemann, T., Pawelczyk, M., Kasneci, G.: Language models are realistic tabular data generators. arXiv preprint arXiv:2210.06280 (2022)"},{"issue":"1","key":"318_CR18","first-page":"31704","volume":"12","author":"JM Corchado","year":"2023","unstructured":"Corchado, J.M., L\u00f3pez, S., Garcia, R., Chamoso, P., et al.: Generative artificial intelligence: fundamentals. Adv. Distrib. Comput. Artif. Intell. J. 12(1), 31704 (2023)","journal-title":"Adv. Distrib. Comput. Artif. Intell. J."},{"key":"318_CR19","unstructured":"Solaiman, I., Brundage, M., Clark, J., Askell, A., Herbert-Voss, A., Wu, J., Radford, A., Krueger, G., Kim, J.W., Kreps, S., et al.: Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203 (2019)"},{"key":"318_CR20","doi-asserted-by":"crossref","unstructured":"Sheikholeslami, S., Ghasemirahni, H., Payberah, A.H., Wang, T., Dowling, J., Vlassov, V.: Utilizing large language models for ablation studies in machine learning and deep learning. In: Proceedings of the 5th Workshop on Machine Learning and Systems, pp. 230\u2013237 (2025)","DOI":"10.1145\/3721146.3721957"},{"key":"318_CR21","doi-asserted-by":"crossref","unstructured":"Bhuiyan, M.H., Alam, K., Shahin, K.I., Farid, D.M.: A deep learning approach for network intrusion classification. In: 2024 IEEE Region 10 Symposium (TENSYMP), pp. 1\u20136 (2024). IEEE","DOI":"10.1109\/TENSYMP61132.2024.10752251"},{"key":"318_CR22","doi-asserted-by":"publisher","first-page":"617","DOI":"10.1016\/j.procs.2024.04.061","volume":"235","author":"R Benni","year":"2024","unstructured":"Benni, R., Totad, S., Mulimani, D., KG, K.: Impact analysis of real and virtual concept drifts on the predictive performance of classifiers. Procedia Comput. Sci. 235, 617\u2013627 (2024)","journal-title":"Procedia Comput. Sci."},{"key":"318_CR23","doi-asserted-by":"crossref","unstructured":"Nayak, P.A., Sriganesh, P., Rakshitha, K., Kumar, M.M., BS, P., HR, S.: Concept drift and model decay detection using machine learning algorithm. In: 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), pp. 1\u20138 (2021). IEEE","DOI":"10.1109\/ICCCNT51525.2021.9580110"},{"key":"318_CR24","doi-asserted-by":"crossref","unstructured":"Yang, L., Manias, D.M., Shami, A.: Pwpae: An ensemble framework for concept drift adaptation in iot data streams. In: 2021 Ieee Global Communications Conference (globecom), pp. 01\u201306 (2021). IEEE","DOI":"10.1109\/GLOBECOM46510.2021.9685338"},{"key":"318_CR25","doi-asserted-by":"crossref","unstructured":"Horchulhack, P., Viegas, E.K., Lopez, M.A.: A stream learning intrusion detection system for concept drifting network traffic. In: 2022 6th Cyber Security in Networking Conference (CSNet), pp. 1\u20137 (2022). IEEE","DOI":"10.1109\/CSNet56116.2022.9955620"},{"key":"318_CR26","doi-asserted-by":"crossref","unstructured":"Hasan, M., Rahman, M.S., Janicke, H., Sarker, I.H.: Detecting anomalies in blockchain transactions using machine learning classifiers and explainability analysis. arXiv preprint arXiv:2401.03530 (2024)","DOI":"10.1016\/j.bcra.2024.100207"},{"key":"318_CR27","doi-asserted-by":"publisher","DOI":"10.1016\/j.icte.2024.05.007","author":"IH Sarker","year":"2024","unstructured":"Sarker, I.H., Janicke, H., Mohsin, A., Gill, A., Maglaras, L.: Explainable ai for cybersecurity automation, intelligence and trustworthiness in digital twin: Methods, taxonomy, challenges and prospects. ICT Express (2024). https:\/\/doi.org\/10.1016\/j.icte.2024.05.007","journal-title":"ICT Express"},{"key":"318_CR28","doi-asserted-by":"publisher","first-page":"820","DOI":"10.7717\/peerj-cs.820","volume":"8","author":"HA Ahmed","year":"2022","unstructured":"Ahmed, H.A., Hameed, A., Bawany, N.Z.: Network intrusion detection using oversampling technique and machine learning algorithms. PeerJ Comput. Sci. 8, 820 (2022)","journal-title":"PeerJ Comput. Sci."},{"issue":"1","key":"318_CR29","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1145\/1007730.1007735","volume":"6","author":"GE Batista","year":"2004","unstructured":"Batista, G.E., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl 6(1), 20\u201329 (2004)","journal-title":"ACM SIGKDD Explor. Newsl"},{"key":"318_CR30","doi-asserted-by":"publisher","first-page":"73127","DOI":"10.1109\/ACCESS.2020.2988359","volume":"8","author":"M Wang","year":"2020","unstructured":"Wang, M., Zheng, K., Yang, Y., Wang, X.: An explainable machine learning framework for intrusion detection systems. IEEE Access 8, 73127\u201373141 (2020)","journal-title":"IEEE Access"},{"key":"318_CR31","doi-asserted-by":"publisher","DOI":"10.1016\/j.cose.2020.102158","volume":"103","author":"J Gu","year":"2021","unstructured":"Gu, J., Lu, S.: An effective intrusion detection approach using svm with na\u00efve bayes feature embedding. Comput. Secur. 103, 102158 (2021)","journal-title":"Comput. Secur."},{"key":"318_CR32","doi-asserted-by":"crossref","unstructured":"Elmasri, T., Samir, N., Mashaly, M., Atef, Y.: Evaluation of cicids2017 with qualitative comparison of machine learning algorithm. In: 2020 IEEE Cloud Summit, pp. 46\u201351 (2020). IEEE","DOI":"10.1109\/IEEECloudSummit48914.2020.00013"},{"issue":"1","key":"318_CR33","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1186\/s40537-020-00379-6","volume":"7","author":"SM Kasongo","year":"2020","unstructured":"Kasongo, S.M., Sun, Y.: Performance analysis of intrusion detection systems using a feature selection method on the unsw-nb15 dataset. J. Big Data 7(1), 105 (2020)","journal-title":"J. Big Data"},{"issue":"1","key":"318_CR34","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1186\/s40537-020-00390-x","volume":"8","author":"S Bagui","year":"2021","unstructured":"Bagui, S., Li, K.: Resampling imbalanced data for network intrusion detection datasets. J. Big Data 8(1), 6 (2021)","journal-title":"J. Big Data"},{"issue":"6","key":"318_CR35","doi-asserted-by":"publisher","first-page":"898","DOI":"10.3390\/electronics11060898","volume":"11","author":"Y Fu","year":"2022","unstructured":"Fu, Y., Du, Y., Cao, Z., Li, Q., Xiang, W.: A deep learning model for network intrusion detection with imbalanced data. Electronics 11(6), 898 (2022)","journal-title":"Electronics"},{"issue":"1","key":"318_CR36","doi-asserted-by":"publisher","first-page":"550","DOI":"10.3390\/s23010550","volume":"23","author":"YN Rao","year":"2023","unstructured":"Rao, Y.N., Suresh Babu, K.: An imbalanced generative adversarial network-based approach for network intrusion detection in an imbalanced dataset. Sensors 23(1), 550 (2023)","journal-title":"Sensors"},{"issue":"15","key":"318_CR37","doi-asserted-by":"publisher","first-page":"3323","DOI":"10.3390\/electronics12153323","volume":"12","author":"H Yang","year":"2023","unstructured":"Yang, H., Xu, J., Xiao, Y., Hu, L.: Spe-acgan: A resampling approach for class imbalance problem in network intrusion detection systems. Electronics 12(15), 3323 (2023)","journal-title":"Electronics"},{"key":"318_CR38","unstructured":"Somepalli, G., Goldblum, M., Schwarzschild, A., Bruss, C.B., Goldstein, T.: Saint: Improved neural networks for tabular data via row attention and contrastive pre-training. arXiv preprint arXiv:2106.01342 (2021)"},{"key":"318_CR39","first-page":"28742","volume":"34","author":"J Kossen","year":"2021","unstructured":"Kossen, J., Band, N., Lyle, C., Gomez, A.N., Rainforth, T., Gal, Y.: Self-attention between datapoints: Going beyond individual input-output pairs in deep learning. Adv. Neural. Inf. Process. Syst. 34, 28742\u201328756 (2021)","journal-title":"Adv. Neural. Inf. Process. Syst."},{"key":"318_CR40","doi-asserted-by":"crossref","unstructured":"Yin, P., Neubig, G., Yih, W.-t., Riedel, S.: Tabert: Pretraining for joint understanding of textual and tabular data. arXiv preprint arXiv:2005.08314 (2020)","DOI":"10.18653\/v1\/2020.acl-main.745"},{"key":"318_CR41","doi-asserted-by":"crossref","unstructured":"Padhi, I., Schiff, Y., Melnyk, I., Rigotti, M., Mroueh, Y., Dognin, P., Ross, J., Nair, R., Altman, E.: Tabular transformers for modeling multivariate time series. In: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3565\u20133569 (2021). IEEE","DOI":"10.1109\/ICASSP39728.2021.9414142"},{"issue":"4","key":"318_CR42","doi-asserted-by":"publisher","first-page":"332","DOI":"10.1016\/j.icte.2020.05.011","volume":"6","author":"MMW Yan","year":"2020","unstructured":"Yan, M.M.W.: Accurate detecting concept drift in evolving data streams. ICT Express 6(4), 332\u2013338 (2020)","journal-title":"ICT Express"},{"key":"318_CR43","doi-asserted-by":"crossref","unstructured":"Andresini, G., Pendlebury, F., Pierazzi, F., Loglisci, C., Appice, A., Cavallaro, L.: Insomnia: Towards concept-drift robustness in network intrusion detection. In: Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security, pp. 111\u2013122 (2021)","DOI":"10.1145\/3474369.3486864"},{"key":"318_CR44","doi-asserted-by":"crossref","unstructured":"Pesaranghader, A., Viktor, H.L.: Fast hoeffding drift detection method for evolving data streams. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 96\u2013111 (2016). Springer","DOI":"10.1007\/978-3-319-46227-1_7"},{"issue":"6","key":"318_CR45","doi-asserted-by":"publisher","first-page":"3692","DOI":"10.1109\/TII.2021.3108464","volume":"18","author":"H Qiao","year":"2022","unstructured":"Qiao, H., Novikov, B., Blech, J.O.: Concept drift analysis by dynamic residual projection for effectively detecting botnet cyber-attacks in iot scenarios. IEEE Trans. Industr. Inf. 18(6), 3692\u20133701 (2022). https:\/\/doi.org\/10.1109\/TII.2021.3108464","journal-title":"IEEE Trans. Industr. Inf."},{"issue":"12","key":"318_CR46","doi-asserted-by":"publisher","first-page":"3712","DOI":"10.3390\/s24123712","volume":"24","author":"M Zubair","year":"2024","unstructured":"Zubair, M., Janicke, H., Mohsin, A., Maglaras, L., Sarker, I.H.: Automated sensor node malicious activity detection with explainability analysis. Sensors 24(12), 3712 (2024). https:\/\/doi.org\/10.3390\/s24123712","journal-title":"Sensors"},{"issue":"9","key":"318_CR47","doi-asserted-by":"publisher","first-page":"5809","DOI":"10.3390\/app13095809","volume":"13","author":"T-T-H Le","year":"2023","unstructured":"Le, T.-T.-H., Prihatno, A.T., Oktian, Y.E., Kang, H., Kim, H.: Exploring local explanation of practical industrial ai applications: A systematic literature review. Appl. Sci. 13(9), 5809 (2023)","journal-title":"Appl. Sci."},{"issue":"19","key":"318_CR48","doi-asserted-by":"publisher","first-page":"3079","DOI":"10.3390\/electronics11193079","volume":"11","author":"S Patil","year":"2022","unstructured":"Patil, S., Varadarajan, V., Mazhar, S.M., Sahibzada, A., Ahmed, N., Sinha, O., Kumar, S., Shaw, K., Kotecha, K.: Explainable artificial intelligence for intrusion detection system. Electronics 11(19), 3079 (2022)","journal-title":"Electronics"},{"issue":"3","key":"318_CR49","doi-asserted-by":"publisher","first-page":"167","DOI":"10.1109\/LNET.2022.3186589","volume":"4","author":"P Barnard","year":"2022","unstructured":"Barnard, P., Marchetti, N., DaSilva, L.A.: Robust network intrusion detection through explainable artificial intelligence (xai). IEEE Netw. Lett. 4(3), 167\u2013171 (2022)","journal-title":"IEEE Netw. Lett."},{"key":"318_CR50","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141, Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, (2017)"},{"key":"318_CR51","doi-asserted-by":"crossref","unstructured":"Moustafa, N., Slay, J.: Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: 2015 Military Communications and Information Systems Conference (MilCIS), pp. 1\u20136 (2015). IEEE","DOI":"10.1109\/MilCIS.2015.7348942"},{"key":"318_CR52","first-page":"108","volume":"1","author":"I Sharafaldin","year":"2018","unstructured":"Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A., et al.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 1, 108\u2013116 (2018)","journal-title":"ICISSp"},{"key":"318_CR53","doi-asserted-by":"publisher","first-page":"148","DOI":"10.1016\/j.neucom.2013.06.035","volume":"122","author":"G Doquire","year":"2013","unstructured":"Doquire, G., Verleysen, M.: Mutual information-based feature selection for multilabel classification. Neurocomputing 122, 148\u2013155 (2013)","journal-title":"Neurocomputing"},{"key":"318_CR54","doi-asserted-by":"publisher","unstructured":"Information Theory and Statistics, pp. 347\u2013408. John Wiley & Sons, Ltd (2005). Chap. 11. https:\/\/doi.org\/10.1002\/047174882X.ch11 . https:\/\/onlinelibrary.wiley.com\/doi\/abs\/10.1002\/047174882X.ch11","DOI":"10.1002\/047174882X.ch11"},{"key":"318_CR55","doi-asserted-by":"publisher","DOI":"10.1201\/9781315140919","volume-title":"Density Estimation for Statistics and Data Analysis","author":"BW Silverman","year":"2018","unstructured":"Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Routledge, ??? (2018)"},{"issue":"10","key":"318_CR56","doi-asserted-by":"publisher","first-page":"10611","DOI":"10.1007\/s11227-023-05073-x","volume":"79","author":"A Abdelkhalek","year":"2023","unstructured":"Abdelkhalek, A., Mashaly, M.: Addressing the class imbalance problem in network intrusion detection systems using data resampling and deep learning. J. Supercomput. 79(10), 10611\u201310644 (2023)","journal-title":"J. Supercomput."},{"key":"318_CR57","doi-asserted-by":"crossref","unstructured":"Mohammed, R., Rawashdeh, J., Abdullah, M.: Machine learning with oversampling and undersampling techniques: overview study and experimental results. In: 2020 11th International Conference on Information and Communication Systems (ICICS), pp. 243\u2013248 (2020). IEEE","DOI":"10.1109\/ICICS49469.2020.239556"},{"key":"318_CR58","unstructured":"Drummond, C., Holte, R.C., et al.: C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: Workshop on Learning from Imbalanced Datasets II, vol. 11 (2003)"},{"issue":"3","key":"318_CR59","doi-asserted-by":"publisher","first-page":"2247","DOI":"10.1007\/s40747-021-00638-w","volume":"8","author":"X Yi","year":"2022","unstructured":"Yi, X., Xu, Y., Hu, Q., Krishnamoorthy, S., Li, W., Tang, Z.: Asn-smote: a synthetic minority oversampling method with adaptive qualified synthesizer selection. Complex Intell. Syst. 8(3), 2247\u20132272 (2022)","journal-title":"Complex Intell. Syst."},{"key":"318_CR60","first-page":"1","volume":"14","author":"R Blagus","year":"2013","unstructured":"Blagus, R., Lusa, L.: Smote for high-dimensional class-imbalanced data. BMC Bioinformatics 14, 1\u201316 (2013)","journal-title":"BMC Bioinformatics"},{"key":"318_CR61","unstructured":"Sanh, V.: Distilbert, a distilled version of bert: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)"},{"issue":"8","key":"318_CR62","first-page":"9","volume":"1","author":"A Radford","year":"2019","unstructured":"Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019)","journal-title":"OpenAI blog"},{"key":"318_CR63","unstructured":"Sennrich, R.: Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909 (2015)"},{"key":"318_CR64","unstructured":"Holtzman, A., Buys, J., Du, L., Forbes, M., Choi, Y.: The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751 (2019)"},{"key":"318_CR65","unstructured":"Keskar, N.S., McCann, B., Varshney, L.R., Xiong, C., Socher, R.: Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858 (2019)"},{"key":"318_CR66","unstructured":"Chakravarti, I.M., Laha, R.G., Roy, J.: Handbook of methods of applied statistics. Wiley Series in Probability and Mathematical Statistics (USA) eng (1967)"},{"key":"318_CR67","unstructured":"Scikit-learn: sklearn.ensemble.ExtraTreesClassifier. https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.ensemble.ExtraTreesClassifier.html. Accessed: 2024-08-16"},{"issue":"11","key":"318_CR68","first-page":"738","volume":"11","author":"AA Ibrahim","year":"2020","unstructured":"Ibrahim, A.A., Ridwan, R.L., Muhammed, M.M., Abdulaziz, R.O., Saheed, G.A.: Comparison of the catboost classifier with other machine learning methods. Int. J. Adv. Comput. Sci. Appl. 11(11), 738\u2013748 (2020)","journal-title":"Int. J. Adv. Comput. Sci. Appl."},{"key":"318_CR69","doi-asserted-by":"publisher","DOI":"10.1016\/j.cose.2020.101984","volume":"97","author":"D Jin","year":"2020","unstructured":"Jin, D., Lu, Y., Qin, J., Cheng, Z., Mao, Z.: Swiftids: Real-time intrusion detection system based on lightgbm and parallel intrusion detection mechanism. Comput. Secur. 97, 101984 (2020)","journal-title":"Comput. Secur."},{"key":"318_CR70","first-page":"1342","volume":"16","author":"MN Aziz","year":"2021","unstructured":"Aziz, M.N., Ahmad, T.: Clustering under-sampling data for improving the performance of intrusion detection system. JESTEC 16, 1342\u20131355 (2021)","journal-title":"JESTEC"},{"key":"318_CR71","doi-asserted-by":"publisher","first-page":"578","DOI":"10.1016\/j.procs.2024.03.042","volume":"234","author":"AO Widodo","year":"2024","unstructured":"Widodo, A.O., Setiawan, B., Indraswari, R.: Machine learning-based intrusion detection on multi-class imbalanced dataset using smote. Procedia Comput. Sci. 234, 578\u2013583 (2024)","journal-title":"Procedia Comput. Sci."},{"issue":"2","key":"318_CR72","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1007\/s11416-022-00441-2","volume":"19","author":"S Hariharan","year":"2023","unstructured":"Hariharan, S., Rejimol Robinson, R., Prasad, R.R., Thomas, C., Balakrishnan, N.: Xai for intrusion detection system: comparing explanations based on global and local scope. J. Comput. Virol. Hack. Tech. 19(2), 217\u2013239 (2023)","journal-title":"J. Comput. Virol. Hack. Tech."}],"container-title":["New Generation Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00354-026-00318-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00354-026-00318-8","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00354-026-00318-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,5,5]],"date-time":"2026-05-05T18:03:49Z","timestamp":1778004229000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00354-026-00318-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,10]]},"references-count":72,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,5]]}},"alternative-id":["318"],"URL":"https:\/\/doi.org\/10.1007\/s00354-026-00318-8","relation":{},"ISSN":["0288-3635","1882-7055"],"issn-type":[{"value":"0288-3635","type":"print"},{"value":"1882-7055","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,10]]},"assertion":[{"value":"18 September 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 January 2026","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 February 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 April 2026","order":5,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Update","order":6,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The original online version of this article was revised to correct the Table 9","order":7,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Authors declare no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"13"}}