{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,22]],"date-time":"2025-11-22T11:34:33Z","timestamp":1763811273882,"version":"build-2065373602"},"reference-count":26,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2023,11,21]],"date-time":"2023-11-21T00:00:00Z","timestamp":1700524800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"CEEPUS network","award":["CIII-HR-0108","KK.01.1.1.01.0009","2021-1-HR01-KA220-HED-000031177","uniri-mladi-technic-22-61","uniri-tehnic-18-275-1447"],"award-info":[{"award-number":["CIII-HR-0108","KK.01.1.1.01.0009","2021-1-HR01-KA220-HED-000031177","uniri-mladi-technic-22-61","uniri-tehnic-18-275-1447"]}]},{"name":"European Regional Development","award":["CIII-HR-0108","KK.01.1.1.01.0009","2021-1-HR01-KA220-HED-000031177","uniri-mladi-technic-22-61","uniri-tehnic-18-275-1447"],"award-info":[{"award-number":["CIII-HR-0108","KK.01.1.1.01.0009","2021-1-HR01-KA220-HED-000031177","uniri-mladi-technic-22-61","uniri-tehnic-18-275-1447"]}]},{"name":"Erasmus+ project WICT","award":["CIII-HR-0108","KK.01.1.1.01.0009","2021-1-HR01-KA220-HED-000031177","uniri-mladi-technic-22-61","uniri-tehnic-18-275-1447"],"award-info":[{"award-number":["CIII-HR-0108","KK.01.1.1.01.0009","2021-1-HR01-KA220-HED-000031177","uniri-mladi-technic-22-61","uniri-tehnic-18-275-1447"]}]},{"name":"University of Rijeka Scientific","award":["CIII-HR-0108","KK.01.1.1.01.0009","2021-1-HR01-KA220-HED-000031177","uniri-mladi-technic-22-61","uniri-tehnic-18-275-1447"],"award-info":[{"award-number":["CIII-HR-0108","KK.01.1.1.01.0009","2021-1-HR01-KA220-HED-000031177","uniri-mladi-technic-22-61","uniri-tehnic-18-275-1447"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"<jats:p>Malware detection using hybrid features, combining binary and hexadecimal analysis with DLL calls, is crucial for leveraging the strengths of both static and dynamic analysis methods. Artificial intelligence (AI) enhances this process by enabling automated pattern recognition, anomaly detection, and continuous learning, allowing security systems to adapt to evolving threats and identify complex, polymorphic malware that may exhibit varied behaviors. This synergy of hybrid features with AI empowers malware detection systems to efficiently and proactively identify and respond to sophisticated cyber threats in real time. In this paper, the genetic programming symbolic classifier (GPSC) algorithm was applied to the publicly available dataset to obtain symbolic expressions (SEs) that could detect the malware software with high classification performance. The initial problem with the dataset was a high imbalance between class samples, so various oversampling techniques were utilized to obtain balanced dataset variations on which GPSC was applied. To find the optimal combination of GPSC hyperparameter values, the random hyperparameter value search method (RHVS) was developed and applied to obtain SEs with high classification accuracy. The GPSC was trained with five-fold cross-validation (5FCV) to obtain a robust set of SEs on each dataset variation. To choose the best SEs, several evaluation metrics were used, i.e., the length and depth of SEs, accuracy score (ACC), area under receiver operating characteristic curve (AUC), precision, recall, f1-score, and confusion matrix. The best-obtained SEs are applied on the original imbalanced dataset to see if the classification performance is the same as it was on balanced dataset variations. The results of the investigation showed that the proposed method generated SEs with high classification accuracy (0.9962) in malware software detection.<\/jats:p>","DOI":"10.3390\/computers12120242","type":"journal-article","created":{"date-parts":[[2023,11,21]],"date-time":"2023-11-21T01:48:45Z","timestamp":1700531325000},"page":"242","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Improvement of Malicious Software Detection Accuracy through Genetic Programming Symbolic Classifier with Application of Dataset Oversampling Techniques"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0314-243X","authenticated-orcid":false,"given":"Nikola","family":"An\u0111eli\u0107","sequence":"first","affiliation":[{"name":"Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, Croatia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3015-1024","authenticated-orcid":false,"given":"Sandi","family":"Baressi \u0160egota","sequence":"additional","affiliation":[{"name":"Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, Croatia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2817-9252","authenticated-orcid":false,"given":"Zlatan","family":"Car","sequence":"additional","affiliation":[{"name":"Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, Croatia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,11,21]]},"reference":[{"key":"ref_1","first-page":"8176","article-title":"A deeper look into cybersecurity issues in the wake of COVID-19: A survey","volume":"34","author":"Alawida","year":"2022","journal-title":"J. King Saud Univ.-Comput. Inf. Sci."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1333","DOI":"10.3390\/electronics12061333","article-title":"A comprehensive review of cyber security vulnerabilities, threats, attacks, and solutions","volume":"12","author":"Aslan","year":"2023","journal-title":"Electronics"},{"key":"ref_3","unstructured":"Broadhurst, R. (2017). The Oxford Handbook of Cyber Security, Oxford Handbooks Press."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Li, B., Zhao, Q., Jiao, S., and Liu, X. (2023, January 2\u20136). DroidPerf: Profiling Memory Objects on Android Devices. Proceedings of the 29th Annual International Conference on Mobile Computing and Networking, Madrid, Spain.","DOI":"10.1145\/3570361.3592503"},{"key":"ref_5","first-page":"930","article-title":"Techniques in detection and analyzing malware executables: A review","volume":"3","author":"Jain","year":"2014","journal-title":"Int. J. Comput. Sci. Mob. Comput."},{"key":"ref_6","unstructured":"Monnappa, K. (2018). Learning Malware Analysis: Explore the Concepts, Tools, and Techniques to Analyze and Investigate Windows Malware, Packt Publishing Ltd."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"022097","DOI":"10.1088\/1742-6596\/1529\/2\/022097","article-title":"Malware Behaviour Analysis and Classification via Windows DLL and System Call","volume":"1529","author":"Rauf","year":"2020","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Narayanan, B.N., Djaneye-Boundjou, O., and Kebede, T.M. (2016, January 25\u201329). Performance analysis of machine learning and pattern recognition algorithms for malware classification. Proceedings of the 2016 IEEE National Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS), Dayton, OH, USA.","DOI":"10.1109\/NAECON.2016.7856826"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1007\/s11416-016-0274-2","article-title":"Structural analysis of binary executable headers for malware detection optimization","volume":"13","author":"David","year":"2017","journal-title":"J. Comput. Virol. Hacking Tech."},{"key":"ref_10","unstructured":"Shaid, S.Z.M., and Maarof, M.A. (2015, January 21\u201323). In memory detection of Windows API call hooking technique. Proceedings of the 2015 International Conference on Computer, Communications, and Control Technology (I4CT), Kuching, Malaysia."},{"key":"ref_11","unstructured":"Rathore, H., Agarwal, S., Sahay, S.K., and Sewak, M. (2018). Big Data Analytics, Proceedings of the 6th International Conference, BDA 2018, Warangal, India, 18\u201321 December 2018, Springer. Proceedings 6."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"5183","DOI":"10.1007\/s00521-020-05309-4","article-title":"MLDroid\u2014Framework for Android malware detection using machine learning techniques","volume":"33","author":"Mahindru","year":"2021","journal-title":"Neural Comput. Appl."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"46717","DOI":"10.1109\/ACCESS.2019.2906934","article-title":"Robust intelligent malware detection using deep learning","volume":"7","author":"Vinayakumar","year":"2019","journal-title":"IEEE Access"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Xu, Z., Ray, S., Subramanyan, P., and Malik, S. (2017, January 27\u201331). Malware detection using machine learning based analysis of virtual memory access patterns. Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE), Lausanne, Switzerland.","DOI":"10.23919\/DATE.2017.7926977"},{"key":"ref_15","unstructured":"Piyush AnastaRumao (2016). Using Two Dimensional Hybrid Feature Dataset to Detect Malicious Executables. Int. J. Innov. Res. Comp. Com. Eng., 4."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"863","DOI":"10.1613\/jair.1.11192","article-title":"SMOTE for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary","volume":"61","author":"Garcia","year":"2018","journal-title":"J. Artif. Intell. Res."},{"key":"ref_17","unstructured":"He, H., Bai, Y., Garcia, E.A., and Li, S. (2008, January 1\u20138). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, China."},{"key":"ref_18","unstructured":"Han, H., Wang, W.Y., and Mao, B.H. (2005). Advances in Intelligent Computing, Proceedings of the International Conference on Intelligent Computing, Hefei, China, 23\u201326 August 2005, Springer."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"6618841","DOI":"10.1155\/2021\/6618841","article-title":"Research on credit card default prediction based on k-means SMOTE and BP neural network","volume":"2021","author":"Chen","year":"2021","journal-title":"Complexity"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"67","DOI":"10.15294\/jaist.v3i2.57061","article-title":"Multilayer Perceptron Optimization on Imbalanced Data Using SVM-SMOTE and One-Hot Encoding for Credit Card Default Prediction","volume":"3","author":"Almajid","year":"2021","journal-title":"J. Adv. Inf. Syst. Technol."},{"key":"ref_21","unstructured":"Poli, R., Langdon, W.B., and McPhee, N.F. (2008). A Field Guide to Genetic Programming, Lulu.com."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"An\u0111eli\u0107, N., and Baressi \u0160egota, S. (2023). Development of Symbolic Expressions Ensemble for Breast Cancer Type Classification Using Genetic Programming Symbolic Classifier and Decision Tree Classifier. Cancers, 15.","DOI":"10.3390\/cancers15133411"},{"key":"ref_23","unstructured":"Sokolova, M., Japkowicz, N., and Szpakowicz, S. (2006). Advances in Artificial Intelligence, Proceedings of the Australasian Joint Conference on Artificial Intelligence, Hobart, Australia, 4\u20138 December 2006, Springer."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1177\/0272989X8900900307","article-title":"Analyzing a portion of the ROC curve","volume":"9","author":"McClish","year":"1989","journal-title":"Med. Decis. Mak."},{"key":"ref_25","unstructured":"Goutte, C., and Gaussier, E. (2005). Advances in Information Retrieval, Proceedings of the European Conference on Information Retrieval, Santiago de Compostela, Spain, 21\u201323 March 2005, Springer."},{"key":"ref_26","unstructured":"Susmaga, R. (2004). Intelligent Information Processing and Web Mining, Proceedings of the International IIS: IIPWM \u201804 Conference, Zakopane, Poland, 17\u201320 May 2004, Springer."}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/12\/12\/242\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:25:43Z","timestamp":1760131543000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/12\/12\/242"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,21]]},"references-count":26,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["computers12120242"],"URL":"https:\/\/doi.org\/10.3390\/computers12120242","relation":{},"ISSN":["2073-431X"],"issn-type":[{"type":"electronic","value":"2073-431X"}],"subject":[],"published":{"date-parts":[[2023,11,21]]}}}