{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,16]],"date-time":"2026-02-16T10:01:36Z","timestamp":1771236096110,"version":"3.50.1"},"reference-count":59,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2024,12,10]],"date-time":"2024-12-10T00:00:00Z","timestamp":1733788800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>Detecting malware has become extremely important with the increasing exposure of computational systems and mobile devices to online services. However, the rapidly evolving nature of malicious software makes this task particularly challenging. Despite the significant number of machine learning works for malware detection proposed in the last few years, limited interest has been devoted to continual learning approaches, which could allow models to showcase effective performance in challenging and dynamic scenarios while being computationally efficient. Moreover, most of the research works proposed thus far adopt a fully supervised setting, which relies on fully labelled data and appears to be impractical in a rapidly evolving malware landscape. In this paper, we address malware detection from a continual semi-supervised one-class learning perspective, which only requires normal\/benign data and empowers models with a greater degree of flexibility, allowing them to detect multiple malware types with different morphology. Specifically, we assess the effectiveness of two replay strategies on anomaly detection models and analyze their performance in continual learning scenarios with three popular malware detection datasets (CIC-AndMal2017, CIC-MalMem-2022, and CIC-Evasive-PDFMal2022). Our evaluation shows that replay-based strategies can achieve competitive performance in terms of continual ROC-AUC with respect to the considered baselines and bring new perspectives and insights on this topic.<\/jats:p>","DOI":"10.3390\/make6040135","type":"journal-article","created":{"date-parts":[[2024,12,10]],"date-time":"2024-12-10T12:22:06Z","timestamp":1733833326000},"page":"2829-2854","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Continual Semi-Supervised Malware Detection"],"prefix":"10.3390","volume":"6","author":[{"given":"Matthew","family":"Chin","sequence":"first","affiliation":[{"name":"Department of Computer Science, American University, Washington, DC 20016, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8366-6059","authenticated-orcid":false,"given":"Roberto","family":"Corizzo","sequence":"additional","affiliation":[{"name":"Department of Computer Science, American University, Washington, DC 20016, USA"}]}],"member":"1968","published-online":{"date-parts":[[2024,12,10]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"102860","DOI":"10.1016\/j.cose.2022.102860","article-title":"MFMCNS: A multi-feature and multi-classifier network-based system for ransomworm detection","volume":"121","author":"Almashhadani","year":"2022","journal-title":"Comput. Secur."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"617","DOI":"10.1109\/TCYB.2022.3164625","article-title":"Cyber code intelligence for android malware detection","volume":"53","author":"Qiu","year":"2022","journal-title":"IEEE Trans. Cybern."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"118705","DOI":"10.1016\/j.eswa.2022.118705","article-title":"Android malware detection based on multi-head squeeze-and-excitation residual network","volume":"212","author":"Zhu","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Hessel, M., Modayil, J., Van Hasselt, H., Schaul, T., Ostrovski, G., Dabney, W., Horgan, D., Piot, B., Azar, M., and Silver, D. (2018, January 2\u20137). Rainbow: Combining improvements in deep reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA.","DOI":"10.1609\/aaai.v32i1.11796"},{"key":"ref_5","unstructured":"Gu, S., Lillicrap, T., Sutskever, I., and Levine, S. (2016, January 20\u201322). Continuous deep q-learning with model-based acceleration. Proceedings of the International Conference on Machine Learning, PMLR, New York, NY, USA."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"639","DOI":"10.1613\/jair.1.14819","article-title":"Actor prioritized experience replay","volume":"78","author":"Saglam","year":"2023","journal-title":"J. Artif. Intell. Res."},{"key":"ref_7","unstructured":"Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Pieter Abbeel, O., and Zaremba, W. (2017). Hindsight experience replay. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1016\/j.neunet.2019.01.012","article-title":"Continual lifelong learning with neural networks: A review","volume":"113","author":"Parisi","year":"2019","journal-title":"Neural Netw."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"120495","DOI":"10.1016\/j.eswa.2023.120495","article-title":"Reinforcement learning algorithms: A brief survey","volume":"231","author":"Shakya","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_10","first-page":"100546","article-title":"A survey of malware detection using deep learning","volume":"16","author":"Bensaoud","year":"2024","journal-title":"Mach. Learn. Appl."},{"key":"ref_11","unstructured":"Zhu, D., Jin, H., Yang, Y., Wu, D., and Chen, W. (2017, January 3\u20136). DeepFlow: Deep learning-based malware detection by mining Android application for abnormal usage of sensitive data. Proceedings of the 2017 IEEE Symposium on Computers and Communications (ISCC), Heraklion, Greece."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1016\/j.future.2018.03.007","article-title":"A deep recurrent neural network based approach for internet of things malware threat hunting","volume":"85","author":"HaddadPajouh","year":"2018","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_13","first-page":"2390","article-title":"Understanding illicit UI in iOS apps through hidden UI analysis","volume":"18","author":"Lee","year":"2019","journal-title":"IEEE Trans. Dependable Secur. Comput."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"102490","DOI":"10.1016\/j.cose.2021.102490","article-title":"Ransomware: Recent advances, analysis, challenges and future research directions","volume":"111","author":"Beaman","year":"2021","journal-title":"Comput. Secur."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"102659","DOI":"10.1016\/j.cose.2022.102659","article-title":"FeSA: Feature selection architecture for ransomware detection under concept drift","volume":"116","author":"Fernando","year":"2022","journal-title":"Comput. Secur."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Liu, M., Yang, Q., Wang, W., and Liu, S. (2024). Semi-Supervised Encrypted Malicious Traffic Detection Based on Multimodal Traffic Characteristics. Sensors, 24.","DOI":"10.3390\/s24206507"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Memon, M., Unar, A.A., Ahmed, S.S., Daudpoto, G.H., and Jaffari, R. (2023). Feature-based semi-supervised learning approach to android malware detection. Eng. Proc., 32.","DOI":"10.3390\/engproc2023032006"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Yu, X., Lin, G., Hu, X., Keung, J.W., and Xia, X. (Acm Trans. Softw. Eng. Methodol., 2024). Less is More: Unlocking Semi-Supervised Deep Learning for Vulnerability Detection, Acm Trans. Softw. Eng. Methodol.","DOI":"10.1145\/3699602"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Eren, M.E., Alexandrov, B.S., and Nicholas, C. (2024). Classifying Malware Using Tensor Decomposition. Malware: Handbook of Prevention and Detection, Springer.","DOI":"10.1007\/978-3-031-66245-4_1"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3624567","article-title":"Semi-supervised classification of malware families under extreme class imbalance via hierarchical non-negative matrix factorization with automatic model selection","volume":"26","author":"Eren","year":"2023","journal-title":"ACM Trans. Priv. Secur."},{"key":"ref_21","unstructured":"Van de Ven, G.M., and Tolias, A.S. (2019). Three scenarios for continual learning. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"De Lange, M., and Tuytelaars, T. (2021, January 10\u201317). Continual Prototype Evolution: Learning Online From Non-Stationary Data Streams. Proceedings of the Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00814"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Cossu, A., Graffieti, G., Pellegrini, L., Maltoni, D., Bacciu, D., Carta, A., and Lomonaco, V. (2022). Is Class-Incremental Enough for Continual Learning?. arXiv.","DOI":"10.3389\/frai.2022.829842"},{"key":"ref_24","first-page":"32854","article-title":"AnoShift: A distribution shift benchmark for unsupervised anomaly detection","volume":"35","author":"Dragoi","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Sharif Razavian, A., Azizpour, H., Sullivan, J., and Carlsson, S. (2014, January 23\u201328). CNN features off-the-shelf: An astounding baseline for recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, OH, USA.","DOI":"10.1109\/CVPRW.2014.131"},{"key":"ref_26","unstructured":"Lomonaco, V., and Maltoni, D. (2017, January 13\u201315). Core50: A new dataset and benchmark for continuous object recognition. Proceedings of the Conference on Robot Learning\u2014PMLR, Mountain View, CA, USA."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"3521","DOI":"10.1073\/pnas.1611835114","article-title":"Overcoming catastrophic forgetting in neural networks","volume":"114","author":"Kirkpatrick","year":"2017","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"2935","DOI":"10.1109\/TPAMI.2017.2773081","article-title":"Learning without Forgetting","volume":"40","author":"Li","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_29","unstructured":"Diethe, T., Borchert, T., Thereska, E., Balle, B., and Lawrence, N. (2018). Continual learning in practice. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"4445","DOI":"10.1007\/s10994-023-06501-y","article-title":"Distributed and explainable GHSOM for anomaly detection in sensor networks","volume":"113","author":"Mignone","year":"2024","journal-title":"Mach. Learn."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Mallya, A., and Lazebnik, S. (2017). PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning. arXiv.","DOI":"10.1109\/CVPR.2018.00810"},{"key":"ref_32","first-page":"10734","article-title":"Forget-free Continual Learning with Winning Subnetworks","volume":"162","author":"Kang","year":"2022","journal-title":"ICML"},{"key":"ref_33","unstructured":"Pietro\u0144, M., \u017burek, D., Faber, K., and Corizzo, R. (October, January 30). Ada-QPacknet\u2013adaptive pruning with bit width reduction as an efficient continual learning method without forgetting. Proceedings of the European Conference on Artificial Intelligence (ECAI), Krakow, Poland."},{"key":"ref_34","first-page":"28530","article-title":"Retrospective adversarial replay for continual learning","volume":"35","author":"Kumari","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"4069","DOI":"10.1038\/s41467-020-17866-2","article-title":"Brain-inspired replay for continual learning with artificial neural networks","volume":"11","author":"Siegelmann","year":"2020","journal-title":"Nat. Commun."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"127204","DOI":"10.1016\/j.neucom.2023.127204","article-title":"AdaER: An adaptive experience replay approach for continual lifelong learning","volume":"572","author":"Li","year":"2024","journal-title":"Neurocomputing"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"7135","DOI":"10.1007\/s10489-024-05488-w","article-title":"Uncertainty-aware enhanced dark experience replay for continual learning","volume":"54","author":"Wang","year":"2024","journal-title":"Appl. Intell."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"41801","DOI":"10.1109\/ACCESS.2024.3378606","article-title":"MixER: Mixup-Based Experience Replay for Online Class-Incremental Learning","volume":"12","author":"Lim","year":"2024","journal-title":"IEEE Access"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Buzzega, P., Boschini, M., Porrello, A., and Calderara, S. (2021, January 10\u201315). Rethinking experience replay: A bag of tricks for continual learning. Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy.","DOI":"10.1109\/ICPR48806.2021.9412614"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Faber, K., Sniezynski, B., and Corizzo, R. (2023, January 15\u201318). Distributed Continual Intrusion Detection: A Collaborative Replay Framework. Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy.","DOI":"10.1109\/BigData59044.2023.10386211"},{"key":"ref_41","unstructured":"Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (2017). Continual Learning with Deep Generative Replay. Proceedings of the NeurIPS, Curran Associates, Inc."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1016\/j.neunet.2023.05.032","article-title":"VLAD: Task-agnostic VAE-based lifelong anomaly detection","volume":"165","author":"Faber","year":"2023","journal-title":"Neural Netw."},{"key":"ref_43","unstructured":"Rahman, M.S., Coull, S., and Wright, M. (2022). On the Limitations of Continual Learning for Malware Classification. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Faber, K., Corizzo, R., Sniezynski, B., and Japkowicz, N. (2023). Lifelong Learning for Anomaly Detection: New Challenges, Perspectives, and Insights. arXiv.","DOI":"10.2139\/ssrn.4374293"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1016\/j.neunet.2020.05.011","article-title":"Progressive learning: A deep learning framework for continual learning","volume":"128","author":"Fayek","year":"2020","journal-title":"Neural Netw."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"8137","DOI":"10.1007\/s10994-024-06524-z","article-title":"From MNIST to ImageNet and back: Benchmarking continual curriculum learning","volume":"113","author":"Faber","year":"2024","journal-title":"Mach. Learn."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"12864","DOI":"10.1109\/TNNLS.2023.3265331","article-title":"Cyclical curriculum learning","volume":"35","author":"Kesgin","year":"2023","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Carrier, T. (2022, January 11\u201313). Detecting obfuscated malware using memory feature engineering. Proceedings of the ICISSP, Virtual.","DOI":"10.5220\/0010908200003120"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Issakhani, M., Victor, P., Tekeoglu, A., and Lashkari, A.H. (2022, January 9\u201311). PDF Malware Detection based on Stacking Learning. Proceedings of the ICISSP, Virtual.","DOI":"10.5220\/0010908400003120"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Lashkari, A.H., Kadir, A.F.A., Taheri, L., and Ghorbani, A.A. (2018, January 22\u201325). Toward developing a systematic approach to generate benchmark android malware datasets and classification. Proceedings of the 2018 International Carnahan Conference on Security Technology (ICCST), Madrid, Spain.","DOI":"10.1109\/CCST.2018.8585560"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Breunig, M.M., Kriegel, H.P., Ng, R.T., and Sander, J. (2000, January 15\u201318). LOF: Identifying density-based local outliers. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA.","DOI":"10.1145\/342009.335388"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Liu, F.T., Ting, K.M., and Zhou, Z.H. (2008, January 15\u201319). Isolation forest. Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy.","DOI":"10.1109\/ICDM.2008.17"},{"key":"ref_53","unstructured":"Sch\u00f6lkopf, B., Williamson, R.C., Smola, A.J., Shawe-Taylor, J., and Platt, J.C. (2000). Support vector method for novelty detection. Advances in Neural Information Processing Systems, MIT Press."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Li, Z., Zhao, Y., Botta, N., Ionescu, C., and Hu, X. (2020, January 17\u201320). COPOD: Copula-Based Outlier Detection. Proceedings of the 2020 IEEE International Conference on Data Mining (ICDM), Sorrento, Italy.","DOI":"10.1109\/ICDM50108.2020.00135"},{"key":"ref_55","unstructured":"Goldstein, M., and Score, A.D.H.b.O. (2012, January 24\u201327). A fast Unsupervised Anomaly Detection Algorithm. Proceedings of the KI-2012: Poster and Demo Track, Saarbr\u00fccken, Germany."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Kriegel, H., Schubert, M., and Zimek, A. (2008, January 24\u201327). Angle-based outlier detection in high-dimensional data. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA.","DOI":"10.1145\/1401890.1401946"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Gehring, J., Miao, Y., Metze, F., and Waibel, A. (2013, January 26\u201331). Extracting deep bottleneck features using stacked auto-encoders. Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada.","DOI":"10.1109\/ICASSP.2013.6638284"},{"key":"ref_58","unstructured":"D\u00edaz-Rodr\u00edguez, N., Lomonaco, V., Filliat, D., and Maltoni, D. (2018). Don\u2019t forget, there is more than forgetting: New metrics for Continual Learning. arXiv."},{"key":"ref_59","first-page":"1","article-title":"PyOD: A Python Toolbox for Scalable Outlier Detection","volume":"20","author":"Zhao","year":"2019","journal-title":"J. Mach. Learn. Res."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/4\/135\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:51:34Z","timestamp":1760115094000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/4\/135"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,10]]},"references-count":59,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["make6040135"],"URL":"https:\/\/doi.org\/10.3390\/make6040135","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,10]]}}}