{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T19:35:05Z","timestamp":1777059305188,"version":"3.51.4"},"reference-count":16,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2025,9,10]],"date-time":"2025-09-10T00:00:00Z","timestamp":1757462400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["JCP"],"abstract":"<jats:p>Accurate malware family classification from dynamic sandbox reports continues to be a fundamental cybersecurity challenge. Most prior works depend on random splits that tend to overestimate accuracy, whereas deployment requires robustness under temporal drift as well as changing behaviors. We present a leakage-aware pipeline that transforms CAPEv2 sandbox JSON reports into structured visual heatmaps and evaluate models under stratified and chronological splits. The pipeline rigorously flattens behavioral keys, builds normalized representations, and benchmarks Random Forest, MLP, CNN64, HybridNet, and a modern ResNeXt-50 backbone. On the Avast\u2013CTU CAPEv2 dataset containing ten malware families, Random Forest achieves nearly state-of-the-art accuracy (97.2% accuracy, 0.993 AUC) with high efficiency on CPUs, making it attractive for triage. ResNeXt-50 achieves the best overall performance (98.4% accuracy, 0.998 AUC) and provides visual interpretability via Grad-CAM, enabling analysts to verify predictions. We further quantify efficiency trade-offs (inference throughput and GPU memory) and report ablation studies on vocabulary size and keyset choices. These results affirm that though ensemble methods are still robust, heatmap-based CNNs provide better accuracy, interpretability, and robustness against drift.<\/jats:p>","DOI":"10.3390\/jcp5030072","type":"journal-article","created":{"date-parts":[[2025,9,10]],"date-time":"2025-09-10T09:32:01Z","timestamp":1757496721000},"page":"72","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Structured Heatmap Learning for Multi-Family Malware Classification: A Deep and Explainable Approach Using CAPEv2"],"prefix":"10.3390","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-5336-7913","authenticated-orcid":false,"given":"Oussama","family":"El Rhayati","sequence":"first","affiliation":[{"name":"Faculty of Sciences Dhar El Mahraz (FSDM), Sidi Mohamed Ben Abdellah University, Fez 30000, Morocco"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hatim","family":"Essadeq","sequence":"additional","affiliation":[{"name":"Faculty of Sciences Dhar El Mahraz (FSDM), Sidi Mohamed Ben Abdellah University, Fez 30000, Morocco"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Omar","family":"El Beqqali","sequence":"additional","affiliation":[{"name":"Faculty of Sciences Dhar El Mahraz (FSDM), Sidi Mohamed Ben Abdellah University, Fez 30000, Morocco"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hamid","family":"Tairi","sequence":"additional","affiliation":[{"name":"Faculty of Sciences Dhar El Mahraz (FSDM), Sidi Mohamed Ben Abdellah University, Fez 30000, Morocco"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mohamed","family":"Lamrini","sequence":"additional","affiliation":[{"name":"Faculty of Sciences Dhar El Mahraz (FSDM), Sidi Mohamed Ben Abdellah University, Fez 30000, Morocco"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jamal","family":"Riffi","sequence":"additional","affiliation":[{"name":"Faculty of Sciences Dhar El Mahraz (FSDM), Sidi Mohamed Ben Abdellah University, Fez 30000, Morocco"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,9,10]]},"reference":[{"key":"ref_1","first-page":"3283","article-title":"A Hybrid Approach for Malware Detection using Machine Learning and Behavior Analysis","volume":"18","author":"Zhang","year":"2023","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"ref_2","first-page":"190","article-title":"Hybrid Malware Detection System Using Deep Autoencoders and Clustering","volume":"29","author":"Gupta","year":"2022","journal-title":"J. Comput. Secur."},{"key":"ref_3","first-page":"567","article-title":"Recurrent Neural Network-based Approach for Sequential API Call Analysis in Malware Detection","volume":"18","author":"Kim","year":"2023","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"ref_4","first-page":"920","article-title":"Extending MalBERT: Improving Transformer-based Models for Malware Family Detection","volume":"68","author":"Thompson","year":"2023","journal-title":"J. Artif. Intell. Res."},{"key":"ref_5","unstructured":"Bosansky, B., Kouba, D., Manhal, O., Sick, T., Lisy, V., Kroustek, J., and Somol, P. (2024). Avast-CTU Public CAPE Dataset. arXiv."},{"key":"ref_6","unstructured":"Raff, E., Barker, J., Sylvester, J., Brandon, R., Catanzaro, C., and McLean, J. (2018). Malware Detection by Eating a Whole EXE. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Nataraj, L., Karthikeyan, S., Jacob, G., and Manjunath, B.S. (2011, January 20). Malware Images: Visualization and Automatic Classification. Proceedings of the 8th International Symposium on Visualization for Cyber Security (VizSec), Pittsburgh, PA, USA.","DOI":"10.1145\/2016904.2016908"},{"key":"ref_8","first-page":"207","article-title":"CNN-based Detection of Malware API Patterns from Sequence Images","volume":"16","author":"Sharma","year":"2020","journal-title":"J. Comput. Virol. Hacking Tech."},{"key":"ref_9","first-page":"e18","article-title":"Memory Access Pattern Analysis for Malware Detection Using Deep CNNs","volume":"8","author":"Lopez","year":"2022","journal-title":"J. Cybersecur."},{"key":"ref_10","first-page":"456","article-title":"MalNet: Graph-based Malware Classification Using Structural and Semantic Features","volume":"468","author":"Khoa","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_11","first-page":"1","article-title":"Efficient Cyber Attack Detection in Industrial Control Systems Using Visual Representations","volume":"24","author":"Kravchik","year":"2021","journal-title":"ACM Trans. Priv. Secur."},{"key":"ref_12","unstructured":"Chang, T., Li, W., and Huang, Z. (2021, January 15\u201319). Transformer-based Sequence Modeling for Dynamic Malware Classification. Proceedings of the ACM Conference on Computer and Communications Security, Virtual."},{"key":"ref_13","unstructured":"Wang, X., Zhou, Y., and Liu, J. (2022, January 23\u201326). MalBERT: Using Transformers for Dynamic Malware Family Classification. Proceedings of the IEEE Symposium on Security and Privacy Workshops, San Francisco, CA, USA."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017, January 22\u201329). Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.74"},{"key":"ref_15","first-page":"144","article-title":"Detecting Crypto Ransomware Using Machine Learning Techniques","volume":"73","author":"Azmoodeh","year":"2018","journal-title":"Comput. Secur."},{"key":"ref_16","first-page":"1042","article-title":"Zero-day Malware Detection Based on Supervised Learning Algorithms of API Call Signatures","volume":"26","author":"Alazab","year":"2020","journal-title":"Adv. Sci. Lett."}],"container-title":["Journal of Cybersecurity and Privacy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2624-800X\/5\/3\/72\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:43:01Z","timestamp":1760035381000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2624-800X\/5\/3\/72"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,10]]},"references-count":16,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2025,9]]}},"alternative-id":["jcp5030072"],"URL":"https:\/\/doi.org\/10.3390\/jcp5030072","relation":{},"ISSN":["2624-800X"],"issn-type":[{"value":"2624-800X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,10]]}}}