{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T12:44:19Z","timestamp":1774529059349,"version":"3.50.1"},"reference-count":49,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2023,1,20]],"date-time":"2023-01-20T00:00:00Z","timestamp":1674172800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100003407","name":"Minister for Education, University and Research","doi-asserted-by":"publisher","award":["Law 232\/216"],"award-info":[{"award-number":["Law 232\/216"]}],"id":[{"id":"10.13039\/501100003407","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Convolutional Neural Networks (CNN) have received a large share of research in mammography image analysis due to their capability of extracting hierarchical features directly from raw data. Recently, Vision Transformers are emerging as viable alternative to CNNs in medical imaging, in some cases performing on par or better than their convolutional counterparts. In this work, we conduct an extensive experimental study to compare the most recent CNN and Vision Transformer architectures for whole mammograms classification. We selected, trained and tested 33 different models, 19 convolutional- and 14 transformer-based, on the largest publicly available mammography image database OMI-DB. We also performed an analysis of the performance at eight different image resolutions and considering all the individual lesion categories in isolation (masses, calcifications, focal asymmetries, architectural distortions). Our findings confirm the potential of visual transformers, which performed on par with traditional CNNs like ResNet, but at the same time show a superiority of modern convolutional networks like EfficientNet.<\/jats:p>","DOI":"10.3390\/s23031229","type":"journal-article","created":{"date-parts":[[2023,1,23]],"date-time":"2023-01-23T01:36:26Z","timestamp":1674437786000},"page":"1229","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":36,"title":["Convolutional Networks and Transformers for Mammography Classification: An Experimental Study"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6225-6680","authenticated-orcid":false,"given":"Marco","family":"Cantone","sequence":"first","affiliation":[{"name":"Department of Electrical and Information Engineering, University of Cassino and Southern Latium, 03043 Cassino, FR, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0840-7350","authenticated-orcid":false,"given":"Claudio","family":"Marrocco","sequence":"additional","affiliation":[{"name":"Department of Electrical and Information Engineering, University of Cassino and Southern Latium, 03043 Cassino, FR, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5033-9323","authenticated-orcid":false,"given":"Francesco","family":"Tortorella","sequence":"additional","affiliation":[{"name":"Department of Information and Electrical Engineering and Applied Mathematics, University of Salerno, 84084 Fisciano, SA, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2895-6544","authenticated-orcid":false,"given":"Alessandro","family":"Bria","sequence":"additional","affiliation":[{"name":"Department of Electrical and Information Engineering, University of Cassino and Southern Latium, 03043 Cassino, FR, Italy"}]}],"member":"1968","published-online":{"date-parts":[[2023,1,20]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1183","DOI":"10.1002\/cac2.12207","article-title":"Global patterns of breast cancer incidence and mortality: A population-based cancer registry data analysis from 2000 to 2020","volume":"41","author":"Lei","year":"2021","journal-title":"Cancer Commun."},{"key":"ref_2","first-page":"1","article-title":"Breast cancer","volume":"5","author":"Harbeck","year":"2019","journal-title":"Nat. Rev. Dis. Prim."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"524","DOI":"10.3322\/caac.21754","article-title":"Breast Cancer Statistics, 2022","volume":"72","author":"Giaquinto","year":"2022","journal-title":"CA Cancer J. Clin."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1016\/j.media.2018.03.006","article-title":"Deep learning in mammography and breast histology, an overview and future trends","volume":"47","author":"Hamidinekoo","year":"2018","journal-title":"Med. Image Anal."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1016\/j.media.2017.07.005","article-title":"A survey on deep learning in medical image analysis","volume":"42","author":"Litjens","year":"2017","journal-title":"Med. Image Anal."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., and Xie, S. (2022, January 18\u201324). A convnet for the 2020s. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01167"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"LeCun","year":"1998","journal-title":"Proc. IEEE"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s12065-020-00540-3","article-title":"Convolutional neural networks in medical image understanding: A survey","volume":"15","author":"Sarvamangala","year":"2022","journal-title":"Evol. Intell."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Abdelhafiz, D., Nabavi, S., Ammar, R., and Yang, C. (2017, January 19\u201321). Survey on deep convolutional neural networks in mammography. Proceedings of the 2017 IEEE 7th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS), Orlando, FL, USA.","DOI":"10.1109\/ICCABS.2017.8114310"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Casal-Guisande, M., Comesa\u00f1a-Campos, A., Dutra, I., Cerqueiro-Peque\u00f1o, J., and Bouza-Rodr\u00edguez, J.B. (2022). Design and Development of an Intelligent Clinical Decision Support System Applied to the Evaluation of Breast Cancer Risk. J. Pers. Med., 12.","DOI":"10.3390\/jpm12020169"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1184","DOI":"10.1109\/TMI.2019.2945514","article-title":"Deep neural networks improve radiologists\u2019 performance in breast cancer screening","volume":"39","author":"Wu","year":"2019","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Abdelhafiz, D., Yang, C., Ammar, R., and Nabavi, S. (2019). Deep convolutional neural networks for mammography: Advances, challenges and applications. BMC Bioinform., 20.","DOI":"10.1186\/s12859-019-2823-4"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Abdelhafiz, D., Bi, J., Ammar, R., Yang, C., and Nabavi, S. (2020). Convolutional neural network for automated mass segmentation in mammography. BMC Bioinform., 21.","DOI":"10.1186\/s12859-020-3521-y"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Castro, E., Cardoso, J.S., and Pereira, J.C. (2018, January 4\u20137). Elastic deformations for data augmentation in breast cancer mass detection. Proceedings of the 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), Las Vegas, NV, USA.","DOI":"10.1109\/BHI.2018.8333411"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"12495","DOI":"10.1038\/s41598-019-48995-4","article-title":"Deep learning to improve breast cancer detection on screening mammography","volume":"9","author":"Shen","year":"2019","journal-title":"Sci. Rep."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"103774","DOI":"10.1016\/j.compbiomed.2020.103774","article-title":"Deep learning for mass detection in full field digital mammograms","volume":"121","author":"Agarwal","year":"2020","journal-title":"Comput. Biol. Med."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"103735","DOI":"10.1016\/j.compbiomed.2020.103735","article-title":"Addressing class imbalance in deep learning for small lesion detection on medical images","volume":"120","author":"Bria","year":"2020","journal-title":"Comput. Biol. Med."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Dhungel, N., Carneiro, G., and Bradley, A.P. (2017, January 18\u201321). Fully automated classification of mammograms using deep residual neural networks. Proceedings of the 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), Melbourne, VIC, Australia.","DOI":"10.1109\/ISBI.2017.7950526"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1159","DOI":"10.1007\/s11517-021-02497-6","article-title":"A real use case of semi-supervised learning for mammogram classification in a local clinic of Costa Rica","volume":"60","author":"Elizondo","year":"2022","journal-title":"Med. Biol. Eng. Comput."},{"key":"ref_20","unstructured":"L\u00e9vy, D., and Jain, A. (2016). Breast mass classification from mammograms using deep convolutional neural networks. arXiv."},{"key":"ref_21","unstructured":"Heath, M., Bowyer, K., Kopans, D., Kegelmeyer, P., Moore, R., Chang, K., and Munishkumaran, S. (1998). Digital Mammography, Springer."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/j.acra.2011.09.014","article-title":"Inbreast: Toward a full-field digital mammographic database","volume":"19","author":"Moreira","year":"2012","journal-title":"Acad. Radiol."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Tsochatzidis, L., Costaridou, L., and Pratikakis, I. (2019). Deep learning for breast cancer diagnosis from mammograms\u2014A comparative study. J. Imaging, 5.","DOI":"10.3390\/jimaging5030037"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"170177","DOI":"10.1038\/sdata.2017.177","article-title":"A curated mammography data set for use in computer-aided detection and diagnosis research","volume":"4","author":"Lee","year":"2017","journal-title":"Sci. Data"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"Imagenet large scale visual recognition challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_27","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16 \u00d7 16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_28","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Attention is all you need. Adv. Neural Inf. Process. Syst., 30."},{"key":"ref_29","unstructured":"Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., and J\u00e9gou, H. (2021, January 18\u201324). Training data-efficient image transformers & distillation through attention. Proceedings of the International Conference on Machine Learning, Online."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Liu, Z., Hu, H., Lin, Y., Yao, Z., Xie, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., and Dong, L. (2022, January 18\u201324). Swin transformer v2: Scaling up capacity and resolution. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01170"},{"key":"ref_32","unstructured":"Zhang, Z., Zhang, H., Zhao, L., Chen, T., Arik, S.\u00d6., and Pfister, T. (March, January 22). Nested hierarchical transformer: Towards accurate, data-efficient and interpretable visual understanding. Proceedings of the AAAI Conference on Artificial Intelligence, Online."},{"key":"ref_33","unstructured":"Shamshad, F., Khan, S., Zamir, S.W., Khan, M.H., Hayat, M., Khan, F.S., and Fu, H. (2022). Transformers in medical imaging: A survey. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Chen, R.J., Chen, C., Li, Y., Chen, T.Y., Trister, A.D., Krishnan, R.G., and Mahmood, F. (2022, January 18\u201324). Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01567"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Garrucho, L., Kushibar, K., Jouide, S., Diaz, O., Igual, L., and Lekadir, K. (2022). Domain generalization in deep learning-based mass detection in mammography: A large-scale multi-center study. arXiv.","DOI":"10.1016\/j.artmed.2022.102386"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Chen, X., Zhang, K., Abdoli, N., Gilley, P.W., Wang, X., Liu, H., Zheng, B., and Qiu, Y. (2022). Transformers Improve Breast Cancer Diagnosis from Unregistered Multi-View Mammograms. Diagnostics, 12.","DOI":"10.20944\/preprints202206.0315.v1"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Sun, Z., Jiang, H., Ma, L., Yu, Z., and Xu, H. (2022, January 18\u201322). Transformer Based Multi-view Network for Mammographic Image Classification. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Singapore.","DOI":"10.1007\/978-3-031-16437-8_5"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_39","unstructured":"Tan, M., and Le, Q. (2019, January 9\u201315). Efficientnet: Rethinking model scaling for convolutional neural networks. Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA."},{"key":"ref_40","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_41","unstructured":"Wightman, R. (2022, November 29). PyTorch Image Models. Available online: https:\/\/github.com\/rwightman\/pytorch-image-models."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Chicco, D., and Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom., 21.","DOI":"10.1186\/s12864-019-6413-7"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A. (2016, January 27\u201330). Learning deep features for discriminative localization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.319"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017, January 22\u201329). Grad-cam: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.74"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Chattopadhay, A., Sarkar, A., Howlader, P., and Balasubramanian, V.N. (2018, January 12\u201315). Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA.","DOI":"10.1109\/WACV.2018.00097"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Shome, D., Kar, T., Mohanty, S.N., Tiwari, P., Muhammad, K., AlTameem, A., Zhang, Y., and Saudagar, A.K.J. (2021). Covid-transformer: Interpretable covid-19 detection using vision transformer for healthcare. Int. J. Environ. Res. Public Health, 18.","DOI":"10.3390\/ijerph182111086"},{"key":"ref_47","unstructured":"Ali, A., Schnake, T., Eberle, O., Montavon, G., M\u00fcller, K.R., and Wolf, L. (2022). XAI for transformers: Better explanations through conservative propagation. arXiv."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Abnar, S., and Zuidema, W. (2020). Quantifying attention flow in transformers. arXiv.","DOI":"10.18653\/v1\/2020.acl-main.385"},{"key":"ref_49","first-page":"e200103","article-title":"Optimam mammography image database: A large-scale resource of mammography images and clinical data","volume":"3","author":"Warren","year":"2020","journal-title":"Radiol. Artif. Intell."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/3\/1229\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:12:24Z","timestamp":1760119944000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/3\/1229"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,1,20]]},"references-count":49,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["s23031229"],"URL":"https:\/\/doi.org\/10.3390\/s23031229","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,1,20]]}}}