{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,5]],"date-time":"2025-11-05T05:39:45Z","timestamp":1762321185977,"version":"build-2065373602"},"reference-count":43,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T00:00:00Z","timestamp":1762128000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002322","name":"Coordena\u00e7\u00e3o de Aperfei\u00e7oamento de Pessoal de N\u00edvel Superior","doi-asserted-by":"crossref","award":["Finance Code 001"],"award-info":[{"award-number":["Finance Code 001"]}],"id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100003758","name":"Funda\u00e7\u00e3o de Amparo \u00e0 Pesquisa e ao Desenvolvimento Cient\u00edfico e Tecnol\u00f3gico do Maranh\u00e3o","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100003758","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100003593","name":"Conselho Nacional de Desenvolvimento Cient\u00edfico e Tecnol\u00f3gico","doi-asserted-by":"publisher","award":["305253\/2025-5"],"award-info":[{"award-number":["305253\/2025-5"]}],"id":[{"id":"10.13039\/501100003593","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Vision"],"abstract":"<jats:p>Deep learning for glaucoma screening often relies on high-resolution clinical images and convolutional neural networks (CNNs). However, these methods face significant performance drops when applied to noisy, low-resolution images from portable devices. To address this, our work investigates ensemble methods using multiple Transformer architectures for automated glaucoma detection in challenging scenarios. We use the Brazil Glaucoma (BrG) and private D-Eye datasets to assess model robustness. These datasets include images typical of smartphone-coupled ophthalmoscopes, which are often noisy and variable in quality. Four Transformer models\u2014Swin-Tiny, ViT-Base, MobileViT-Small, and DeiT-Base\u2014were trained and evaluated both individually and in ensembles. We evaluated the results at both image and patient levels to reflect clinical practice. The results show that, although performance drops on lower-quality images, ensemble combinations and patient-level aggregation significantly improve accuracy and sensitivity. We achieved up to 85% accuracy and an 84.2% F1-score on the D-Eye dataset, with a notable reduction in false negatives. Grad-CAM attention maps confirmed that Transformers identify anatomical regions relevant to diagnosis. These findings reinforce the potential of Transformer ensembles as an accessible solution for early glaucoma detection in populations with limited access to specialized equipment.<\/jats:p>","DOI":"10.3390\/vision9040093","type":"journal-article","created":{"date-parts":[[2025,11,3]],"date-time":"2025-11-03T19:30:27Z","timestamp":1762198227000},"page":"93","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Comparative Analysis of Transformer Architectures and Ensemble Methods for Automated Glaucoma Screening in Fundus Images from Portable Ophthalmoscopes"],"prefix":"10.3390","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-9504-2443","authenticated-orcid":false,"given":"Rodrigo Ot\u00e1vio Cantanhede","family":"Costa","sequence":"first","affiliation":[{"name":"Computer Science Department, Universidade Federal do Maranh\u00e3o (UFMA), Campus do Bacanga, S\u00e3o Lu\u00eds 65085-580, Brazil"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-2226-1125","authenticated-orcid":false,"given":"Pedro Alexandre Ferreira","family":"Fran\u00e7a","sequence":"additional","affiliation":[{"name":"Computer Science Department, Universidade Federal do Maranh\u00e3o (UFMA), Campus do Bacanga, S\u00e3o Lu\u00eds 65085-580, Brazil"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4995-8909","authenticated-orcid":false,"given":"Alexandre C\u00e9sar Pinto","family":"Pessoa","sequence":"additional","affiliation":[{"name":"Computer Science Department, Universidade Federal do Maranh\u00e3o (UFMA), Campus do Bacanga, S\u00e3o Lu\u00eds 65085-580, Brazil"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3731-6431","authenticated-orcid":false,"given":"Geraldo","family":"Braz J\u00fanior","sequence":"additional","affiliation":[{"name":"Computer Science Department, Universidade Federal do Maranh\u00e3o (UFMA), Campus do Bacanga, S\u00e3o Lu\u00eds 65085-580, Brazil"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7013-9700","authenticated-orcid":false,"given":"Jo\u00e3o Dallyson Sousa","family":"de Almeida","sequence":"additional","affiliation":[{"name":"Computer Science Department, Universidade Federal do Maranh\u00e3o (UFMA), Campus do Bacanga, S\u00e3o Lu\u00eds 65085-580, Brazil"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3458-7693","authenticated-orcid":false,"given":"Ant\u00f3nio","family":"Cunha","sequence":"additional","affiliation":[{"name":"School of Science and Technology, University of Tr\u00e1s-os-Montes e Alto Douro, Quinta de Prados, 5000-801 Vila Real, Portugal"},{"name":"ALGORITMI Research Centre, University of Minho, 4800-058 Guimar\u00e3es, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,3]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1901","DOI":"10.1001\/jama.2014.3192","article-title":"The pathophysiology and treatment of glaucoma: A review","volume":"311","author":"Weinreb","year":"2014","journal-title":"JAMA"},{"key":"ref_2","unstructured":"Vision Loss Expert Group of the Global Burden of Disease Study, and the GBD 2019 Blindness and Vision Impairment Collaborators (2024). Global estimates on the number of people blind or visually impaired by glaucoma: A meta-analysis from 2000 to 2020. Eye, 38, 2036."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1788","DOI":"10.1016\/S0140-6736(23)01289-8","article-title":"Glaucoma: Now and beyond","volume":"402","author":"Jayaram","year":"2023","journal-title":"Lancet"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"588","DOI":"10.1136\/bjophthalmol-2019-314336","article-title":"Estimated number of ophthalmologists worldwide (International Council of Ophthalmology update): Will we meet the needs?","volume":"104","author":"Resnikoff","year":"2020","journal-title":"Br. J. Ophthalmol."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Furtado, J.M., Fernandes, A.G., Silva, J.C., Del Pino, S., and Hommes, C. (2023). Indigenous eye health in the americas: The burden of vision impairment and ocular diseases. Int. J. Environ. Res. Public Health, 20.","DOI":"10.3390\/ijerph20053820"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"2033","DOI":"10.1016\/j.ophtha.2012.04.019","article-title":"The Nakuru posterior segment eye disease study: Methods and prevalence of blindness and visual impairment in Nakuru, Kenya","volume":"119","author":"Mathenge","year":"2012","journal-title":"Ophthalmology"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Jin, K., Lu, H., Su, Z., Cheng, C., Ye, J., and Qian, D. (2017). Telemedicine screening of retinal diseases with a handheld portable non-mydriatic fundus camera. BMC Ophthalmol., 17.","DOI":"10.1186\/s12886-017-0484-5"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2957","DOI":"10.2147\/OPTH.S423845","article-title":"A systematic review of digital ophthalmoscopes in medicine","volume":"17","author":"Robles","year":"2023","journal-title":"Clin. Ophthalmol."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1016\/j.media.2017.07.005","article-title":"A survey on deep learning in medical image analysis","volume":"42","author":"Litjens","year":"2017","journal-title":"Med. Image Anal."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"2402","DOI":"10.1001\/jama.2016.17216","article-title":"Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs","volume":"316","author":"Gulshan","year":"2016","journal-title":"JAMA"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Chen, X., Xu, Y., Wong, D.W.K., Wong, T.Y., and Liu, J. (2015, January 25\u201329). Glaucoma detection based on deep convolutional neural network. Proceedings of the 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Milan, Italy.","DOI":"10.1109\/EMBC.2015.7318462"},{"key":"ref_12","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"274","DOI":"10.1038\/s41433-021-01926-y","article-title":"Feasibility and clinical utility of handheld fundus cameras for retinal imaging","volume":"37","author":"Das","year":"2023","journal-title":"Eye"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Linde, G., Rodrigues de Souza, W., Chalakkal, R., Danesh-Meyer, H.V., O\u2019Keeffe, B., and Chiong Hong, S. (2024). A comparative evaluation of deep learning approaches for ophthalmology. Sci. Rep., 14.","DOI":"10.1038\/s41598-024-72752-x"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"587","DOI":"10.1136\/bjophthalmol-2020-318107","article-title":"Deep learning-assisted (automatic) diagnosis of glaucoma using a smartphone","volume":"106","author":"Nakahara","year":"2022","journal-title":"Br. J. Ophthalmol."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Neto, A., Camara, J., and Cunha, A. (2022). Evaluations of deep learning approaches for glaucoma screening using retinal images from mobile device. Sensors, 22.","DOI":"10.3390\/s22041449"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Fumero, F., Alay\u00f3n, S., Sanchez, J.L., Sigut, J., and Gonzalez-Hernandez, M. (2011, January 27\u201330). RIM-ONE: An open retinal image database for optic nerve evaluation. Proceedings of the 2011 24th International Symposium on Computer-Based Medical Systems (CBMS), Bristol, UK.","DOI":"10.1109\/CBMS.2011.5999143"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Sivaswamy, J., Krishnadas, S., Joshi, G.D., Jain, M., and Tabish, A.U.S. (May, January 29). Drishti-gs: Retinal image dataset for optic nerve head (onh) segmentation. Proceedings of the 2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI), Beijing, China.","DOI":"10.1109\/ISBI.2014.6867807"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"101570","DOI":"10.1016\/j.media.2019.101570","article-title":"Refuge challenge: A unified framework for evaluating automated methods for glaucoma assessment from fundus photographs","volume":"59","author":"Orlando","year":"2020","journal-title":"Med. Image Anal."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017, January 22\u201329). Grad-cam: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.74"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Bragan\u00e7a, C.P., Torres, J.M., Soares, C.P.d.A., and Macedo, L.O. (2022). Detection of glaucoma on fundus images using deep learning on a new image set obtained with a smartphone and handheld ophthalmoscope. Healthcare, 10.","DOI":"10.3390\/healthcare10122345"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Li, F.F. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"100233","DOI":"10.1016\/j.xops.2022.100233","article-title":"Detecting glaucoma from fundus photographs using deep learning without convolutions: Transformer for improved generalization","volume":"3","author":"Fan","year":"2023","journal-title":"Ophthalmol. Sci."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"573","DOI":"10.1001\/archopht.117.5.573","article-title":"The Ocular Hypertension Treatment Study: Design and baseline description of the participants","volume":"117","author":"Gordon","year":"1999","journal-title":"Arch. Ophthalmol."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Angara, S., and Kim, J. (2024, January 26\u201328). Deep ensemble learning for classification of glaucoma from smartphone fundus images. Proceedings of the 2024 IEEE 37th International Symposium on Computer-Based Medical Systems (CBMS), Guadalajara, Mexico.","DOI":"10.1109\/CBMS61543.2024.00074"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1109\/TMI.2019.2927226","article-title":"A large-scale database and a CNN model for attention-based glaucoma detection","volume":"39","author":"Li","year":"2019","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Costa, R., Pimentel, P., Pessoa, A., J\u00fanior, G.B., and Almeida, J. (2025, January 9\u201313). Diagn\u00f3stico de Glaucoma em Retinografias de Oftalmosc\u00f3pio Port\u00e1til Utilizando Ensemble Baseado em Transformers. Proceedings of the Anais do XXV Simp\u00f3sio Brasileiro de Computa\u00e7\u00e3o Aplicada \u00e0 Sa\u00fade, Porto Alegre, Brasil.","DOI":"10.5753\/sbcas.2025.7270"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"2888","DOI":"10.1109\/TVCG.2023.3261935","article-title":"How does attention work in vision transformers? A visual analytics attempt","volume":"29","author":"Li","year":"2023","journal-title":"IEEE Trans. Vis. Comput. Graph."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Camara, J., Silva, B., Gouveia, A., Pires, I.M., Coelho, P., and Cunha, A. (2022). Detection and mosaicing techniques for low-quality retinal videos. Sensors, 22.","DOI":"10.3390\/s22052059"},{"key":"ref_30","unstructured":"Jocher, G., Qiu, J., and Chaurasia, A. (2020, September 05). Ultralytics YOLO. Available online: https:\/\/github.com\/ultralytics\/ultralytics."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Buslaev, A., Iglovikov, V.I., Khvedchenya, E., Parinov, A., Druzhinin, M., and Kalinin, A.A. (2020). Albumentations: Fast and Flexible Image Augmentations. Information, 11.","DOI":"10.3390\/info11020125"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1186\/s40537-019-0197-0","article-title":"A survey on image data augmentation for deep learning","volume":"6","author":"Shorten","year":"2019","journal-title":"J. Big Data"},{"key":"ref_33","unstructured":"Wu, B., Xu, C., Dai, X., Wan, A., Zhang, P., Yan, Z., Tomizuka, M., Gonzalez, J., Keutzer, K., and Vajda, P. (2020). Visual Transformers: Token-based Image Representation and Processing for Computer Vision. arXiv."},{"key":"ref_34","unstructured":"Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., and J\u00e9gou, H. (2021). Training data-efficient image transformers & distillation through attention. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. arXiv.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Liu, Z., Hu, H., Lin, Y., Yao, Z., Xie, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., and Dong, L. (2021). Swin Transformer V2: Scaling Up Capacity and Resolution. arXiv.","DOI":"10.1109\/CVPR52688.2022.01170"},{"key":"ref_37","unstructured":"Bao, H., Dong, L., and Wei, F. (2021). BEiT: BERT Pre-Training of Image Transformers. arXiv."},{"key":"ref_38","unstructured":"Mehta, S., and Rastegari, M. (2022). MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer. arXiv."},{"key":"ref_39","unstructured":"Loshchilov, I., and Hutter, F. (2017). Decoupled weight decay regularization. arXiv."},{"key":"ref_40","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., and Antiga, L. (2019, January 8\u201314). Pytorch: An imperative style, high-performance deep learning library. Proceedings of the Advances in Neural Information Processing Systems 32, Vancouver, BC, USA."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., and Funtowicz, M. (2020, January 16\u201320). Transformers: State-of-the-art natural language processing. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online.","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"ref_42","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_43","unstructured":"Gildenblat, J. (2025, August 28). PyTorch Library for CAM Methods. Available online: https:\/\/github.com\/jacobgil\/pytorch-grad-cam."}],"container-title":["Vision"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2411-5150\/9\/4\/93\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,5]],"date-time":"2025-11-05T05:26:48Z","timestamp":1762320408000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2411-5150\/9\/4\/93"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,3]]},"references-count":43,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["vision9040093"],"URL":"https:\/\/doi.org\/10.3390\/vision9040093","relation":{},"ISSN":["2411-5150"],"issn-type":[{"type":"electronic","value":"2411-5150"}],"subject":[],"published":{"date-parts":[[2025,11,3]]}}}