{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T14:24:21Z","timestamp":1772029461186,"version":"3.50.1"},"reference-count":58,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T00:00:00Z","timestamp":1771977600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Accurate identification of Alzheimer\u2019s disease (AD)-related cellular characteristics from microscopy images is essential for understanding neurodegenerative mechanisms at the cellular level. While most computational approaches focus on macroscopic neuroimaging modalities, cell type classification from microscopy remains relatively underexplored. In this study, we propose a hybrid vision transformer\u2013convolutional neural network (ViT\u2013CNN) framework that integrates DeiT-Small and EfficientNet-B7 to classify three AD-related cell types\u2014astrocytes, cortical neurons, and SH-SY5Y neuroblastoma cells\u2014from phase-contrast microscopy images. We perform a comparative evaluation against conventional CNN architectures (DenseNet, ResNet, InceptionNet, and MobileNet) and prompt-based multimodal vision\u2013language models (GPT-5, GPT-4o, and Gemini 2.5-Flash) using zero-shot, few-shot, and chain-of-thought prompting. Experiments conducted with stratified fivefold cross-validation show that the proposed hybrid model achieves a test accuracy of 61.03% and a macro F1 score of 61.85, outperforming standalone CNN baselines and prompt-only LLM approaches under data-limited conditions. These results suggest that combining convolutional inductive biases with transformer-based global context modeling can improve generalization for cellular microscopy classification. While constrained by dataset size and scope, this work serves as a proof of concept and highlights promising directions for future research in domain-specific pretraining, multimodal data integration, and explainable AI for AD-related cellular analysis.<\/jats:p>","DOI":"10.3390\/jimaging12030098","type":"journal-article","created":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T12:35:17Z","timestamp":1772022917000},"page":"98","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Hybrid Vision Transformer\u2013CNN Framework for Alzheimer\u2019s Disease Cell Type Classification: A Comparative Study with Vision\u2013Language Models"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2446-7425","authenticated-orcid":false,"given":"Md Easin","family":"Hasan","sequence":"first","affiliation":[{"name":"Department of Mathematical Sciences, The University of Texas at El Paso, El Paso, TX 79968, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8144-1275","authenticated-orcid":false,"given":"Md Tahmid Hasan","family":"Fuad","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, University of Manitoba, Winnipeg, MB R3T 2N2, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4372-1572","authenticated-orcid":false,"given":"Omar","family":"Sharif","sequence":"additional","affiliation":[{"name":"School of Mathematical and Statistical Science, The University of Texas at Rio Grande Valley, Edinburg, TX 78541, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9799-6059","authenticated-orcid":false,"given":"Amy","family":"Wagler","sequence":"additional","affiliation":[{"name":"Office of Research, Creativity, and Economic Development, New Mexico State University, Las Cruces, NM 88003, USA"}]}],"member":"1968","published-online":{"date-parts":[[2026,2,25]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"777","DOI":"10.1126\/science.1132814","article-title":"A century of alzheimer\u2019s disease","volume":"314","author":"Goedert","year":"2006","journal-title":"Science"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1007\/s11538-024-01307-y","article-title":"Oscillations in neuronal activity: A neuron-centered spatiotemporal model of the unfolded protein response in prion diseases","volume":"86","author":"Miller","year":"2024","journal-title":"Bull. Math. Biol."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1007\/s00401-009-0532-1","article-title":"Classification and basic pathology of alzheimer disease","volume":"118","author":"Duyckaerts","year":"2009","journal-title":"Acta Neuropathol."},{"key":"ref_4","unstructured":"(2022, December 01). Alzheimer\u2019s Disease Facts and Figures. Available online: https:\/\/www.alz.org\/alzheimers-dementia\/facts-figures#:~:text=more%20than%206%20million%20americans%20of%20all%20ages%20have%20alzheimer\u2019s,older%20(10.7%25)%20has%20alzheimer\u2019s."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1111\/nan.12338","article-title":"astrocytes in alzheimer\u2019s disease and other age-associated dementias: A supporting player with a central role","volume":"43","author":"Garwood","year":"2017","journal-title":"Neuropathol. Appl. Neurobiol."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Gupta, A.D., Asan, L., John, J., Beretta, C., Kuner, T., and Knabbe, J. (2023). Accurate classification of major brain cell types using in vivo imaging and neural network processing. PLoS Biol., 21.","DOI":"10.1371\/journal.pbio.3002357"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1038","DOI":"10.1038\/s41592-021-01249-6","article-title":"Livecell\u2014A large-scale dataset for label-free live cell segmentation","volume":"18","author":"Edlund","year":"2021","journal-title":"Nat. Methods"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Alshammari, M., and Mezher, M. (2021). A modified convolutional neural networks for mri-based images for detection and stage classification of alzheimer disease. Proceedings of the 2021 National Computing Colleges Conference (NCCC), Taif, Saudi Arabia, 27\u201328 March 2021, IEEE.","DOI":"10.1109\/NCCC49330.2021.9428810"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"11149","DOI":"10.1007\/s10462-023-10415-5","article-title":"Trustworthy artificial intelligence in alzheimer\u2019s disease: State of the art, opportunities, and challenges","volume":"56","author":"Abuhmed","year":"2023","journal-title":"Artif. Intell. Rev."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"2216","DOI":"10.1111\/2041-210X.13075","article-title":"Machine learning for image based species identification","volume":"9","year":"2018","journal-title":"Methods Ecol. Evol."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Maaten, L.V.D., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Solano-Rojas, B., and Villal\u00f3n-Fonseca, R. (2021). A low-cost three-dimensional densenet neural network for alzheimer\u2019s disease early discovery. Sensors, 21.","DOI":"10.3390\/s21041302"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Fulton, L.V., Dolezel, D., Harrop, J., Yan, Y., and Fulton, C.P. (2019). Classification of alzheimer\u2019s disease with and without imagery using gradient boosted machines and resnet-50. Brain Sci., 9.","DOI":"10.20944\/preprints201907.0345.v1"},{"key":"ref_15","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv."},{"key":"ref_16","first-page":"042012","article-title":"Classification of alzheimer\u2019s disease in mobilenet","volume":"Volume 1345","author":"Lu","year":"2019","journal-title":"Journal of Physics: Conference Series"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Cui, Z., Gao, Z., Leng, J., Zhang, T., Quan, P., and Zhao, W. (2019). Alzheimer\u2019s disease diagnosis using enhanced inception network based on brain magnetic resonance image. Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), San Diego, CA, USA, 18\u201321 November 2019, IEEE.","DOI":"10.1109\/BIBM47256.2019.8983046"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Sethi, M., Ahuja, S., Singh, S., Snehi, J., and Chawla, M. (2022). An intelligent framework for alzheimer\u2019s disease classification using efficientnet transfer learning model. Proceedings of the 2022 International Conference on Emerging Smart Computing and Informatics (ESCI), Pune, India, 9\u201311 March 2022, IEEE.","DOI":"10.1109\/ESCI53509.2022.9758195"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1049\/ipr2.12618","article-title":"A modified 3d efficientnet for the classification of alzheimer\u2019s disease using structural magnetic resonance images","volume":"17","author":"Zheng","year":"2023","journal-title":"IET Image Process"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1016\/j.nurt.2010.05.017","article-title":"Astrocytes in alzheimer\u2019s disease","volume":"7","author":"Verkhratsky","year":"2010","journal-title":"Neurotherapeutics"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"4491","DOI":"10.1523\/JNEUROSCI.16-14-04491.1996","article-title":"Profound loss of layer ii entorhinal cortex neurons occurs in very mild alzheimer\u2019s disease","volume":"16","author":"Price","year":"1996","journal-title":"J. Neurosci."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"485","DOI":"10.3233\/JAD-201409","article-title":"Expression and localization of a\u03b2pp in sh-sy5y cells depends on differentiation state","volume":"82","author":"Riegerova","year":"2021","journal-title":"J. Alzheimer\u2019s Dis."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"2036","DOI":"10.1093\/brain\/awp105","article-title":"Early diagnosis of alzheimer\u2019s disease using cortical thickness: Impact of cognitive reserve","volume":"132","author":"Querbes","year":"2009","journal-title":"Brain"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1299","DOI":"10.1007\/s10571-012-9856-9","article-title":"Ages induce cell death via oxidative and endoplasmic reticulum stresses in both human sh-sy5y neuroblastoma cells and rat cortical neurons","volume":"32","author":"Yin","year":"2012","journal-title":"Cell. Mol. Neurobiol."},{"key":"ref_26","first-page":"100224","article-title":"Study on artificial intelligence: The state of the art and future prospects","volume":"23","author":"Zhang","year":"2021","journal-title":"J. Ind. Inf. Integr."},{"key":"ref_27","unstructured":"Tan, M., and Le, Q. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9\u201315 June 2019, PMLR."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"105634","DOI":"10.1016\/j.compbiomed.2022.105634","article-title":"Early diagnosis of alzheimer\u2019s disease based on deep learning: A systematic review","volume":"146","author":"Fathi","year":"2022","journal-title":"Comput. Biol. Med."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1016\/j.cogsys.2018.12.015","article-title":"Convolutional neural network based alzheimer\u2019s disease classification from magnetic resonance brain images","volume":"57","author":"Jain","year":"2019","journal-title":"Cogn. Syst. Res."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"456","DOI":"10.1148\/radiol.2018180958","article-title":"A deep learning model to predict a diagnosis of alzheimer disease by using 18f-fdg pet of the brain","volume":"290","author":"Ding","year":"2019","journal-title":"Radiology"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"105857","DOI":"10.1016\/j.asoc.2019.105857","article-title":"Analysis of brain sub regions using optimization techniques and deep learning method in alzheimer disease","volume":"86","author":"Chitradevi","year":"2020","journal-title":"Appl. Soft Comput."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"35789","DOI":"10.1007\/s11042-020-09087-y","article-title":"A deep feature-based real-time system for alzheimer disease stage detection","volume":"80","author":"Nawaz","year":"2021","journal-title":"Multimed. Tools Appl."},{"key":"ref_33","unstructured":"Kundaram, S.S., and Pathak, K.C. (2021). Deep learning-based alzheimer disease detection. Proceedings of the Fourth International Conference on Microelectronics, Computing and Communication Systems: MCCS 2019, Coimbatore, India, 17\u201319 July 2019, Springer."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Vu, T.D., Yang, H.-J., Nguyen, V.Q., Oh, A.-R., and Kim, M.-S. (2017). Multimodal learning using convolution neural network and sparse autoencoder. Proceedings of the 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), Jeju, Republic of Korea, 13\u201316 February 2017, IEEE.","DOI":"10.1109\/BIGCOMP.2017.7881683"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Salehi, A.W., Baglat, P., Sharma, B.B., Gupta, G., and Upadhya, A. (2020). A cnn model: Earlier diagnosis and classification of alzheimer disease using mri. Proceedings of the 2020 International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 10\u201312 September 2020, IEEE.","DOI":"10.1109\/ICOSEC49089.2020.9215402"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"101645","DOI":"10.1016\/j.nicl.2018.101645","article-title":"Automated classification of alzheimer\u2019s disease and mild cognitive impairment using a single mri and deep neural networks","volume":"21","author":"Basaia","year":"2019","journal-title":"Neuroimage Clin."},{"key":"ref_37","unstructured":"Khvostikov, A., Aderghal, K., Benois-Pineau, J., Krylov, A., and Catheline, G. (2018). 3d cnn-based classification using smri and md-dti images for alzheimer disease studies. arXiv."},{"key":"ref_38","first-page":"49250","article-title":"Instructblip: Towards general-purpose vision-language models with instruction tuning","volume":"36","author":"Dai","year":"2023","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_39","first-page":"34892","article-title":"Visual instruction tuning","volume":"36","author":"Liu","year":"2023","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Wang, Z., Wu, Z., Agarwal, D., and Sun, J. (2022, January 7\u201311). Medclip: Contrastive learning from unpaired medical images and text. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), Abu Dhabi, United Arab Emirates.","DOI":"10.18653\/v1\/2022.emnlp-main.256"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"786","DOI":"10.5755\/j01.itc.51.4.28052","article-title":"Alzheimer\u2019s disease segmentation and classification on mri brain images using enhanced expectation maximization adaptive histogram (eem-ah) and machine learning","volume":"51","author":"Ramya","year":"2022","journal-title":"Inf. Technol. Control"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"302","DOI":"10.5755\/j01.itc.53.1.34536","article-title":"Dual attention aware octave convolution network for early-stage alzheimer\u2019s disease detection","volume":"53","author":"Rangaraju","year":"2024","journal-title":"Inf. Technol. Control"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Odusami, M., Maskeli\u016bnas, R., and Dama\u0161evi\u010dius, R. (2023). Pixel-level fusion approach with vision transformer for early detection of alzheimer\u2019s disease. Electronics, 12.","DOI":"10.3390\/electronics12051218"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1007\/s40846-023-00801-3","article-title":"Explainable deep-learning-based diagnosis of alzheimer\u2019s disease using multimodal input fusion of pet and mri images","volume":"43","author":"Odusami","year":"2023","journal-title":"J. Med. Biol. Eng."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Odusami, M., Maskeli\u016bnas, R., and Dama\u0161evi\u010dius, R. (2023). Pareto optimized adaptive learning with transposed convolution for image fusion alzheimer\u2019s disease classification. Brain Sci., 13.","DOI":"10.3390\/brainsci13071045"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Odusami, M., Maskeli\u016bnas, R., and Dama\u0161evi\u010dius, R. (2023). Optimized convolutional fusion for multimodal neuroimaging in alzheimer\u2019s disease diagnosis: Enhancing data integration and feature extraction. J. Pers. Med., 13.","DOI":"10.3390\/jpm13101496"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"Smote: Synthetic minority over-sampling technique","volume":"16","author":"Chawla","year":"2002","journal-title":"J. Artif. Intell. Res."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"5979","DOI":"10.1038\/s41598-022-09954-8","article-title":"On evaluation metrics for medical applications of artificial intelligence","volume":"12","author":"Hicks","year":"2022","journal-title":"Sci. Rep."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Borji, A., Kronreif, G., Angermayr, B., and Hatamikia, S. (2025). Advanced hybrid deep learning model for enhanced evaluation of osteosarcoma histopathology images. Front. Med., 12.","DOI":"10.3389\/fmed.2025.1555907"},{"key":"ref_50","first-page":"104796","article-title":"Hybrid cnn-transformer architecture for medical image segmentation: A comprehensive review","volume":"84","author":"He","year":"2023","journal-title":"Biomed. Signal Process. Control"},{"key":"ref_51","unstructured":"Raghu, M., Zhang, C., Kleinberg, J., and Bengio, S. (2019, January 8\u201314). Transfusion: Understanding transfer learning for medical imaging. Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, BC, Canada."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Azizi, S., Mustafa, B., Ryan, F., Beaver, Z., Freyberg, J., Deaton, J., Loh, A., Karthikesalingam, A., Kornblith, S., and Chen, T. (2021, January 11\u201317). Big self-supervised models advance medical image classification. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00346"},{"key":"ref_53","first-page":"540","article-title":"Foundation models for generalist medical artificial intelligence","volume":"624","author":"Tang","year":"2023","journal-title":"Nature"},{"key":"ref_54","first-page":"102559","article-title":"Contrastive learning for label-efficient pathological image analysis","volume":"81","author":"Jha","year":"2022","journal-title":"Med. Image Anal."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1007\/s10916-024-02105-8","article-title":"Comparison of vision transformers and convolutional neural networks in medical image analysis: A systematic review","volume":"48","author":"Takahashi","year":"2024","journal-title":"J. Med. Syst."},{"key":"ref_56","first-page":"106376","article-title":"Transformers in medical imaging: A survey on vision transformer applications in medical image analysis","volume":"152","author":"Gupta","year":"2023","journal-title":"Comput. Biol. Med."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Singh, A., Sengupta, S., and Lakshminarayanan, V. (2020). Explainable deep learning models in medical image analysis. J. Imaging, 6.","DOI":"10.3390\/jimaging6060052"},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1109\/JPROC.2021.3060483","article-title":"Explaining deep neural networks and beyond: A review of methods and applications","volume":"109","author":"Samek","year":"2021","journal-title":"Proc. IEEE"}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/12\/3\/98\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T13:16:00Z","timestamp":1772025360000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/12\/3\/98"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,25]]},"references-count":58,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2026,3]]}},"alternative-id":["jimaging12030098"],"URL":"https:\/\/doi.org\/10.3390\/jimaging12030098","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,25]]}}}