{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T10:41:47Z","timestamp":1767955307219,"version":"3.49.0"},"reference-count":70,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2026,1,7]],"date-time":"2026-01-07T00:00:00Z","timestamp":1767744000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Automated animal identification is a practical task for reuniting lost pets with their owners, yet current systems often struggle due to limited dataset scale and reliance on unimodal visual cues. This study introduces a multimodal verification framework that enhances visual features with semantic identity priors derived from synthetic textual descriptions. We constructed a massive training corpus of 1.9 million photographs covering 695,091 unique animals to support this investigation. Through systematic ablation studies, we identified SigLIP2-Giant and E5-Small-v2 as the optimal vision and text backbones. We further evaluated fusion strategies ranging from simple concatenation to adaptive gating to determine the best method for integrating these modalities. Our proposed approach utilizes a gated fusion mechanism and achieved a Top-1 accuracy of 84.28% and an Equal Error Rate of 0.0422 on a comprehensive test protocol. These results represent an 11% improvement over leading unimodal baselines and demonstrate that integrating synthesized semantic descriptions significantly refines decision boundaries in large-scale pet re-identification.<\/jats:p>","DOI":"10.3390\/jimaging12010030","type":"journal-article","created":{"date-parts":[[2026,1,7]],"date-time":"2026-01-07T11:46:43Z","timestamp":1767786403000},"page":"30","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["From Visual to Multimodal: Systematic Ablation of Encoders and Fusion Strategies in Animal Identification"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-9935-7514","authenticated-orcid":false,"given":"Vasiliy","family":"Kudryavtsev","sequence":"first","affiliation":[{"name":"Faculty of IT, Technical University of Communication and Informatics, Moscow 111024, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-8203-1059","authenticated-orcid":false,"given":"Kirill","family":"Borodin","sequence":"additional","affiliation":[{"name":"Faculty of IT, Technical University of Communication and Informatics, Moscow 111024, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-9791-5216","authenticated-orcid":false,"given":"German","family":"Berezin","sequence":"additional","affiliation":[{"name":"Faculty of IT, Technical University of Communication and Informatics, Moscow 111024, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-3847-5719","authenticated-orcid":false,"given":"Kirill","family":"Bubenchikov","sequence":"additional","affiliation":[{"name":"AI Lab, Avito, Moscow 125196, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5802-5513","authenticated-orcid":false,"given":"Grach","family":"Mkrtchian","sequence":"additional","affiliation":[{"name":"Faculty of IT, Technical University of Communication and Informatics, Moscow 111024, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-5422-1649","authenticated-orcid":false,"given":"Alexander","family":"Ryzhkov","sequence":"additional","affiliation":[{"name":"AI Lab, Avito, Moscow 125196, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2026,1,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"7033535","DOI":"10.1155\/2024\/7033535","article-title":"A Comprehensive Survey of Animal Identification: Exploring Data Sources, AI Advances, Classification Obstacles and the Role of Taxonomy","volume":"2024","author":"Zhang","year":"2024","journal-title":"Int. J. Intell. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"900","DOI":"10.1093\/icb\/icab107","article-title":"Perspectives on Individual Animal Identification from Biology and Computer Vision","volume":"61","author":"Vidal","year":"2021","journal-title":"Integr. Comp. Biol."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"301","DOI":"10.3390\/ani2020301","article-title":"Frequency of Lost Dogs and Cats in the United States and the Methods Used to Locate Them","volume":"2","author":"Weiss","year":"2012","journal-title":"Animals"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Zheng, Z., Zhao, Y., Li, A., and Yu, Q. (2022). Wild Terrestrial Animal Re-Identification Based on an Improved Locally Aware Transformer with a Cross-Attention Mechanism. Animals, 12.","DOI":"10.3390\/ani12243503"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"43","DOI":"10.5187\/jast.2025.e4","article-title":"Research trends in livestock facial identification: A review","volume":"67","author":"Kang","year":"2025","journal-title":"J. Anim. Sci. Technol."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"387","DOI":"10.2460\/javma.237.4.387","article-title":"Evaluation of collars and microchips for visual and permanent identification of pet cats","volume":"237","author":"Lord","year":"2010","journal-title":"J. Am. Vet. Med. Assoc."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"McGreevy, P., Masters, S., Richards, L., Magalhaes, R.J.S., Peaston, A., Combs, M., Irwin, P.J., Lloyd, J., Croton, C., and Wylie, C. (2019). Identification of Microchip Implantation Events for Dogs and Cats in the VetCompass Australia Database. Animals, 9.","DOI":"10.3390\/ani9070423"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"332","DOI":"10.3390\/ani5020332","article-title":"Problems Associated with the Microchip Data of Stray Dogs and Cats Entering RSPCA Queensland Shelters","volume":"5","author":"Lancaster","year":"2015","journal-title":"Animals"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1111\/2041-210X.14278","article-title":"An open-source general purpose machine learning framework for individual animal re-identification using few-shot learning","volume":"15","author":"Wahltinez","year":"2024","journal-title":"Methods Ecol. Evol."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1009","DOI":"10.1111\/faf.12861","article-title":"Long-term effects of tagging fishes with electronic tracking devices","volume":"25","author":"Matley","year":"2024","journal-title":"Fish Fish."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s10344-021-01549-4","article-title":"Review on methods used for wildlife species and individual identification","volume":"68","author":"Petso","year":"2022","journal-title":"Eur. J. Wildl. Res."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1111\/jzo.13009","article-title":"Phenotypic matching by spot pattern potentially mediates female giraffe social associations","volume":"318","author":"Morandi","year":"2022","journal-title":"J. Zool."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"177","DOI":"10.3354\/meps13729","article-title":"Genetic markers validate photo-identification and uniqueness of spot patterns in whale sharks","volume":"668","author":"Meenakshisundaram","year":"2021","journal-title":"Mar. Ecol. Prog. Ser."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"McCutcheon, J., Campbell, B., Hudock, R.E., Motz, N., Windsor, M., Carlisle, A., and Hale, E. (2025). An approach to predicting linear trends in tagging-related mortality and tag loss during mark-recapture studies. Front. Ecol. Evol., 13.","DOI":"10.3389\/fevo.2025.1572994"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"863","DOI":"10.1111\/2041-210X.13577","article-title":"Revisiting animal photo-identification using deep metric learning and network flow","volume":"12","author":"Cheema","year":"2021","journal-title":"Methods Ecol. Evol."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"eaaw0736","DOI":"10.1126\/sciadv.aaw0736","article-title":"Chimpanzee face recognition from videos in the wild using deep learning","volume":"5","author":"Schofield","year":"2019","journal-title":"Sci. Adv."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Mougeot, G., Li, D., and Jia, S. (2019, January 26\u201330). A Deep Learning Approach for Dog Face Verification and Recognition. Proceedings of the PRICAI 2019: Trends in Artificial Intelligence, Cuvu, Yanuca Island, Fiji.","DOI":"10.1007\/978-3-030-29894-4_34"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1072","DOI":"10.1111\/2041-210X.13436","article-title":"Deep learning-based methods for individual recognition in small birds","volume":"11","author":"Ferreira","year":"2020","journal-title":"Methods Ecol. Evol."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1038\/s41592-018-0295-5","article-title":"idtracker.ai: Tracking all individuals in small or large collectives of unmarked animals","volume":"16","author":"Bergomi","year":"2019","journal-title":"Nat. Methods"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Sakamoto, N., Kakeno, H., Ozaki, N., Miyazaki, Y., Kobayashi, K., and Murata, T. (2023). Marker-less tracking system for multiple mice using Mask R-CNN. Front. Behav. Neurosci., 16.","DOI":"10.3389\/fnbeh.2022.1086242"},{"key":"ref_21","unstructured":"Hou, S., Huang, P., Wang, Z., Liu, Y., Li, Z., Zhang, M., and Huang, Y. (2025, January 19\u201323). OpenAnimals: Revisiting Person Re-Identification for Animals Towards Better Generalization. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"e10260","DOI":"10.1002\/ece3.10260","article-title":"Optimizing the automated recognition of individual animals to support population monitoring","volume":"13","author":"Horswill","year":"2023","journal-title":"Ecol. Evol."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Nepovinnykh, E., Eerola, T., Biard, V., Mutka, P., Niemi, M., Kunnasranta, M., and K\u00e4lvi\u00e4inen, H. (2022). SealID: Saimaa Ringed Seal Re-Identification Dataset. Sensors, 22.","DOI":"10.3390\/s22197602"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Cermak, V., Picek, L., Adam, L., and Papafitsoros, K. (2024, January 3\u20138). WildlifeDatasets: An Open-Source Toolkit for Animal Re-Identification. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA.","DOI":"10.1109\/WACV57701.2024.00585"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Adam, L., \u010cerm\u00e1k, V., Papafitsoros, K., and Picek, L. (2025, January 11\u201312). WildlifeReID-10k: Wildlife Re-Identification Dataset with 10k Individual Animals. Proceedings of the CVPR Workshops (CVPRW), Nashville, TN, USA.","DOI":"10.1109\/CVPRW67362.2025.00197"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1016\/j.biosystemseng.2025.02.001","article-title":"Re-identification for long-term tracking and management of health and welfare challenges in pigs","volume":"251","author":"Odo","year":"2025","journal-title":"Biosyst. Eng."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Beery, S., Van Horn, G., and Perona, P. (2018, January 8\u201314). Recognition in Terra Incognita. Proceedings of the Computer Vision\u2014ECCV 2018: 15th European Conference, Munich, Germany. Proceedings, Part XVI.","DOI":"10.1007\/978-3-030-01270-0_28"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Santamaria, J.D., Isaza, C., and Giraldo, J.H. (March, January 26). CATALOG: A Camera Trap Language-Guided Contrastive Learning Model. Proceedings of the 2025 IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, AZ, USA.","DOI":"10.1109\/WACV61041.2025.00124"},{"key":"ref_29","unstructured":"Otarashvili, L., Subramanian, T., Holmberg, J., Levenson, J.J., and Stewart, C.V. (2024). Multispecies Animal Re-ID Using a Large Community-Curated Dataset. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Schneider, S., Taylor, G.W., and Kremer, S.C. (2020, January 1\u20135). Similarity Learning Networks for Animal Individual Re-Identification\u2014Beyond the Capabilities of a Human Observer. Proceedings of the 2020 IEEE Winter Applications of Computer Vision Workshops (WACVW), Snowmass Village, CO, USA.","DOI":"10.1109\/WACVW50321.2020.9096925"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Ratnasingham, S., and Hebert, P.D.N. (2013). A DNA-Based Registry for All Animal Species: The Barcode Index Number (BIN) System. PLoS ONE, 8.","DOI":"10.1371\/journal.pone.0066213"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Pereira, K.S., Gibson, L., Biggs, D., Samarasinghe, D., and Braczkowski, A.R. (2022). Individual Identification of Large Felids in Field Studies: Common Methods, Challenges, and Implications for Conservation Science. Front. Ecol. Evol., 10.","DOI":"10.3389\/fevo.2022.866403"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Paudel, S., and Brown-Brandl, T. (2025). Advancements in Individual Animal Identification: A Historical Perspective from Prehistoric Times to the Present. Animals, 15.","DOI":"10.3390\/ani15172514"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"108414","DOI":"10.1016\/j.biocon.2020.108414","article-title":"Identification of animal individuals using deep learning: A case study of giant panda","volume":"242","author":"Hou","year":"2020","journal-title":"Biol. Conserv."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Shinoda, R., and Shiohara, K. (2024). PetFace: A Large-Scale Dataset and Benchmark for Animal Identification. arXiv.","DOI":"10.1007\/978-3-031-72649-1_2"},{"key":"ref_36","first-page":"100194","article-title":"Pseudo-labeling and semi-supervised learning for individual cattle identification","volume":"4","author":"Ferreira","year":"2023","journal-title":"Smart Agric. Technol."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Feragen, A., Pelillo, M., and Loog, M. (2015). Deep Metric Learning Using Triplet Network. Similarity-Based Pattern Recognition. SIMBAD 2015. Lecture Notes in Computer Science, Springer.","DOI":"10.1007\/978-3-319-24261-3"},{"key":"ref_38","first-page":"1","article-title":"A survey on learning from data with label noise via deep neural networks","volume":"13","author":"Song","year":"2025","journal-title":"Syst. Sci. Control Eng."},{"key":"ref_39","unstructured":"Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., and Clark, J. (2021, January 18\u201324). Learning Transferable Visual Models From Natural Language Supervision. Proceedings of the 38th International Conference on Machine Learning, Virtual. Proceedings of Machine Learning Research (PMLR): New York, NY, USA, 2021."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Azizi, E., and Zaman, L. (2023). Deep Learning Pet Identification Using Face and Body. Information, 14.","DOI":"10.3390\/info14050278"},{"key":"ref_41","unstructured":"Zhou, X., Zhang, X., Niyato, D., and Shen, Z. (2025). Learning Item Representations Directly from Multimodal Features for Effective Recommendation. arXiv."},{"key":"ref_42","unstructured":"Huang, Z., and Liu, X. (2025, January 25\u201326). Generalizable Object Re-Identification via Visual In-Context Prompting. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Paris, France."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Zhai, X., Mustafa, B., Kolesnikov, A., and Beyer, L. (2023, January 2\u20133). Sigmoid loss for language image pre-training. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.01100"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Schroff, F., Kalenichenko, D., and Philbin, J. (2015, January 7\u201312). FaceNet: A unified embedding for face recognition and clustering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"ref_45","unstructured":"Khosla, A., Jayadevaprakash, N., Yao, B., and Li, F.F. (2011, January 20\u201325). Novel Dataset for Fine-Grained Image Categorization: Stanford Dogs. Proceedings of the First Workshop on Fine-Grained Visual Categorization (FGVC), Colorado Springs, CO, USA."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Parkhi, O.M., Vedaldi, A., Zisserman, A., and Jawahar, C.V. (2012, January 16\u201321). Cats and Dogs. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248092"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Khan, M.H., McDonagh, J., Khan, S., Shahabuddin, M., Arora, A., Khan, F.S., Shao, L., and Tzimiropoulos, G. (2020, January 13\u201319). AnimalWeb: A Large-Scale Hierarchical Dataset of Annotated Animal Faces. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00697"},{"key":"ref_48","first-page":"13","article-title":"Animal biometrics: Quantifying and detecting phenotypic appearance","volume":"33","author":"Tharwat","year":"2018","journal-title":"Trends Ecol. Evol."},{"key":"ref_49","first-page":"1","article-title":"Study on the Viability of Canine Nose Pattern as a Unique Biometric Identifier","volume":"17","author":"Choi","year":"2021","journal-title":"BMC Vet. Res."},{"key":"ref_50","first-page":"121353","article-title":"Dog nose-print recognition based on the shape and spatial features of scales","volume":"237","author":"Chan","year":"2024","journal-title":"Expert Syst. Appl."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"12883","DOI":"10.1002\/ece3.6840","article-title":"Automated facial recognition for wildlife that lack unique markings: A deep learning approach for brown bears","volume":"10","author":"Clapham","year":"2020","journal-title":"Ecol. Evol."},{"key":"ref_52","unstructured":"Biggs, B., Boyne, O., Charles, J., Fitzgibbon, A., and Cipolla, R. (2020, January 23\u201328). Deep Cross-species Feature Learning for Animal Face Recognition. Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK."},{"key":"ref_53","unstructured":"Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., and El-Nouby, A. (2024). DINOv2: Learning Robust Visual Features without Supervision. arXiv."},{"key":"ref_54","unstructured":"Tschannen, M., Gritsenko, A., Wang, X., Naeem, M.F., Alabdulmohsin, I., Parthasarathy, N., Evans, T., Beyer, L., Xia, Y., and Mustafa, B. (2025). SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features. arXiv."},{"key":"ref_55","first-page":"825","article-title":"Transfer Learning for Animal Species Identification from CCTV Image","volume":"11","author":"Hosny","year":"2023","journal-title":"Int. J. Intell. Syst. Appl. Eng."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Li, Y., Zhao, D., Qiao, T., Wu, Y., Pang, B., and Koh, Y.S. (2025, January 27\u201331). MetaWild: A Multimodal Dataset for Animal Re-Identification with Environmental Metadata. Proceedings of the 33rd ACM International Conference on Multimedia, MM \u201925, Dublin, Ireland.","DOI":"10.1145\/3746027.3758249"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Cermak, V., Picek, L., Adam, L., Neumann, L., and Matas, J. (October, January 29). WildFusion: Individual Animal Identification with Calibrated Similarity Fusion. Proceedings of the Computer Vision\u2014ECCV 2024 Workshops, Milan, Italy. Proceedings, Part II.","DOI":"10.1007\/978-3-031-92387-6_2"},{"key":"ref_58","unstructured":"Toumbourou, L. (2025, November 15). Dogs of the World. Dataset, 300k+ Images, 240+ Breeds, CC0 Public Domain License. Available online: https:\/\/www.kaggle.com\/datasets\/lextoumbourou\/dogs-world."},{"key":"ref_59","unstructured":"DSEIDLI (2025, November 15). LCW (Labeled Cats in the Wild). Dataset of 140k+ Unique Cats for Cat Face Recognition and Related Research. Licensed Under Apache 2.0. Available online: https:\/\/www.kaggle.com\/datasets\/dseidli\/lcwlabeled-cats-in-the-wild."},{"key":"ref_60","unstructured":"Lin, T.Y. (2025, November 15). Cat Individual Images. Kaggle. Available online: https:\/\/www.kaggle.com\/datasets\/timost1234\/cat-individuals."},{"key":"ref_61","unstructured":"Tian, Y., Ye, Q., and Doermann, D. (2025, January 2\u20137). YOLOv12: Attention-Centric Real-Time Object Detectors. Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), San Diego, CA, USA."},{"key":"ref_62","unstructured":"Bai, S., Chen, K., Liu, X., Wang, J., Ge, W., Song, S., Dang, K., Wang, P., Wang, S., and Tang, J. (2025). Qwen2.5-VL Technical Report. arXiv."},{"key":"ref_63","unstructured":"Wang, L., Yang, N., Huang, X., Jiao, B., Yang, L., Jiang, D., Majumder, R., and Wei, F. (2024). Text Embeddings by Weakly-Supervised Contrastive Pre-training. arXiv."},{"key":"ref_64","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2019, January 2\u20137). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA."},{"key":"ref_65","unstructured":"Otarashvili, L. (MiewID, 2023). MiewID, Version 1.0.1."},{"key":"ref_66","first-page":"6105","article-title":"EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks","volume":"Volume 97","author":"Tan","year":"2019","journal-title":"Proceedings of the 36th International Conference on Machine Learning (ICML)"},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 11\u201317). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), Virtual.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Stevens, S., Wu, J., Thompson, M.J., Campolongo, E.G., Song, C.H., Carlyn, D.E., Dong, L., Dahdul, W.M., Stewart, C., and Berger-Wolf, T. (2024, January 16\u201322). BioCLIP: A Vision Foundation Model for the Tree of Life. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.01836"},{"key":"ref_69","unstructured":"Yu, B., and Tao, D. (November, January 27). Deep Metric Learning with Tuplet Margin Loss. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Republic of Korea."},{"key":"ref_70","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Hinton","year":"2008","journal-title":"J. Mach. Learn. Res."}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/12\/1\/30\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T05:12:57Z","timestamp":1767935577000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/12\/1\/30"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,7]]},"references-count":70,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2026,1]]}},"alternative-id":["jimaging12010030"],"URL":"https:\/\/doi.org\/10.3390\/jimaging12010030","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,7]]}}}