{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T14:06:06Z","timestamp":1780581966848,"version":"3.54.1"},"reference-count":46,"publisher":"Springer Science and Business Media LLC","issue":"26","license":[{"start":{"date-parts":[[2025,8,1]],"date-time":"2025-08-01T00:00:00Z","timestamp":1754006400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,8,1]],"date-time":"2025-08-01T00:00:00Z","timestamp":1754006400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100023991","name":"Swinburne University of Technology Sarawak Campus","doi-asserted-by":"publisher","award":["1.7.02(2022).01-02(02)"],"award-info":[{"award-number":["1.7.02(2022).01-02(02)"]}],"id":[{"id":"10.13039\/501100023991","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Swinburne University of Technology"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Comput &amp; Applic"],"published-print":{"date-parts":[[2025,9]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Plant species identification is a fundamental process in botany and agriculture sector. In recent years, deep neural networks have become the primary approach for automating this task, providing valuable insights into biodiversity, ecological systems, and agricultural practices. Along with more discoveries in plant species, training a deep neural network becomes very challenging as the cost required to collect and annotate plant samples is expensive and impractical. Despite the lack of labelled plant samples, recent studies have explored the potential of leveraging publicly available and systematically annotated plant specimens in herbaria coupled with field images for plant species identification through cross-domain adaptation techniques. However, the accuracy of these methods remains unsatisfactory, motivating the exploration of alternative approaches. In this paper, we evaluated the feasibility of employing a pre-trained transformer-based self-distillation model (DINOv2) for cross-domain plant species identification tasks. We trained our model with the PlantCLEF2020 dataset comprised of approximately 320 k herbarium and field images representing 997 plant species. Our approach leverages the advanced feature extraction capabilities of DINOv2, which enhances cross-domain adaptation by effectively bridging the gap between herbarium and field images, achieving a 17.7% improvement over the best model proposed in previous work, that employs ensembles of Siamese network architectures with triplet loss (HFTL-ENS and OSM-ENS). <\/jats:p>","DOI":"10.1007\/s00521-025-11499-6","type":"journal-article","created":{"date-parts":[[2025,8,1]],"date-time":"2025-08-01T06:19:35Z","timestamp":1754029175000},"page":"21969-21995","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["An evaluation of a pre-trained transformer-based self-distillation model (DINOv2) for cross-domain plant species identification"],"prefix":"10.1007","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0335-9011","authenticated-orcid":false,"given":"Chin Ann","family":"Ong","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Fei Siang","family":"Tay","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yi Lung","family":"Then","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Chris","family":"McCarthy","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2025,8,1]]},"reference":[{"issue":"8","key":"11499_CR1","doi-asserted-by":"publisher","first-page":"7562","DOI":"10.1016\/j.eswa.2012.01.073","volume":"39","author":"JS Cope","year":"2012","unstructured":"Cope JS, Corney D, Clark JY, Remagnino P, Wilkin P (2012) Plant species identification using digital morphometrics: a review. Expert Syst Appl 39(8):7562\u20137573. https:\/\/doi.org\/10.1016\/j.eswa.2012.01.073","journal-title":"Expert Syst Appl"},{"key":"11499_CR2","doi-asserted-by":"crossref","unstructured":"Dwivedi P, Sharma DK (2023) Plant species classification using information measure and deep learning for an actual environmental problem. In: Recent developments in machine and human intelligence, pp 208\u2013227. IGI Global, Hershey, PA, USA","DOI":"10.4018\/978-1-6684-9189-8.ch015"},{"key":"11499_CR3","doi-asserted-by":"publisher","first-page":"1286088","DOI":"10.3389\/fpls.2023.1286088","volume":"14","author":"AK Mulugeta","year":"2024","unstructured":"Mulugeta AK, Sharma DP, Mesfin AH (2024) Deep learning for medicinal plant species classification and recognition: a systematic review. Front Plant Sci 14:1286088","journal-title":"Front Plant Sci"},{"issue":"2","key":"11499_CR4","doi-asserted-by":"publisher","first-page":"20207","DOI":"10.3390\/agronomy10020207","volume":"10","author":"V Saiz-Rubio","year":"2020","unstructured":"Saiz-Rubio V, Rovira-M\u00e1s F (2020) From smart farming towards agriculture 5.0: a review on crop data management. Agronomy 10(2):20207. https:\/\/doi.org\/10.3390\/agronomy10020207","journal-title":"Agronomy"},{"key":"11499_CR5","doi-asserted-by":"publisher","DOI":"10.3389\/fpls.2022.805738","volume":"13","author":"N Katal","year":"2022","unstructured":"Katal N, Rzanny M, M\u00e4der P, W\u00e4ldchen J (2022) Deep learning in plant phenological research: a systematic literature review. Front Plant Sci 13:805738","journal-title":"Front Plant Sci"},{"key":"11499_CR6","doi-asserted-by":"publisher","unstructured":"Lee SH, Go\u00ebau H, Bonnet P, Joly A (2021) Conditional multi-task learning for plant disease identification. In: 2020 25th international conference on pattern recognition (ICPR), pp 3320\u20133327. https:\/\/doi.org\/10.1109\/ICPR48806.2021.9412643","DOI":"10.1109\/ICPR48806.2021.9412643"},{"issue":"2","key":"11499_CR7","doi-asserted-by":"publisher","first-page":"507","DOI":"10.1007\/s11831-016-9206-z","volume":"25","author":"J W\u00e4ldchen","year":"2018","unstructured":"W\u00e4ldchen J, M\u00e4der P (2018) Plant species identification using computer vision techniques: a systematic literature review. Arch Comput Methods Eng 25(2):507\u2013543","journal-title":"Arch Comput Methods Eng"},{"key":"11499_CR8","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"11499_CR9","doi-asserted-by":"crossref","unstructured":"Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 31","DOI":"10.1609\/aaai.v31i1.11231"},{"key":"11499_CR10","unstructured":"Go\u00ebau H, Bonnet P, Joly A (2020) Overview of lifeclef plant identification task 2020. CEUR-WS"},{"issue":"1","key":"11499_CR11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12862-017-1014-z","volume":"17","author":"J Carranza-Rojas","year":"2017","unstructured":"Carranza-Rojas J, Goeau H, Bonnet P, Mata-Montero E, Joly A (2017) Going deeper in the automated identification of herbarium specimens. BMC Evol Biol 17(1):1\u201314","journal-title":"BMC Evol Biol"},{"issue":"1","key":"11499_CR12","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1109\/JPROC.2020.3004555","volume":"109","author":"F Zhuang","year":"2020","unstructured":"Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2020) A comprehensive survey on transfer learning. Proc IEEE 109(1):43\u201376","journal-title":"Proc IEEE"},{"key":"11499_CR13","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale"},{"key":"11499_CR14","first-page":"12116","volume":"34","author":"M Raghu","year":"2021","unstructured":"Raghu M, Unterthiner T, Kornblith S, Zhang C, Dosovitskiy A (2021) Do vision transformers see like convolutional neural networks? Adv Neural Inf Process Syst 34:12116\u201312128","journal-title":"Adv Neural Inf Process Syst"},{"key":"11499_CR15","unstructured":"Oquab M, Darcet T, Moutakanni T, Vo H, Szafraniec M, Khalidov V, Fernandez P, Haziza D, Massa F, El-Nouby A et al (2023) Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193"},{"issue":"11","key":"11499_CR16","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1145\/3422622","volume":"63","author":"I Goodfellow","year":"2020","unstructured":"Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2020) Generative adversarial networks. Commun ACM 63(11):139\u2013144","journal-title":"Commun ACM"},{"key":"11499_CR17","unstructured":"Zhang Y, Davison BD (2020) Adversarial consistent learning on partial domain adaptation of plantclef 2020 challenge. arXiv preprint arXiv:2009.09289"},{"key":"11499_CR18","doi-asserted-by":"crossref","unstructured":"Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697\u20138710","DOI":"10.1109\/CVPR.2018.00907"},{"key":"11499_CR19","unstructured":"Zhang Y, Davison BD (2021) Weighted pseudo labeling refinement for plant identification"},{"key":"11499_CR20","doi-asserted-by":"crossref","unstructured":"Sun B, Saenko K (2016) Deep coral: Correlation alignment for deep domain adaptation. In: computer vision\u2013ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14, pp. 443\u2013450. Springer","DOI":"10.1007\/978-3-319-49409-8_35"},{"key":"11499_CR21","unstructured":"Villacis J, Go\u00ebau H, Bonnet P, Joly A, Mata-Montero E (2020) Domain adaptation in the context of herbarium collections. a submission to plantclef 2020. CEUR-WS"},{"key":"11499_CR22","unstructured":"Motiian S, Jones Q, Iranmanesh S, Doretto G (2017) Few-shot adversarial domain adaptation. Advances in neural information processing systems 30"},{"key":"11499_CR23","doi-asserted-by":"crossref","unstructured":"Noroozi M, Favaro P (2016) Unsupervised learning of visual representations by solving jigsaw puzzles. In: European conference on computer vision, Springer, pp 69\u201384","DOI":"10.1007\/978-3-319-46466-4_5"},{"key":"11499_CR24","unstructured":"Go\u00ebau H, Bonnet P, Joly A (2021) Overview of plantclef 2021: cross-domain plant identification. In: CLEF (Working Notes), pp 1422\u20131436"},{"issue":"9","key":"11499_CR25","doi-asserted-by":"publisher","first-page":"1066","DOI":"10.3390\/sym11091066","volume":"11","author":"M Kaya","year":"2019","unstructured":"Kaya M, Bilge H\u015e (2019) Deep metric learning A survey. Symmetry 11(9):1066","journal-title":"Symmetry"},{"issue":"6","key":"11499_CR26","doi-asserted-by":"publisher","first-page":"76","DOI":"10.1109\/MSP.2017.2732900","volume":"34","author":"J Lu","year":"2017","unstructured":"Lu J, Hu J, Zhou J (2017) Deep metric learning for visual understanding: an overview of recent advances. IEEE Signal Process Mag 34(6):76\u201384","journal-title":"IEEE Signal Process Mag"},{"key":"11499_CR27","doi-asserted-by":"crossref","unstructured":"Chulif S, Chang YL (2021) Herbarium-field triplet network for cross-domain plant identification. In: Experimental IR meets multilinguality, multimodality, and interaction: 12th international conference of the CLEF association, CLEF 2021, Virtual Event, September 21\u201324, 2021, Proceedings 12, Springer, pp 173\u2013188","DOI":"10.1007\/978-3-030-85251-1_14"},{"key":"11499_CR28","doi-asserted-by":"crossref","unstructured":"Chulif S, Chang YL (2021) Improved herbarium-field triplet network for cross-domain plant identification: Neuon submission to lifeclef 2021 plant","DOI":"10.1007\/978-3-030-85251-1_14"},{"key":"11499_CR29","unstructured":"Balestriero R, Ibrahim M, Sobal V, Morcos A, Shekhar S, Goldstein T, Bordes F, Bardes A, Mialon G, Tian Y et al (2023) A cookbook of self-supervised learning. arXiv preprint arXiv:2304.12210"},{"key":"11499_CR30","unstructured":"Gui J, Chen T, Cao Q, Sun Z, Luo H, Tao D (2023) A survey of self-supervised learning from multiple perspectives: algorithms, theory, applications and future trends. arXiv preprint arXiv:2301.05712"},{"key":"11499_CR31","doi-asserted-by":"crossref","unstructured":"He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 9729\u20139738","DOI":"10.1109\/CVPR42600.2020.00975"},{"key":"11499_CR32","unstructured":"Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, pp 1597\u20131607, PMLR"},{"key":"11499_CR33","first-page":"21271","volume":"33","author":"J-B Grill","year":"2020","unstructured":"Grill J-B, Strub F, Altch\u00e9 F, Tallec C, Richemond P, Buchatskaya E, Doersch C, Avila Pires B, Guo Z, Gheshlaghi Azar M et al (2020) Bootstrap your own latent-a new approach to self-supervised learning. Adv Neural Inf Process Syst 33:21271\u201321284","journal-title":"Adv Neural Inf Process Syst"},{"key":"11499_CR34","first-page":"9912","volume":"33","author":"M Caron","year":"2020","unstructured":"Caron M, Misra I, Mairal J, Goyal P, Bojanowski P, Joulin A (2020) Unsupervised learning of visual features by contrasting cluster assignments. Adv Neural Inf Process Syst 33:9912\u20139924","journal-title":"Adv Neural Inf Process Syst"},{"key":"11499_CR35","doi-asserted-by":"crossref","unstructured":"Chen X, He K (2021) Exploring simple siamese representation learning. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 15750\u201315758","DOI":"10.1109\/CVPR46437.2021.01549"},{"key":"11499_CR36","unstructured":"Zbontar J, Jing L, Misra I, LeCun Y, Deny S (2021) Barlow twins: self-supervised learning via redundancy reduction. In: International conference on machine learning, PMLR, pp 12310\u201312320"},{"key":"11499_CR37","unstructured":"Bardes A, Ponce J, LeCun Y (2021) Vicreg: Variance-invariance-covariance regularization for self-supervised learning. arXiv preprint arXiv:2105.04906"},{"key":"11499_CR38","doi-asserted-by":"crossref","unstructured":"Caron M, Touvron H, Misra I, J\u00e9gou H, Mairal J, Bojanowski P, Joulin A (2021) Emerging properties in self-supervised vision transformers","DOI":"10.1109\/ICCV48922.2021.00951"},{"key":"11499_CR39","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805"},{"issue":"8","key":"11499_CR40","first-page":"9","volume":"1","author":"A Radford","year":"2019","unstructured":"Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I et al (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8):9","journal-title":"OpenAI blog"},{"key":"11499_CR41","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2023.12.045","volume":"172","author":"Z Liu","year":"2024","unstructured":"Liu Z, Qian S, Xia C, Wang C (2024) Are transformer-based models more robust than CNN-based models? Neural Netw 172:106091","journal-title":"Neural Netw"},{"issue":"8","key":"11499_CR42","doi-asserted-by":"publisher","first-page":"5963","DOI":"10.1007\/s00521-022-07951-6","volume":"35","author":"S Chulif","year":"2023","unstructured":"Chulif S, Lee SH, Chang YL, Chai KC (2023) A machine learning approach for cross-domain plant identification using herbarium specimens. Neural Comput Appl 35(8):5963\u20135985","journal-title":"Neural Comput Appl"},{"key":"11499_CR43","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","volume":"115","author":"O Russakovsky","year":"2015","unstructured":"Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115:211\u2013252","journal-title":"Int J Comput Vision"},{"key":"11499_CR44","unstructured":"Tuli S, Dasgupta I, Grant E, Griffiths TL (2021) Are convolutional neural networks or transformers more like human vision? arXiv preprint arXiv:2105.07197"},{"key":"11499_CR45","first-page":"23296","volume":"34","author":"MM Naseer","year":"2021","unstructured":"Naseer MM, Ranasinghe K, Khan SH, Hayat M, Shahbaz Khan F, Yang M-H (2021) Intriguing properties of vision transformers. Adv Neural Inf Process Syst 34:23296\u201323308","journal-title":"Adv Neural Inf Process Syst"},{"key":"11499_CR46","doi-asserted-by":"crossref","unstructured":"Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618\u2013626","DOI":"10.1109\/ICCV.2017.74"}],"container-title":["Neural Computing and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-025-11499-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00521-025-11499-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-025-11499-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,8]],"date-time":"2025-09-08T10:03:08Z","timestamp":1757325788000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00521-025-11499-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,1]]},"references-count":46,"journal-issue":{"issue":"26","published-print":{"date-parts":[[2025,9]]}},"alternative-id":["11499"],"URL":"https:\/\/doi.org\/10.1007\/s00521-025-11499-6","relation":{},"ISSN":["0941-0643","1433-3058"],"issn-type":[{"value":"0941-0643","type":"print"},{"value":"1433-3058","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,8,1]]},"assertion":[{"value":"31 July 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 July 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 August 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"We declare that there is no Conflict of interest of any kind to the best of our knowledge.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}