{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T08:26:16Z","timestamp":1774427176620,"version":"3.50.1"},"reference-count":13,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2025,11,1]],"date-time":"2025-11-01T00:00:00Z","timestamp":1761955200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/deed.de"},{"start":{"date-parts":[[2025,11,27]],"date-time":"2025-11-27T00:00:00Z","timestamp":1764201600000},"content-version":"vor","delay-in-days":26,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/deed.de"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["442032008"],"award-info":[{"award-number":["442032008"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100007537","name":"Freie Universit\u00e4t Berlin","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100007537","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Datenbank Spektrum"],"published-print":{"date-parts":[[2025,11]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>As the availability of historical biodiversity data continues to grow, ensuring its usability through adherence to FAIR principles (Findable, Accessible, Interoperable, and Reusable) has become increasingly essential. This study focuses on solving key challenges in interpreting biodiversity data from historical texts, particularly in identifying and aligning common species names with their modern scientific counterparts. We address five main challenges: spelling variations, the invention of new terms, semantic shifts between broad and narrow naming conventions, and the renaming or reclassification of historical terms. To tackle these issues, we tested a\u00a0range of large language models (LLMs) (GPT\u20114, LLaMA3-405B, Mistral-8B, and Qwen3-30B-A3B) for their ability to resolve these challenges and support terminology alignment. The initial entity detection was performed using GPT-4o, which achieved a\u00a092% success rate in detecting historical common names and correctly identified 98% of scientific terms on a\u00a0test dataset. Comparative evaluation of the ability to match historical common names with modern equivalents revealed that GPT-4o consistently delivered the most accurate and nuanced outputs across four of the five challenges, demonstrating strong contextual understanding. The results highlight the potential of advanced LLMs to not only identify entities but also to interpret historical naming conventions, thereby enhancing the reusability and interoperability of biodiversity data in line with FAIR principles.<\/jats:p>","DOI":"10.1007\/s13222-025-00519-3","type":"journal-article","created":{"date-parts":[[2025,11,27]],"date-time":"2025-11-27T08:46:38Z","timestamp":1764233198000},"page":"179-186","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Historic to FAIR: Leveraging LLMs for Historic Term Identification and Standardization","Historisch zu FAIR: Einsatz von LLMs zur Identifikation und Standardisierung historischer Begriffe"],"prefix":"10.1007","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2997-4656","authenticated-orcid":false,"given":"Jan","family":"Fillies","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Maximilian","family":"Teich","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Naouel","family":"Karam","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Adrian","family":"Paschke","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Malte","family":"Rehbein","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2025,11,27]]},"reference":[{"key":"519_CR1","unstructured":"Dhananjay Ashok and Zachary C. Lipton. Promptner: Prompting for named entity recognition, 2023."},{"key":"519_CR2","unstructured":"Patrick Barkham. Country diary 100 years on: sheep and dogs dominate over rabbits and house martins. The Guardian, pages 49\u201350, 2024-09-28."},{"key":"519_CR3","doi-asserted-by":"crossref","unstructured":"Sabit Ekin. Prompt engineering for chatgpt: A\u00a0quick guide to techniques, tips, and best practices, 05 2023.","DOI":"10.36227\/techrxiv.22683919"},{"key":"519_CR4","first-page":"e112926","volume":"7","author":"M Elliott","year":"2023","unstructured":"Elliott\u00a0M, Fortes\u00a0J (2023) Using chatgpt with confidence for biodiversity-related information tasks. Biodivers Inf Sci Stand 7:e112926","journal-title":"Biodivers Inf Sci Stand"},{"issue":"2","key":"519_CR5","first-page":"241","volume":"30","author":"S Govaerts","year":"2024","unstructured":"Govaerts\u00a0S (2024) Biodiversity in the late middle ages: Wild birds in the fourteenth-century county of holland. environ hist camb 30(2):241\u2013266","journal-title":"environ hist camb"},{"issue":"2","key":"519_CR6","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1007\/s13222-024-00474-5","volume":"24","author":"-","year":"2024","unstructured":"- (2024) Biodivportal: Enabling semantic services for biodiversity within the german national research data infrastructure. Datenbank Spektrum 24(2):129\u2013137","journal-title":"Datenbank Spektrum"},{"key":"519_CR7","unstructured":"Andreas Kohlbecker, Naouel Karam, Adrian Paschke, and Anton G\u00fcntsch. Preserving taxonomic change and subsequent taxon relationships over time. In JOWO, 2021."},{"key":"519_CR8","unstructured":"Shervin Minaee, Tomas Mikolov, Narjes Nikzad, Meysam Chenaghlu, Richard Socher, Xavier Amatriain, and Jianfeng Gao. Large language models: A\u00a0survey, 2024."},{"key":"519_CR9","unstructured":"OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, and more. Gpt\u20114 technical report, 2024."},{"key":"519_CR10","unstructured":"T Osawa, N Tsutsumida, et\u00a0al. The role of large language models in ecology and biodiversity conservation: Opportunities and challenges. 2023."},{"key":"519_CR11","unstructured":"(2014) Germinal Rouhan and Myriam Gaudeul. Plant taxonomy: a\u00a0historical perspective, current challenges, and perspectives. In: Molecular plant taxonomy: Methods and protocols, pp\u00a01\u201337"},{"key":"519_CR12","doi-asserted-by":"crossref","unstructured":"Elaine Svenonius. The intellectual foundation of information organization, 2000.","DOI":"10.7551\/mitpress\/3828.001.0001"},{"issue":"10","key":"519_CR13","first-page":"e3783","volume":"103","author":"DS Viana","year":"2022","unstructured":"Viana\u00a0DS (2022) Francisco Blanco-Garrido, Miguel Delibes, and Miguel Clavero. A\u00a016th-century Biodivers Crop Invent Ecol 103(10):e3783","journal-title":"A 16th-century Biodivers Crop Invent Ecol"}],"container-title":["Datenbank-Spektrum"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13222-025-00519-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s13222-025-00519-3","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13222-025-00519-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T07:52:47Z","timestamp":1770364367000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s13222-025-00519-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11]]},"references-count":13,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,11]]}},"alternative-id":["519"],"URL":"https:\/\/doi.org\/10.1007\/s13222-025-00519-3","relation":{},"ISSN":["1618-2162","1610-1995"],"issn-type":[{"value":"1618-2162","type":"print"},{"value":"1610-1995","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11]]},"assertion":[{"value":"28 May 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 November 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 November 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}