{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,20]],"date-time":"2026-03-20T16:02:32Z","timestamp":1774022552615,"version":"3.50.1"},"reference-count":18,"publisher":"Oxford University Press (OUP)","issue":"17","license":[{"start":{"date-parts":[[2021,1,28]],"date-time":"2021-01-28T00:00:00Z","timestamp":1611792000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Helmholtz Einstein International Berlin Research School in Data Science"},{"name":"German Research Council","award":["LE-1428\/7-1"],"award-info":[{"award-number":["LE-1428\/7-1"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,9,9]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Summary<\/jats:title>\n                  <jats:p>Named entity recognition (NER) is an important step in biomedical information extraction pipelines. Tools for NER should be easy to use, cover multiple entity types, be highly accurate and be robust toward variations in text genre and style. We present HunFlair, a NER tagger fulfilling these requirements. HunFlair is integrated into the widely used NLP framework Flair, recognizes five biomedical entity types, reaches or overcomes state-of-the-art performance on a wide set of evaluation corpora, and is trained in a cross-corpus setting to avoid corpus-specific bias. Technically, it uses a character-level language model pretrained on roughly 24 million biomedical abstracts and three million full texts. It outperforms other off-the-shelf biomedical NER tools with an average gain of 7.26 pp over the next best tool in a cross-corpus setting and achieves on-par results with state-of-the-art research prototypes in in-corpus experiments. HunFlair can be installed with a single command and is applied with only four lines of code. Furthermore, it is accompanied by harmonized versions of 23 biomedical NER corpora.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>HunFlair ist freely available through the Flair NLP framework (https:\/\/github.com\/flairNLP\/flair) under an MIT license and is compatible with all major operating systems.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab042","type":"journal-article","created":{"date-parts":[[2021,1,20]],"date-time":"2021-01-20T20:11:10Z","timestamp":1611173470000},"page":"2792-2794","source":"Crossref","is-referenced-by-count":82,"title":["HunFlair: an easy-to-use tool for state-of-the-art biomedical named entity recognition"],"prefix":"10.1093","volume":"37","author":[{"given":"Leon","family":"Weber","sequence":"first","affiliation":[{"name":"Computer Science Department, Humboldt-Universit\u00e4t zu Berlin , Berlin 10099, Germany"},{"name":"Group Mathematical Modelling of Cellular Processes, Max Delbr\u00fcck Center for Molecular Medicine in the Helmholtz Association , Berlin 13125, Germany"}]},{"given":"Mario","family":"S\u00e4nger","sequence":"additional","affiliation":[{"name":"Computer Science Department, Humboldt-Universit\u00e4t zu Berlin , Berlin 10099, Germany"}]},{"given":"Jannes","family":"M\u00fcnchmeyer","sequence":"additional","affiliation":[{"name":"Computer Science Department, Humboldt-Universit\u00e4t zu Berlin , Berlin 10099, Germany"},{"name":"Section Seismology, GFZ German Research Centre for Geosciences , Potsdam 14473, Germany"}]},{"given":"Maryam","family":"Habibi","sequence":"additional","affiliation":[{"name":"Computer Science Department, Humboldt-Universit\u00e4t zu Berlin , Berlin 10099, Germany"}]},{"given":"Ulf","family":"Leser","sequence":"additional","affiliation":[{"name":"Computer Science Department, Humboldt-Universit\u00e4t zu Berlin , Berlin 10099, Germany"}]},{"given":"Alan","family":"Akbik","sequence":"additional","affiliation":[{"name":"Computer Science Department, Humboldt-Universit\u00e4t zu Berlin , Berlin 10099, Germany"}]}],"member":"286","published-online":{"date-parts":[[2021,1,28]]},"reference":[{"key":"2023051609164927300_btab042-B1","first-page":"1638","volume-title":"Proceedings of the 27th International Conference on Computational Linguistics","author":"Akbik","year":"2018"},{"key":"2023051609164927300_btab042-B2","first-page":"54","author":"Akbik","year":"2019"},{"key":"2023051609164927300_btab042-B3","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1186\/1471-2105-13-161","article-title":"Concept annotation in the craft corpus","volume":"13","author":"Bada","year":"2012","journal-title":"BMC Bioinformatics"},{"key":"2023051609164927300_btab042-B4","volume-title":"Empirical Methods in Natural Language Processing 2019 (EMNLP)","author":"Beltagy","year":"2019"},{"key":"2023051609164927300_btab042-B5","first-page":"135","article-title":"Enriching word vectors with subword information","volume":"5","author":"Bojanowski","year":"2017","journal-title":"Trans. ACL"},{"key":"2023051609164927300_btab042-B6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.jbi.2013.12.006","article-title":"NCBI disease corpus: a resource for disease name recognition and concept normalization","volume":"47","author":"Do\u011fan","year":"2014","journal-title":"J. Biomed. Inform"},{"key":"2023051609164927300_btab042-B7","article-title":"Bidirectional LSTM-CRF models for sequence tagging","author":"Huang","year":"2015"},{"key":"2023051609164927300_btab042-B8","doi-asserted-by":"crossref","first-page":"e0221582","DOI":"10.1371\/journal.pone.0221582","article-title":"A corpus of plant\u2013disease relations in the biomedical domain","volume":"14","author":"Kim","year":"2019","journal-title":"PLoS One"},{"key":"2023051609164927300_btab042-B9","first-page":"73","author":"Kim","year":"2004"},{"key":"2023051609164927300_btab042-B10","doi-asserted-by":"crossref","first-page":"2909","DOI":"10.1093\/bioinformatics\/btt474","article-title":"DNorm: disease name normalization with pairwise learning to rank","volume":"29","author":"Leaman","year":"2013","journal-title":"Bioinformatics"},{"key":"2023051609164927300_btab042-B11","doi-asserted-by":"crossref","first-page":"S3","DOI":"10.1186\/1758-2946-7-S1-S3","article-title":"tmchem: a high performance approach for chemical named entity recognition and normalization","volume":"7","author":"Leaman","year":"2015","journal-title":"J. Cheminf"},{"key":"2023051609164927300_btab042-B12","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: a pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2019","journal-title":"Bioinformatics"},{"key":"2023051609164927300_btab042-B13","doi-asserted-by":"crossref","first-page":"baw068","DOI":"10.1093\/database\/baw068","article-title":"BioCreative V CDR task corpus: a resource for chemical disease relation extraction","volume":"2016","author":"Li","year":"2016","journal-title":"Database"},{"key":"2023051609164927300_btab042-B14","volume-title":"18th BioNLP Workshop and Shared Task","author":"Neumann","year":"2019"},{"key":"2023051609164927300_btab042-B15","volume-title":"BioNLP Shared Task 2013 Workshop","author":"Pyysalo","year":"2013"},{"key":"2023051609164927300_btab042-B16","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1093\/bioinformatics\/btz528","article-title":"HUNER: improving biomedical NER with pretraining","volume":"36","author":"Weber","year":"2020","journal-title":"Bioinformatics"},{"key":"2023051609164927300_btab042-B17","first-page":"1","article-title":"Gnormplus: an integrative approach for tagging genes, gene families, and protein domains","volume":"2015","author":"Wei","year":"2015","journal-title":"BioMed. Res. Int"},{"key":"2023051609164927300_btab042-B18","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1186\/s12859-019-2813-6","article-title":"Collabonet: collaboration of deep neural networks for biomedical named entity recognition","volume":"20","author":"Yoon","year":"2019","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab042\/36180397\/btab042.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/17\/2792\/50339119\/btab042.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/17\/2792\/50339119\/btab042.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T09:20:17Z","timestamp":1684228817000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/17\/2792\/6122692"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,1,28]]},"references-count":18,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2021,9,9]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab042","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,9,1]]},"published":{"date-parts":[[2021,1,28]]}}}