{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,11]],"date-time":"2026-05-11T19:31:14Z","timestamp":1778527874877,"version":"3.51.4"},"reference-count":23,"publisher":"Springer Science and Business Media LLC","issue":"S2","license":[{"start":{"date-parts":[[2021,7,1]],"date-time":"2021-07-01T00:00:00Z","timestamp":1625097600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,7,30]],"date-time":"2021-07-30T00:00:00Z","timestamp":1627603200000},"content-version":"vor","delay-in-days":29,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Beijing Natural Science Foundation","award":["Z200016"],"award-info":[{"award-number":["Z200016"]}]},{"DOI":"10.13039\/501100005150","name":"Chinese Academy of Medical Sciences","doi-asserted-by":"crossref","award":["2018-I2M-AI-016"],"award-info":[{"award-number":["2018-I2M-AI-016"]}],"id":[{"id":"10.13039\/501100005150","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61906214"],"award-info":[{"award-number":["61906214"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Med Inform Decis Mak"],"published-print":{"date-parts":[[2021,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p><jats:italic>Transformer<\/jats:italic> is an attention-based architecture proven the state-of-the-art model in natural language processing (NLP). To reduce the difficulty of beginning to use <jats:italic>transformer-based<\/jats:italic> models in medical language understanding and expand the capability of the <jats:italic>scikit-learn<\/jats:italic> toolkit in deep learning, we proposed an easy to learn Python toolkit named <jats:italic>transformers-sklearn.<\/jats:italic> By wrapping the interfaces of <jats:italic>transformers<\/jats:italic> in only three functions (i.e., fit, score, and predict), <jats:italic>transformers-sklearn<\/jats:italic> combines the advantages of the <jats:italic>transformers<\/jats:italic> and <jats:italic>scikit-learn<\/jats:italic> toolkits.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Methods<\/jats:title>\n                <jats:p>In <jats:italic>transformers-sklearn<\/jats:italic>, three Python classes were implemented, namely, <jats:italic>BERTologyClassifier<\/jats:italic> for the classification task, <jats:italic>BERTologyNERClassifier<\/jats:italic> for the named entity recognition (NER) task, and <jats:italic>BERTologyRegressor<\/jats:italic> for the regression task. Each class contains three methods, i.e., <jats:italic>fit<\/jats:italic> for fine-tuning <jats:italic>transformer-based<\/jats:italic> models with the training dataset, <jats:italic>score<\/jats:italic> for evaluating the performance of the fine-tuned model, and <jats:italic>predict<\/jats:italic> for predicting the labels of the test dataset. <jats:italic>transformers-sklearn<\/jats:italic> is a user-friendly toolkit that (1) Is customizable via a few parameters (e.g., <jats:italic>model_name_or_path<\/jats:italic> and <jats:italic>model_type<\/jats:italic>), (2) Supports multilingual NLP tasks, and (3) Requires less coding. The input data format is automatically generated by <jats:italic>transformers-sklearn<\/jats:italic> with the annotated corpus. Newcomers only need to prepare the dataset. The model framework and training methods are predefined in <jats:italic>transformers-sklearn<\/jats:italic>.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We collected four open-source medical language datasets, including <jats:italic>TrialClassification<\/jats:italic> for Chinese medical trial text multi label classification, <jats:italic>BC5CDR<\/jats:italic> for English biomedical text name entity recognition, <jats:italic>DiabetesNER<\/jats:italic> for Chinese diabetes entity recognition and <jats:italic>BIOSSES<\/jats:italic> for English biomedical sentence similarity estimation.<\/jats:p>\n                <jats:p>In the four medical NLP tasks, the average code size of our script is 45 lines\/task, which is one-sixth the size of <jats:italic>transformers<\/jats:italic>\u2019 script. The experimental results show that <jats:italic>transformers-sklearn<\/jats:italic> based on pretrained BERT models achieved macro F1 scores of 0.8225, 0.8703 and 0.6908, respectively, on the <jats:italic>TrialClassification<\/jats:italic>, <jats:italic>BC5CDR<\/jats:italic> and <jats:italic>DiabetesNER<\/jats:italic> tasks and a <jats:italic>Pearson correlation<\/jats:italic> of 0.8260 on the <jats:italic>BIOSSES<\/jats:italic> task, which is consistent with the results of <jats:italic>transformers<\/jats:italic>.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>The proposed toolkit could help newcomers address medical language understanding tasks using the <jats:italic>scikit-learn<\/jats:italic> coding style easily. The code and tutorials of <jats:italic>transformers-sklearn<\/jats:italic> are available at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/doi.org\/10.5281\/zenodo.4453803\">https:\/\/doi.org\/10.5281\/zenodo.4453803<\/jats:ext-link>. In future, more medical language understanding tasks will be supported to improve the applications of <jats:italic>transformers_sklearn<\/jats:italic>.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12911-021-01459-0","type":"journal-article","created":{"date-parts":[[2021,7,30]],"date-time":"2021-07-30T09:03:36Z","timestamp":1627635816000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":32,"title":["Transformers-sklearn: a toolkit for medical language understanding with transformer-based models"],"prefix":"10.1186","volume":"21","author":[{"given":"Feihong","family":"Yang","sequence":"first","affiliation":[]},{"given":"Xuwen","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Hetong","family":"Ma","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6391-8343","authenticated-orcid":false,"given":"Jiao","family":"Li","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,7,30]]},"reference":[{"key":"1459_CR1","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser U, Polosukhin I. Attention is all you need. In: NIPS\u201917. Red Hook, NY, USA; 2017, p. 6000\u20136010."},{"key":"1459_CR2","unstructured":"Devlin J, Chang M, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT:2019; 2019."},{"key":"1459_CR3","unstructured":"Liu Y, Ott M, Goyal N, et al. RoBERTa: aA robustly optimized BERT pretraining approach. In: ArXiv 2019, abs\/1907.11692."},{"key":"1459_CR4","doi-asserted-by":"crossref","unstructured":"Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M et al. HuggingFace's transformers: state-of-the-art natural language processing. ArXiv 2019, abs\/1910.03771.","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"1459_CR5","doi-asserted-by":"crossref","unstructured":"Gardner M, Grus J, Neumann M, Tafjord O, Dasigi P, Liu NF, Peters ME, Schmitz M, Zettlemoyer L. AllenNLP: a deep semantic natural language processing platform. ArXiv 2018, abs\/1803.07640.","DOI":"10.18653\/v1\/W18-2501"},{"key":"1459_CR6","unstructured":"Akbik A, Blythe D, Vollgraf R. Contextual string embeddings for sequence labeling. In: COLING2018:27th international conference on computational linguistics; 2018, p. 1638\u20131649."},{"key":"1459_CR7","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, M\u00fcller A, Nothman J, Louppe G et al. Scikit-learn: machine learning in python. ArXiv 2012, abs\/1201.0490."},{"key":"1459_CR8","unstructured":"Lemaitre G, Nogueira F, Aridas CK. Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. ArXiv 2016, abs\/1609.06570"},{"key":"1459_CR9","unstructured":"Szyma\u0144ski P, Kajdanowicz T. A scikit-based Python environment for performing multi-label classification. ArXiv 2017, abs\/1702.01460 ."},{"key":"1459_CR10","unstructured":"L\u00f6ning M, Bagnall A, Ganesh S, Kazakov V, Lines J, Kir\u00e1ly FJ. sktime: A Unified Interface for Machine Learning with Time Series. ArXiv 2019, abs\/1909.07872."},{"key":"1459_CR11","unstructured":"de Vazelhes W, Carey CJ, Tang Y, Vauquier N, Bellet A. metric-learn: Metric Learning algorithms in python. ArXiv 2019, abs\/1908.04710."},{"key":"1459_CR12","doi-asserted-by":"crossref","unstructured":"Zhao Z, Chen H, Zhang J, Zhao X, Liu T, Lu W, Chen X, Deng H, Ju Q, Du X. UER: An Open-source toolkit for pre-training models. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP): system demonstrations: 1990\u201311\u201301 2019; Hong Kong, China: Association for Computational","DOI":"10.18653\/v1\/D19-3041"},{"key":"1459_CR13","unstructured":"Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le QV. XLNet: generalized autoregressive pretraining for language understanding. ArXiv 2019, abs\/1906.08237."},{"key":"1459_CR14","unstructured":"Lample G, Conneau A. Cross-lingual Language Model Pretraining. ArXiv 2019, abs\/1901.0729."},{"key":"1459_CR15","unstructured":"Sanh V, Debut L, Chaumond J, Wolf T. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv 2019, abs\/1910.01108."},{"key":"1459_CR16","unstructured":"Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R. ALBERT: a Lite BERT for self-supervised learning of language representations. ArXiv 2019, abs\/1909.11942."},{"key":"1459_CR17","unstructured":"NumPy. https:\/\/numpy.org\/. Accessed 21 Aug 2020"},{"key":"1459_CR18","unstructured":"pandas: Python data analysis library. https:\/\/pandas.pydata.org\/index.html. Accessed 21 Aug 2020"},{"key":"1459_CR19","unstructured":"Google Research.GitHub Repository. https:\/\/github.com\/google-research\/bert. Accessed 21 Aug 2020"},{"key":"1459_CR20","unstructured":"CHIP: Short text classification for clinical trial screening criteria. http:\/\/www.cips-chip.org.cn:8088\/evaluation. Accessed 21 Aug 2020"},{"key":"1459_CR21","unstructured":"Wei C, Peng Y, Leaman R, Davis AP, Mattingly CJ, Li J, Wiegers TC, Lu Z. Overview of the BioCreative V chemical disease relation (CDR) task. In: Proceedings of the fifth biocreative challenge evaluation workshop:2015; 2015: 154\u2013166."},{"key":"1459_CR22","unstructured":"Cloud A: Alibaba Cloud Labeled Chinese Dataset for diabetes. https:\/\/tianchi.aliyun.com\/dataset\/dataDetail?dataId=22288. Accessed 21 Aug 2020"},{"issue":"14","key":"1459_CR23","doi-asserted-by":"publisher","first-page":"i49","DOI":"10.1093\/bioinformatics\/btx238","volume":"33","author":"G So\u011fanc\u0131o\u011flu","year":"2017","unstructured":"So\u011fanc\u0131o\u011flu G, \u00d6zt\u00fcrk H, \u00d6zg\u00fcr A. BIOSSES: a semantic sentence similarity estimation system for the biomedical domain. Bioinformatics. 2017;33(14):i49\u201358.","journal-title":"Bioinformatics"}],"container-title":["BMC Medical Informatics and Decision Making"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-021-01459-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12911-021-01459-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-021-01459-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,30]],"date-time":"2021-07-30T09:15:10Z","timestamp":1627636510000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcmedinformdecismak.biomedcentral.com\/articles\/10.1186\/s12911-021-01459-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7]]},"references-count":23,"journal-issue":{"issue":"S2","published-print":{"date-parts":[[2021,7]]}},"alternative-id":["1459"],"URL":"https:\/\/doi.org\/10.1186\/s12911-021-01459-0","relation":{},"ISSN":["1472-6947"],"issn-type":[{"value":"1472-6947","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,7]]},"assertion":[{"value":"21 February 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 March 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 July 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"90"}}