{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T04:54:24Z","timestamp":1776833664318,"version":"3.51.2"},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"S2","license":[{"start":{"date-parts":[[2021,7,1]],"date-time":"2021-07-01T00:00:00Z","timestamp":1625097600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,7,30]],"date-time":"2021-07-30T00:00:00Z","timestamp":1627603200000},"content-version":"vor","delay-in-days":29,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"2019 Guangzhou Innovation and Entrepreneurship Leader Team","award":["20190901008"],"award-info":[{"award-number":["20190901008"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["U1711266"],"award-info":[{"award-number":["U1711266"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["9174620"],"award-info":[{"award-number":["9174620"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61772146"],"award-info":[{"award-number":["61772146"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Guangdong Provincial Key R&D Programme","award":["2019B010153001"],"award-info":[{"award-number":["2019B010153001"]}]},{"DOI":"10.13039\/501100003453","name":"Natural Science Foundation of Guangdong Province","doi-asserted-by":"crossref","award":["2021A1515011339"],"award-info":[{"award-number":["2021A1515011339"]}],"id":[{"id":"10.13039\/501100003453","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Med Inform Decis Mak"],"published-print":{"date-parts":[[2021,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Eligibility criteria are the primary strategy for screening the target participants of a clinical trial. Automated classification of clinical trial eligibility criteria text by using machine learning methods improves recruitment efficiency to reduce the cost of clinical research. However, existing methods suffer from poor classification performance due to the complexity and imbalance of eligibility criteria text data.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Methods<\/jats:title>\n                <jats:p>An ensemble learning-based model with metric learning is proposed for eligibility criteria classification. The model integrates a set of pre-trained models including Bidirectional Encoder Representations from Transformers (BERT), A Robustly Optimized BERT Pretraining Approach (RoBERTa), XLNet, Pre-training Text Encoders as Discriminators Rather Than Generators (ELECTRA), and Enhanced Representation through Knowledge Integration (ERNIE). Focal Loss is used as a loss function to address the data imbalance problem. Metric learning is employed to train the embedding of each base model for feature distinguish. Soft Voting is applied to achieve final classification of the ensemble model. The dataset is from the standard evaluation task 3 of 5th China Health Information Processing Conference containing 38,341 eligibility criteria text in 44 categories.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>Our ensemble method had an accuracy of 0.8497, a precision of 0.8229, and a recall of 0.8216 on the dataset. The macro F1-score was 0.8169, outperforming state-of-the-art baseline methods by 0.84% improvement on average. In addition, the performance improvement had a p-value of 2.152e-07 with a standard t-test, indicating that our model achieved a significant improvement.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>A model for classifying eligibility criteria text of clinical trials based on multi-model ensemble learning and metric learning was proposed. The experiments demonstrated that the classification performance was improved by our ensemble model significantly. In addition, metric learning was able to improve word embedding representation and the focal loss reduced the impact of data imbalance to model performance.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12911-021-01492-z","type":"journal-article","created":{"date-parts":[[2021,7,30]],"date-time":"2021-07-30T09:03:36Z","timestamp":1627635816000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Automated classification of clinical trial eligibility criteria text based on ensemble learning and metric learning"],"prefix":"10.1186","volume":"21","author":[{"given":"Kun","family":"Zeng","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yibin","family":"Xu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ge","family":"Lin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Likeng","family":"Liang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9792-3949","authenticated-orcid":false,"given":"Tianyong","family":"Hao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,7,30]]},"reference":[{"key":"1492_CR1","unstructured":"He Z, Carini S, Hao T, Sim I, Weng C. A method for analyzing commonalities in clinical trial target populations. In: AMIA 2014 annual symposium (AMIA), November 15\u201319, 2014;777\u20131786."},{"key":"1492_CR2","doi-asserted-by":"publisher","first-page":"112","DOI":"10.1016\/j.jbi.2014.01.009","volume":"52","author":"T Hao","year":"2014","unstructured":"Hao T, Rusanov A, Boland MR, Weng C. Clustering clinical trials with similar eligibility criteria features. J Biomed Inform. 2014;52:112\u201320.","journal-title":"J Biomed Inform"},{"issue":"6","key":"1492_CR3","first-page":"869","volume":"16","author":"SR Thadani","year":"2009","unstructured":"Thadani SR, Weng C, Bigger JT, Ennever JF, Wajngurt D. Case report: electronic screening improves efficiency in clinical trial recruitment. JAMIA. 2009;16(6):869\u201373.","journal-title":"JAMIA"},{"issue":"6","key":"1492_CR4","doi-asserted-by":"publisher","first-page":"365","DOI":"10.1200\/JOP.2012.000646","volume":"8","author":"L Penberthy","year":"2012","unstructured":"Penberthy L, Dahman B, Petkov V, et al. Effort required in eligibility screening for clinical trials. J Oncol Pract. 2012;8(6):365\u201370.","journal-title":"J Oncol Pract"},{"key":"1492_CR5","doi-asserted-by":"publisher","first-page":"114","DOI":"10.1016\/j.ijmedinf.2019.05.019","volume":"129","author":"C Gulden","year":"2019","unstructured":"Gulden C, Kirchner M, Sch\u00fcttler C, Hinderer M, Kampf MO, Prokosch H-U, Toddenroth D. Extractive summarization of clinical trial descriptions. Int J Med Inform. 2019;129:114\u201321.","journal-title":"Int J Med Inform"},{"issue":"5","key":"1492_CR6","doi-asserted-by":"publisher","first-page":"530","DOI":"10.1093\/jamia\/ocx160","volume":"25","author":"H Wu","year":"2018","unstructured":"Wu H, Toti G, Morley KI, Ibrahim ZM, Folarin A, Jackson R, et al. SemEHR: a general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research. J Am Med Inform Assoc. 2018;25(5):530\u20137.","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"1492_CR7","doi-asserted-by":"publisher","first-page":"132","DOI":"10.1093\/bib\/bbv024","volume":"17","author":"C-C Huang","year":"2016","unstructured":"Huang C-C, Zhiyong Lu. Community challenges in biomedical text mining over 10 years: success, failure and the future. Brief Bioinform. 2016;17(1):132\u201344.","journal-title":"Brief Bioinform"},{"issue":"4","key":"1492_CR8","doi-asserted-by":"publisher","first-page":"453","DOI":"10.1007\/s10115-006-0013-y","volume":"10","author":"T Li","year":"2006","unstructured":"Li T, Zhu S, Ogihara M. Using discriminant analysis for multi-class classification: an experimental investigation. Knowl Inf Syst. 2006;10(4):453\u201372.","journal-title":"Knowl Inf Syst"},{"issue":"2","key":"1492_CR9","first-page":"159","volume":"19-S","author":"B Chen","year":"2019","unstructured":"Chen B, Jin H, Yang Z, Qu Y, Weng H, Hao T. An approach for transgender population information extraction and summarization from clinical trial text. BMC Med Inf Decis Mak. 2019;19-S(2):159\u201370.","journal-title":"BMC Med Inf Decis Mak"},{"key":"1492_CR10","unstructured":"Tseo Y, Salkola M I, Mohamed A, et al. Information extraction of clinical trial eligibility criteria 2020; arXiv preprint arXiv:2006.07296."},{"issue":"6","key":"1492_CR11","doi-asserted-by":"publisher","first-page":"1062","DOI":"10.1093\/jamia\/ocx019","volume":"24","author":"T Kang","year":"2017","unstructured":"Kang T, Zhang S, Tang Y, et al. EliIE: an open-source information extraction system for clinical trial eligibility criteria. J Am Med Inform Assoc. 2017;24(6):1062\u201371.","journal-title":"J Am Med Inform Assoc"},{"key":"1492_CR12","unstructured":"Luo Z, Johnson SB, Lai AM, et al. Extracting temporal constraints from clinical research eligibility criteria using conditional random fields. In: AMIA annual symposium proceedings. Am Med Inform Assoc. 2011;2011:843."},{"issue":"6","key":"1492_CR13","doi-asserted-by":"publisher","first-page":"927","DOI":"10.1016\/j.jbi.2011.06.001","volume":"44","author":"Z Luo","year":"2011","unstructured":"Luo Z, Yetisgen-Yildiz M, Weng C. Dynamic categorization of clinical research eligibility criteria by hierarchical clustering. J Biomed Inform. 2011;44(6):927\u201335.","journal-title":"J Biomed Inform"},{"key":"1492_CR14","doi-asserted-by":"crossref","unstructured":"Chuan CH. Classifying eligibility criteria in clinical trials using active deep learning. In: 17th IEEE international conference on machine learning and applications (ICMLA). IEEE 2018;305\u2013310.","DOI":"10.1109\/ICMLA.2018.00052"},{"issue":"7553","key":"1492_CR15","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1038\/nature14539","volume":"521","author":"Y LeCun","year":"2015","unstructured":"LeCun Y, Bengio Y, Hinton GE. Deep learning. Nature. 2015;521(7553):436\u201344.","journal-title":"Nature"},{"key":"1492_CR16","unstructured":"Kaljahi, R., Foster, J. Any-gram kernels for sentence classification: a sentiment analysis case study. lthaca, New York: arXiv preprint 2017."},{"key":"1492_CR17","doi-asserted-by":"crossref","unstructured":"Kim Y. Convolutional neural networks for sentence classification. EMNLP:2014;1746\u20131751.","DOI":"10.3115\/v1\/D14-1181"},{"key":"1492_CR18","doi-asserted-by":"crossref","unstructured":"Lee JY, Dernoncourt F. Sequential short-text classification with recurrent and convolutional neural networks. HLT-NAACL. 2016;515\u2013520.","DOI":"10.18653\/v1\/N16-1062"},{"key":"1492_CR19","first-page":"443","volume":"2","author":"ST Hsu","year":"2017","unstructured":"Hsu ST, Moon C, Jones P, et al. A Hybrid CNN-RNN alignment model for phrase-aware sentence classification. EACL. 2017;2:443\u20139.","journal-title":"EACL"},{"key":"1492_CR20","unstructured":"Zhou P, Qi Z, Zheng S, et al. Text classification improved by integrating bidirectional lstm with two-dimensional max pooling. Coling: 3485\u20133495; 2016."},{"key":"1492_CR21","first-page":"4171","volume":"1","author":"J Devlin","year":"2019","unstructured":"Devlin J, Chang M-W, Lee K, et al. BERT, pre-training of deep bidirectional transformers for language understanding. NAACL-HLT. 2019;1:4171\u201386.","journal-title":"NAACL-HLT"},{"key":"1492_CR22","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention is all you need. NIPS: 2017;5998\u20136008."},{"key":"1492_CR23","doi-asserted-by":"crossref","unstructured":"Zhang K, Demner-Fushman D. Automated classification of eligibility criteria in clinical trials to facilitate patient-trial matching for specific patient populations. J Am Med Inform Assoc. 2017.","DOI":"10.1093\/jamia\/ocw176"},{"key":"1492_CR24","doi-asserted-by":"crossref","unstructured":"Stubbs A et al. Cohort selection for clinical trials. n2c2 2018 shared task track 1. J Am Med Inform Assoc.  2019.","DOI":"10.1093\/jamia\/ocz163"},{"issue":"11","key":"1492_CR25","doi-asserted-by":"publisher","first-page":"1247","DOI":"10.1093\/jamia\/ocz149","volume":"26","author":"M Olevnik","year":"2019","unstructured":"Olevnik M, Kugic A, Kasac Z, Kreuzthaler M. Evaluating shallow and deep learning strategies for the 2018 N2c2 shared task on clinical text classification. J Am Med Inform Assoc. 2019;26(11):1247\u201354.","journal-title":"J Am Med Inform Assoc"},{"issue":"33","key":"1492_CR26","doi-asserted-by":"publisher","first-page":"3781","DOI":"10.1200\/JCO.2017.74.4144","volume":"35","author":"L Gore","year":"2017","unstructured":"Gore L, Ivy SP, Balis FM, et al. modernizing clinical trial eligibility: recommendations of the American Society of Clinical Oncology-friends of cancer research minimum age working group. J Clin Oncol. 2017;35(33):3781\u20137.","journal-title":"J Clin Oncol"},{"issue":"33","key":"1492_CR27","doi-asserted-by":"publisher","first-page":"3774","DOI":"10.1200\/JCO.2017.73.7338","volume":"35","author":"TS Uldrick","year":"2017","unstructured":"Uldrick TS, Ison G, Rudek M, et al. Modernizing clinical trial eligibility criteria: recommendations of the American Society of Clinical Oncology-friends of cancer research HIV Working Group. J Clin Oncol. 2017;35(33):3774\u201380.","journal-title":"J Clin Oncol"},{"issue":"33","key":"1492_CR28","doi-asserted-by":"publisher","first-page":"3753","DOI":"10.1200\/JCO.2017.74.4102","volume":"35","author":"SM Lichtman","year":"2017","unstructured":"Lichtman SM, Harvey RD, Damiette SMA, et al. Modernizing clinical trial eligibility criteria: recommendations of the American Society of Clinical Oncology-Friends of Cancer Research Organ Dysfunction, Prior or Concurrent Malignancy, and Comorbidities Working Group. J Clin Oncol. 2017;35(33):3753\u20139.","journal-title":"J Clin Oncol"},{"issue":"33","key":"1492_CR29","doi-asserted-by":"publisher","first-page":"3760","DOI":"10.1200\/JCO.2017.74.0761","volume":"35","author":"NU Lin","year":"2017","unstructured":"Lin NU, Prowell T, Tan AR, et al. modernizing clinical trial eligibility criteria: recommendations of the American Society of Clinical Oncology-Friends of Cancer Research Brain Metastases Working Group. JCO. 2017;35(33):3760\u201373.","journal-title":"JCO"},{"key":"1492_CR30","unstructured":"Xing EP, Ng AY, Jordan MI, Russell S. Distance metric learning with application to clustering with side-information. In: Advances in neural information processing systems. 2003;521\u2013528."},{"key":"1492_CR31","unstructured":"Weinberger KQ, Blitzer J, Saul LK. Distance metric learning for large mar-gin nearest neighbor classification. In: Advances inneural information processing systems. 2006;1473\u20131480."},{"issue":"2","key":"1492_CR32","doi-asserted-by":"publisher","first-page":"573","DOI":"10.1109\/TIP.2012.2219547","volume":"22","author":"M Gong","year":"2013","unstructured":"Gong M, Liang Y, Shi J, Ma W, Ma J. Fuzzy c-means clustering with local information and kernel metric for image segmentation. IEEE Trans Image Process. 2013;22(2):573\u201384.","journal-title":"IEEE Trans Image Process"},{"key":"1492_CR33","doi-asserted-by":"crossref","unstructured":"Guillaumin M, Verbeek J, Schmid C. Is that you? Metric learning approaches for face identification. In: 2009 IEEE 12th international conference on computer vision, 2009;498\u2013505. IEEE.","DOI":"10.1109\/ICCV.2009.5459197"},{"key":"1492_CR34","doi-asserted-by":"crossref","unstructured":"Xu Z, Chen M, Weinberger KQ, Sha F. From sbow to dcotmarginalized encoders for text representation. In: Proceedings of the 21st ACM international conference on information and knowledge management, CIKM 12, 2012;1879\u20131884, New York, NY, USA. ACM.","DOI":"10.1145\/2396761.2398536"},{"key":"1492_CR35","doi-asserted-by":"crossref","unstructured":"Hsieh CK, Yang L, Cui Y, Lin TY, Belongie S, Estrin D. Collaborative metric learning. In: Proceedings of the26th international conference on world wide web, 2017;193\u2013201. International World Wide Web Conferences Steering Committee.","DOI":"10.1145\/3038912.3052639"},{"key":"1492_CR36","unstructured":"Amit Mandelbaum and Daphna Weinshall. Distance-based confidence score for neural network classifiers. 2017;arXiv preprint arXiv:1709.09844."},{"key":"1492_CR37","doi-asserted-by":"crossref","unstructured":"Lin TY, Goyal P, Girshick R, He K, Doll\u00e1r P. Focal loss for dense object detection. ICCV: 2017;2999\u20133007.","DOI":"10.1109\/ICCV.2017.324"}],"container-title":["BMC Medical Informatics and Decision Making"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-021-01492-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12911-021-01492-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-021-01492-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,30]],"date-time":"2021-07-30T09:07:43Z","timestamp":1627636063000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcmedinformdecismak.biomedcentral.com\/articles\/10.1186\/s12911-021-01492-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7]]},"references-count":37,"journal-issue":{"issue":"S2","published-print":{"date-parts":[[2021,7]]}},"alternative-id":["1492"],"URL":"https:\/\/doi.org\/10.1186\/s12911-021-01492-z","relation":{},"ISSN":["1472-6947"],"issn-type":[{"value":"1472-6947","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,7]]},"assertion":[{"value":"15 March 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 April 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 July 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"129"}}