{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T17:15:14Z","timestamp":1772039714166,"version":"3.50.1"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2023,9,5]],"date-time":"2023-09-05T00:00:00Z","timestamp":1693872000000},"content-version":"vor","delay-in-days":4,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000092","name":"National Library of Medicine","doi-asserted-by":"publisher","award":["R01LM013519"],"award-info":[{"award-number":["R01LM013519"]}],"id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,9,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Automated extraction of population, intervention, comparison\/control, and outcome (PICO) from the randomized controlled trial (RCT) abstracts is important for evidence synthesis. Previous studies have demonstrated the feasibility of applying natural language processing (NLP) for PICO extraction. However, the performance is not optimal due to the complexity of PICO information in RCT abstracts and the challenges involved in their annotation.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We propose a two-step NLP pipeline to extract PICO elements from RCT abstracts: (i) sentence classification using a prompt-based learning model and (ii) PICO extraction using a named entity recognition (NER) model. First, the sentences in abstracts were categorized into four sections namely background, methods, results, and conclusions. Next, the NER model was applied to extract the PICO elements from the sentences within the title and methods sections that include &amp;gt;96% of PICO information. We evaluated our proposed NLP pipeline on three datasets, the EBM-NLPmod dataset, a randomly selected and re-annotated dataset of 500 RCT abstracts from the EBM-NLP corpus, a dataset of 150 Coronavirus Disease 2019 (COVID-19) RCT abstracts, and a dataset of 150 Alzheimer\u2019s disease (AD) RCT abstracts. The end-to-end evaluation reveals that our proposed approach achieved an overall micro F1 score of 0.833 on the EBM-NLPmod dataset, 0.928 on the COVID-19 dataset, and 0.899 on the AD dataset when measured at the token-level and an overall micro F1 score of 0.712 on EBM-NLPmod dataset, 0.850 on the COVID-19 dataset, and 0.805 on the AD dataset when measured at the entity-level.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Our codes and datasets are publicly available at https:\/\/github.com\/BIDS-Xu-Lab\/section_specific_annotation_of_PICO.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad542","type":"journal-article","created":{"date-parts":[[2023,9,4]],"date-time":"2023-09-04T13:53:25Z","timestamp":1693835605000},"source":"Crossref","is-referenced-by-count":19,"title":["Towards precise PICO extraction from abstracts of randomized controlled trials using a section-specific learning approach"],"prefix":"10.1093","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-2413-5918","authenticated-orcid":false,"given":"Yan","family":"Hu","sequence":"first","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston , Houston, TX 77054, United States"}]},{"given":"Vipina K","family":"Keloth","sequence":"additional","affiliation":[{"name":"Section of Biomedical Informatics and Data Science, School of Medicine, Yale University , 100 College St , New Haven, CT 06510, United States"}]},{"given":"Kalpana","family":"Raja","sequence":"additional","affiliation":[{"name":"Section of Biomedical Informatics and Data Science, School of Medicine, Yale University , 100 College St , New Haven, CT 06510, United States"}]},{"given":"Yong","family":"Chen","sequence":"additional","affiliation":[{"name":"Center for Health Analytics and Synthesis of Evidence (CHASE), Department of Biostatistics, Epide-miology and Informatics, University of Pennsylvania , 423 Guardian Dr , Philadelphia, PA 19104, United States"},{"name":"Penn Medicine Center for Evidence-based Practice (CEP), University of Pennsylvania , 3600 Civic Center Blvd , Philadelphia, PA 19104, United States"}]},{"given":"Hua","family":"Xu","sequence":"additional","affiliation":[{"name":"Section of Biomedical Informatics and Data Science, School of Medicine, Yale University , 100 College St , New Haven, CT 06510, United States"}]}],"member":"286","published-online":{"date-parts":[[2023,9,5]]},"reference":[{"key":"2023091405003203400_btad542-B1","doi-asserted-by":"crossref","first-page":"1637","DOI":"10.1016\/S0140-6736(21)00676-0","article-title":"Tocilizumab in patients admitted to hospital with COVID-19 (RECOVERY): a randomised, controlled, open-label, platform trial","volume":"397","author":"Abani","year":"2021","journal-title":"Lancet"},{"key":"2023091405003203400_btad542-B2","first-page":"221","author":"Alrowili","year":"2021"},{"key":"2023091405003203400_btad542-B3","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1186\/1472-6947-10-29","article-title":"Combining classifiers for robust PICO element detection","volume":"10","author":"Boudin","year":"2010","journal-title":"BMC Med Inform Decis Mak"},{"key":"2023091405003203400_btad542-B4","first-page":"1","author":"Chabou","year":"2015"},{"key":"2023091405003203400_btad542-B5","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1186\/s12911-018-0699-2","article-title":"Combination of conditional random field with a rule based method in the extraction of PICO elements","volume":"18","author":"Chabou","year":"2018","journal-title":"BMC Med Inform Decis Mak"},{"key":"2023091405003203400_btad542-B6","first-page":"121","author":"Chung","year":"2007"},{"key":"2023091405003203400_btad542-B7","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1186\/1472-6947-9-10","article-title":"Sentence retrieval for abstracts of randomized controlled trials","volume":"9","author":"Chung","year":"2009","journal-title":"BMC Med Inform Decis Mak"},{"key":"2023091405003203400_btad542-B8","author":"Cohan","year":"2019"},{"key":"2023091405003203400_btad542-B9","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1162\/coli.2007.33.1.63","article-title":"Answering clinical questions with knowledge-based and statistical techniques","volume":"33","author":"Demner-Fushman","year":"2007","journal-title":"Comput Linguist"},{"key":"2023091405003203400_btad542-B10","author":"Dernoncourt","year":"2017"},{"key":"2023091405003203400_btad542-B11","author":"Dernoncourt","year":"2016"},{"key":"2023091405003203400_btad542-B12","author":"Devlin","year":"2018"},{"key":"2023091405003203400_btad542-B13","first-page":"65","author":"Dhrangadhariya","year":"2021"},{"key":"2023091405003203400_btad542-B14","author":"Ding","year":"2021"},{"key":"2023091405003203400_btad542-B15","doi-asserted-by":"crossref","first-page":"106222","DOI":"10.1016\/j.ijantimicag.2020.106222","article-title":"Randomised controlled trials for COVID-19: evaluation of optimal randomisation methodologies\u2014need for data validation of the completed trials and to improve ongoing and future randomised trial designs","volume":"57","author":"Emani","year":"2021","journal-title":"Int J Antimicrob Agents"},{"key":"2023091405003203400_btad542-B16","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13054-020-03406-3","article-title":"The adaptive COVID-19 treatment trial-1 (ACTT-1) in a real-world population: a comparative observational study","volume":"24","author":"Frost","year":"2020","journal-title":"Crit Care"},{"key":"2023091405003203400_btad542-B17","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3458754","article-title":"Domain-specific language model pretraining for biomedical natural language processing","volume":"3","author":"Gu","year":"2022","journal-title":"ACM Trans Comput Healthc"},{"key":"2023091405003203400_btad542-B18","author":"Hirohata","year":"2008."},{"key":"2023091405003203400_btad542-B19","first-page":"606","author":"Hu","year":"2022"},{"key":"2023091405003203400_btad542-B20","first-page":"359","article-title":"Evaluation of PICO as a knowledge representation for clinical questions","volume":"2006","author":"Huang","year":"2006","journal-title":"AMIA Annu Symp Proc"},{"key":"2023091405003203400_btad542-B21","author":"Jin","year":"2018"},{"key":"2023091405003203400_btad542-B22","first-page":"67","author":"Jin","year":"2018"},{"key":"2023091405003203400_btad542-B23","doi-asserted-by":"crossref","first-page":"812","DOI":"10.1093\/jamia\/ocaa309","article-title":"UMLS-based data augmentation for natural language processing of clinical research literature","volume":"28","author":"Kang","year":"2021","journal-title":"J Am Med Inform Assoc"},{"key":"2023091405003203400_btad542-B24","first-page":"188","article-title":"Pretraining to recognize PICO elements from randomized controlled trial literature","volume":"264","author":"Kang","year":"2019","journal-title":"Stud Health Technol Inform"},{"key":"2023091405003203400_btad542-B25","doi-asserted-by":"crossref","first-page":"S5","DOI":"10.1186\/1471-2105-12-S2-S5","article-title":"Automatic classification of sentences to support evidence based medicine","volume":"12(Suppl 2)","author":"Kim","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023091405003203400_btad542-B26","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: a pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee","year":"2020","journal-title":"Bioinformatics"},{"key":"2023091405003203400_btad542-B27","first-page":"65","author":"Lin","year":"2006"},{"key":"2023091405003203400_btad542-B28","author":"Liu","year":"2021"},{"key":"2023091405003203400_btad542-B29","author":"Liu","year":"2021"},{"key":"2023091405003203400_btad542-B30","first-page":"440","article-title":"Categorization of sentence types in medical abstracts","volume":"2003","author":"McKnight","year":"2003","journal-title":"AMIA Annu Symp Proc"},{"key":"2023091405003203400_btad542-B31","first-page":"299","article-title":"Aggregating and predicting sequence labels from crowd annotations","volume":"2017","author":"Nguyen","year":"2017","journal-title":"Proc Conf Assoc Comput Linguist Meet"},{"key":"2023091405003203400_btad542-B32","first-page":"197","article-title":"A corpus with multi-level annotations of patients, interventions and outcomes to support language processing for medical literature","volume":"2018","author":"Nye","year":"2018","journal-title":"Proc Conf Assoc Comput Linguist Meet"},{"key":"2023091405003203400_btad542-B33","author":"Peng","year":"2019"},{"key":"2023091405003203400_btad542-B34","author":"Petroni","year":"2019"},{"key":"2023091405003203400_btad542-B35","doi-asserted-by":"crossref","first-page":"A12","DOI":"10.7326\/ACPJC-1995-123-3-A12","article-title":"The well-built clinical question: a key to evidence-based decisions","volume":"123","author":"Richardson","year":"1995","journal-title":"ACP J Club"},{"key":"2023091405003203400_btad542-B36","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/j.ijmedinf.2006.05.002","article-title":"Using argumentation to extract key sentences from biomedical abstracts","volume":"76","author":"Ruch","year":"2007","journal-title":"Int J Med Inform"},{"key":"2023091405003203400_btad542-B37","first-page":"198","author":"Shang","year":"2021"},{"key":"2023091405003203400_btad542-B38","author":"Shimbo","year":"2003"},{"key":"2023091405003203400_btad542-B39","author":"Wei","year":"2019"},{"key":"2023091405003203400_btad542-B40","first-page":"824","article-title":"Combining text classification and hidden Markov modeling techniques for structuring randomized clinical trial abstracts","volume":"2006","author":"Xu","year":"2006","journal-title":"AMIA Annu Symp Proc"},{"key":"2023091405003203400_btad542-B41","first-page":"871","author":"Yamada","year":"2020"},{"key":"2023091405003203400_btad542-B42","author":"Zhang","year":"2020"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad542\/51359143\/btad542.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/9\/btad542\/51546696\/btad542.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/9\/btad542\/51546696\/btad542.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,14]],"date-time":"2023-09-14T05:38:42Z","timestamp":1694669922000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad542\/7260503"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,9,1]]},"references-count":42,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2023,9,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad542","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,9,1]]},"published":{"date-parts":[[2023,9,1]]},"article-number":"btad542"}}