{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T20:50:29Z","timestamp":1760215829500,"version":"3.37.3"},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2019,7,15]],"date-time":"2019-07-15T00:00:00Z","timestamp":1563148800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>Cohort selection for clinical trials is a key step for clinical research. We proposed a hierarchical neural network to determine whether a patient satisfied selection criteria or not.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>We designed a hierarchical neural network (denoted as CNN-Highway-LSTM or LSTM-Highway-LSTM) for the track 1 of the national natural language processing (NLP) clinical challenge (n2c2) on cohort selection for clinical trials in 2018. The neural network is composed of 5 components: (1) sentence representation using convolutional neural network (CNN) or long short-term memory (LSTM) network; (2) a highway network to adjust information flow; (3) a self-attention neural network to reweight sentences; (4) document representation using LSTM, which takes sentence representations in chronological order as input; (5) a fully connected neural network to determine whether each criterion is met or not. We compared the proposed method with its variants, including the methods only using the first component to represent documents directly and the fully connected neural network for classification (denoted as CNN-only or LSTM-only) and the methods without using the highway network (denoted as CNN-LSTM or LSTM-LSTM). The performance of all methods was measured by micro-averaged precision, recall, and F1 score.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>The micro-averaged F1 scores of CNN-only, LSTM-only, CNN-LSTM, LSTM-LSTM, CNN-Highway-LSTM, and LSTM-Highway-LSTM were 85.24%, 84.25%, 87.27%, 88.68%, 88.48%, and 90.21%, respectively. The highest micro-averaged F1 score is higher than our submitted 1 of 88.55%, which is 1 of the top-ranked results in the challenge. The results indicate that the proposed method is effective for cohort selection for clinical trials.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Discussion<\/jats:title>\n                  <jats:p>Although the proposed method achieved promising results, some mistakes were caused by word ambiguity, negation, number analysis and incomplete dictionary. Moreover, imbalanced data was another challenge that needs to be tackled in the future.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusion<\/jats:title>\n                  <jats:p>In this article, we proposed a hierarchical neural network for cohort selection. Experimental results show that this method is good at selecting cohort.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocz099","type":"journal-article","created":{"date-parts":[[2019,6,13]],"date-time":"2019-06-13T19:13:24Z","timestamp":1560453204000},"page":"1203-1208","source":"Crossref","is-referenced-by-count":26,"title":["Cohort selection for clinical trials using hierarchical neural network"],"prefix":"10.1093","volume":"26","author":[{"given":"Ying","family":"Xiong","sequence":"first","affiliation":[{"name":"Department of Computer Science, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China"}]},{"given":"Xue","family":"Shi","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China"}]},{"given":"Shuai","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China"}]},{"given":"Dehuan","family":"Jiang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China"}]},{"given":"Buzhou","family":"Tang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China"}]},{"given":"Xiaolong","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China"}]},{"given":"Qingcai","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China"}]},{"given":"Jun","family":"Yan","sequence":"additional","affiliation":[{"name":"Yidu Cloud (Beijing) Technology Co., Ltd, Beijing, China"}]}],"member":"286","published-online":{"date-parts":[[2019,7,15]]},"reference":[{"issue":"4","key":"2021012411200660000_ocz099-B1","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1136\/jamia.2010.003707","article-title":"Symbolic rule-based classification of lung cancer stages from free-text pathology reports","volume":"17","author":"Nguyen","year":"2010","journal-title":"J Am Med Inform Assoc"},{"key":"2021012411200660000_ocz099-B2","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1016\/j.ijmedinf.2016.09.014","article-title":"A machine learning-based framework to identify type 2 diabetes through electronic health records","volume":"97","author":"Zheng","year":"2017","journal-title":"Int J Med Inform"},{"issue":"5","key":"2021012411200660000_ocz099-B3","doi-asserted-by":"crossref","first-page":"859","DOI":"10.1136\/amiajnl-2011-000535","article-title":"Automated extraction of ejection fraction for quality measurement using regular expressions in Unstructured Information Management Architecture (UIMA) for heart failure","volume":"19","author":"Garvin","year":"2012","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2021012411200660000_ocz099-B4","doi-asserted-by":"crossref","first-page":"30.","DOI":"10.1186\/1472-6947-6-30","article-title":"Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system","volume":"6","author":"Zeng","year":"2006","journal-title":"BMC Med Inform Dec Making"},{"issue":"4","key":"2021012411200660000_ocz099-B5","doi-asserted-by":"crossref","first-page":"561","DOI":"10.1197\/jamia.M3115","article-title":"Recognizing obesity and comorbidities in sparse data","volume":"16","author":"Uzuner","year":"2009","journal-title":"J Am Med Inform Assoc"},{"issue":"2","key":"2021012411200660000_ocz099-B6","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1136\/amiajnl-2013-001935","article-title":"A review of approaches to identifying patient phenotype cohorts using electronic health records","volume":"21","author":"Shivade","year":"2014","journal-title":"J Am Med Inform Assoc"},{"issue":"4","key":"2021012411200660000_ocz099-B7","doi-asserted-by":"crossref","first-page":"731","DOI":"10.1093\/jamia\/ocw011","article-title":"Electronic medical record phenotyping using the anchor and learn framework","volume":"23","author":"Halpern","year":"2016","journal-title":"J Am Med Inform Assoc"},{"issue":"3","key":"2021012411200660000_ocz099-B8","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1016\/j.cct.2010.03.005","article-title":"Automated matching software for clinical trials eligibility: measuring efficiency and flexibility","volume":"31","author":"Penberthy","year":"2010","journal-title":"Contemporary Clinical Trials"},{"issue":"2","key":"2021012411200660000_ocz099-B9","doi-asserted-by":"crossref","first-page":"212","DOI":"10.1136\/amiajnl-2011-000439","article-title":"Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study","volume":"19","author":"Kho","year":"2012","journal-title":"J Am Med Inform Assoc"},{"issue":"7","key":"2021012411200660000_ocz099-B10","doi-asserted-by":"crossref","first-page":"e2626.","DOI":"10.1371\/journal.pone.0002626","article-title":"Automated identification of acute hepatitis B using electronic medical record data to facilitate public health surveillance","volume":"3","author":"Klompas","year":"2008","journal-title":"PLOS One"},{"issue":"6","key":"2021012411200660000_ocz099-B11","doi-asserted-by":"crossref","first-page":"270","DOI":"10.1177\/106286069901400607","article-title":"Identifying persons with diabetes using Medicare claims data","volume":"14","author":"Hebert","year":"1999","journal-title":"Am J Med Qual"},{"key":"2021012411200660000_ocz099-B12","first-page":"505\u201311.","article-title":"Risk stratification of ICU patients using topic models inferred from unstructured progress notes","volume":"2012","author":"Lehman","year":"2012","journal-title":"AMIA Annu Symp Proc"},{"issue":"6","key":"2021012411200660000_ocz099-B13","doi-asserted-by":"crossref","first-page":"1191","DOI":"10.1016\/j.jbi.2012.07.008","article-title":"Synergistic effect of different levels of genomic data for cancer clinical outcome prediction","volume":"45","author":"Kim","year":"2012","journal-title":"J. Biomed Inform"},{"key":"2021012411200660000_ocz099-B14","first-page":"1564","article-title":"Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases","volume":"2011","author":"Xu","year":"2011","journal-title":"AMIA Annu Symp Proc"},{"issue":"5","key":"2021012411200660000_ocz099-B15","doi-asserted-by":"crossref","first-page":"993","DOI":"10.1093\/jamia\/ocv034","article-title":"Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources","volume":"22","author":"Yu","year":"2015","journal-title":"J Am Med Inform Assoc"},{"issue":"10","key":"2021012411200660000_ocz099-B16","doi-asserted-by":"crossref","first-page":"761.","DOI":"10.1038\/gim.2013.72","article-title":"The electronic medical records and genomics (eMERGE) network: past, present, and future","volume":"15","author":"Gottesman","year":"2013","journal-title":"Genet Med"},{"issue":"6","key":"2021012411200660000_ocz099-B17","doi-asserted-by":"crossref","first-page":"1046","DOI":"10.1093\/jamia\/ocv202","article-title":"PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability","volume":"23","author":"Kirby","year":"2016","journal-title":"J Am Med Inform Assoc"},{"key":"2021012411200660000_ocz099-B18","first-page":"606.","article-title":"Type 2 diabetes risk forecasting from EMR data using machine learning","volume":"2012","author":"Mani","year":"2012","journal-title":"AMIA Annu Symp Proc"},{"year":"2017","author":"Che","key":"2021012411200660000_ocz099-B19"},{"year":"2001","author":"Aronson","key":"2021012411200660000_ocz099-B20"},{"key":"2021012411200660000_ocz099-B21","first-page":"237","article-title":"Comparing methods for identifying pancreatic cancer patients using electronic data sources","volume":"2010","author":"Friedlin","year":"2010;","journal-title":"AMIA Annu Symp Proc"},{"key":"2021012411200660000_ocz099-B22","first-page":"1191","article-title":"EpiDEA: extracting structured epilepsy and seizure information from patient discharge summaries for cohort identification","volume":"2012","author":"Cui","year":"2012","journal-title":"AMIA Annu Symp Proc"},{"author":"Jain","key":"2021012411200660000_ocz099-B23"},{"issue":"5","key":"2021012411200660000_ocz099-B24","doi-asserted-by":"crossref","first-page":"507","DOI":"10.1136\/jamia.2009.001560","article-title":"Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications","volume":"17","author":"Savova","year":"2010","journal-title":"J Am Med Inform Assoc"},{"issue":"5","key":"2021012411200660000_ocz099-B25","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1006\/jbin.2001.1029","article-title":"A simple algorithm for identifying negated findings and diseases in discharge summaries","volume":"34","author":"Chapman","year":"2001","journal-title":"J Biomed Inform"},{"issue":"1","key":"2021012411200660000_ocz099-B26","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1197\/jamia.M2408","article-title":"Identifying patient smoking status from medical discharge records","volume":"15","author":"Uzuner","year":"2008","journal-title":"J Am Med Inform Assoc"},{"key":"2021012411200660000_ocz099-B27","doi-asserted-by":"crossref","first-page":"S62","DOI":"10.1016\/j.jbi.2017.04.017","article-title":"Symptom severity prediction from neuropsychiatric clinical records: overview of 2016 CEGS N-GRID shared tasks track 2","volume":"75","author":"Filannino","year":"2017","journal-title":"J Biomed Inform"},{"author":"Yang","key":"2021012411200660000_ocz099-B28"},{"first-page":"2377","year":"2015","author":"Srivastava","key":"2021012411200660000_ocz099-B29"},{"author":"Bahdanau","key":"2021012411200660000_ocz099-B30"},{"issue":"5","key":"2021012411200660000_ocz099-B31","doi-asserted-by":"crossref","first-page":"828","DOI":"10.1136\/amiajnl-2013-001635","article-title":"A hybrid system for temporal information extraction from clinical text","volume":"20","author":"Tang","year":"2013","journal-title":"J Am Med Inform Assoc"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/26\/11\/1203\/36089025\/ocz099.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/26\/11\/1203\/36089025\/ocz099.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,1,24]],"date-time":"2021-01-24T16:20:18Z","timestamp":1611505218000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/26\/11\/1203\/5532320"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,7,15]]},"references-count":31,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2019,7,15]]},"published-print":{"date-parts":[[2019,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocz099","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"type":"print","value":"1067-5027"},{"type":"electronic","value":"1527-974X"}],"subject":[],"published-other":{"date-parts":[[2019,11]]},"published":{"date-parts":[[2019,7,15]]}}}