{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,31]],"date-time":"2025-12-31T10:23:20Z","timestamp":1767176600792,"version":"build-2238731810"},"reference-count":75,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,6,11]],"date-time":"2022-06-11T00:00:00Z","timestamp":1654905600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,6,11]],"date-time":"2022-06-11T00:00:00Z","timestamp":1654905600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100008113","name":"Indiana University-Purdue University Indianapolis","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100008113","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Ohio State University"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Biomed Semant"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Adverse events induced by drug-drug interactions are a major concern in the United States. Current research is moving toward using electronic health record (EHR) data, including for adverse drug events discovery. One of the first steps in EHR-based studies is to define a phenotype for establishing a cohort of patients. However, phenotype definitions are not readily available for all phenotypes. One of the first steps of developing automated text mining tools is building a corpus. Therefore, this study aimed to develop annotation guidelines and a gold standard corpus to facilitate building future automated approaches for mining phenotype definitions contained in the literature. Furthermore, our aim is to improve the understanding of how these published phenotype definitions are presented in the literature and how we annotate them for future text mining tasks.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Two annotators manually annotated the corpus on a sentence-level for the presence of evidence for phenotype definitions. Three major categories (inclusion, intermediate, and exclusion) with a total of ten dimensions were proposed characterizing major contextual patterns and cues for presenting phenotype definitions in published literature. The developed annotation guidelines were used to annotate the corpus that contained 3971 sentences: 1923 out of 3971 (48.4%) for the inclusion category, 1851 out of 3971 (46.6%) for the intermediate category, and 2273 out of 3971 (57.2%) for exclusion category. The highest number of annotated sentences was 1449 out of 3971 (36.5%) for the \u201cBiomedical &amp; Procedure\u201d dimension. The lowest number of annotated sentences was 49 out of 3971 (1.2%) for \u201cThe use of NLP\u201d. The overall percent inter-annotator agreement was 97.8%. Percent and Kappa statistics also showed high inter-annotator agreement across all dimensions.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusions<\/jats:title>\n                    <jats:p>The corpus and annotation guidelines can serve as a foundational informatics approach for annotating and mining phenotype definitions in literature, and can be used later for text mining applications.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s13326-022-00272-6","type":"journal-article","created":{"date-parts":[[2022,6,11]],"date-time":"2022-06-11T08:03:34Z","timestamp":1654934614000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["PhenoDEF: a corpus for annotating sentences with information of phenotype definitions in biomedical literature"],"prefix":"10.1186","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0400-823X","authenticated-orcid":false,"given":"Samar","family":"Binkheder","sequence":"first","affiliation":[]},{"given":"Heng-Yi","family":"Wu","sequence":"additional","affiliation":[]},{"given":"Sara K.","family":"Quinney","sequence":"additional","affiliation":[]},{"given":"Shijun","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Md. Muntasir","family":"Zitu","sequence":"additional","affiliation":[]},{"given":"Chien\u2010Wei","family":"Chiang","sequence":"additional","affiliation":[]},{"given":"Lei","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Josette","family":"Jones","sequence":"additional","affiliation":[]},{"given":"Lang","family":"Li","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,6,11]]},"reference":[{"issue":"4","key":"272_CR1","doi-asserted-by":"publisher","first-page":"237","DOI":"10.1007\/s40264-014-0145-z","volume":"37","author":"R Eriksson","year":"2014","unstructured":"Eriksson R, Werge T, Jensen LJ, Brunak S. Dose-specific adverse drug reaction identification in electronic patient records: temporal data mining in an inpatient psychiatric population. Drug Saf. 2014;37(4):237\u201347. https:\/\/doi.org\/10.1007\/s40264-014-0145-z.","journal-title":"Drug Saf"},{"issue":"10","key":"272_CR2","first-page":"12","volume":"42","author":"S Pal","year":"2017","unstructured":"Pal S. Reporting and Consequences of Adverse Events. US Pharmacist. 2017;42(10):12.","journal-title":"US Pharmacist"},{"key":"272_CR3","doi-asserted-by":"publisher","first-page":"S1","DOI":"10.1186\/1472-6947-15-S4-S1","volume":"15","author":"J Zhao","year":"2015","unstructured":"Zhao J, Henriksson A, Asker L, Bostrom H. Predictive modeling of structured electronic health records for adverse drug event detection. BMC Med Inform Decis Mak. 2015;15:S1. https:\/\/doi.org\/10.1186\/1472-6947-15-S4-S1.","journal-title":"BMC Med Inform Decis Mak"},{"key":"272_CR4","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1186\/1472-6947-14-13","volume":"14","author":"S Yeleswarapu","year":"2014","unstructured":"Yeleswarapu S, Rao A, Joseph T, Saipradeep VG, Srinivasan R. A pipeline to extract drug-adverse event pairs from multiple data sources. BMC Med Inform Decis Mak. 2014;14:13. https:\/\/doi.org\/10.1186\/1472-6947-14-13.","journal-title":"BMC Med Inform Decis Mak"},{"issue":"7","key":"272_CR5","doi-asserted-by":"publisher","first-page":"815","DOI":"10.1002\/pds.4562","volume":"27","author":"AS Czaja","year":"2018","unstructured":"Czaja AS, Ross ME, Liu W, Fiks AG, Localio R, Wasserman RC, Grundmeier RW, Adams WG. Electronic health record (EHR) based postmarketing surveillance of adverse events associated with pediatric off-label medication use: a case study of short-acting beta-2 agonists and arrhythmias. Pharmacoepidemiol Drug Saf. 2018;27(7):815\u201322. https:\/\/doi.org\/10.1002\/pds.4562.","journal-title":"Pharmacoepidemiol Drug Saf"},{"issue":"2","key":"272_CR6","doi-asserted-by":"publisher","first-page":"287","DOI":"10.1002\/cpt.914","volume":"103","author":"C-W Chiang","year":"2018","unstructured":"Chiang C-W, Zhang P, Wang X, Wang L, Zhang S, Ning X, Shen L, Quinney SK, Li L. Translational high-dimensional drug interaction discovery and validation using health record databases and pharmacokinetics models. Clin Pharmacol Ther. 2018;103(2):287\u201395. https:\/\/doi.org\/10.1002\/cpt.914.","journal-title":"Clin Pharmacol Ther"},{"issue":"e2","key":"272_CR7","doi-asserted-by":"publisher","first-page":"e226","DOI":"10.1136\/amiajnl-2013-001926","volume":"20","author":"RL Richesson","year":"2013","unstructured":"Richesson RL, Hammond WE, Nahm M, Wixted D, Simon GE, Robinson JG, Bauck AE, Cifelli D, Smerek MM, Dickerson J, et al. Electronic health records based phenotyping in next-generation clinical trials: a perspective from the NIH Health Care Systems Collaboratory. J Am Med Inform Assoc. 2013;20(e2):e226-231. https:\/\/doi.org\/10.1136\/amiajnl-2013-001926.","journal-title":"J Am Med Inform Assoc"},{"key":"272_CR8","doi-asserted-by":"publisher","first-page":"145","DOI":"10.1142\/9789813235533_0014","volume":"23","author":"BS Glicksberg","year":"2018","unstructured":"Glicksberg BS, Miotto R, Johnson KW, Shameer K, Li L, Chen R, Dudley JT. Automated disease cohort selection using word embeddings from Electronic Health Records. Pac Symp Biocomput. 2018;23:145\u201356. https:\/\/doi.org\/10.1142\/9789813235533_0014.","journal-title":"Pac Symp Biocomput"},{"issue":"6","key":"272_CR9","doi-asserted-by":"publisher","first-page":"1046","DOI":"10.1093\/jamia\/ocv202","volume":"23","author":"JC Kirby","year":"2016","unstructured":"Kirby JC, Speltz P, Rasmussen LV, Basford M, Gottesman O, Peissig PL, Pacheco JA, Tromp G, Pathak J, Carrell DS, et al. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J Am Med Inform Assoc. 2016;23(6):1046\u201352. https:\/\/doi.org\/10.1093\/jamia\/ocv202.","journal-title":"J Am Med Inform Assoc"},{"issue":"4","key":"272_CR10","doi-asserted-by":"publisher","first-page":"469","DOI":"10.2217\/pgs.10.41","volume":"11","author":"D Gurwitz","year":"2010","unstructured":"Gurwitz D, Pirmohamed M. Pharmacogenomics: the importance of accurate phenotypes. Pharmacogenomics. 2010;11(4):469\u201370. https:\/\/doi.org\/10.2217\/pgs.10.41.","journal-title":"Pharmacogenomics"},{"issue":"3","key":"272_CR11","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1093\/jamia\/ocx110","volume":"25","author":"G Hripcsak","year":"2017","unstructured":"Hripcsak G, Albers DJ. High-fidelity phenotyping: richness and freedom from bias. J Am Med Inform Assoc. 2017;25(3):289\u201394. https:\/\/doi.org\/10.1093\/jamia\/ocx110.","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"272_CR12","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1136\/amiajnl-2012-001145","volume":"20","author":"G Hripcsak","year":"2013","unstructured":"Hripcsak G, Albers DJ. Next-generation phenotyping of electronic health records. J Am Med Inform Assoc. 2013;20(1):117\u201321. https:\/\/doi.org\/10.1136\/amiajnl-2012-001145.","journal-title":"J Am Med Inform Assoc"},{"issue":"2","key":"272_CR13","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1136\/amiajnl-2013-001935","volume":"21","author":"C Shivade","year":"2014","unstructured":"Shivade C, Raghavan P, Fosler-Lussier E, Embi PJ, Elhadad N, Johnson SB, Lai AM. A review of approaches to identifying patient phenotype cohorts using electronic health records. J Am Med Inform Assoc. 2014;21(2):221\u201330. https:\/\/doi.org\/10.1136\/amiajnl-2013-001935.","journal-title":"J Am Med Inform Assoc"},{"issue":"e1","key":"272_CR14","doi-asserted-by":"publisher","first-page":"e28","DOI":"10.1136\/amiajnl-2011-000699","volume":"19","author":"M Liu","year":"2012","unstructured":"Liu M, Wu Y, Chen Y, Sun J, Zhao Z, Chen XW, Matheny ME, Xu H. Large-scale prediction of adverse drug reactions using chemical, biological, and phenotypic properties of drugs. J Am Med Inform Assoc. 2012;19(e1):e28\u201335. https:\/\/doi.org\/10.1136\/amiajnl-2011-000699.","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"272_CR15","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1146\/annurev-biodatasci-080917-013315","volume":"1","author":"JM Banda","year":"2018","unstructured":"Banda JM, Seneviratne M, Hernandez-Boussard T, Shah NH. Advances in electronic phenotyping: from rule-based definitions to machine learning models. Annu Rev Biomed Data Sci. 2018;1(1):53\u201368. https:\/\/doi.org\/10.1146\/annurev-biodatasci-080917-013315.","journal-title":"Annu Rev Biomed Data Sci"},{"issue":"1","key":"272_CR16","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1186\/s13073-015-0166-y","volume":"7","author":"WQ Wei","year":"2015","unstructured":"Wei WQ, Denny JC. Extracting research-quality phenotypes from electronic health records to support precision medicine. Genome Med. 2015;7(1):41. https:\/\/doi.org\/10.1186\/s13073-015-0166-y.","journal-title":"Genome Med"},{"key":"272_CR17","first-page":"248","volume":"2011","author":"CG Chute","year":"2011","unstructured":"Chute CG, Pathak J, Savova GK, Bailey KR, Schor MI, Hart LA, Beebe CE, Huff SM. The SHARPn project on secondary use of electronic medical record data: progress, plans, and possibilities. AMIA Annu Symp Proc. 2011;2011:248\u201356.","journal-title":"AMIA Annu Symp Proc"},{"issue":"e2","key":"272_CR18","doi-asserted-by":"publisher","first-page":"e206","DOI":"10.1136\/amiajnl-2013-002428","volume":"20","author":"J Pathak","year":"2013","unstructured":"Pathak J, Kho AN, Denny JC. Electronic health records-driven phenotyping: challenges, recent advances, and perspectives. J Am Med Inform Assoc. 2013;20(e2):e206\u2013211. https:\/\/doi.org\/10.1136\/amiajnl-2013-002428.","journal-title":"J Am Med Inform Assoc"},{"key":"272_CR19","first-page":"189","volume":"2011","author":"RJ Carroll","year":"2011","unstructured":"Carroll RJ, Eyler AE, Denny JC. Naive electronic health record phenotype identification for rheumatoid arthritis. AMIA Ann Symp Proc. 2011;2011:189\u201396.","journal-title":"AMIA Ann Symp Proc"},{"issue":"3","key":"272_CR20","doi-asserted-by":"publisher","first-page":"298","DOI":"10.1002\/cpt.321","volume":"99","author":"DM Roden","year":"2016","unstructured":"Roden DM, Denny JC. Integrating electronic health record genotype and phenotype datasets to transform patient care. Clin Pharmacol Ther. 2016;99(3):298\u2013305. https:\/\/doi.org\/10.1002\/cpt.321.","journal-title":"Clin Pharmacol Ther"},{"key":"272_CR21","doi-asserted-by":"publisher","first-page":"h1885","DOI":"10.1136\/bmj.h1885","volume":"350","author":"KP Liao","year":"2015","unstructured":"Liao KP, Cai T, Savova GK, Murphy SN, Karlson EW, Ananthakrishnan AN, Gainer VS, Shaw SY, Xia Z, Szolovits P, et al. Development of phenotype algorithms using electronic medical records and incorporating natural language processing. BMJ. 2015;350:h1885. https:\/\/doi.org\/10.1136\/bmj.h1885.","journal-title":"BMJ"},{"issue":"5","key":"272_CR22","doi-asserted-by":"publisher","first-page":"993","DOI":"10.1093\/jamia\/ocv034","volume":"22","author":"S Yu","year":"2015","unstructured":"Yu S, Liao KP, Shaw SY, Gainer VS, Churchill SE, Szolovits P, Murphy SN, Kohane IS, Cai T. Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources. J Am Med Inform Assoc. 2015;22(5):993\u20131000. https:\/\/doi.org\/10.1093\/jamia\/ocv034.","journal-title":"J Am Med Inform Assoc"},{"key":"272_CR23","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1016\/j.artmed.2016.05.005","volume":"71","author":"RL Richesson","year":"2016","unstructured":"Richesson RL, Sun J, Pathak J, Kho AN, Denny JC. Clinical phenotyping in selected national networks: demonstrating the need for high-throughput, portable, and computational methods. Artif Intell Med. 2016;71:57\u201361. https:\/\/doi.org\/10.1016\/j.artmed.2016.05.005.","journal-title":"Artif Intell Med"},{"issue":"6","key":"272_CR24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3127881","volume":"50","author":"P Yadav","year":"2018","unstructured":"Yadav P, Steinbach M, Kumar V, Simon G. Mining electronic health records (EHRs): a survey. ACM Comput Surv. 2018;50(6):1\u201340. https:\/\/doi.org\/10.1145\/3127881.","journal-title":"ACM Comput Surv"},{"key":"272_CR25","doi-asserted-by":"publisher","unstructured":"Richesson R, Wiley LK, Gold S, Rasmussen L; for the NIH Health Care Systems Research Collaboratory Electronic Health Records Core Working Group. Electronic Health Records\u2013Based Phenotyping: Introduction. In: Rethinking Clinical Trials: A Living Textbook of Pragmatic Clinical Trials. Bethesda: NIH Health Care Systems Research Collaboratory. Available at: https:\/\/rethinkingclinicaltrials.org\/chapters\/conduct\/electronic-health-records-based-phenotyping\/electronichealth-records-based-phenotyping-introduction\/. Updated December 27, 2021. https:\/\/doi.org\/10.28929\/143.","DOI":"10.28929\/143"},{"issue":"2","key":"272_CR26","doi-asserted-by":"publisher","first-page":"140","DOI":"10.2500\/ajra.2014.28.4012","volume":"28","author":"J Hsu","year":"2014","unstructured":"Hsu J, Pacheco JA, Stevens WW, Smith ME, Avila PC. Accuracy of phenotyping chronic rhinosinusitis in the electronic health record. Am J Rhinol Allergy. 2014;28(2):140\u20134. https:\/\/doi.org\/10.2500\/ajra.2014.28.4012.","journal-title":"Am J Rhinol Allergy"},{"key":"272_CR27","unstructured":"International Classification of Diseases,Ninth Revision (ICD-9). https:\/\/www.cdc.gov\/nchs\/icd\/icd9.htm. Accessed 1 Jan 2019."},{"key":"272_CR28","unstructured":"CPT code\/relative value search. https:\/\/ocm.ama-assn.org\/OCM\/CPTRelativeValueSearch.do?submitbutton=accept. Accessed 3 Apr 2022."},{"issue":"10","key":"272_CR29","doi-asserted-by":"publisher","first-page":"761","DOI":"10.1038\/gim.2013.72","volume":"15","author":"O Gottesman","year":"2013","unstructured":"Gottesman O, Kuivaniemi H, Tromp G, Faucett WA, Li R, Manolio TA, Sanderson SC, Kannry J, Zinberg R, Basford MA, et al. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future. Genet Med. 2013;15(10):761\u201371. https:\/\/doi.org\/10.1038\/gim.2013.72.","journal-title":"Genet Med"},{"issue":"10","key":"272_CR30","doi-asserted-by":"publisher","first-page":"e75256","DOI":"10.1371\/journal.pone.0075256","volume":"8","author":"A Leong","year":"2013","unstructured":"Leong A, Dasgupta K, Bernatsky S, Lacaille D, Avina-Zubieta A, Rahme E. Systematic review and meta-analysis of validation studies on a diabetes case definition from health administrative records. PLoS One [Electronic Resource]. 2013;8(10):e75256. https:\/\/doi.org\/10.1371\/journal.pone.0075256.","journal-title":"PLoS One [Electronic Resource]"},{"issue":"1","key":"272_CR31","doi-asserted-by":"publisher","first-page":"38","DOI":"10.1186\/s13643-017-0431-9","volume":"6","author":"S Souri","year":"2017","unstructured":"Souri S, Symonds NE, Rouhi A, Lethebe BC, Garies S, Ronksley PE, Williamson TS, Fabreau GE, Birtwhistle R, Quan H, et al. Identification of validated case definitions for chronic disease using electronic medical records: a systematic review protocol. Syst Rev. 2017;6(1):38. https:\/\/doi.org\/10.1186\/s13643-017-0431-9.","journal-title":"Syst Rev"},{"issue":"8","key":"272_CR32","doi-asserted-by":"publisher","first-page":"1343","DOI":"10.1002\/acr.21959","volume":"65","author":"C Barber","year":"2013","unstructured":"Barber C, Lacaille D, Fortin PR. Systematic review of validation studies of the use of administrative data to identify serious infections. Arthritis Care Res. 2013;65(8):1343\u201357. https:\/\/doi.org\/10.1002\/acr.21959.","journal-title":"Arthritis Care Res"},{"issue":"5","key":"272_CR33","doi-asserted-by":"publisher","first-page":"e146","DOI":"10.2500\/ajra.2015.29.4229","volume":"29","author":"JT Lui","year":"2015","unstructured":"Lui JT, Rudmik L. Case definitions for chronic rhinosinusitis in administrative data: a systematic review. Am J Rhinol Allergy. 2015;29(5):e146\u2013151. https:\/\/doi.org\/10.2500\/ajra.2015.29.4229.","journal-title":"Am J Rhinol Allergy"},{"key":"272_CR34","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1186\/s12888-014-0289-5","volume":"14","author":"KM Fiest","year":"2014","unstructured":"Fiest KM, Jette N, Quan H, St Germaine-Smith C, Metcalfe A, Patten SB, Beck CA. Systematic review and assessment of validated case definitions for depression in administrative data. BMC Psychiatry. 2014;14:289. https:\/\/doi.org\/10.1186\/s12888-014-0289-5.","journal-title":"BMC Psychiatry"},{"issue":"8","key":"272_CR35","doi-asserted-by":"publisher","first-page":"1052","DOI":"10.1016\/j.cjca.2017.05.025","volume":"33","author":"R Pace","year":"2017","unstructured":"Pace R, Peters T, Rahme E, Dasgupta K. Validity of health administrative database definitions for hypertension: a systematic review. Can J Cardiol. 2017;33(8):1052\u20139. https:\/\/doi.org\/10.1016\/j.cjca.2017.05.025.","journal-title":"Can J Cardiol"},{"issue":"6","key":"272_CR36","doi-asserted-by":"publisher","first-page":"1303","DOI":"10.1002\/lary.25804","volume":"126","author":"KI Macdonald","year":"2016","unstructured":"Macdonald KI, Kilty SJ, van Walraven C. Chronic rhinosinusitis identification in administrative databases and health surveys: a systematic review. Laryngoscope. 2016;126(6):1303\u201310. https:\/\/doi.org\/10.1002\/lary.25804.","journal-title":"Laryngoscope"},{"key":"272_CR37","doi-asserted-by":"publisher","unstructured":"Cohen AM, Adams CE, Davis JM, Yu C, Yu PS, Meng W, Duggan L, McDonagh M, Smalheiser NR. Evidence-based medicine, the essential role of systematic reviews, and the need for automated text mining tools. In: Proceedings of the 1st ACM International Health Informatics Symposium. Arlington:\u00a0Association for Computing Machinery; 2010. p. 376\u2013380. https:\/\/doi.org\/10.1145\/1882992.1883046.","DOI":"10.1145\/1882992.1883046"},{"key":"272_CR38","doi-asserted-by":"publisher","unstructured":"Collier N, Groza T, Smedley D, Robinson PN, Oellrich A, Rebholz-Schuhmann D: PhenoMiner: from text to a database of phenotypes associated with OMIM diseases. Database (Oxford) 2015, 2015.\u00a0https:\/\/doi.org\/10.1093\/database\/bav104.","DOI":"10.1093\/database\/bav104"},{"key":"272_CR39","first-page":"149","volume":"2017","author":"J Henderson","year":"2017","unstructured":"Henderson J, Bridges R, Ho JC, Wallace BC, Ghosh J. PheKnow-cloud: a tool for evaluating high-throughput phenotype candidates using online medical literature. AMIA Jt Summits Transl Sci Proc. 2017;2017:149\u201357.","journal-title":"AMIA Jt Summits Transl Sci Proc"},{"issue":"5","key":"272_CR40","doi-asserted-by":"publisher","first-page":"859","DOI":"10.1016\/j.jbi.2011.05.004","volume":"44","author":"D Zhao","year":"2011","unstructured":"Zhao D, Weng C. Combining PubMed knowledge and EHR data to develop a weighted bayesian network for pancreatic cancer prediction. J Biomed Inform. 2011;44(5):859\u201368. https:\/\/doi.org\/10.1016\/j.jbi.2011.05.004.","journal-title":"J Biomed Inform"},{"issue":"4","key":"272_CR41","doi-asserted-by":"publisher","first-page":"515","DOI":"10.4338\/ACI-2013-04-RA-0028","volume":"4","author":"T Botsis","year":"2013","unstructured":"Botsis T, Ball R. Automating case definitions using literature-based reasoning. Appl Clin Inform. 2013;4(4):515\u201327. https:\/\/doi.org\/10.4338\/ACI-2013-04-RA-0028.","journal-title":"Appl Clin Inform"},{"issue":"2","key":"272_CR42","doi-asserted-by":"publisher","first-page":"199","DOI":"10.11613\/BM.2014.022","volume":"24","author":"MSJBmBm Thiese","year":"2014","unstructured":"Thiese MSJBmBm. Observational and interventional study design types; an overview. Biochema Medica. 2014;24(2):199\u2013210. https:\/\/doi.org\/10.11613\/BM.2014.022.","journal-title":"Biochema Medica"},{"key":"272_CR43","doi-asserted-by":"publisher","first-page":"196","DOI":"10.1016\/j.jbi.2014.11.002","volume":"53","author":"A Sarker","year":"2015","unstructured":"Sarker A, Gonzalez G. Portable automatic text classification for adverse drug reaction detection via multi-corpus training. J Biomed Inform. 2015;53:196\u2013207. https:\/\/doi.org\/10.1016\/j.jbi.2014.11.002.","journal-title":"J Biomed Inform"},{"issue":"e1","key":"272_CR44","doi-asserted-by":"publisher","first-page":"e147","DOI":"10.1136\/amiajnl-2012-000896","volume":"20","author":"KM Newton","year":"2013","unstructured":"Newton KM, Peissig PL, Kho AN, Bielinski SJ, Berg RL, Choudhary V, Basford M, Chute CG, Kullo IJ, Li R, et al. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J Am Med Inform Assoc. 2013;20(e1):e147\u2013154. https:\/\/doi.org\/10.1136\/amiajnl-2012-000896.","journal-title":"J Am Med Inform Assoc"},{"key":"272_CR45","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1016\/j.jbi.2014.08.012","volume":"52","author":"VM Castro","year":"2014","unstructured":"Castro VM, Apperson WK, Gainer VS, Ananthakrishnan AN, Goodson AP, Wang TD, Herrick CD, Murphy SN. Evaluation of matched control algorithms in EHR-based phenotyping studies: a case study of inflammatory bowel disease comorbidities. J Biomed Inform. 2014;52:105\u201311. https:\/\/doi.org\/10.1016\/j.jbi.2014.08.012.","journal-title":"J Biomed Inform"},{"key":"272_CR46","unstructured":"Phenome Wide Association Studies. https:\/\/phewascatalog.org\/. Accessed 1 Jan 2019."},{"issue":"12","key":"272_CR47","doi-asserted-by":"publisher","first-page":"e1000597","DOI":"10.1371\/journal.pcbi.1000597","volume":"5","author":"R Rodriguez-Esteban","year":"2009","unstructured":"Rodriguez-Esteban R. Biomedical text mining and its applications. PLoS Comput Biol. 2009;5(12):e1000597. https:\/\/doi.org\/10.1371\/journal.pcbi.1000597.","journal-title":"PLoS Comput Biol"},{"key":"272_CR48","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.jbi.2013.12.006","volume":"47","author":"RI Dogan","year":"2014","unstructured":"Dogan RI, Leaman R, Lu Z. NCBI disease corpus: a resource for disease name recognition and concept normalization. J Biomed Inform. 2014;47:1\u201310. https:\/\/doi.org\/10.1016\/j.jbi.2013.12.006.","journal-title":"J Biomed Inform"},{"issue":"5","key":"272_CR49","doi-asserted-by":"publisher","first-page":"885","DOI":"10.1016\/j.jbi.2012.04.008","volume":"45","author":"H Gurulingappa","year":"2012","unstructured":"Gurulingappa H, Rajput AM, Roberts A, Fluck J, Hofmann-Apitius M, Toldo L. Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. J Biomed Inform. 2012;45(5):885\u201392. https:\/\/doi.org\/10.1016\/j.jbi.2012.04.008.","journal-title":"J Biomed Inform"},{"issue":"Suppl 1","key":"272_CR50","doi-asserted-by":"publisher","first-page":"i180","DOI":"10.1093\/bioinformatics\/btg1023","volume":"19","author":"JD Kim","year":"2003","unstructured":"Kim JD, Ohta T, Tateisi Y, Tsujii J. GENIA corpus\u2013semantically annotated corpus for bio-textmining. Bioinformatics. 2003;19(Suppl 1):i180-182. https:\/\/doi.org\/10.1093\/bioinformatics\/btg1023.","journal-title":"Bioinformatics"},{"issue":"Suppl 1 Text mi","key":"272_CR51","doi-asserted-by":"publisher","first-page":"S2","DOI":"10.1186\/1758-2946-7-S1-S2","volume":"7","author":"M Krallinger","year":"2015","unstructured":"Krallinger M, Rabal O, Leitner F, Vazquez M, Salgado D, Lu Z, Leaman R, Lu Y, Ji D, Lowe DM, et al. The CHEMDNER corpus of chemicals and drugs and its annotation principles. J Cheminform. 2015;7(Suppl 1 Text mining for chemistry and the CHEMDNER track):S2. https:\/\/doi.org\/10.1186\/1758-2946-7-S1-S2.","journal-title":"J Cheminform"},{"issue":"2","key":"272_CR52","doi-asserted-by":"publisher","first-page":"73","DOI":"10.4103\/2229-3485.153997","volume":"6","author":"FF Ozair","year":"2015","unstructured":"Ozair FF, Jamshed N, Sharma A, Aggarwal P. Ethical issues in electronic health records: a general overview. Perspect Clin Res. 2015;6(2):73\u20136. https:\/\/doi.org\/10.4103\/2229-3485.153997.","journal-title":"Perspect Clin Res"},{"key":"272_CR53","first-page":"2010","volume-title":"2nd Workshop on Building and evaluating resources for biomedical text mining (7th edition of the Language Resources and Evaluation Conference)","author":"H Gurulingappa","year":"2010","unstructured":"Gurulingappa H, Klinger R, Hofmann-Apitius M, Fluck J. An empirical evaluation of resources for the identification of diseases and adverse effects in biomedical literature. In: 2nd Workshop on Building and evaluating resources for biomedical text mining (7th edition of the Language Resources and Evaluation Conference). 2010. p. 2010."},{"issue":"5","key":"272_CR54","doi-asserted-by":"publisher","first-page":"950","DOI":"10.1016\/j.jbi.2008.12.013","volume":"42","author":"A Roberts","year":"2009","unstructured":"Roberts A, Gaizauskas R, Hepple M, Demetriou G, Guo Y, Roberts I, Setzer A. Building a semantically annotated corpus of clinical texts. J Biomed Inform. 2009;42(5):950\u201366. https:\/\/doi.org\/10.1016\/j.jbi.2008.12.013.","journal-title":"J Biomed Inform"},{"issue":"Suppl 2","key":"272_CR55","doi-asserted-by":"publisher","first-page":"S3","DOI":"10.1186\/1472-6947-15-S2-S3","volume":"15","author":"N Alnazzawi","year":"2015","unstructured":"Alnazzawi N, Thompson P, Batista-Navarro R, Ananiadou S. Using text mining techniques to extract phenotypic information from the PhenoCHF corpus. BMC Med Inform Decis Mak. 2015;15(Suppl 2):S3. https:\/\/doi.org\/10.1186\/1472-6947-15-S2-S3.","journal-title":"BMC Med Inform Decis Mak"},{"key":"272_CR56","first-page":"69","volume-title":"Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis: 2014","author":"N Alnazzawi","year":"2014","unstructured":"Alnazzawi N, Thompson P, Ananiadou S. Building a semantically annotated corpus for congestive heart and renal failure from clinical records and the literature. In: Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis: 2014. Gothenburg, Sweden: Association for Computational Linguistics; 2014. p. 69\u201374."},{"issue":"4","key":"272_CR57","doi-asserted-by":"publisher","first-page":"561","DOI":"10.1197\/jamia.M3115","volume":"16","author":"O Uzuner","year":"2009","unstructured":"Uzuner O. Recognizing obesity and comorbidities in sparse data. J Am Med Inform Assoc. 2009;16(4):561\u201370. https:\/\/doi.org\/10.1197\/jamia.M3115.","journal-title":"J Am Med Inform Assoc"},{"key":"272_CR58","doi-asserted-by":"publisher","first-page":"bat019","DOI":"10.1093\/database\/bat019","volume":"2013","author":"K Verspoor","year":"2013","unstructured":"Verspoor K, JimenoYepes A, Cavedon L, McIntosh T, Herten-Crabb A, Thomas Z, Plazzer JP. Annotating the biomedical literature for the human variome. Database. 2013;2013:bat019. https:\/\/doi.org\/10.1093\/database\/bat019.","journal-title":"Database"},{"issue":"8","key":"272_CR59","doi-asserted-by":"publisher","first-page":"e1002614","DOI":"10.1371\/journal.pcbi.1002614","volume":"8","author":"JD Duke","year":"2012","unstructured":"Duke JD, Han X, Wang Z, Subhadarshini A, Karnik SD, Li X, Hall SD, Jin Y, Callaghan JT, Overhage MJ, et al. Literature based drug interaction prediction with clinical assessment using electronic medical records: novel myopathy associated drug interactions. PLoS Comput Biol. 2012;8(8):e1002614. https:\/\/doi.org\/10.1371\/journal.pcbi.1002614.","journal-title":"PLoS Comput Biol"},{"issue":"S1","key":"272_CR60","doi-asserted-by":"publisher","first-page":"S91","DOI":"10.1002\/cpt.1745","volume":"101","author":"HY Wu","year":"2017","unstructured":"Wu HY, Zhang S, Desta Z, Quinney S, Li L. Translational drug interaction evidence gap discovery using text mining. Clin Pharmacol Ther. 2017;101(S1):S91\u20132. https:\/\/doi.org\/10.1002\/cpt.1745.","journal-title":"Clin Pharmacol Ther"},{"issue":"4","key":"272_CR61","doi-asserted-by":"publisher","first-page":"342","DOI":"10.2174\/138920010791514180","volume":"11","author":"J-F Wang","year":"2010","unstructured":"Wang J-F, Chou K-C. Molecular modeling of cytochrome P450 and drug metabolism. Curr Drug Metab. 2010;11(4):342\u20136. https:\/\/doi.org\/10.2174\/138920010791514180.","journal-title":"Curr Drug Metab"},{"issue":"4","key":"272_CR62","doi-asserted-by":"publisher","first-page":"421","DOI":"10.1080\/08998280.2000.11927719","volume":"13","author":"CC Ogu","year":"2000","unstructured":"Ogu CC, Maxa JL. Drug interactions due to cytochrome P450. Proc (Baylor Univ Med Cent). 2000;13(4):421\u20133. https:\/\/doi.org\/10.1080\/08998280.2000.11927719.","journal-title":"Proc (Baylor Univ Med Cent)"},{"issue":"2","key":"272_CR63","doi-asserted-by":"publisher","first-page":"109","DOI":"10.2165\/00002018-199920020-00002","volume":"20","author":"EG Brown","year":"1999","unstructured":"Brown EG, Wood L, Wood S. The medical dictionary for regulatory activities (MedDRA). Drug Saf. 1999;20(2):109\u201317. https:\/\/doi.org\/10.2165\/00002018-199920020-00002.","journal-title":"Drug Saf"},{"issue":"D1","key":"272_CR64","doi-asserted-by":"publisher","first-page":"D1075","DOI":"10.1093\/nar\/gkv1075","volume":"44","author":"M Kuhn","year":"2016","unstructured":"Kuhn M, Letunic I, Jensen LJ, Bork P. The SIDER database of drugs and side effects. Nucleic Acids Res. 2016;44(D1):D1075\u20131079. https:\/\/doi.org\/10.1093\/nar\/gkv1075.","journal-title":"Nucleic Acids Res"},{"key":"272_CR65","first-page":"662","volume-title":"AMIA Annual Symposium Proceedings. 2002\/02\/05 edn","author":"MQ Stearns","year":"2001","unstructured":"Stearns MQ, Price C, Spackman KA, Wang AY. SNOMED clinical terms: overview of the development process and project status. In: AMIA Annual Symposium Proceedings. 2002\/02\/05 edn. 2001. p. 662\u20136."},{"key":"272_CR66","doi-asserted-by":"crossref","unstructured":"Artstein R. Inter-annotator Agreement. In: Handbook of Linguistic Annotation. edn. Edited by Ide N, Pustejovsky J. Dordrecht: Springer Netherlands; 2017: 297\u2013313.","DOI":"10.1007\/978-94-024-0881-2_11"},{"key":"272_CR67","doi-asserted-by":"publisher","first-page":"356","DOI":"10.1186\/1471-2105-7-356","volume":"7","author":"WJ Wilbur","year":"2006","unstructured":"Wilbur WJ, Rzhetsky A, Shatkay H. New directions in biomedical text annotation: definitions, guidelines and corpus construction. BMC Bioinformatics. 2006;7:356. https:\/\/doi.org\/10.1186\/1471-2105-7-356.","journal-title":"BMC Bioinformatics"},{"issue":"3","key":"272_CR68","doi-asserted-by":"publisher","first-page":"276","DOI":"10.11613\/BM.2012.031","volume":"22","author":"ML McHugh","year":"2012","unstructured":"McHugh ML. Interrater reliability: the kappa statistic. Biochema Medica. 2012;22(3):276\u201382. https:\/\/doi.org\/10.11613\/BM.2012.031.","journal-title":"Biochema Medica"},{"key":"272_CR69","first-page":"145","volume-title":"Proceedings of BioCreative Workshop: 2012; Washington, DC USA","author":"CH Wei","year":"2012","unstructured":"Wei CH, Kao HY, Lu Z. PubTator: A PubMed-like interactive curation system for document triage and literature curation. In: Proceedings of BioCreative Workshop: 2012; Washington, DC USA. 2012. p. 145\u201350."},{"issue":"5","key":"272_CR70","first-page":"360","volume":"37","author":"AJ Viera","year":"2005","unstructured":"Viera AJ, Garrett JM. Understanding interobserver agreement: the kappa statistic. Fam Med. 2005;37(5):360\u20133.","journal-title":"Fam Med"},{"key":"272_CR71","doi-asserted-by":"publisher","first-page":"1937","DOI":"10.1109\/BigData.2016.7840814","volume-title":"2016 IEEE International Conference on Big Data (Big Data)","author":"SR Kundeti","year":"2016","unstructured":"Kundeti SR, Vijayananda J, Mujjiga S, Kalyan M. Clinical named entity recognition: Challenges and opportunities. In: 2016 IEEE International Conference on Big Data (Big Data). 2016. p. 1937\u201345."},{"key":"272_CR72","unstructured":"Unified Medical Language System (UMLS).\u00a0https:\/\/www.nlm.nih.gov\/research\/umls\/index.html. Accessed 1 Jan 2019."},{"issue":"Database issue","key":"272_CR73","doi-asserted-by":"publisher","first-page":"D267","DOI":"10.1093\/nar\/gkh061","volume":"32","author":"O Bodenreider","year":"2004","unstructured":"Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004;32(Database issue):D267\u2013270. https:\/\/doi.org\/10.1093\/nar\/gkh061.","journal-title":"Nucleic Acids Res"},{"key":"272_CR74","first-page":"466","volume":"2015","author":"LK Wiley","year":"2015","unstructured":"Wiley LK, Moretz JD, Denny JC, Peterson JF, Bush WS. Phenotyping adverse drug reactions: statin-related myotoxicity. AMIA Summits Transl Sci Proc. 2015;2015:466\u201370.","journal-title":"AMIA Summits Transl Sci Proc"},{"issue":"e2","key":"272_CR75","doi-asserted-by":"publisher","first-page":"e319","DOI":"10.1136\/amiajnl-2013-001952","volume":"20","author":"RL Richesson","year":"2013","unstructured":"Richesson RL, Rusincovitch SA, Wixted D, Batch BC, Feinglos MN, Miranda ML, Hammond WE, Califf RM, Spratt SE. A comparison of phenotype definitions for diabetes mellitus. J Am Med Inform Assoc. 2013;20(e2):e319\u2013326. https:\/\/doi.org\/10.1136\/amiajnl-2013-001952.","journal-title":"J Am Med Inform Assoc"}],"updated-by":[{"DOI":"10.1186\/s13326-022-00275-3","type":"correction","label":"Correction","source":"publisher","updated":{"date-parts":[[2022,7,20]],"date-time":"2022-07-20T00:00:00Z","timestamp":1658275200000}}],"container-title":["Journal of Biomedical Semantics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13326-022-00272-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13326-022-00272-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13326-022-00272-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,7,20]],"date-time":"2022-07-20T06:06:01Z","timestamp":1658297161000},"score":1,"resource":{"primary":{"URL":"https:\/\/jbiomedsem.biomedcentral.com\/articles\/10.1186\/s13326-022-00272-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,11]]},"references-count":75,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["272"],"URL":"https:\/\/doi.org\/10.1186\/s13326-022-00272-6","relation":{},"ISSN":["2041-1480"],"issn-type":[{"value":"2041-1480","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,6,11]]},"assertion":[{"value":"7 August 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 May 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 June 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 July 2022","order":4,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Correction","order":5,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"A Correction to this paper has been published:","order":6,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"https:\/\/doi.org\/10.1186\/s13326-022-00275-3","URL":"https:\/\/doi.org\/10.1186\/s13326-022-00275-3","order":7,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"17"}}