{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T23:20:00Z","timestamp":1771629600163,"version":"3.50.1"},"reference-count":50,"publisher":"Oxford University Press (OUP)","issue":"6","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Objective Health care generated data have become an important source for clinical and genomic research. Often, investigators create and iteratively refine phenotype algorithms to achieve high positive predictive values (PPVs) or sensitivity, thereby identifying valid cases and controls. These algorithms achieve the greatest utility when validated and shared by multiple health care systems.<\/jats:p>\n               <jats:p>Materials and Methods We report the current status and impact of the Phenotype KnowledgeBase (PheKB, http:\/\/phekb.org ), an online environment supporting the workflow of building, sharing, and validating electronic phenotype algorithms. We analyze the most frequent components used in algorithms and their performance at authoring institutions and secondary implementation sites.<\/jats:p>\n               <jats:p>Results As of June 2015, PheKB contained 30 finalized phenotype algorithms and 62 algorithms in development spanning a range of traits and diseases. Phenotypes have had over 3500 unique views in a 6-month period and have been reused by other institutions. International Classification of Disease codes were the most frequently used component, followed by medications and natural language processing. Among algorithms with published performance data, the median PPV was nearly identical when evaluated at the authoring institutions (n = 44; case 96.0%, control 100%) compared to implementation sites (n = 40; case 97.5%, control 100%).<\/jats:p>\n               <jats:p>Discussion These results demonstrate that a broad range of algorithms to mine electronic health record data from different health systems can be developed with high PPV, and algorithms developed at one site are generally transportable to others.<\/jats:p>\n               <jats:p>Conclusion By providing a central repository, PheKB enables improved development, transportability, and validity of algorithms for research-grade phenotypes using health care generated data.<\/jats:p>","DOI":"10.1093\/jamia\/ocv202","type":"journal-article","created":{"date-parts":[[2016,3,30]],"date-time":"2016-03-30T00:18:52Z","timestamp":1459297132000},"page":"1046-1052","source":"Crossref","is-referenced-by-count":303,"title":["PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability"],"prefix":"10.1093","volume":"23","author":[{"given":"Jacqueline C","family":"Kirby","sequence":"first","affiliation":[{"name":"Vanderbilt University Medical Center, Nashville, TN, USA"}]},{"given":"Peter","family":"Speltz","sequence":"additional","affiliation":[{"name":"Vanderbilt University Medical Center, Nashville, TN, USA"}]},{"given":"Luke V","family":"Rasmussen","sequence":"additional","affiliation":[{"name":"Northwestern University, Feinberg School of Medicine, Chicago, IL, USA"}]},{"given":"Melissa","family":"Basford","sequence":"additional","affiliation":[{"name":"Vanderbilt University Medical Center, Nashville, TN, USA"}]},{"given":"Omri","family":"Gottesman","sequence":"additional","affiliation":[{"name":"Icahn School of Medicine at Mount Sinai, New York, NY, USA"}]},{"given":"Peggy L","family":"Peissig","sequence":"additional","affiliation":[{"name":"Marshfield Clinic Research Foundation, Marshfield, WI, USA"}]},{"given":"Jennifer A","family":"Pacheco","sequence":"additional","affiliation":[{"name":"Northwestern University, Feinberg School of Medicine, Chicago, IL, USA"}]},{"given":"Gerard","family":"Tromp","sequence":"additional","affiliation":[{"name":"Geisinger Health System, Danville, PA, USA"}]},{"given":"Jyotishman","family":"Pathak","sequence":"additional","affiliation":[{"name":"Mayo Clinic, Rochester, MN, USA"}]},{"given":"David S","family":"Carrell","sequence":"additional","affiliation":[{"name":"Group Health Research Institute, Seattle, WA, USA"}]},{"given":"Stephen B","family":"Ellis","sequence":"additional","affiliation":[{"name":"Icahn School of Medicine at Mount Sinai, New York, NY, USA"}]},{"given":"Todd","family":"Lingren","sequence":"additional","affiliation":[{"name":"Cincinnati Children\u2019s Hospital Medical Center, Cincinnati, OH, USA"}]},{"given":"Will K","family":"Thompson","sequence":"additional","affiliation":[{"name":"Northwestern University, Feinberg School of Medicine, Chicago, IL, USA"}]},{"given":"Guergana","family":"Savova","sequence":"additional","affiliation":[{"name":"Boston Children\u2019s Hospital and Harvard Medical School, Boston, MA, USA"}]},{"given":"Jonathan","family":"Haines","sequence":"additional","affiliation":[{"name":"Case Western University, Cleveland, OH, USA"}]},{"given":"Dan M","family":"Roden","sequence":"additional","affiliation":[{"name":"Vanderbilt University Medical Center, Nashville, TN, USA"}]},{"given":"Paul A","family":"Harris","sequence":"additional","affiliation":[{"name":"Vanderbilt University Medical Center, Nashville, TN, USA"}]},{"given":"Joshua C","family":"Denny","sequence":"additional","affiliation":[{"name":"Vanderbilt University Medical Center, Nashville, TN, USA"}]}],"member":"286","published-online":{"date-parts":[[2016,3,28]]},"reference":[{"key":"2020110612372817500_ocv202-B1","first-page":"761","article-title":"The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future","volume":"15","author":"Gottesman","year":"2013","journal-title":"Genet Med Off J Am Coll Med Genet."},{"key":"2020110612372817500_ocv202-B2","first-page":"248","article-title":"The SHARPn project on secondary use of electronic medical record data: progress, plans, and possibilities","volume":"2011","author":"Chute","year":"2011","journal-title":"AMIA Annu Symp Proc."},{"key":"2020110612372817500_ocv202-B3","doi-asserted-by":"crossref","first-page":"e226","DOI":"10.1136\/amiajnl-2013-001926","article-title":"Electronic health records based phenotyping in next-generation clinical trials: a perspective from the NIH Health Care Systems Collaboratory","volume":"20","author":"Richesson","year":"2013","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612372817500_ocv202-B4","doi-asserted-by":"crossref","first-page":"578","DOI":"10.1136\/amiajnl-2014-002747","article-title":"Launching PCORnet, a national patient-centered clinical research network","volume":"21","author":"Fleurence","year":"2014","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612372817500_ocv202-B5","article-title":"Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources [published online ahead of print April 29, 2015]","author":"Yu","journal-title":"J Am Med Inform Assoc"},{"key":"2020110612372817500_ocv202-B6","doi-asserted-by":"crossref","first-page":"1095","DOI":"10.1038\/nbt.2757","article-title":"Mining the ultimate phenome repository","volume":"31","author":"Shah","year":"2013","journal-title":"Nat Biotechnol."},{"key":"2020110612372817500_ocv202-B7","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1186\/s13326-015-0010-8","article-title":"Development and validation of a classification approach for extracting severity automatically from electronic health records","volume":"6","author":"Boland","year":"2015","journal-title":"J Biomed Semant."},{"key":"2020110612372817500_ocv202-B8","doi-asserted-by":"crossref","first-page":"e319","DOI":"10.1136\/amiajnl-2013-001952","article-title":"A comparison of phenotype definitions for diabetes mellitus","volume":"20","author":"Richesson","year":"2013","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612372817500_ocv202-B9","doi-asserted-by":"crossref","first-page":"557","DOI":"10.4338\/ACI-2014-02-RA-0013","article-title":"Development and validation of a computer-based algorithm to identify foreign-born patients with HIV infection from the electronic medical record","volume":"5","author":"Levison","year":"2014","journal-title":"Appl Clin Inform."},{"key":"2020110612372817500_ocv202-B10","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1136\/amiajnl-2013-001942","article-title":"Database queries for hospitalizations for acute congestive heart failure: flexible methods and validation based on set theory","volume":"21","author":"Rosenman","year":"2014","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612372817500_ocv202-B11","doi-asserted-by":"crossref","first-page":"e0124653","DOI":"10.1371\/journal.pone.0124653","article-title":"Proton pump inhibitor usage and the risk of myocardial infarction in the general population","volume":"10","author":"Shah","year":"2015","journal-title":"PLoS One."},{"key":"2020110612372817500_ocv202-B12","doi-asserted-by":"crossref","first-page":"776","DOI":"10.1136\/amiajnl-2013-001914","article-title":"Phenotyping for patient safety: algorithm development for electronic health record based automated adverse event and medical error detection in neonatal intensive care","volume":"21","author":"Li","year":"2014","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612372817500_ocv202-B13","doi-asserted-by":"crossref","first-page":"305","DOI":"10.1086\/664052","article-title":"Implementing automated surveillance for tracking clostridium difficile infection at multiple healthcare facilities","volume":"33","author":"Dubberke","year":"2012","journal-title":"Infect Control Hosp Epidemiol."},{"key":"2020110612372817500_ocv202-B14","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1002\/cpt.2","article-title":"Systems pharmacology augments drug safety surveillance","volume":"97","author":"Lorberbaum","year":"2015","journal-title":"Clin Pharmacol Ther."},{"key":"2020110612372817500_ocv202-B15","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1016\/j.ajhg.2011.09.008","article-title":"Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies","volume":"89","author":"Denny","year":"2011","journal-title":"Am J Hum Genet."},{"key":"2020110612372817500_ocv202-B16","doi-asserted-by":"crossref","first-page":"79re1","DOI":"10.1126\/scitranslmed.3001807","article-title":"Electronic medical records for genetic research: results of the eMERGE consortium","volume":"3","author":"Kho","year":"2011","journal-title":"Sci Transl Med"},{"key":"2020110612372817500_ocv202-B17","doi-asserted-by":"crossref","first-page":"560","DOI":"10.1016\/j.ajhg.2010.03.003","article-title":"Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record","volume":"86","author":"Ritchie","year":"2010","journal-title":"Am J Hum Genet."},{"key":"2020110612372817500_ocv202-B18","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1136\/amiajnl-2011-000456","article-title":"Importance of multi-modal approaches to effectively identify cataract cases from electronic health records","volume":"19","author":"Peissig","year":"2012","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612372817500_ocv202-B19","doi-asserted-by":"crossref","first-page":"394","DOI":"10.1111\/j.1752-8062.2012.00446.x","article-title":"High density GWAS for LDL cholesterol in African Americans using electronic medical records reveals a strong protective variant in APOE","volume":"5","author":"Rasmussen-Torvik","year":"2012","journal-title":"Clin Transl Sci."},{"key":"2020110612372817500_ocv202-B20","doi-asserted-by":"crossref","first-page":"268","DOI":"10.3389\/fgene.2013.00268","article-title":"EMR-linked GWAS study: investigation of variation landscape of loci for body mass index in children","volume":"4","author":"Namjou","year":"2013","journal-title":"Front Genet."},{"key":"2020110612372817500_ocv202-B21","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1186\/s13073-015-0166-y","article-title":"Extracting research-quality phenotypes from electronic health records to support precision medicine","volume":"7","author":"Wei","year":"2015","journal-title":"Genome Med"},{"key":"2020110612372817500_ocv202-B22","doi-asserted-by":"crossref","first-page":"e147","DOI":"10.1136\/amiajnl-2012-000896","article-title":"Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network","volume":"20","author":"Newton","year":"2013","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612372817500_ocv202-B23","article-title":"Development and validation of algorithms to identify acute diverticulitis [published online ahead of print September 25, 2014]","author":"Kawatkar","journal-title":"Pharmacoepidemiol Drug Saf"},{"key":"2020110612372817500_ocv202-B24","doi-asserted-by":"crossref","first-page":"1377","DOI":"10.1161\/CIRCULATIONAHA.112.000604","article-title":"Genome- and phenome-wide analyses of cardiac conduction identifies markers of arrhythmia risk","volume":"127","author":"Ritchie","year":"2013","journal-title":"Circulation."},{"key":"2020110612372817500_ocv202-B25","doi-asserted-by":"crossref","first-page":"212","DOI":"10.1136\/amiajnl-2011-000439","article-title":"Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study","volume":"19","author":"Kho","year":"2012","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612372817500_ocv202-B26","doi-asserted-by":"crossref","first-page":"e162","DOI":"10.1136\/amiajnl-2011-000583","article-title":"Portability of an algorithm to identify rheumatoid arthritis in electronic health records","volume":"19","author":"Carroll","year":"2012","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612372817500_ocv202-B27","first-page":"274","article-title":"Analyzing the heterogeneity and complexity of Electronic Health Record oriented phenotyping algorithms","volume":"2011","author":"Conway","year":"2011","journal-title":"AMIA Annu Symp Proc."},{"key":"2020110612372817500_ocv202-B28","doi-asserted-by":"crossref","first-page":"1205","DOI":"10.1093\/bioinformatics\/btq126","article-title":"PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene\u2013disease associations","volume":"26","author":"Denny","year":"2010","journal-title":"Bioinformatics."},{"key":"2020110612372817500_ocv202-B29","first-page":"1281","article-title":"Electronic medical records and genomics (eMERGE) network exploration in cataract: several new potential susceptibility loci","volume":"20","author":"Ritchie","year":"2014","journal-title":"Mol Vis."},{"key":"2020110612372817500_ocv202-B30","first-page":"606","article-title":"Using anchors to estimate clinical state without labeled data","volume":"2014","author":"Yoni Halpern","year":"2014","journal-title":"AMIA Annu Symp Proc"},{"key":"2020110612372817500_ocv202-B31","first-page":"722","article-title":"Discovering peripheral arterial disease cases from radiology notes using natural language processing","volume":"2010","author":"Savova","year":"2010","journal-title":"AMIA Annu Symp Proc."},{"key":"2020110612372817500_ocv202-B32"},{"key":"2020110612372817500_ocv202-B33","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1145\/358108.358110","article-title":"Model-driven development of web applications: the AutoWeb system","volume":"18","author":"Fraternali","year":"2000","journal-title":"ACM Trans Inf Syst."},{"key":"2020110612372817500_ocv202-B34"},{"key":"2020110612372817500_ocv202-B35","doi-asserted-by":"crossref","first-page":"118","DOI":"10.4338\/ACI-2013-09-RA-0074","article-title":"A rigorous algorithm to detect and clean inaccurate adult height records within EHR systems","volume":"5","author":"Muthalagu","year":"2014","journal-title":"Appl Clin Inform."},{"key":"2020110612372817500_ocv202-B36","doi-asserted-by":"crossref","first-page":"1844","DOI":"10.1038\/ajg.2014.147","article-title":"Anatomic and advanced adenoma detection rates as quality metrics determined via natural language processing","volume":"109","author":"Gawron","year":"2014","journal-title":"Am J Gastroenterol."},{"key":"2020110612372817500_ocv202-B37","first-page":"113","article-title":"Ephenotyping for abdominal aortic aneurysm in the electronic medical records and genomics (emerge) network: algorithm development and Konstanz Information Miner Workflow","volume":"4","author":"Tromp","year":"2015","journal-title":"Int J Biomed Data Min"},{"key":"2020110612372817500_ocv202-B38","doi-asserted-by":"crossref","first-page":"45","DOI":"10.4103\/0301-4738.37595","article-title":"Understanding and using sensitivity, specificity and predictive values","volume":"56","author":"Parikh","year":"2008","journal-title":"Indian J Ophthalmol."},{"key":"2020110612372817500_ocv202-B39","doi-asserted-by":"crossref","first-page":"D975","DOI":"10.1093\/nar\/gkt1211","article-title":"NCBI\u2019s database of genotypes and phenotypes: dbGaP","volume":"42","author":"Tryka","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"2020110612372817500_ocv202-B40","doi-asserted-by":"crossref","first-page":"1181","DOI":"10.1038\/ng1007-1181","article-title":"The NCBI dbGaP database of genotypes and phenotypes","volume":"39","author":"Mailman","year":"2007","journal-title":"Nat Genet."},{"key":"2020110612372817500_ocv202-B41","article-title":"Design patterns for the development of electronic health record-driven phenotype extraction algorithms [published online ahead of print June 21, 2014]","author":"Rasmussen","journal-title":"J Biomed Inform"},{"key":"2020110612372817500_ocv202-B42","doi-asserted-by":"crossref","first-page":"325","DOI":"10.1186\/1471-2474-15-325","article-title":"A comparative effectiveness trial of postoperative management for lumbar spine surgery: changing behavior through physical therapy (CBPT) study protocol","volume":"15","author":"Archer","year":"2014","journal-title":"BMC Musculoskelet Disord"},{"key":"2020110612372817500_ocv202-B43","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1136\/amiajnl-2012-001145","article-title":"Next-generation phenotyping of electronic health records","volume":"20","author":"Hripcsak","year":"2013","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612372817500_ocv202-B44","doi-asserted-by":"crossref","first-page":"e1002823","DOI":"10.1371\/journal.pcbi.1002823","article-title":"Chapter 13: mining electronic health records in the genomics era","volume":"8","author":"Denny","year":"2012","journal-title":"PLoS Comput Biol."},{"key":"2020110612372817500_ocv202-B45","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1136\/amiajnl-2013-001935","article-title":"A review of approaches to identifying patient phenotype cohorts using electronic health records","volume":"21","author":"Shivade","year":"2014","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612372817500_ocv202-B46","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1016\/j.jbi.2014.07.016","article-title":"Integrating electronic health record information to support integrated care: Practical application of ontologies to improve the accuracy of diabetes disease registers","volume":"52","author":"Liaw","year":"2014","journal-title":"J Biomed Inform."},{"key":"2020110612372817500_ocv202-B47","doi-asserted-by":"crossref","first-page":"1083","DOI":"10.1038\/clpt.2012.42","article-title":"Electronic medical records as a tool in clinical pharmacology: opportunities and challenges","volume":"91","author":"Roden","year":"2012","journal-title":"Clin Pharmacol Ther."},{"key":"2020110612372817500_ocv202-B48","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1016\/j.jbi.2014.07.007","article-title":"Relational machine learning for electronic health record-driven phenotyping","volume":"52","author":"Peissig","year":"2014","journal-title":"J Biomed Inform."},{"key":"2020110612372817500_ocv202-B49","first-page":"911","article-title":"An evaluation of the NQF Quality Data Model for representing Electronic Health Record driven phenotyping algorithms","volume":"2012","author":"Thompson","year":"2012","journal-title":"AMIA Annu Symp Proc."},{"key":"2020110612372817500_ocv202-B50","volume-title":"Mining the Electronic Health Record for Disease Knowledge - Springer","author":"Kumar","year":"2014"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/23\/6\/1046\/34148009\/ocv202.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/23\/6\/1046\/34148009\/ocv202.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,11,6]],"date-time":"2020-11-06T17:50:14Z","timestamp":1604685014000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/23\/6\/1046\/2399228"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,3,28]]},"references-count":50,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2016,3,28]]},"published-print":{"date-parts":[[2016,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocv202","relation":{},"ISSN":["1527-974X","1067-5027"],"issn-type":[{"value":"1527-974X","type":"electronic"},{"value":"1067-5027","type":"print"}],"subject":[],"published-other":{"date-parts":[[2016,11]]},"published":{"date-parts":[[2016,3,28]]}}}