{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,25]],"date-time":"2025-09-25T16:34:43Z","timestamp":1758818083954,"version":"3.37.3"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2022,11,30]],"date-time":"2022-11-30T00:00:00Z","timestamp":1669766400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01 HL133786","R01 GM139891","R01AG069900","U01 HG011181","P50HD106446","R01 LM012806"],"award-info":[{"award-number":["R01 HL133786","R01 GM139891","R01AG069900","U01 HG011181","P50HD106446","R01 LM012806"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100013017","name":"Vanderbilt University Medical Center","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100013017","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Science","doi-asserted-by":"crossref","award":["UL1 TR002243"],"award-info":[{"award-number":["UL1 TR002243"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,2,16]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Objective<\/jats:title><jats:p>A previous study, PheMAP, combined independent, online resources to enable high-throughput phenotyping (HTP) using electronic health records (EHRs). However, online resources offer distinct quality descriptions of diseases which may affect phenotyping performance. We aimed to evaluate the phenotyping performance of single resource-based PheMAPs and investigate an optimized strategy for HTP.<\/jats:p><\/jats:sec><jats:sec><jats:title>Materials and Methods<\/jats:title><jats:p>We compared how each resource produced top-ranked concept unique identifiers (CUIs) by term frequency\u2014inverse document frequency with Jaccard matrices comparing single resources and the original PheMAP. We correlated top-ranked concepts from each resource to features used in established Phenotype KnowledgeBase (PheKB) algorithms for hypothyroidism, type II diabetes mellitus (T2DM), and dementias. Using resources separately, we calculated multiple phenotype risk scores for individuals from Vanderbilt University Medical Center\u2019s BioVU DNA Biobank and compared phenotyping performance against rule-based eMERGE algorithms. Lastly, we implemented an ensemble strategy which classified patient case\/control status based upon PheMAP resource agreement.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Jaccard similarity matrices indicate that the similarity of CUIs comprising single resource-based PheMAPs varies. Single resource-based PheMAPs generated from MedlinePlus and MedicineNet outperformed others but only encompass 81.6% of overall disease phenotypes. We propose the PheMAP-Ensemble which provides higher average accuracy and precision than the combined average accuracy and precision of single resource-based PheMAPs. While offering complete phenotype coverage, PheMAP-Ensemble significantly increases phenotyping recall compared to the original iteration.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>Resources comprising the PheMAP produce different phenotyping performance when implemented individually. The ensemble method significantly improves the quality of PheMAP by fully utilizing dissimilar resources to capture accurate phenotyping data from EHRs.<\/jats:p><\/jats:sec>","DOI":"10.1093\/jamia\/ocac234","type":"journal-article","created":{"date-parts":[[2022,12,1]],"date-time":"2022-12-01T04:53:15Z","timestamp":1669870395000},"page":"456-465","source":"Crossref","is-referenced-by-count":6,"title":["Evaluating resources composing the PheMAP knowledge base to enhance high-throughput phenotyping"],"prefix":"10.1093","volume":"30","author":[{"given":"Nicholas C","family":"Wan","sequence":"first","affiliation":[{"name":"Department of Biomedical Engineering, Vanderbilt University , Nashville, Tennessee, USA"}]},{"given":"Ali A","family":"Yaqoob","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Vanderbilt University Medical Center , Nashville, Tennessee, USA"}]},{"given":"Henry H","family":"Ong","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Vanderbilt University Medical Center , Nashville, Tennessee, USA"}]},{"given":"Juan","family":"Zhao","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Vanderbilt University Medical Center , Nashville, Tennessee, USA"}]},{"given":"Wei-Qi","family":"Wei","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Vanderbilt University Medical Center , Nashville, Tennessee, USA"}]}],"member":"286","published-online":{"date-parts":[[2022,11,30]]},"reference":[{"issue":"6","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1038\/nrg2999","article-title":"Using electronic health records to drive discovery in disease genomics","volume":"12","author":"Kohane","year":"2011","journal-title":"Nat Rev Genet"},{"issue":"234","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"234cm3","DOI":"10.1126\/scitranslmed.3008604","article-title":"Biobanks and electronic medical records: enabling cost-effective research","volume":"6","author":"Bowton","year":"2014","journal-title":"Sci Transl Med"},{"issue":"7576","key":"2023021611051964200_","doi-asserted-by":"publisher","first-page":"S14","DOI":"10.1038\/527S14a","article-title":"Deep phenotyping: the details of disease","volume":"527","author":"Delude","year":"2015","journal-title":"Nature"},{"issue":"1","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1186\/s13073-015-0166-y","article-title":"Extracting research-quality phenotypes from electronic health records to support precision medicine","volume":"7","author":"Wei","year":"2015","journal-title":"Genome Med"},{"issue":"e1","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"e147","DOI":"10.1136\/amiajnl-2012-000896","article-title":"Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network","volume":"20","author":"Newton","year":"2013","journal-title":"J Am Med Inform Assoc"},{"issue":"12","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"3426","DOI":"10.1038\/s41596-019-0227-6","article-title":"High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP)","volume":"14","author":"Zhang","year":"2019","journal-title":"Nat Protoc"},{"issue":"4","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"e14325","DOI":"10.2196\/14325","article-title":"Mapping ICD-10 and ICD-10-CM codes to phecodes: workflow development and initial evaluation","volume":"7","author":"Wu","year":"2019","journal-title":"JMIR Med Inform"},{"issue":"7","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"e0175508","DOI":"10.1371\/journal.pone.0175508","article-title":"Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record","volume":"12","author":"Wei","year":"2017","journal-title":"PLoS One"},{"issue":"1","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"18953","DOI":"10.1038\/s41598-021-98579-4","article-title":"An updated, computable MEDication-Indication resource for biomedical research","volume":"11","author":"Zheng","year":"2021","journal-title":"Sci Rep"},{"issue":"5","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"954","DOI":"10.1136\/amiajnl-2012-001431","article-title":"Development and evaluation of an ensemble resource linking medications to their indications","volume":"20","author":"Wei","year":"2013","journal-title":"J Am Med Inform Assoc"},{"issue":"11","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"1675","DOI":"10.1093\/jamia\/ocaa104","article-title":"PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records","volume":"27","author":"Zheng","year":"2020","journal-title":"J Am Med Inform Assoc"},{"issue":"2","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"212","DOI":"10.1136\/amiajnl-2011-000439","article-title":"Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study","volume":"19","author":"Kho","year":"2012","journal-title":"J Am Med Inform Assoc"},{"author":"PheKB","key":"2023021611051964200_"},{"issue":"4","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1016\/j.ajhg.2011.09.008","article-title":"Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies","volume":"89","author":"Denny","year":"2011","journal-title":"Am J Hum Genet"},{"issue":"13","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"5645","DOI":"10.1016\/j.eswa.2015.02.055","article-title":"An analysis of the coherence of descriptors in topic modeling","volume":"42","author":"O\u2019Callaghan","year":"2015","journal-title":"Expert Syst Appl"},{"issue":"6","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"1046","DOI":"10.1093\/jamia\/ocv202","article-title":"PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability","volume":"23","author":"Kirby","year":"2016","journal-title":"J Am Med Inform Assoc"},{"issue":"4","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1197\/jamia.M1176","article-title":"\u201cUnderstanding\u201d medical school curriculum content using KnowledgeMap","volume":"10","author":"Denny","year":"2003","journal-title":"J Am Med Inform Assoc"},{"issue":"9","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"1205","DOI":"10.1093\/bioinformatics\/btq126","article-title":"PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene\u2013disease associations","volume":"26","author":"Denny","year":"2010","journal-title":"Bioinformatics"},{"issue":"12","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"1102","DOI":"10.1038\/nbt.2749","article-title":"Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data","volume":"31","author":"Denny","year":"2013","journal-title":"Nat Biotechnol"},{"issue":"13","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"1377","DOI":"10.1161\/CIRCULATIONAHA.112.000604","article-title":"Genome- and phenome-wide analyses of cardiac conduction identifies markers of arrhythmia risk","volume":"127","author":"Ritchie","year":"2013","journal-title":"Circulation"},{"issue":"13","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"2322","DOI":"10.1093\/bioinformatics\/bty109","article-title":"nVenn: generalized, quasi-proportional Venn and Euler diagrams","volume":"34","author":"P\u00e9rez-Silva","year":"2018","journal-title":"Bioinformatics"},{"issue":"3","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1038\/clpt.2008.89","article-title":"Development of a large-scale de-identified DNA biobank to enable personalized medicine","volume":"84","author":"Roden","year":"2008","journal-title":"Clin Pharmacol Ther"},{"author":"PheKB","key":"2023021611051964200_"},{"author":"PheKB","key":"2023021611051964200_"},{"issue":"12","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"1983","DOI":"10.1109\/TVCG.2014.2346248","article-title":"UpSet: visualization of intersecting sets","volume":"20","author":"Lex","year":"2014","journal-title":"IEEE Trans Vis Comput Graph"},{"key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"276","DOI":"10.11613\/BM.2012.031","article-title":"Interrater reliability: the kappa statistic","volume":"22","author":"McHugh","year":"2012","journal-title":"Biochem Med"},{"key":"2023021611051964200_","doi-asserted-by":"publisher","DOI":"10.1007\/s13187-021-02075-2","article-title":"Quality assessment of online resources for the most common cancers","author":"Li","year":"2021","journal-title":"J Cancer Educ"},{"issue":"e1","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"e20","DOI":"10.1093\/jamia\/ocv130","article-title":"Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance","volume":"23","author":"Wei","year":"2016","journal-title":"J Am Med Inform Assoc"},{"issue":"3","key":"2023021611051964200_","doi-asserted-by":"crossref","first-page":"553","DOI":"10.1093\/jamia\/ocu023","article-title":"Feasibility and utility of applications of the common data model to multiple, disparate observational health databases","volume":"22","author":"Voss","year":"2015","journal-title":"J Am Med Inform Assoc"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/30\/3\/456\/49198743\/ocac234.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/30\/3\/456\/49198743\/ocac234.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,3,13]],"date-time":"2023-03-13T19:43:52Z","timestamp":1678736632000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/30\/3\/456\/6855150"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,30]]},"references-count":29,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2022,11,30]]},"published-print":{"date-parts":[[2023,2,16]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocac234","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"type":"print","value":"1067-5027"},{"type":"electronic","value":"1527-974X"}],"subject":[],"published-other":{"date-parts":[[2023,3,1]]},"published":{"date-parts":[[2022,11,30]]}}}