{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,15]],"date-time":"2025-12-15T19:45:47Z","timestamp":1765827947837,"version":"3.37.3"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2020,10,12]],"date-time":"2020-10-12T00:00:00Z","timestamp":1602460800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000051","name":"National Human Genome Research Institute","doi-asserted-by":"publisher","award":["R01-HG009174"],"award-info":[{"award-number":["R01-HG009174"]}],"id":[{"id":"10.13039\/100000051","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>Due to a complex set of processes involved with the recording of health information in the Electronic Health Records (EHRs), the truthfulness of EHR diagnosis records is questionable. We present a computational approach to estimate the probability that a single diagnosis record in the EHR reflects the true disease.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>Using EHR data on 18 diseases from the Mass General Brigham (MGB) Biobank, we develop generative classifiers on a small set of disease-agnostic features from EHRs that aim to represent Patients, pRoviders, and their Interactions within the healthcare SysteM (PRISM features).<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We demonstrate that PRISM features and the generative PRISM classifiers are potent for estimating disease probabilities and exhibit generalizable and transferable distributional characteristics across diseases and patient populations. The joint probabilities we learn about diseases through the PRISM features via PRISM generative models are transferable and generalizable to multiple diseases.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Discussion<\/jats:title>\n                  <jats:p>The Generative Transfer Learning (GTL) approach with PRISM classifiers enables the scalable validation of computable phenotypes in EHRs without the need for domain-specific knowledge about specific disease processes.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusion<\/jats:title>\n                  <jats:p>Probabilities computed from the generative PRISM classifier can enhance and accelerate applied Machine Learning research and discoveries with EHR data.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocaa215","type":"journal-article","created":{"date-parts":[[2020,8,18]],"date-time":"2020-08-18T11:51:14Z","timestamp":1597751474000},"page":"559-568","source":"Crossref","is-referenced-by-count":17,"title":["Generative transfer learning for measuring plausibility of EHR diagnosis records"],"prefix":"10.1093","volume":"28","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0204-8978","authenticated-orcid":false,"given":"Hossein","family":"Estiri","sequence":"first","affiliation":[{"name":"Harvard Medical School, Boston, Massachusetts, USA"},{"name":"Massachusetts General Hospital, Boston, Massachusetts, USA"},{"name":"Mass General Brigham, Boston, Massachusetts, USA"}]},{"given":"Sebastien","family":"Vasey","sequence":"additional","affiliation":[{"name":"Department of Mathematics, Harvard University, Cambridge, Massachusetts, USA"}]},{"given":"Shawn N","family":"Murphy","sequence":"additional","affiliation":[{"name":"Harvard Medical School, Boston, Massachusetts, USA"},{"name":"Massachusetts General Hospital, Boston, Massachusetts, USA"},{"name":"Mass General Brigham, Boston, Massachusetts, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,10,12]]},"reference":[{"volume-title":"The Learning Healthcare System: Workshop Summary","year":"2007","key":"2021030612201944600_ocaa215-B1"},{"key":"2021030612201944600_ocaa215-B2","doi-asserted-by":"crossref","first-page":"w181","DOI":"10.1377\/hlthaff.26.2.w181","article-title":"Bridging the inferential gap: the electronic health record and clinical evidence","volume":"26","author":"Stewart","year":"2007","journal-title":"Health Aff"},{"issue":"8 Suppl 3","key":"2021030612201944600_ocaa215-B3","doi-asserted-by":"crossref","first-page":"S22","DOI":"10.1097\/MLR.0b013e31829b1e2c","article-title":"Data quality assessment for comparative effectiveness research in distributed data networks","volume":"51","author":"Brown","year":"2013","journal-title":"Med Care"},{"issue":"Suppl","key":"2021030612201944600_ocaa215-B4","doi-asserted-by":"crossref","first-page":"S60","DOI":"10.1097\/MLR.0b013e318259bff4","article-title":"Data model considerations for clinical effectiveness researchers","volume":"50","author":"Kahn","year":"2012","journal-title":"Med Care"},{"issue":"5","key":"2021030612201944600_ocaa215-B5","doi-asserted-by":"crossref","first-page":"830","DOI":"10.1016\/j.jbi.2013.06.010","article-title":"Defining and measuring completeness of electronic health records for secondary use","volume":"46","author":"Weiskopf","year":"2013","journal-title":"J Biomed Inform"},{"key":"2021030612201944600_ocaa215-B6","doi-asserted-by":"crossref","first-page":"456","DOI":"10.1002\/9781119940012.ch23","volume-title":"Statistical Methods in Healthcare","author":"Gregori","year":"2012"},{"year":"2020","key":"2021030612201944600_ocaa215-B7"},{"issue":"1","key":"2021030612201944600_ocaa215-B8","doi-asserted-by":"crossref","first-page":"18","DOI":"10.13063\/2327-9214.1244","article-title":"A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data","volume":"4","author":"Kahn","year":"2016","journal-title":"eGEMs"},{"issue":"Suppl 1","key":"2021030612201944600_ocaa215-B9","doi-asserted-by":"crossref","first-page":"i109","DOI":"10.1136\/amiajnl-2011-000463","article-title":"Exploiting time in electronic health record correlations","volume":"18","author":"Hripcsak","year":"2011","journal-title":"J Am Med Informatics Assoc"},{"issue":"1","key":"2021030612201944600_ocaa215-B10","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1136\/amiajnl-2012-001145","article-title":"Next-generation phenotyping of electronic health records","volume":"20","author":"Hripcsak","year":"2013","journal-title":"J Am Med Informatics Assoc"},{"key":"2021030612201944600_ocaa215-B11","doi-asserted-by":"crossref","first-page":"k1479","DOI":"10.1136\/bmj.k1479","article-title":"Biases in electronic health record data due to processes within the healthcare system: Retrospective observational study","volume":"361","author":"Agniel","year":"2018","journal-title":"BMJ"},{"issue":"1","key":"2021030612201944600_ocaa215-B12","doi-asserted-by":"crossref","first-page":"11","DOI":"10.3390\/jpm6010011","article-title":"The biobank portal for partners personalized medicine: A query tool for working with consented biobank samples, genotypes, and phenotypes using i2b2","volume":"6","author":"Gainer","year":"2016","journal-title":"J Pers Med"},{"issue":"1","key":"2021030612201944600_ocaa215-B13","doi-asserted-by":"crossref","first-page":"2","DOI":"10.3390\/jpm6010002","article-title":"Building the partners healthcare biobank at partners personalized medicine: Informed consent, return of research results, recruitment lessons and operational considerations","volume":"6","author":"Karlson","year":"2016","journal-title":"J Pers Med"},{"issue":"1","key":"2021030612201944600_ocaa215-B14","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1146\/annurev-biodatasci-080917-013315","article-title":"Advances in electronic phenotyping: from rule-based definitions to machine learning models","volume":"1","author":"Banda","year":"2018","journal-title":"Annu Rev Biomed Data Sci"},{"key":"2021030612201944600_ocaa215-B15","first-page":"18","article-title":"The effectiveness of multitask learning for phenotyping with electronic health records data","volume":"24","author":"Ding","year":"2019","journal-title":"Pac Symp Biocomput"},{"issue":"2","key":"2021030612201944600_ocaa215-B16","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1136\/amiajnl-2013-001935","article-title":"A review of approaches to identifying patient phenotype cohorts using electronic health records","volume":"21","author":"Shivade","year":"2014","journal-title":"J Am Med Inform Assoc"},{"key":"2021030612201944600_ocaa215-B17","first-page":"48","article-title":"Electronic phenotyping with APHRODITE and the Observational Health Sciences and Informatics (OHDSI) data network","volume":"2017","author":"Banda","year":"2017","journal-title":"AMIA Jt Summits Transl Sci Proc"},{"key":"2021030612201944600_ocaa215-B18","first-page":"606","article-title":"Using anchors to estimate clinical state without labeled data","volume":"2014","author":"Halpern","year":"2014","journal-title":"AMIA Annu Symp Proc"},{"issue":"4","key":"2021030612201944600_ocaa215-B19","doi-asserted-by":"crossref","first-page":"731","DOI":"10.1093\/jamia\/ocw011","article-title":"Electronic medical record phenotyping using the anchor and learn framework","volume":"23","author":"Halpern","year":"2016","journal-title":"J Am Med Inform Assoc"},{"issue":"6","key":"2021030612201944600_ocaa215-B20","doi-asserted-by":"crossref","first-page":"1166","DOI":"10.1093\/jamia\/ocw028","article-title":"Learning statistical models of phenotypes using noisy labeled training data","volume":"23","author":"Agarwal","year":"2016","journal-title":"J Am Med Inform Assoc"},{"issue":"5","key":"2021030612201944600_ocaa215-B21","doi-asserted-by":"crossref","first-page":"993","DOI":"10.1093\/jamia\/ocv034","article-title":"Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources","volume":"22","author":"Yu","year":"2015","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"2021030612201944600_ocaa215-B22","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1093\/jamia\/ocx111","article-title":"Enabling phenotypic big data with PheNorm","volume":"25","author":"Yu","year":"2018","journal-title":"J Am Med Informatics Assoc"},{"key":"2021030612201944600_ocaa215-B23","doi-asserted-by":"crossref","first-page":"e143","DOI":"10.1093\/jamia\/ocw135","article-title":"Surrogate-assisted feature extraction for high-throughput phenotyping","volume":"24","author":"Yu","year":"2017","journal-title":"J Am Med Inform Assoc"},{"key":"2021030612201944600_ocaa215-B24","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1016\/j.jbi.2017.04.009","article-title":"EHR-based phenotyping: bulk learning and evaluation","volume":"70","author":"Chiu","year":"2017","journal-title":"J Biomed Inform"},{"issue":"10","key":"2021030612201944600_ocaa215-B25","doi-asserted-by":"crossref","first-page":"3200","DOI":"10.1093\/bioinformatics\/btaa088","article-title":"Polar labeling: silver standard algorithm for training disease classifiers","volume":"36","author":"Wagholikar","year":"2020","journal-title":"Bioinformatics"},{"key":"2021030612201944600_ocaa215-B26","first-page":"169","volume-title":"Adv Neural Inf Process Syst","author":"Ng","year":"2002"},{"year":"; 2016. : 10.1016\/978-0-12-391420-0.09987-.","author":"Goodfellow","key":"2021030612201944600_ocaa215-B27"},{"first-page":"242","year":"2010","author":"Torrey","key":"2021030612201944600_ocaa215-B28"},{"year":"1999","author":"Yang","key":"2021030612201944600_ocaa215-B29"},{"volume-title":"Elements of Information Theory","year":"2012","author":"Cover","key":"2021030612201944600_ocaa215-B30"},{"key":"2021030612201944600_ocaa215-B31","article-title":"High-throughput multimodal automated phenotyping (MAP) with application to PheWAS","author":"Liao","year":"2019","journal-title":"bioRxiv"},{"key":"2021030612201944600_ocaa215-B32","doi-asserted-by":"crossref","first-page":"103122","DOI":"10.1016\/j.jbi.2019.103122","article-title":"Feature extraction for phenotyping from semantic and knowledge resources","volume":"91","author":"Ning","year":"2019","journal-title":"J Biomed Inform"},{"issue":"1","key":"2021030612201944600_ocaa215-B33","doi-asserted-by":"crossref","DOI":"10.1038\/srep26094","article-title":"Deep patient: an unsupervised representation to predict the future of patients from the electronic health records","volume":"6","author":"Miotto","year":"2016","journal-title":"Sci Rep"},{"key":"2021030612201944600_ocaa215-B34","doi-asserted-by":"crossref","first-page":"S106","DOI":"10.1097\/MLR.0b013e3181de9e17","article-title":"Prediction modeling using EHR data: challenges, strategies, and a comparison of machine learning approaches","volume":"48","author":"Wu","year":"2010","journal-title":"Med Care"},{"author":"Liu","key":"2021030612201944600_ocaa215-B35","first-page":"705"},{"issue":"2","key":"2021030612201944600_ocaa215-B36","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1136\/jamia.2009.000893","article-title":"Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2)","volume":"17","author":"Murphy","year":"2010","journal-title":"J Am Med Inform Assoc"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/28\/3\/559\/36428626\/ocaa215.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/28\/3\/559\/36428626\/ocaa215.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,6]],"date-time":"2021-03-06T12:20:52Z","timestamp":1615033252000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/28\/3\/559\/5920886"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,12]]},"references-count":36,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2020,10,12]]},"published-print":{"date-parts":[[2021,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocaa215","relation":{},"ISSN":["1527-974X"],"issn-type":[{"type":"electronic","value":"1527-974X"}],"subject":[],"published-other":{"date-parts":[[2021,3,1]]},"published":{"date-parts":[[2020,10,12]]}}}