{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T00:55:48Z","timestamp":1774572948133,"version":"3.50.1"},"reference-count":22,"publisher":"Oxford University Press (OUP)","issue":"6","funder":[{"DOI":"10.13039\/100000057","name":"National Institute of General Medical Sciences","doi-asserted-by":"publisher","award":["5R01GM104303"],"award-info":[{"award-number":["5R01GM104303"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["U54HG007963"],"award-info":[{"award-number":["U54HG007963"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["U01CA198934"],"award-info":[{"award-number":["U01CA198934"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006093","name":"Patient-Centered Outcomes Research Institute","doi-asserted-by":"crossref","award":["CDRN130604608"],"award-info":[{"award-number":["CDRN130604608"]}],"id":[{"id":"10.13039\/100006093","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>One promise of nationwide adoption of electronic health records (EHRs) is the availability of data for large-scale clinical research studies. However, because the same patient could be treated at multiple health care institutions, data from only a single site might not contain the complete medical history for that patient, meaning that critical events could be missing. In this study, we evaluate how simple heuristic checks for data \u201ccompleteness\u201d affect the number of patients in the resulting cohort and introduce potential biases.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>We began with a set of 16 filters that check for the presence of demographics, laboratory tests, and other types of data, and then systematically applied all 216 possible combinations of these filters to the EHR data for 12 million patients at 7 health care systems and a separate payor claims database of 7 million members.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>EHR data showed considerable variability in data completeness across sites and high correlation between data types. For example, the fraction of patients with diagnoses increased from 35.0% in all patients to 90.9% in those with at least 1 medication. An unrelated claims dataset independently showed that most filters select members who are older and more likely female and can eliminate large portions of the population whose data are actually complete.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Discussion and Conclusion<\/jats:title>\n                  <jats:p>As investigators design studies, they need to balance their confidence in the completeness of the data with the effects of placing requirements on the data on the resulting patient cohort.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocx071","type":"journal-article","created":{"date-parts":[[2017,6,12]],"date-time":"2017-06-12T11:08:09Z","timestamp":1497265689000},"page":"1134-1141","source":"Crossref","is-referenced-by-count":79,"title":["Biases introduced by filtering electronic health records for patients with \u201ccomplete data\u201d"],"prefix":"10.1093","volume":"24","author":[{"given":"Griffin M","family":"Weber","sequence":"first","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA"},{"name":"Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA"}]},{"given":"William G","family":"Adams","sequence":"additional","affiliation":[{"name":"Department of Pediatrics, Boston Medical Center, Boston, MA, USA"}]},{"given":"Elmer V","family":"Bernstam","sequence":"additional","affiliation":[{"name":"Department of Internal Medicine, McGovern Medical School, School of Biomedical Informatics, The University of Texas Health Science Center, Houston, TX, USA"}]},{"given":"Jonathan P","family":"Bickel","sequence":"additional","affiliation":[{"name":"Computational Health Informatics Program, Boston Children\u2019s Hospital, Boston, MA, USA"}]},{"given":"Kathe P","family":"Fox","sequence":"additional","affiliation":[{"name":"Department of Analytics and Behavior Change, Aetna, Hartford, CT, USA"}]},{"given":"Keith","family":"Marsolo","sequence":"additional","affiliation":[{"name":"Department of Pediatrics, Division of Biomedical Informatics, Cincinnati Children\u2019s Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, USA"}]},{"given":"Vijay A","family":"Raghavan","sequence":"additional","affiliation":[{"name":"Scientific Information Management, Merck, Boston, MA, USA"}]},{"given":"Alexander","family":"Turchin","sequence":"additional","affiliation":[{"name":"Division of Endocrinology, Brigham and Women\u2019s Hospital, Boston, MA, USA"}]},{"given":"Xiaobo","family":"Zhou","sequence":"additional","affiliation":[{"name":"Department of Radiology, Wake Forest University School of Medicine, Winston Salem, NC, USA"}]},{"given":"Shawn N","family":"Murphy","sequence":"additional","affiliation":[{"name":"Department of Neurology, Massachusetts General Hospital, Boston, MA, USA"}]},{"given":"Kenneth D","family":"Mandl","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA"},{"name":"Computational Health Informatics Program, Boston Children\u2019s Hospital, Boston, MA, USA"}]}],"member":"286","published-online":{"date-parts":[[2017,8,1]]},"reference":[{"issue":"4","key":"2020110612450904500_ocx071-B1","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1370\/afm.1279","article-title":"Electronic health records vs Medicaid claims: completeness of diabetes preventive care data in community health centers","volume":"9","author":"Devoe","year":"2011","journal-title":"Ann Fam Med"},{"key":"2020110612450904500_ocx071-B2","doi-asserted-by":"crossref","first-page":"S30","DOI":"10.1097\/MLR.0b013e31829b1dbd","article-title":"Caveats for the use of operational electronic health record data in comparative effectiveness research","volume":"51","author":"Hersh","year":"2013","journal-title":"Med Care"},{"issue":"4","key":"2020110612450904500_ocx071-B3","doi-asserted-by":"crossref","first-page":"720","DOI":"10.1136\/amiajnl-2013-002333","article-title":"Agreement of Medicaid claims and electronic health records for assessing preventive care quality among adults","volume":"21","author":"Heintzman","year":"2014","journal-title":"J Am Med Inform Assoc"},{"key":"2020110612450904500_ocx071-B4","doi-asserted-by":"crossref","first-page":"1989","DOI":"10.1001\/archinternmed.2010.439","article-title":"Patients treated at multiple acute health care facilities: quantifying information fragmentation","volume":"170","author":"Bourgeois","year":"2010","journal-title":"Arch Int Med"},{"key":"2020110612450904500_ocx071-B5","first-page":"1","article-title":"Secondary use of EHR: data quality issues and informatics opportunities","volume":"2010","author":"Botsis","year":"2010","journal-title":"AMIA Jt Summits Transl Sci Proc"},{"key":"2020110612450904500_ocx071-B6","first-page":"409","article-title":"All health care is not local: an evaluation of the distribution of Emergency Department care delivered in Indiana","volume":"2011","author":"Finnell","year":"2011","journal-title":"AMIA Annu Symp Proc"},{"key":"2020110612450904500_ocx071-B7","first-page":"259","article-title":"Use of electronic medical records (EMR) for oncology outcomes research: assessing the comparability of EMR information to patient registry and health claims data","volume":"3","author":"Lau","year":"2011","journal-title":"Clin Epidemiol"},{"issue":"2","key":"2020110612450904500_ocx071-B8","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1136\/amiajnl-2011-000597","article-title":"Impact of data fragmentation across healthcare centers on the accuracy of a high-throughput clinical phenotyping algorithm for specifying subjects with type 2 diabetes mellitus","volume":"19","author":"Wei","year":"2012","journal-title":"J Am Med Inform Assoc"},{"issue":"4","key":"2020110612450904500_ocx071-B9","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1016\/j.ijmedinf.2012.05.015","article-title":"The absence of longitudinal data limits the accuracy of high-throughput clinical phenotyping for identifying type 2 diabetes mellitus subjects","volume":"82","author":"Wei","year":"2013","journal-title":"Int J Med Inform"},{"issue":"8","key":"2020110612450904500_ocx071-B10","doi-asserted-by":"crossref","first-page":"1486","DOI":"10.1377\/hlthaff.2013.0124","article-title":"Operational health information exchanges show substantial growth, but long-term funding remains a concern","volume":"32","author":"Adler-Milstein","year":"2013","journal-title":"Health Aff (Millwood)"},{"issue":"1","key":"2020110612450904500_ocx071-B11","first-page":"26","article-title":"Health information exchange among US hospitals: Who's in, who's out, and why? Healthcare","volume":"2","author":"Adler-Milstein","year":"2014"},{"issue":"3","key":"2020110612450904500_ocx071-B12","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1016\/j.annemergmed.2013.09.024","article-title":"Emergency physicians' perspectives on their use of health information exchange","volume":"63","author":"Thorn","year":"2014","journal-title":"Ann Emerg Med"},{"issue":"8","key":"2020110612450904500_ocx071-B13","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1007\/s10916-014-0078-1","article-title":"Factors related to health information exchange participation and use systems-level quality improvement","volume":"38","author":"Yeager","year":"2014","journal-title":"J Med Syst"},{"issue":"24","key":"2020110612450904500_ocx071-B14","first-page":"2479","article-title":"Finding the missing link for big biomedical data","volume":"311","author":"Weber","year":"2014","journal-title":"JAMA"},{"key":"2020110612450904500_ocx071-B15","doi-asserted-by":"crossref","first-page":"615","DOI":"10.1136\/amiajnl-2014-002727","article-title":"Scalable Collaborative Infrastructure for a Learning Healthcare System (SCILHS): architecture","volume":"21","author":"Mandl","year":"2014","journal-title":"J Am Med Inform Assoc"},{"key":"2020110612450904500_ocx071-B16","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1136\/amiajnl-2014-002864","article-title":"PCORnet: turning a dream into reality","volume":"21","author":"Collins","year":"2014","journal-title":"J Am Med Inform Assoc"},{"key":"2020110612450904500_ocx071-B17","doi-asserted-by":"crossref","first-page":"592","DOI":"10.1056\/NEJMp1313061","article-title":"PCORI at 3 years: progress, lessons, and plans","volume":"370","author":"Selby","year":"2014","journal-title":"New Engl J Med"},{"key":"2020110612450904500_ocx071-B18","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1186\/1472-6947-14-51","article-title":"Hidden in plain sight: bias towards sick patients when sampling patients with sufficient electronic health record data for research","volume":"14","author":"Rusanov","year":"2014","journal-title":"BMC Med Inform Decis Mak"},{"issue":"5","key":"2020110612450904500_ocx071-B19","doi-asserted-by":"crossref","first-page":"830","DOI":"10.1016\/j.jbi.2013.06.010","article-title":"Defining and measuring completeness of electronic health records for secondary use","volume":"46","author":"Weiskopf","year":"2013","journal-title":"J Biomed Inform"},{"issue":"2","key":"2020110612450904500_ocx071-B20","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1136\/jamia.2009.000893","article-title":"Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2)","volume":"17","author":"Murphy","year":"2010","journal-title":"J Am Med Inform Assoc"},{"key":"2020110612450904500_ocx071-B21","doi-asserted-by":"crossref","first-page":"1840","DOI":"10.1111\/1475-6773.12102","article-title":"Accountable Care Organizations in the United States: market and demographic factors associated with formation","volume":"48","author":"Lewis","year":"2013","journal-title":"Health Services Res"},{"key":"2020110612450904500_ocx071-B22","doi-asserted-by":"crossref","first-page":"1493","DOI":"10.1001\/jama.2012.451","article-title":"Accountable care organizations and antitrust: restructuring the health care market","volume":"307","author":"Scheffler","year":"2012","journal-title":"JAMA"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/24\/6\/1134\/34149462\/ocx071.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/24\/6\/1134\/34149462\/ocx071.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,11,6]],"date-time":"2020-11-06T18:26:37Z","timestamp":1604687197000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/24\/6\/1134\/4057688"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,8,1]]},"references-count":22,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2017,8,1]]},"published-print":{"date-parts":[[2017,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocx071","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,11]]},"published":{"date-parts":[[2017,8,1]]}}}