{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:39:47Z","timestamp":1772138387042,"version":"3.50.1"},"reference-count":25,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2020,5,6]],"date-time":"2020-05-06T00:00:00Z","timestamp":1588723200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000092","name":"NLM","doi-asserted-by":"publisher","award":["R01LM011369-06"],"award-info":[{"award-number":["R01LM011369-06"]}],"id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000092","name":"NLM","doi-asserted-by":"publisher","award":["R01LM006910"],"award-info":[{"award-number":["R01LM006910"]}],"id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100005205","name":"Janssen Research and Development LLC","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100005205","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Objective<\/jats:title>\n                    <jats:p>Accurate electronic phenotyping is essential to support collaborative observational research. Supervised machine learning methods can be used to train phenotype classifiers in a high-throughput manner using imperfectly labeled data. We developed 10 phenotype classifiers using this approach and evaluated performance across multiple sites within the Observational Health Data Sciences and Informatics (OHDSI) network.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Materials and Methods<\/jats:title>\n                    <jats:p>We constructed classifiers using the Automated PHenotype Routine for Observational Definition, Identification, Training and Evaluation (APHRODITE) R-package, an open-source framework for learning phenotype classifiers using datasets in the Observational Medical Outcomes Partnership Common Data Model. We labeled training data based on the presence of multiple mentions of disease-specific codes. Performance was evaluated on cohorts derived using rule-based definitions and real-world disease prevalence. Classifiers were developed and evaluated across 3 medical centers, including 1 international site.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Compared to the multiple mentions labeling heuristic, classifiers showed a mean recall boost of 0.43 with a mean precision loss of 0.17. Performance decreased slightly when classifiers were shared across medical centers, with mean recall and precision decreasing by 0.08 and 0.01, respectively, at a site within the USA, and by 0.18 and 0.10, respectively, at an international site.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Discussion and Conclusion<\/jats:title>\n                    <jats:p>We demonstrate a high-throughput pipeline for constructing and sharing phenotype classifiers across sites within the OHDSI network using APHRODITE. Classifiers exhibit good portability between sites within the USA, however limited portability internationally, indicating that classifier generalizability may have geographic limitations, and, consequently, sharing the classifier-building recipe, rather than the pretrained classifiers, may be more useful for facilitating collaborative observational research.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/jamia\/ocaa032","type":"journal-article","created":{"date-parts":[[2020,3,12]],"date-time":"2020-03-12T08:39:32Z","timestamp":1584002372000},"page":"877-883","source":"Crossref","is-referenced-by-count":18,"title":["Development and validation of phenotype classifiers across multiple sites in the observational health data sciences and informatics network"],"prefix":"10.1093","volume":"27","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7336-1056","authenticated-orcid":false,"given":"Mehr","family":"Kashyap","sequence":"first","affiliation":[{"name":"Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA"}]},{"given":"Martin","family":"Seneviratne","sequence":"additional","affiliation":[{"name":"Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8499-824X","authenticated-orcid":false,"given":"Juan M","family":"Banda","sequence":"additional","affiliation":[{"name":"Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA"},{"name":"Department of Computer Science, Georgia State University, Atlanta, Georgia, USA"}]},{"given":"Thomas","family":"Falconer","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Columbia University, New York, New York, USA"}]},{"given":"Borim","family":"Ryu","sequence":"additional","affiliation":[{"name":"Office of eHealth and Business, Seoul National University Bundang Hospital, Gyeonggi-do, South Korea"}]},{"given":"Sooyoung","family":"Yoo","sequence":"additional","affiliation":[{"name":"Office of eHealth and Business, Seoul National University Bundang Hospital, Gyeonggi-do, South Korea"}]},{"given":"George","family":"Hripcsak","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Columbia University, New York, New York, USA"}]},{"given":"Nigam H","family":"Shah","sequence":"additional","affiliation":[{"name":"Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,5,6]]},"reference":[{"issue":"1","key":"2020110613100061400_ocaa032-B1","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1146\/annurev-biodatasci-080917-013315","article-title":"Advances in electronic phenotyping: from rule-based definitions to machine learning models","volume":"1","author":"Banda","year":"2018","journal-title":"Annu Rev Biomed Data Sci."},{"issue":"e2","key":"2020110613100061400_ocaa032-B2","doi-asserted-by":"crossref","first-page":"e226","DOI":"10.1136\/amiajnl-2013-001926","article-title":"Electronic health records based phenotyping in next-generation clinical trials: a perspective from the NIH Health Care Systems Collaboratory","volume":"20","author":"Richesson","year":"2013","journal-title":"J Am Med Inform Assoc."},{"issue":"1","key":"2020110613100061400_ocaa032-B3","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1136\/amiajnl-2012-001145","article-title":"Next-generation phenotyping of electronic health records","volume":"20","author":"Hripcsak","year":"2013","journal-title":"J Am Med Inform Assoc."},{"issue":"e2","key":"2020110613100061400_ocaa032-B4","doi-asserted-by":"crossref","first-page":"e206","DOI":"10.1136\/amiajnl-2013-002428","article-title":"Electronic health records-driven phenotyping: challenges, recent advances, and perspectives","volume":"20","author":"Pathak","year":"2013","journal-title":"J Am Med Inform Assoc."},{"issue":"2","key":"2020110613100061400_ocaa032-B5","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1136\/amiajnl-2013-001935","article-title":"A review of approaches to identifying patient phenotype cohorts using electronic health records","volume":"21","author":"Shivade","year":"2014","journal-title":"J Am Med Inform Assoc."},{"issue":"6","key":"2020110613100061400_ocaa032-B6","doi-asserted-by":"crossref","first-page":"1046","DOI":"10.1093\/jamia\/ocv202","article-title":"PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability","volume":"23","author":"Kirby","year":"2016","journal-title":"J Am Med Inform Assoc."},{"issue":"11","key":"2020110613100061400_ocaa032-B7","doi-asserted-by":"crossref","first-page":"1540","DOI":"10.1093\/jamia\/ocy101","article-title":"A case study evaluating the portability of an executable computable phenotype algorithm across multiple institutions and electronic health record environments","volume":"25","author":"Pacheco","year":"2018","journal-title":"J Am Med Inform Assoc."},{"issue":"6","key":"2020110613100061400_ocaa032-B8","doi-asserted-by":"crossref","first-page":"1625","DOI":"10.1093\/ije\/dys188","article-title":"Data resource profile: cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER)","volume":"41","author":"Denaxas","year":"2012","journal-title":"Int J Epidemiol."},{"issue":"e1","key":"2020110613100061400_ocaa032-B9","doi-asserted-by":"crossref","first-page":"e147","DOI":"10.1136\/amiajnl-2012-000896","article-title":"Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network","volume":"20","author":"Newton","year":"2013","journal-title":"J Am Med Inform Assoc."},{"key":"2020110613100061400_ocaa032-B10","first-page":"574","article-title":"Observational Health Data Sciences and Informatics (OHDSI): opportunities for observational researchers","volume":"216","author":"Hripcsak","year":"2015","journal-title":"Stud Heal Technol Inform"},{"issue":"2","key":"2020110613100061400_ocaa032-B11","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1136\/amiajnl-2011-000492","article-title":"A translational engine at the national scale: informatics for integrating biology and the bedside","volume":"19","author":"Kohane","year":"2012","journal-title":"J Am Med Inform Assoc."},{"key":"2020110613100061400_ocaa032-B12","doi-asserted-by":"crossref","first-page":"103253","DOI":"10.1016\/j.jbi.2019.103253","article-title":"Facilitating phenotype transfer using a common data model","volume":"96","author":"Hripcsak","year":"2019","journal-title":"J Biomed Inform."},{"issue":"e2","key":"2020110613100061400_ocaa032-B13","doi-asserted-by":"crossref","first-page":"e275","DOI":"10.1136\/amiajnl-2013-001856","article-title":"Using electronic health records data to identify patients with chronic pain in a primary care setting","volume":"20","author":"Tian","year":"2013","journal-title":"J Am Med Inform Assoc."},{"key":"2020110613100061400_ocaa032-B14","first-page":"189","article-title":"Na\u00efve Electronic Health Record phenotype identification for rheumatoid arthritis","volume":"2011","author":"Carroll","year":"2011","journal-title":"AMIA Annu Symp Proc"},{"issue":"e1","key":"2020110613100061400_ocaa032-B15","doi-asserted-by":"crossref","first-page":"e162","DOI":"10.1136\/amiajnl-2011-000583","article-title":"Portability of an algorithm to identify rheumatoid arthritis in electronic health records","volume":"19","author":"Carroll","year":"2012","journal-title":"J Am Med Inform Assoc."},{"issue":"e2","key":"2020110613100061400_ocaa032-B16","doi-asserted-by":"crossref","first-page":"e253","DOI":"10.1136\/amiajnl-2013-001945","article-title":"Applying active learning to high-throughput phenotyping algorithms for electronic health records data","volume":"20","author":"Chen","year":"2013","journal-title":"J Am Med Inform Assoc."},{"issue":"6","key":"2020110613100061400_ocaa032-B17","doi-asserted-by":"crossref","first-page":"1166","DOI":"10.1093\/jamia\/ocw028","article-title":"Learning statistical models of phenotypes using noisy labeled training data","volume":"23","author":"Agarwal","year":"2016","journal-title":"J Am Med Inform Assoc."},{"key":"2020110613100061400_ocaa032-B18","first-page":"606","article-title":"Using anchors to estimate clinical state without labeled data","volume":"2014","author":"Halpern","year":"2014","journal-title":"AMIA Annu Symp Proc"},{"issue":"4","key":"2020110613100061400_ocaa032-B19","doi-asserted-by":"crossref","first-page":"731","DOI":"10.1093\/jamia\/ocw011","article-title":"Electronic medical record phenotyping using the anchor and learn framework","volume":"23","author":"Halpern","year":"2016","journal-title":"J Am Med Inform Assoc."},{"key":"2020110613100061400_ocaa032-B20","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1016\/j.jbi.2016.10.007","article-title":"Consortium PRO-AALSCT. Semi-supervised learning of the electronic health record for phenotype stratification","volume":"64","author":"Beaulieu-Jones","year":"2016","journal-title":"J Biomed Inform."},{"issue":"e1","key":"2020110613100061400_ocaa032-B21","doi-asserted-by":"crossref","first-page":"e143","DOI":"10.1093\/jamia\/ocw135","article-title":"Surrogate-assisted feature extraction for high-throughput phenotyping","volume":"24","author":"Yu","year":"2017","journal-title":"J Am Med Inform Assoc."},{"issue":"1","key":"2020110613100061400_ocaa032-B22","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1093\/jamia\/ocy154","article-title":"Automated and flexible identification of complex disease: building a model for systemic lupus erythematosus using noisy labeling","volume":"26","author":"Murray","year":"2019","journal-title":"J Am Med Informatics Assoc."},{"issue":"2","key":"2020110613100061400_ocaa032-B23","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1006\/jcss.1996.0019","article-title":"General bounds on the number of examples needed for learning probabilistic concepts","volume":"52","author":"Simon","year":"1996","journal-title":"J Comput Syst Sci."},{"key":"2020110613100061400_ocaa032-B24","first-page":"48","article-title":"Electronic phenotyping with APHRODITE and the Observational Health Sciences and Informatics (OHDSI) data network","volume":"2017","author":"Banda","year":"2017","journal-title":"AMIA Jt Summits Transl Sci Proc"},{"key":"2020110613100061400_ocaa032-B25","doi-asserted-by":"crossref","first-page":"103258","DOI":"10.1016\/j.jbi.2019.103258","article-title":"PheValuator: development and evaluation of a phenotype algorithm evaluator","volume":"97","author":"Swerdel","year":"2019","journal-title":"J Biomed Inform."}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/27\/6\/877\/34152823\/ocaa032.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/27\/6\/877\/34152823\/ocaa032.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,11,6]],"date-time":"2020-11-06T14:30:49Z","timestamp":1604673049000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/27\/6\/877\/5831103"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,6]]},"references-count":25,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2020,5,6]]},"published-print":{"date-parts":[[2020,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocaa032","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/673418","asserted-by":"object"}]},"ISSN":["1527-974X"],"issn-type":[{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,6]]},"published":{"date-parts":[[2020,5,6]]}}}