{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,2]],"date-time":"2026-01-02T17:17:34Z","timestamp":1767374254641,"version":"3.48.0"},"reference-count":65,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2025,9,3]],"date-time":"2025-09-03T00:00:00Z","timestamp":1756857600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Objectives<\/jats:title>\n                    <jats:p>Electronic Health Records (EHRs) sampled from different populations can introduce unwanted biases, limit individual-level data sharing, and make the data and fitted model hardly transferable across different population groups. In this context, our main goal is to design an effective method to transfer knowledge between population groups, with computable guarantees for suitability, and that can be applied to quantify treatment disparities.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Materials and Methods<\/jats:title>\n                    <jats:p>For a model trained in an embedded feature space of one subgroup, our proposed framework, Optimal Transport-based Transfer Learning for EHRs (OTTEHR), combines feature embedding of the data and unbalanced optimal transport (OT) for domain adaptation to another population group. To test our method, we processed and divided the MIMIC-III and MIMIC-IV databases into multiple population groups using ICD codes and multiple labels.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We derive a theoretical bound for the generalization error of our method, and interpret it in terms of the Wasserstein distance, unbalancedness between the source and target domains, and labeling divergence, which can be used as a guide for assessing the suitability of binary classification and regression tasks. In general, our method achieves better accuracy and computational efficiency compared with standard and machine learning transfer learning methods on various tasks. Upon testing our method for populations with different insurance plans, we detect various levels of disparities in hospital duration stay between groups.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Discussion and Conclusion<\/jats:title>\n                    <jats:p>By leveraging tools from OT theory, our proposed framework allows to compare statistical models on EHR data between different population groups. As a potential application for clinical decision making, we quantify treatment disparities between different population groups. Future directions include applying OTTEHR to broader regression and classification tasks and extending the method to semi-supervised learning.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/jamia\/ocaf134","type":"journal-article","created":{"date-parts":[[2025,8,8]],"date-time":"2025-08-08T12:06:48Z","timestamp":1754654808000},"page":"15-25","source":"Crossref","is-referenced-by-count":1,"title":["Transport-based transfer learning on Electronic Health Records: application to detection of treatment disparities"],"prefix":"10.1093","volume":"33","author":[{"given":"Wanxin","family":"Li","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of British Columbia , Vancouver V6T 1Z4,","place":["Canada"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Saad","family":"Ahmed","sequence":"additional","affiliation":[{"name":"Vancouver Coastal Health , Vancouver V5Z 1M9,","place":["Canada"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yongjin P","family":"Park","sequence":"additional","affiliation":[{"name":"Department of Molecular Oncology, BC Cancer Research, Part of Provincial Health Care Authority , Vancouver V5Z 4E6,","place":["Canada"]},{"name":"Department of Pathology and Laboratory Medicine, University of British Columbia , Vancouver V6T 2B5,","place":["Canada"]},{"name":"Department of Statistics, University of British Columbia , Vancouver V6T 1Z4,","place":["Canada"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9685-5171","authenticated-orcid":false,"given":"Khanh","family":"Dao Duc","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of British Columbia , Vancouver V6T 1Z4,","place":["Canada"]},{"name":"Department of Mathematics, University of British Columbia , Vancouver V6T 1Z2,","place":["Canada"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2025,9,3]]},"reference":[{"key":"2026010211515937600_ocaf134-B1","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1016\/j.hlpt.2012.07.003","article-title":"UK Biobank: current status and what it means for epidemiology","volume":"1","author":"Allen","year":"2012","journal-title":"Health Policy Technol"},{"key":"2026010211515937600_ocaf134-B2","doi-asserted-by":"crossref","first-page":"47","DOI":"10.2147\/RMHP.S12985","article-title":"Benefits and drawbacks of electronic health record systems","volume":"4","author":"Menachemi","year":"2011","journal-title":"Risk Manag Healthc Policy"},{"key":"2026010211515937600_ocaf134-B3","doi-asserted-by":"crossref","first-page":"e58227","DOI":"10.7554\/eLife.58227","article-title":"Augmented curation of clinical notes from a massive EHR system reveals symptoms of impending COVID-19 diagnosis","volume":"9","author":"Wagner","year":"2020","journal-title":"Elife."},{"key":"2026010211515937600_ocaf134-B4","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1159\/000437311","article-title":"Utilities of electronic medical records to improve quality of care for acute kidney injury: past, present, future","volume":"131","author":"Kashani","year":"2015","journal-title":"Nephron"},{"key":"2026010211515937600_ocaf134-B5","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1111\/j.1469-1809.1955.tb01348.x","article-title":"On estimating the relation between blood group and disease","volume":"19","author":"Woolf","year":"1955","journal-title":"Ann Hum Genet"},{"key":"2026010211515937600_ocaf134-B6","first-page":"184","article-title":"PI S and PI Z alpha-1 antitrypsin deficiency worldwide. A review of existing genetic epidemiological data","volume":"67","author":"De Serres","year":"20","journal-title":"Monaldi Arch Chest Dis"},{"key":"2026010211515937600_ocaf134-B7","doi-asserted-by":"crossref","first-page":"e26","DOI":"10.1016\/j.ijmedinf.2010.10.001","article-title":"Aspects of privacy for electronic health records","volume":"80","author":"Haas","year":"2011","journal-title":"Int J Med Inform"},{"key":"2026010211515937600_ocaf134-B8","first-page":"1180","volume-title":"International Conference on Machine Learning","author":"Ganin","year":"2015"},{"key":"2026010211515937600_ocaf134-B9","first-page":"7673","volume-title":"International Conference on Machine Learning","author":"Pham","year":"2020"},{"key":"2026010211515937600_ocaf134-B10","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1007\/s10994-009-5152-4","article-title":"A theory of learning from different domains","volume":"79","author":"Ben-David","year":"2010","journal-title":"Mach Learn"},{"key":"2026010211515937600_ocaf134-B11","doi-asserted-by":"crossref","DOI":"10.1109\/JBHI.2023.3253208","article-title":"Cross-hospital sepsis rarely detection via semi-supervised optimal transport with self-paced ensemble","author":"Ding","year":"2023","journal-title":"IEEE J Biomed Health Informat"},{"key":"2026010211515937600_ocaf134-B12","first-page":"474","volume-title":"Machine Learn Health","author":"Wang","year":"2022"},{"year":"2017","author":"Gautheron","key":"2026010211515937600_ocaf134-B13"},{"key":"2026010211515937600_ocaf134-B14","doi-asserted-by":"crossref","first-page":"2866","DOI":"10.1109\/TNSRE.2022.3211881","article-title":"Transfer learning with optimal transportation and frequency mixup for EEG-based motor imagery recognition","volume":"30","author":"Chen","year":"2022","journal-title":"IEEE Trans Neural Syst Rehab Eng"},{"key":"2026010211515937600_ocaf134-B15","doi-asserted-by":"crossref","first-page":"6935","DOI":"10.1109\/TGRS.2020.3031337","article-title":"Geographic optimal transport for heterogeneous data: fusing remote sensing and social media","volume":"59","author":"Liu","year":"2020","journal-title":"IEEE Trans Geosci Remote Sens"},{"key":"2026010211515937600_ocaf134-B16","first-page":"137","volume-title":"Advances in Neural Information Processing Systems","author":"Ben-David","year":"2006"},{"key":"2026010211515937600_ocaf134-B17","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1007\/s10472-013-9371-9","article-title":"Domain adaptation\u2013can quantity compensate for quality?","volume":"70","author":"Ben-David","year":"2014","journal-title":"Ann Math Artificial Intell"},{"key":"2026010211515937600_ocaf134-B18","article-title":"Joint distribution optimal transportation for domain adaptation","volume":"30","author":"Courty","year":"2017","journal-title":"Adv Neural Inform Process Syst"},{"author":"Malik","key":"2026010211515937600_ocaf134-B19"},{"key":"2026010211515937600_ocaf134-B20","doi-asserted-by":"crossref","first-page":"4353","DOI":"10.18653\/v1\/2022.acl-long.299","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Li","year":"2022"},{"key":"2026010211515937600_ocaf134-B21","doi-asserted-by":"crossref","first-page":"160035","DOI":"10.1038\/sdata.2016.35","article-title":"MIMIC-III, a freely accessible critical care database","volume":"3","author":"Johnson","year":"2016","journal-title":"Sci Data"},{"first-page":"49","year":"2020","author":"Johnson","key":"2026010211515937600_ocaf134-B22"},{"key":"2026010211515937600_ocaf134-B23","doi-asserted-by":"crossref","first-page":"180178","DOI":"10.1038\/sdata.2018.178","article-title":"The eICU Collaborative Research Database, a freely available multi-center database for critical care research","volume":"5","author":"Pollard","year":"2018","journal-title":"Sci Data"},{"key":"2026010211515937600_ocaf134-B24","doi-asserted-by":"crossref","first-page":"375","DOI":"10.1002\/j.1477-8696.1990.tb05558.x","article-title":"Principal component analysis: a beginner\u2019s guide\u2014I. Introduction and application","volume":"45","author":"Jolliffe","year":"1990","journal-title":"Weather"},{"key":"2026010211515937600_ocaf134-B25","first-page":"1","article-title":"POT: Python optimal transport","volume":"22","author":"Reinger","year":"2021","journal-title":"J Machine Learn Res"},{"first-page":"1","year":"2010","author":"McKnight","key":"2026010211515937600_ocaf134-B26"},{"key":"2026010211515937600_ocaf134-B27","article-title":"An exploratory analysis of electronic intensive care unit (EICU) collaborative research database","volume":"2","author":"Rajabalizadeh","year":"2020","journal-title":"ICIS 2020 Proceedings"},{"key":"2026010211515937600_ocaf134-B28","first-page":"1805","article-title":"Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges","volume":"38","author":"Goldstein","year":"2017","journal-title":"Eur Heart J"},{"key":"2026010211515937600_ocaf134-B29","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1093\/ageing\/afw039","article-title":"Development and validation of an electronic frailty index using routine primary care electronic health record data","volume":"45","author":"Clegg","year":"2016","journal-title":"Age Ageing"},{"key":"2026010211515937600_ocaf134-B30","doi-asserted-by":"crossref","first-page":"96","DOI":"10.1038\/s41597-019-0103-9","article-title":"Multitask learning and benchmarking with clinical time series data","volume":"6","author":"Harutyunyan","year":"2019","journal-title":"Sci Data"},{"key":"2026010211515937600_ocaf134-B31","first-page":"6240","article-title":"Spectrally-normalized margin bounds for neural networks","volume":"30","author":"Bartlett","year":"2017","journal-title":"Adv Neural Inform Process Syst"},{"first-page":"807","year":"2010","author":"Nair","key":"2026010211515937600_ocaf134-B32"},{"key":"2026010211515937600_ocaf134-B33","first-page":"448","volume-title":"Int Conf Machine Learn","author":"Ioffe","year":"2015"},{"key":"2026010211515937600_ocaf134-B34","first-page":"26103","article-title":"A mathematical framework for quantifying transferability in multi-source transfer learning","volume":"34","author":"Tong","year":"2021","journal-title":"Adv Neural Inform Process Syst"},{"key":"2026010211515937600_ocaf134-B35","doi-asserted-by":"crossref","first-page":"2684","DOI":"10.1080\/01621459.2022.2071278","article-title":"Transfer learning under high-dimensional generalized linear models","volume":"118","author":"Tian","year":"2023","journal-title":"J Am Stat Assoc"},{"key":"2026010211515937600_ocaf134-B36","doi-asserted-by":"crossref","first-page":"4181","DOI":"10.1109\/TNNLS.2021.3119889","article-title":"Learning smooth representation for unsupervised domain adaptation","volume":"34","author":"Cai","year":"2021","journal-title":"IEEE Trans Neural Networks Learn Syst"},{"key":"2026010211515937600_ocaf134-B37","doi-asserted-by":"crossref","first-page":"10929","DOI":"10.1073\/pnas.162086599","article-title":"Can patient self-management help explain the SES health gradient?","volume":"99","author":"Goldman","year":"2002","journal-title":"Proc Natl Acad Sci"},{"key":"2026010211515937600_ocaf134-B38","doi-asserted-by":"crossref","first-page":"306","DOI":"10.2307\/2136848","article-title":"Family status and health behaviors: social control as a dimension of social integration","volume":"28","author":"Umberson","year":"1987","journal-title":"J Health Soc Behav"},{"key":"2026010211515937600_ocaf134-B39","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1109\/TNN.2010.2091281","article-title":"Domain adaptation via transfer component analysis","volume":"22","author":"Pan","year":"2010","journal-title":"IEEE Trans Neural Netw"},{"key":"2026010211515937600_ocaf134-B40","first-page":"153","article-title":"Correlation alignment for unsupervised domain adaptation","author":"Sun","year":"2017"},{"key":"2026010211515937600_ocaf134-B41","first-page":"2066","volume-title":"2012 IEEE Conf Computer Vision Pattern Recogn","author":"Gong","year":"2012"},{"first-page":"44","year":"2018","author":"Damodaran","key":"2026010211515937600_ocaf134-B42"},{"first-page":"1749","year":"2021","author":"Chen","key":"2026010211515937600_ocaf134-B43"},{"first-page":"11744","year":"2023","author":"Nejjar","key":"2026010211515937600_ocaf134-B44"},{"key":"2026010211515937600_ocaf134-B45","doi-asserted-by":"crossref","first-page":"1525","DOI":"10.5194\/gmd-7-1247-2014","article-title":"Root mean square error (RMSE) or mean absolute error (MAE)","volume":"7","author":"Chai","year":"2014","journal-title":"Geoscient Model Develop"},{"key":"2026010211515937600_ocaf134-B46","doi-asserted-by":"crossref","first-page":"15","DOI":"10.7208\/chicago\/9780226533575.003.0002","volume-title":"Means-Tested Transfer Programs in the United States","author":"Gruber","year":"2003"},{"key":"2026010211515937600_ocaf134-B47","doi-asserted-by":"crossref","first-page":"1644","DOI":"10.1016\/j.jpubeco.2007.10.005","article-title":"What did Medicare do? The initial impact of Medicare on mortality and out of pocket medical spending","volume":"92","author":"Finkelstein","year":"2008","journal-title":"J Public Econ"},{"key":"2026010211515937600_ocaf134-B48","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1377\/hlthaff.16.1.194","article-title":"Medicaid and private insurance: evidence and implications","volume":"16","author":"Cutler","year":"1997","journal-title":"Health Affairs"},{"year":"2023","author":"Foundation KF","key":"2026010211515937600_ocaf134-B49"},{"year":"2023","author":"Kaiser Family Foundation","key":"2026010211515937600_ocaf134-B50"},{"year":"2021","author":"Kaiser Family Foundation","key":"2026010211515937600_ocaf134-B51"},{"key":"2026010211515937600_ocaf134-B52","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1377\/hlthaff.2013.0934","article-title":"The health reform monitoring survey: addressing data gaps to provide timely insights into the Affordable Care Act","volume":"33","author":"Long","year":"2014","journal-title":"Health Affairs"},{"key":"2026010211515937600_ocaf134-B53","doi-asserted-by":"crossref","first-page":"609","DOI":"10.1086\/505049","article-title":"The labor market effects of rising health insurance premiums","volume":"24","author":"Baicker","year":"2006","journal-title":"J Labor Econ"},{"key":"2026010211515937600_ocaf134-B54","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1038\/s41551-023-01056-8","article-title":"Algorithmic fairness in artificial intelligence for medicine and healthcare","volume":"7","author":"Chen","year":"2023","journal-title":"Nat Biomed Eng"},{"key":"2026010211515937600_ocaf134-B55","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1038\/s41746-023-00858-z","article-title":"Bias in AI-based models for medical applications: challenges and mitigation strategies","volume":"6","author":"Mittermaier","year":"2023","journal-title":"NPJ Digital Med"},{"key":"2026010211515937600_ocaf134-B56","first-page":"438","article-title":"Predicting emergency department visits","volume":"2016","author":"Poole","year":"2016","journal-title":"AMIA Summits Transl Sci Proc"},{"key":"2026010211515937600_ocaf134-B57","doi-asserted-by":"crossref","first-page":"372","DOI":"10.1136\/emj.2005.028522","article-title":"Prediction of mortality among emergency medical admissions","volume":"23","author":"Goodacre","year":"2006","journal-title":"Emerg Med J"},{"key":"2026010211515937600_ocaf134-B58","first-page":"1117","article-title":"Examining the etiology of early-onset breast cancer in the Canadian Partnership for Tomorrow\u2019s Health (CanPath)","volume":"32","author":"Pader","year":"2021","journal-title":"Cancer Causes Control"},{"key":"2026010211515937600_ocaf134-B59","doi-asserted-by":"crossref","first-page":"e1001779","DOI":"10.1371\/journal.pmed.1001779","article-title":"UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age","volume":"12","author":"Sudlow","year":"2015","journal-title":"PLoS Med"},{"key":"2026010211515937600_ocaf134-B60","doi-asserted-by":"crossref","first-page":"j4366","DOI":"10.1136\/bmj.j4366","article-title":"Comparison of postoperative outcomes among patients treated by male and female surgeons: a population based matched cohort study","volume":"359","author":"Wallis","year":"2017","journal-title":"BMJ"},{"key":"2026010211515937600_ocaf134-B61","doi-asserted-by":"crossref","first-page":"764934","DOI":"10.3389\/fmed.2021.764934","article-title":"An Unsupervised machine learning clustering and prediction of differential clinical phenotypes of COVID-19 patients based on blood tests\u2014a Hong Kong population study","volume":"8","author":"Lau","year":"2022","journal-title":"Front Med"},{"volume-title":"COVID-19 Pneumonia: Different Respiratory Treatments for Different Phenotypes","year":"2020","author":"Gattinoni","key":"2026010211515937600_ocaf134-B62"},{"first-page":"3872","year":"2019","author":"Wei","key":"2026010211515937600_ocaf134-B63"},{"key":"2026010211515937600_ocaf134-B64","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1016\/j.patcog.2017.03.004","article-title":"Semi-supervised manifold-embedded hashing with joint feature representation and classifier learning","volume":"68","author":"Song","year":"2017","journal-title":"Pattern Recogn"},{"key":"2026010211515937600_ocaf134-B65","first-page":"8766","article-title":"The unbalanced Gromov Wasserstein distance: conic formulation and relaxation","volume":"34","author":"S\u00e9journ\u00e9","year":"2021","journal-title":"Adv Neural Inform Process Syst"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/33\/1\/15\/64200503\/ocaf134.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/33\/1\/15\/64200503\/ocaf134.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,2]],"date-time":"2026-01-02T16:52:16Z","timestamp":1767372736000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/33\/1\/15\/8246852"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,3]]},"references-count":65,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,9,3]]},"published-print":{"date-parts":[[2026,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocaf134","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"type":"print","value":"1067-5027"},{"type":"electronic","value":"1527-974X"}],"subject":[],"published-other":{"date-parts":[[2026,1]]},"published":{"date-parts":[[2025,9,3]]}}}