{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,22]],"date-time":"2026-02-22T03:57:55Z","timestamp":1771732675226,"version":"3.50.1"},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2020,11,19]],"date-time":"2020-11-19T00:00:00Z","timestamp":1605744000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Bio Industrial Strategic Technology Development Program","award":["20001234"],"award-info":[{"award-number":["20001234"]}]},{"name":"Bio Industrial Strategic Technology Development Program","award":["20003883"],"award-info":[{"award-number":["20003883"]}]},{"name":"Ministry of Trade, Industry & Energy"},{"name":"Korea Health Technology R&D Project through the Korea Health Industry Development Institute"},{"name":"Ministry of Health & Welfare, Republic of Korea","award":["HI16C0992"],"award-info":[{"award-number":["HI16C0992"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,6,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>Cause of death is used as an important outcome of clinical research; however, access to cause-of-death data is limited. This study aimed to develop and validate a machine-learning model that predicts the cause of death from the patient\u2019s last medical checkup.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>To classify the mortality status and each individual cause of death, we used a stacking ensemble method. The prediction outcomes were all-cause mortality, 8 leading causes of death in South Korea, and other causes. The clinical data of study populations were extracted from the national claims (n\u2009=\u2009174\u00a0747) and electronic health records (n\u2009=\u2009729\u00a0065) and were used for model development and external validation. Moreover, we imputed the cause of death from the data of 3 US claims databases (n\u2009=\u2009994\u00a0518, 995\u00a0372, and 407\u00a0604, respectively). All databases were formatted to the Observational Medical Outcomes Partnership Common Data Model.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>The generalized area under the receiver operating characteristic curve (AUROC) of the model predicting the cause of death within 60 days was 0.9511. Moreover, the AUROC of the external validation was 0.8887. Among the causes of death imputed in the Medicare Supplemental database, 11.32% of deaths were due to malignant neoplastic disease.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Discussion<\/jats:title>\n                  <jats:p>This study showed the potential of machine-learning models as a new alternative to address the lack of access to cause-of-death data. All processes were disclosed to maintain transparency, and the model was easily applicable to other institutions.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusion<\/jats:title>\n                  <jats:p>A machine-learning model with competent performance was developed to predict cause of death.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocaa277","type":"journal-article","created":{"date-parts":[[2020,10,23]],"date-time":"2020-10-23T19:11:48Z","timestamp":1603480308000},"page":"1098-1107","source":"Crossref","is-referenced-by-count":38,"title":["Machine-learning model to predict the cause of death using a stacking ensemble method for observational data"],"prefix":"10.1093","volume":"28","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1802-1777","authenticated-orcid":false,"given":"Chungsoo","family":"Kim","sequence":"first","affiliation":[{"name":"Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Gyeonggi-do, Republic of Korea"}]},{"given":"Seng Chan","family":"You","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Gyeonggi-do, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2970-0778","authenticated-orcid":false,"given":"Jenna M.","family":"Reps","sequence":"additional","affiliation":[{"name":"Janssen Research and Development, Titusville, NJ, USA"}]},{"given":"Jae Youn","family":"Cheong","sequence":"additional","affiliation":[{"name":"Department of Gastroenterology, Ajou University School of Medicine, Suwon, Gyeonggi-do, Republic of Korea"}]},{"given":"Rae Woong","family":"Park","sequence":"additional","affiliation":[{"name":"Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Gyeonggi-do, Republic of Korea"},{"name":"Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Gyeonggi-do, Republic of Korea"}]}],"member":"286","published-online":{"date-parts":[[2020,11,19]]},"reference":[{"issue":"3","key":"2021061318593944600_ocaa277-B1","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1007\/s10654-014-9899-y","article-title":"All-cause mortality as an outcome in epidemiologic studies: proceed with caution","volume":"29","author":"Weiss","year":"2014","journal-title":"Eur J Epidemiol"},{"issue":"3","key":"2021061318593944600_ocaa277-B2","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1093\/jnci\/94.3.167","article-title":"All-cause mortality in randomized trials of cancer screening","volume":"94","author":"Black","year":"2002","journal-title":"J Natl Cancer Inst"},{"issue":"21","key":"2021061318593944600_ocaa277-B3","doi-asserted-by":"crossref","first-page":"1985","DOI":"10.1161\/CIRCULATIONAHA.116.023359","article-title":"Should a reduction in all-cause mortality be the goal when assessing preventive medical therapies?","volume":"135","author":"Sasieni","year":"2017","journal-title":"Circulation"},{"issue":"13","key":"2021061318593944600_ocaa277-B4","doi-asserted-by":"crossref","first-page":"6127","DOI":"10.1002\/cam4.2476","article-title":"All-cause mortality versus cancer-specific mortality as outcome in cancer screening trials: a review and modeling study","volume":"8","author":"Heijnsdijk","year":"2019","journal-title":"Cancer Med"},{"issue":"23","key":"2021061318593944600_ocaa277-B5","doi-asserted-by":"crossref","first-page":"2576","DOI":"10.1001\/jama.2016.3332","article-title":"Screening for colorectal cancer: updated evidence report and systematic review for the US preventive services task force","volume":"315","author":"Lin","year":"2016","journal-title":"JAMA"},{"issue":"1","key":"2021061318593944600_ocaa277-B6","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1067\/mhj.2002.119770","article-title":"Choice of clinical outcomes in randomized trials of heart failure therapies: disease-specific or overall outcomes?","volume":"143","author":"Yusuf","year":"2002","journal-title":"Am Heart J"},{"issue":"10159","key":"2021061318593944600_ocaa277-B7","doi-asserted-by":"crossref","first-page":"1736","DOI":"10.1016\/S0140-6736(18)32203-7","article-title":"Global, regional, and national age-sex-specific mortality for 282 causes of death in 195 countries and territories, 1980\u20132017: a systematic analysis for the Global Burden of Disease Study 2017","volume":"392","author":"Roth","year":"2018","journal-title":"Lancet"},{"issue":"4","key":"2021061318593944600_ocaa277-B8","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1001\/jama.284.4.483","article-title":"Is US health really the best in the world?","volume":"284","author":"Starfield","year":"2000","journal-title":"JAMA"},{"issue":"5288","key":"2021061318593944600_ocaa277-B9","doi-asserted-by":"crossref","first-page":"740","DOI":"10.1126\/science.274.5288.740","article-title":"Evidence-based health policy\u2013lessons from the Global Burden of Disease Study","volume":"274","author":"Murray","year":"1996","journal-title":"Science"},{"issue":"1","key":"2021061318593944600_ocaa277-B10","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1111\/1475-6773.13069","article-title":"Alive or dead: validity of the social security administration death master file after 2011","volume":"54","author":"Levin","year":"2019","journal-title":"Health Serv Res"},{"issue":"5","key":"2021061318593944600_ocaa277-B11","doi-asserted-by":"crossref","first-page":"e66116","DOI":"10.1371\/journal.pone.0066116","article-title":"Claims-based definition of death in Japanese claims database: validity and implications","volume":"8","author":"Ooba","year":"2013","journal-title":"PLoS One"},{"issue":"11","key":"2021061318593944600_ocaa277-B12","doi-asserted-by":"crossref","first-page":"831","DOI":"10.2471\/BLT.09.068809","article-title":"Availability and quality of cause-of-death data for estimating the global burden of injuries","volume":"88","author":"Bhalla","year":"2010","journal-title":"Bull World Health Organ"},{"key":"2021061318593944600_ocaa277-B13","doi-asserted-by":"crossref","first-page":"e2018062-0","DOI":"10.4178\/epih.e2018062","article-title":"Data resource profile: the National Health Insurance Research Database (NHIRD)","volume":"40","author":"Lin","year":"2018","journal-title":"Epidemiol Health"},{"issue":"8","key":"2021061318593944600_ocaa277-B14","doi-asserted-by":"crossref","first-page":"954","DOI":"10.1002\/pds.4233","article-title":"The national healthcare system claims databases in France, SNIIRAM and EGB: powerful tools for pharmacoepidemiology","volume":"26","author":"Bezin","year":"2017","journal-title":"Pharmacoepidemiol Drug Saf"},{"issue":"7","key":"2021061318593944600_ocaa277-B15","doi-asserted-by":"crossref","first-page":"778","DOI":"10.1002\/pds.4005","article-title":"Validating mortality in the German Pharmacoepidemiological Research Database (GePaRD) against a mortality registry","volume":"25","author":"Ohlmeier","year":"2016","journal-title":"Pharmacoepidemiol Drug Saf"},{"issue":"4","key":"2021061318593944600_ocaa277-B16","doi-asserted-by":"crossref","first-page":"515","DOI":"10.1007\/s40264-018-0754-z","article-title":"Diagnostic algorithms for cardiovascular death in administrative claims databases: a systematic review","volume":"42","author":"Singh","year":"2019","journal-title":"Drug Saf"},{"issue":"7","key":"2021061318593944600_ocaa277-B17","doi-asserted-by":"crossref","first-page":"e026834","DOI":"10.1136\/bmjopen-2018-026834","article-title":"Implementation of an algorithm for the identification of breast cancer deaths in German health insurance claims data: a validation study based on a record linkage with administrative mortality data","volume":"9","author":"Langner","year":"2019","journal-title":"BMJ Open"},{"issue":"6","key":"2021061318593944600_ocaa277-B18","doi-asserted-by":"crossref","first-page":"856","DOI":"10.1200\/JCO.2005.02.1790","article-title":"Identification in administrative databases of women dying of breast cancer","volume":"24","author":"Gagnon","year":"2006","journal-title":"J Clin Oncol"},{"issue":"3","key":"2021061318593944600_ocaa277-B19","doi-asserted-by":"crossref","first-page":"e0214365","DOI":"10.1371\/journal.pone.0214365","article-title":"Prediction of premature all-cause mortality: a prospective general population cohort study comparing machine-learning and standard epidemiological approaches","volume":"14","author":"Weng","year":"2019","journal-title":"PLoS One"},{"issue":"11","key":"2021061318593944600_ocaa277-B20","doi-asserted-by":"crossref","first-page":"1377","DOI":"10.1007\/s40264-019-00827-0","article-title":"Identifying the DEAD: development and validation of a patient-level model to predict death status in population-level claims data","volume":"42","author":"Reps","year":"2019","journal-title":"Drug Saf"},{"key":"2021061318593944600_ocaa277-B21","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1038\/s41746-018-0029-1","article-title":"Scalable and accurate deep learning with electronic health records","volume":"1","author":"Rajkomar","year":"2018","journal-title":"NPJ Digit Med"},{"issue":"3","key":"2021061318593944600_ocaa277-B22","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1111\/acem.12876","article-title":"Prediction of in-hospital mortality in emergency department patients with sepsis: a local big data\u2013driven machine learning approach","volume":"23","author":"Taylor","year":"2016","journal-title":"Acad Emerg Med"},{"issue":"4","key":"2021061318593944600_ocaa277-B23","doi-asserted-by":"crossref","first-page":"305","DOI":"10.1001\/jama.2019.20866","article-title":"Challenges to the reproducibility of machine learning models in health care","volume":"323","author":"Beam","year":"2020","journal-title":"JAMA"},{"issue":"8","key":"2021061318593944600_ocaa277-B24","doi-asserted-by":"crossref","first-page":"969","DOI":"10.1093\/jamia\/ocy032","article-title":"Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data","volume":"25","author":"Reps","year":"2018","journal-title":"J Am Med Inform Assoc"},{"key":"2021061318593944600_ocaa277-B25","first-page":"574","article-title":"Observational Health Data Sciences and Informatics (OHDSI): opportunities for observational researchers","volume":"216","author":"Hripcsak","year":"2015","journal-title":"Stud Health Technol Inform"},{"issue":"239","key":"2021061318593944600_ocaa277-B26","first-page":"2","article-title":"Docker: lightweight Linux containers for consistent development and deployment","volume":"2014","author":"Merkel","year":"2014","journal-title":"Linux J"},{"issue":"8","key":"2021061318593944600_ocaa277-B27","doi-asserted-by":"crossref","first-page":"922","DOI":"10.1001\/jamadermatol.2019.0629","article-title":"All-cause and cause-specific mortality risks associated with alopecia areata: a Korean nationwide population-based study","volume":"155","author":"Lee","year":"2019","journal-title":"JAMA Dermatol"},{"issue":"2","key":"2021061318593944600_ocaa277-B28","first-page":"e15","article-title":"Cohort Profile: The National Health Insurance Service-National Sample Cohort (NHIS-NSC), South Korea","volume":"46","author":"Lee","year":"2017","journal-title":"Int J Epidemiol"},{"key":"2021061318593944600_ocaa277-B29","first-page":"467","article-title":"Conversion of National Health Insurance Service-National Sample Cohort (NHIS-NSC) database into Observational Medical Outcomes Partnership-Common Data Model (OMOP-CDM)","volume":"245","author":"You","year":"2017","journal-title":"Stud Health Technol Inform"},{"key":"2021061318593944600_ocaa277-B30"},{"issue":"12","key":"2021061318593944600_ocaa277-B31","doi-asserted-by":"crossref","first-page":"1618","DOI":"10.1093\/jamia\/ocy124","article-title":"Effect of vocabulary mapping for conditions on phenotype cohorts","volume":"25","author":"Hripcsak","year":"2018","journal-title":"J Am Med Inform Assoc"},{"issue":"2","key":"2021061318593944600_ocaa277-B32","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1016\/S0893-6080(05)80023-1","article-title":"Stacked generalization","volume":"5","author":"Wolpert","year":"1992","journal-title":"Neural Netw"},{"key":"2021061318593944600_ocaa277-B33","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1016\/j.engappai.2015.04.003","article-title":"Multi-class classification via heterogeneous ensemble of one-class classifiers","volume":"43","author":"Kang","year":"2015","journal-title":"Eng Appl Artific Intell"},{"key":"2021061318593944600_ocaa277-B34","doi-asserted-by":"crossref","first-page":"644","DOI":"10.1016\/j.scitotenv.2018.04.040","article-title":"Development of a stacked ensemble model for forecasting and analyzing daily average PM2.5 concentrations in Beijing, China","volume":"635","author":"Zhai","year":"2018","journal-title":"Sci Total Environ"},{"key":"2021061318593944600_ocaa277-B35","doi-asserted-by":"crossref","first-page":"188","DOI":"10.1016\/j.asoc.2019.01.015","article-title":"Stacking-based ensemble learning of decision trees for interpretable prostate cancer detection","volume":"77","author":"Wang","year":"2019","journal-title":"Appl Soft Comput"},{"issue":"4","key":"2021061318593944600_ocaa277-B36","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1016\/j.ipm.2009.03.002","article-title":"A systematic analysis of performance measures for classification tasks","volume":"45","author":"Sokolova","year":"2009","journal-title":"Inf Process Manag"},{"issue":"2","key":"2021061318593944600_ocaa277-B37","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1023\/A:1010920819831","article-title":"A simple generalisation of the area under the ROC curve for multiple class classification problems","volume":"45","author":"Hand","year":"2001","journal-title":"Mach Learn"},{"issue":"3","key":"2021061318593944600_ocaa277-B38","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1001\/jamainternmed.2019.6447","article-title":"Association of Ticagrelor vs Clopidogrel with major adverse coronary events in patients with acute coronary syndrome undergoing percutaneous coronary intervention","volume":"180","author":"Turgeon","year":"2020","journal-title":"JAMA Intern Med"},{"issue":"10","key":"2021061318593944600_ocaa277-B39","doi-asserted-by":"crossref","first-page":"969","DOI":"10.1001\/jama.2019.1347","article-title":"Association of tramadol with all-cause mortality among patients with osteoarthritis","volume":"321","author":"Zeng","year":"2019","journal-title":"JAMA"},{"issue":"10181","key":"2021061318593944600_ocaa277-B40","doi-asserted-by":"crossref","first-page":"1577","DOI":"10.1016\/S0140-6736(19)30037-6","article-title":"Reporting of artificial intelligence prediction models","volume":"393","author":"Collins","year":"2019","journal-title":"The Lancet"},{"issue":"1","key":"2021061318593944600_ocaa277-B41","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12916-014-0241-z","article-title":"Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement","volume":"13","author":"Collins","year":"2015","journal-title":"BMC Med"},{"issue":"7","key":"2021061318593944600_ocaa277-B42","doi-asserted-by":"crossref","first-page":"1229","DOI":"10.1038\/sj.bjc.6602102","article-title":"A note on competing risks in survival data analysis","volume":"91","author":"Satagopan","year":"2004","journal-title":"Br J Cancer"},{"key":"2021061318593944600_ocaa277-B43","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1016\/j.procs.2019.08.183","article-title":"Comparison of temporal and non-temporal features effect on machine learning models quality and interpretability for chronic heart failure patients","volume":"156","author":"Balabaeva","year":"2019","journal-title":"Procedia Comput Sci"},{"issue":"9","key":"2021061318593944600_ocaa277-B44","article-title":"Deaths: Final data for 2017","volume":"68","year":"2019","journal-title":"Natl Vital Stat Rep"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/28\/6\/1098\/38615460\/ocaa277.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/28\/6\/1098\/38615460\/ocaa277.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,6,13]],"date-time":"2021-06-13T19:01:26Z","timestamp":1623610886000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/28\/6\/1098\/5992328"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,19]]},"references-count":44,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2020,11,19]]},"published-print":{"date-parts":[[2021,6,12]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocaa277","relation":{},"ISSN":["1527-974X"],"issn-type":[{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,6,1]]},"published":{"date-parts":[[2020,11,19]]}}}