{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,9]],"date-time":"2026-05-09T16:45:49Z","timestamp":1778345149599,"version":"3.51.4"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2020,9,1]],"date-time":"2020-09-01T00:00:00Z","timestamp":1598918400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Objective<\/jats:title>\n                  <jats:p>Recent studies on electronic health records (EHRs) started to learn deep generative models and synthesize a huge amount of realistic records, in order to address significant privacy issues surrounding the EHR. However, most of them only focus on structured records about patients\u2019 independent visits, rather than on chronological clinical records. In this article, we aim to learn and synthesize realistic sequences of EHRs based on the generative autoencoder.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Materials and Methods<\/jats:title>\n                  <jats:p>We propose a dual adversarial autoencoder (DAAE), which learns set-valued sequences of medical entities, by combining a recurrent autoencoder with 2 generative adversarial networks (GANs). DAAE improves the mode coverage and quality of generated sequences by adversarially learning both the continuous latent distribution and the discrete data distribution. Using the MIMIC-III (Medical Information Mart for Intensive Care-III) and UT Physicians clinical databases, we evaluated the performances of DAAE in terms of predictive modeling, plausibility, and privacy preservation.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Our generated sequences of EHRs showed the comparable performances to real data for a predictive modeling task, and achieved the best score in plausibility evaluation conducted by medical experts among all baseline models. In addition, differentially private optimization of our model enables to generate synthetic sequences without increasing the privacy leakage of patients\u2019 data.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Conclusions<\/jats:title>\n                  <jats:p>DAAE can effectively synthesize sequential EHRs by addressing its main challenges: the synthetic records should be realistic enough not to be distinguished from the real records, and they should cover all the training patients to reproduce the performance of specific downstream tasks.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/jamia\/ocaa119","type":"journal-article","created":{"date-parts":[[2020,6,16]],"date-time":"2020-06-16T19:11:24Z","timestamp":1592334684000},"page":"1411-1419","source":"Crossref","is-referenced-by-count":53,"title":["Generating sequential electronic health records using dual adversarial autoencoder"],"prefix":"10.1093","volume":"27","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2173-3476","authenticated-orcid":false,"given":"Dongha","family":"Lee","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang, South Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hwanjo","family":"Yu","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang, South Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoqian","family":"Jiang","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Deevakar","family":"Rogith","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Meghana","family":"Gudala","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mubeen","family":"Tejani","sequence":"additional","affiliation":[{"name":"School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qiuchen","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Emory University, Atlanta, Georgia, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Li","family":"Xiong","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Emory University, Atlanta, Georgia, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2020,9,25]]},"reference":[{"key":"2020110613121043500_ocaa119-B1","doi-asserted-by":"crossref","first-page":"h1139","DOI":"10.1136\/bmj.h1139","article-title":"Anonymising and sharing individual patient data","volume":"350","author":"El Emam","year":"2015","journal-title":"BMJ"},{"issue":"12","key":"2020110613121043500_ocaa119-B2","doi-asserted-by":"crossref","first-page":"e28071","DOI":"10.1371\/journal.pone.0028071","article-title":"A systematic review of re-identification attacks on health data","volume":"6","author":"El Emam","year":"2011","journal-title":"PLoS One"},{"issue":"1","key":"2020110613121043500_ocaa119-B3","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1186\/1472-6947-13-114","article-title":"Evaluating the risk of patient re-identification from adverse drug event reports","volume":"13","author":"El Emam","year":"2013","journal-title":"BMC Med Inform Decis Mak"},{"issue":"1","key":"2020110613121043500_ocaa119-B4","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1186\/1472-6947-12-66","article-title":"Estimating the re-identification risk of clinical data sets","volume":"12","author":"Dankar","year":"2012","journal-title":"BMC Med Inform Decis Mak"},{"issue":"1","key":"2020110613121043500_ocaa119-B5","first-page":"6","article-title":"Assessing and minimizing re-identification risk in research data derived from health care records","volume":"7","author":"Simon","year":"2019","journal-title":"EGEMS (Wash DC)"},{"key":"2020110613121043500_ocaa119-B6","article-title":"Standards for privacy of individually identifiable health information","author":"Department of Health and Human Services","year":"2002","journal-title":"Federal Register"},{"key":"2020110613121043500_ocaa119-B7","first-page":"286","article-title":"Generating multi-label discrete patient records using generative adversarial networks","author":"Choi","year":"2017","journal-title":"Proc Machine Learn Healthcare"},{"issue":"7","key":"2020110613121043500_ocaa119-B8","doi-asserted-by":"crossref","first-page":"e005122","DOI":"10.1161\/CIRCOUTCOMES.118.005122","article-title":"Privacy-preserving generative deep neural networks support clinical data sharing","volume":"12","author":"Beaulieu-Jones","year":"2019","journal-title":"Circ Cardiovasc Qual Outcomes"},{"key":"2020110613121043500_ocaa119-B9","first-page":"417","author":"Nie","year":"2017"},{"issue":"3\u20134","key":"2020110613121043500_ocaa119-B10","doi-asserted-by":"crossref","first-page":"383","DOI":"10.1007\/s12021-018-9377-x","article-title":"Adversarial network with multi-scale l1 loss for medical image segmentation","volume":"16","author":"Xue","year":"2018","journal-title":"Neuroinformatics"},{"key":"2020110613121043500_ocaa119-B11","first-page":"66","author":"Spinks","year":"2018"},{"key":"2020110613121043500_ocaa119-B12","first-page":"2720","author":"Zhang","year":"2018"},{"issue":"3","key":"2020110613121043500_ocaa119-B13","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1093\/jamia\/ocy142","article-title":"Synthesizing electronic health records using improved generative adversarial networks","volume":"26","author":"Baowaly","year":"2019","journal-title":"J Am Med Inform Assoc"},{"key":"2020110613121043500_ocaa119-B14","first-page":"2672","author":"Goodfellow","year":"2014"},{"key":"2020110613121043500_ocaa119-B15","author":"Arjovsky","year":"2017"},{"key":"2020110613121043500_ocaa119-B16","author":"Kingma"},{"key":"2020110613121043500_ocaa119-B17","first-page":"2172","author":"Chen","year":"2016"},{"key":"2020110613121043500_ocaa119-B18","first-page":"1125","author":"Isola","year":"2017"},{"key":"2020110613121043500_ocaa119-B19","first-page":"2670","author":"Nguyen","year":"2017"},{"key":"2020110613121043500_ocaa119-B20","first-page":"2852","author":"Yu","year":"2017"},{"key":"2020110613121043500_ocaa119-B21","author":"Che","year":"2017"},{"key":"2020110613121043500_ocaa119-B22","first-page":"6682","author":"Li","year":"2019"},{"key":"2020110613121043500_ocaa119-B23","first-page":"10","author":"Bowman","year":"2016"},{"key":"2020110613121043500_ocaa119-B24","author":"Makhzani"},{"key":"2020110613121043500_ocaa119-B25","author":"Tolstikhin"},{"key":"2020110613121043500_ocaa119-B26","first-page":"5902","article-title":"Adversarially regularized autoencoders","volume":"80","author":"Zhao","year":"2018","journal-title":"Proc Mach Learn Res"},{"key":"2020110613121043500_ocaa119-B27","first-page":"7562","author":"Subramanian","year":"2018"},{"issue":"23","key":"2020110613121043500_ocaa119-B28","doi-asserted-by":"crossref","first-page":"e215","DOI":"10.1161\/01.CIR.101.23.e215","article-title":"Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals","volume":"101","author":"Goldberger","year":"2000","journal-title":"Circulation"},{"issue":"1","key":"2020110613121043500_ocaa119-B29","doi-asserted-by":"crossref","first-page":"160035","DOI":"10.1038\/sdata.2016.35","article-title":"Mimic-iii, a freely accessible critical care database","volume":"3","author":"Johnson","year":"2016","journal-title":"Sci Data"},{"key":"2020110613121043500_ocaa119-B30","first-page":"5769","author":"Gulrajani","year":"2017"},{"issue":"3\u20134","key":"2020110613121043500_ocaa119-B31","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1561\/0400000042","article-title":"The algorithmic foundations of differential privacy","volume":"9","author":"Dwork","year":"2013","journal-title":"Foundations Theor Comput Sci"},{"key":"2020110613121043500_ocaa119-B32","first-page":"308","author":"Abadi","year":"2016"},{"key":"2020110613121043500_ocaa119-B33","first-page":"1746","author":"Kim","year":"2014"},{"issue":"10","key":"2020110613121043500_ocaa119-B34","doi-asserted-by":"crossref","first-page":"1533","DOI":"10.1109\/TASLP.2014.2339736","article-title":"Convolutional neural networks for speech recognition","volume":"22","author":"Abdel-Hamid","year":"2014","journal-title":"IEEE\/ACM Trans Audio Speech Lang Process"},{"key":"2020110613121043500_ocaa119-B35","first-page":"233","author":"Kim","year":"2016"},{"key":"2020110613121043500_ocaa119-B36","first-page":"2007","author":"Chan","year":"2018"},{"key":"2020110613121043500_ocaa119-B37","first-page":"1558","article-title":"Autoencoding beyond pixels using a learned similarity metric","volume":"48","author":"Larsen","year":"2016","journal-title":"Proc Mach Learn Res"},{"issue":"1","key":"2020110613121043500_ocaa119-B38","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1093\/jamia\/ocz161","article-title":"Ensuring electronic medical record simulation through better training, modeling, and evaluation","volume":"27","author":"Zhang","year":"2020","journal-title":"J Am Med Inform Assoc"},{"key":"2020110613121043500_ocaa119-B39","first-page":"301","article-title":"Doctor AI: predicting clinical events via recurrent neural networks","volume":"56","author":"Choi","year":"2016","journal-title":"Proc Mach Learn Res"},{"key":"2020110613121043500_ocaa119-B40","first-page":"226","author":"Ester","year":"1996"},{"key":"2020110613121043500_ocaa119-B41","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Maaten","year":"2008","journal-title":"J Machine Learn Res"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/27\/9\/1411\/34153657\/ocaa119.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/27\/9\/1411\/34153657\/ocaa119.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,11,6]],"date-time":"2020-11-06T19:38:50Z","timestamp":1604691530000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/27\/9\/1411\/5912632"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,1]]},"references-count":41,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2020,9,25]]},"published-print":{"date-parts":[[2020,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocaa119","relation":{},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,9]]},"published":{"date-parts":[[2020,9,1]]}}}