{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,12]],"date-time":"2025-11-12T13:46:53Z","timestamp":1762955213712,"version":"3.45.0"},"publisher-location":"New York, NY, USA","reference-count":47,"publisher":"ACM","funder":[{"DOI":"10.13039\/100000002","name":"NIH (National Institutes of Health)","doi-asserted-by":"publisher","award":["P20GM103446","P20GM113125","U54-GM104941"],"award-info":[{"award-number":["P20GM103446","P20GM113125","U54-GM104941"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,6,24]]},"DOI":"10.1145\/3721201.3721373","type":"proceedings-article","created":{"date-parts":[[2025,11,12]],"date-time":"2025-11-12T13:35:21Z","timestamp":1762954521000},"page":"234-244","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Fairness-Optimized Synthetic EHR Generation for Arbitrary Downstream Predictive Tasks"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-0403-6442","authenticated-orcid":false,"given":"Mirza Farhan","family":"Bin Tarek","sequence":"first","affiliation":[{"name":"University of Delaware, Newark, DE, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9893-5469","authenticated-orcid":false,"given":"Raphael","family":"Poulain","sequence":"additional","affiliation":[{"name":"University of Delaware, Newark, DE, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8912-3063","authenticated-orcid":false,"given":"Rahmatollah","family":"Beheshti","sequence":"additional","affiliation":[{"name":"University of Delaware, Newark, DE, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,11,12]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"NeurIPS Workshop on Synthetic Data for Empowering ML Research.","author":"Abroshan Mahed","year":"2022","unstructured":"Mahed Abroshan, Mohammad Mahdi Khalili, and Andrew Elliott. 2022. Counter-factual fairness in synthetic data generation. In NeurIPS Workshop on Synthetic Data for Empowering ML Research."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1093\/jamia\/ocy142"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3616865"},{"key":"e_1_3_2_1_4_1","volume-title":"Tingting Zhu, Andrew P Creagh, and David A Clifton.","author":"Ceritli Taha","year":"2023","unstructured":"Taha Ceritli, Ghadeer O Ghosheh, Vinod Kumar Chauhan, Tingting Zhu, Andrew P Creagh, and David A Clifton. 2023. Synthesizing mixed-type electronic health records using diffusion models. arXiv preprint arXiv:2302.14679 (2023)."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098126"},{"key":"e_1_3_2_1_6_1","volume-title":"Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart.","author":"Choi Edward","year":"2016","unstructured":"Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. Advances in neural information processing systems 29 (2016)."},{"key":"e_1_3_2_1_7_1","unstructured":"Edward Choi Siddharth Biswal Bradley Malin Jon Duke Walter F Stewart and Jimeng Sun. 2017. Generating multi-label discrete patient records using generative adversarial networks. In Machine learning for healthcare conference. PMLR 286\u2013305."},{"key":"e_1_3_2_1_8_1","volume-title":"The frontiers of fairness in machine learning. arXiv preprint arXiv:1810.08810","author":"Chouldechova Alexandra","year":"2018","unstructured":"Alexandra Chouldechova and Aaron Roth. 2018. The frontiers of fairness in machine learning. arXiv preprint arXiv:1810.08810 (2018)."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i01.5401"},{"key":"e_1_3_2_1_10_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3461702.3462523"},{"key":"e_1_3_2_1_12_1","volume-title":"The accuracy, fairness, and limits of predicting recidivism. Science advances 4, 1","author":"Dressel Julia","year":"2018","unstructured":"Julia Dressel and Hany Farid. 2018. The accuracy, fairness, and limits of predicting recidivism. Science advances 4, 1 (2018), eaao5580."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2090236.2090255"},{"key":"e_1_3_2_1_14_1","volume-title":"International Conference on Machine Learning. PMLR, 6944\u20136959","author":"Ganev Georgi","year":"2022","unstructured":"Georgi Ganev, Bristena Oprisanu, and Emiliano De Cristofaro. 2022. Robin hood and matthew effects: Differential privacy has disparate impact on synthetic data. In International Conference on Machine Learning. PMLR, 6944\u20136959."},{"key":"e_1_3_2_1_15_1","volume-title":"Leon Glass, Jeffrey M Hausdorff, Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody, Chung-Kang Peng, and H Eugene Stanley.","author":"Goldberger Ary L","year":"2000","unstructured":"Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff, Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody, Chung-Kang Peng, and H Eugene Stanley. 2000. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. circulation 101, 23 (2000), e215\u2013e220."},{"key":"e_1_3_2_1_16_1","volume-title":"Generative adversarial nets. Advances in neural information processing systems 27","author":"Goodfellow Ian","year":"2014","unstructured":"Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014)."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3506719"},{"key":"e_1_3_2_1_18_1","volume-title":"Greg Ver Steeg, and Aram Galstyan","author":"Harutyunyan Hrayr","year":"2019","unstructured":"Hrayr Harutyunyan, Hrant Khachatrian, David C Kale, Greg Ver Steeg, and Aram Galstyan. 2019. Multitask learning and benchmarking with clinical time series data. Scientific data 6, 1 (2019), 96."},{"key":"e_1_3_2_1_19_1","volume-title":"MedDiff: Generating electronic health records using accelerated denoising diffusion model. arXiv preprint arXiv:2302.04355","author":"He Huan","year":"2023","unstructured":"Huan He, Shifan Zhao, Yuanzhe Xi, and Joyce C Ho. 2023. MedDiff: Generating electronic health records using accelerated denoising diffusion model. arXiv preprint arXiv:2302.04355 (2023)."},{"key":"e_1_3_2_1_20_1","volume-title":"Leo Anthony Celi, and Roger G Mark","author":"Johnson Alistair E W","year":"2016","unstructured":"Alistair E W Johnson, Tom J Pollard, Lu Shen, Li-Wei H Lehman, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data 3, 1 (2016), 1\u20139."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-023-00834-7"},{"key":"e_1_3_2_1_22_1","volume-title":"A survey on bias and fairness in machine learning. ACM computing surveys (CSUR) 54, 6","author":"Mehrabi Ninareh","year":"2021","unstructured":"Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. A survey on bias and fairness in machine learning. ACM computing surveys (CSUR) 54, 6 (2021), 1\u201335."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/JBHI.2023.3237753"},{"key":"e_1_3_2_1_24_1","volume-title":"Proc. conf. fairness accountability transp., new york, usa","volume":"1170","author":"Narayanan Arvind","year":"2018","unstructured":"Arvind Narayanan. 2018. Translation tutorial: 21 fairness definitions and their politics. In Proc. conf. fairness accountability transp., new york, usa, Vol. 1170. 3."},{"key":"e_1_3_2_1_25_1","volume-title":"Krishna S. Kalluri, Elise L. Minto, Jason Patterson, Linying Zhang, George Hripcsak, No\u00e9mie Elhadad, and Karthik Natarajan.","author":"Pang Chao","year":"2024","unstructured":"Chao Pang, Xinzhuo Jiang, Nishanth Parameshwar Pavinkurve, Krishna S. Kalluri, Elise L. Minto, Jason Patterson, Linying Zhang, George Hripcsak, No\u00e9mie Elhadad, and Karthik Natarajan. 2024. CEHR-GPT: Generating Electronic Health Records with Chronological Patient Timelines. arXiv:2402.04400 [cs.LG]"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1101\/2023.05.02.23289405"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3593013.3594102"},{"key":"e_1_3_2_1_28_1","volume-title":"Proceedings of the Machine Learning for Healthcare (MLHC-2022)","author":"Poulain Raphael","year":"2022","unstructured":"Raphael Poulain, Mehak Gupta, and Rahmatollah Beheshti. 2022. Few-Shot Learning with Semi-Supervised Transformers for Electronic Health Records. In Proceedings of the Machine Learning for Healthcare (MLHC-2022)."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","unstructured":"Raphael Poulain Mehak Gupta Randi Foraker and Rahmatollah Beheshti. 2021. Transformer-based Multi-target Regression on Electronic Health Records for Primordial Prevention of Cardiovascular Disease. In 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 726\u2013731. 10.1109\/BIBM52615.2021.9669441","DOI":"10.1109\/BIBM52615.2021.9669441"},{"key":"e_1_3_2_1_30_1","unstructured":"Alec Radford Jeffrey Wu Rewon Child David Luan Dario Amodei Ilya Sutskever et al. 2019. Language models are unsupervised multitask learners. OpenAI blog 1 8 (2019) 9."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3307339.3342177"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1613\/jair.1.13200"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-59137-3_4"},{"key":"e_1_3_2_1_34_1","volume-title":"Pre-training of graph augmented transformers for medication recommendation. arXiv preprint arXiv:1906.00346","author":"Shang Junyuan","year":"2019","unstructured":"Junyuan Shang, Tengfei Ma, Cao Xiao, and Jimeng Sun. 2019. Pre-training of graph augmented transformers for medication recommendation. arXiv preprint arXiv:1906.00346 (2019)."},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33011126"},{"key":"e_1_3_2_1_36_1","volume-title":"Poly 2021 and DMAH 2021, Virtual Event","author":"Sun Siao","year":"2021","unstructured":"Siao Sun, Fusheng Wang, Sina Rashidian, Tahsin Kurc, Kayley Abell-Hart, Janos Hajagos, Wei Zhu, Mary Saltz, and Joel Saltz. 2021. Generating longitudinal synthetic ehr data with recurrent autoencoders and generative adversarial networks. In Heterogeneous Data Management, Polystores, and Analytics for Healthcare: VLDB Workshops, Poly 2021 and DMAH 2021, Virtual Event, August 20, 2021, Revised Selected Papers 7. Springer, 153\u2013165."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41467-023-41093-0"},{"key":"e_1_3_2_1_38_1","volume-title":"CorGAN: correlation-capturing convolutional generative adversarial networks for generating synthetic healthcare records. arXiv preprint arXiv:2001.09346","author":"Torfi Amirsina","year":"2020","unstructured":"Amirsina Torfi and Edward A Fox. 2020. CorGAN: correlation-capturing convolutional generative adversarial networks for generating synthetic healthcare records. arXiv preprint arXiv:2001.09346 (2020)."},{"key":"e_1_3_2_1_39_1","volume-title":"Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971","author":"Touvron Hugo","year":"2023","unstructured":"Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timoth\u00e9e Lacroix, Baptiste Rozi\u00e8re, Naman Goyal, Eric Hambro, Faisal Azhar, et al. 2023. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023)."},{"key":"e_1_3_2_1_40_1","volume-title":"Attention is all you need. Advances in neural information processing systems 30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017)."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219961"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2018.8622525"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData47090.2019.9006322"},{"key":"e_1_3_2_1_44_1","volume-title":"AMIA annual symposium proceedings","author":"Yan Chao","unstructured":"Chao Yan, Ziqi Zhang, Steve Nyemba, and Bradley A Malin. 2020. Generating electronic health records with multiple data types and constraints. In AMIA annual symposium proceedings, Vol. 2020. American Medical Informatics Association, 1335."},{"key":"e_1_3_2_1_45_1","volume-title":"a paediatric-specific intensive care database. Scientific data 7, 1","author":"Zeng Xian","year":"2020","unstructured":"Xian Zeng, Gang Yu, Yang Lu, Linhua Tan, Xiujing Wu, Shanshan Shi, Huilong Duan, Qiang Shu, and Haomin Li. 2020. PIC, a paediatric-specific intensive care database. Scientific data 7, 1 (2020), 14."},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1093\/jamia\/ocaa262"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1093\/jamia\/ocz161"}],"event":{"name":"CHASE '25: ACM\/IEEE International Conference on Connected Health: Applications, Systems and Engineering Technologies","location":"Yeshiva University Museum New York NY USA","acronym":"CHASE '25","sponsor":["SIGBED ACM Special Interest Group on Embedded Systems","IEEE Computer Society"]},"container-title":["Proceedings of the ACM\/IEEE International Conference on Connected Health: Applications, Systems and Engineering Technologies"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3721201.3721373","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,12]],"date-time":"2025-11-12T13:38:01Z","timestamp":1762954681000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3721201.3721373"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,24]]},"references-count":47,"alternative-id":["10.1145\/3721201.3721373","10.1145\/3721201"],"URL":"https:\/\/doi.org\/10.1145\/3721201.3721373","relation":{},"subject":[],"published":{"date-parts":[[2025,6,24]]},"assertion":[{"value":"2025-11-12","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}