{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T18:14:41Z","timestamp":1780596881042,"version":"3.54.1"},"reference-count":50,"publisher":"Association for Computing Machinery (ACM)","issue":"9","license":[{"start":{"date-parts":[[2024,11,12]],"date-time":"2024-11-12T00:00:00Z","timestamp":1731369600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"U.S. National Institutes of Health\/National Library of Medicine","award":["R01LM011834"],"award-info":[{"award-number":["R01LM011834"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2024,11,30]]},"abstract":"<jats:p>\n            Process data constructed from event logs provides valuable insights into procedural dynamics over time. The confidential information in process data, together with the data\u2019s intricate nature, makes the datasets not sharable and challenging to collect. Consequently, research is limited using process data and analytics in the process mining domain. In this study, we introduced a synthetic process data generation task to address the limitation of sharable process data. We introduced a generative adversarial network, called ProcessGAN, to generate process data with activity sequences and corresponding timestamps. ProcessGAN consists of a transformer-based network as the generator, and a time-aware self-attention network as the discriminator. It can generate privacy-preserving process data from random noise. ProcessGAN considers the duration of the process and time intervals between activities to generate realistic activity sequences with timestamps. We evaluated ProcessGAN on five real-world datasets, two that are public and three collected in medical domains that are private. To evaluate the synthetic data, in addition to statistical metrics, we trained a supervised model to score the synthetic processes. We also used process mining to discover workflows for synthetic medical processes and had domain experts evaluate the clinical applicability of the synthetic workflows. ProcessGAN outperformed the existing generative models in generating complex processes with valid parallel pathways. The synthetic process data generated by ProcessGAN better represented the long-range dependencies between activities, a feature relevant to complicated medical and other processes. The timestamps generated by the ProcessGAN model showed similar distributions with the authentic timestamps. In addition, we trained a transformer-based network to generate synthetic contexts (e.g., patient demographics) that were associated with the synthetic processes. The synthetic contexts generated by our model outperformed the baseline models, with the distributions similar to the authentic contexts. We conclude that ProcessGAN can generate sharable synthetic process data indistinguishable from authentic data. Our source code is available in\n            <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/raaachli\/ProcessGAN\">https:\/\/github.com\/raaachli\/ProcessGAN<\/jats:ext-link>\n            .\n          <\/jats:p>","DOI":"10.1145\/3687464","type":"journal-article","created":{"date-parts":[[2024,8,28]],"date-time":"2024-08-28T18:22:55Z","timestamp":1724869375000},"page":"1-31","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["ProcessGAN: Generating Privacy-Preserving Time-Aware Process Data with Conditional Generative Adversarial Nets"],"prefix":"10.1145","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8493-2187","authenticated-orcid":false,"given":"Keyi","family":"Li","sequence":"first","affiliation":[{"name":"Electrical and Computer Engineering Department, Rutgers University, New Brunswick, New Jersey, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0407-6398","authenticated-orcid":false,"given":"Sen","family":"Yang","sequence":"additional","affiliation":[{"name":"Waymo, Mountain View, CA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4399-7037","authenticated-orcid":false,"given":"Travis M.","family":"Sullivan","sequence":"additional","affiliation":[{"name":"Children\u2019s National Hospital, Washington, DC, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4465-9117","authenticated-orcid":false,"given":"Randall S.","family":"Burd","sequence":"additional","affiliation":[{"name":"Children\u2019s National Hospital, Washington, DC, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1033-6865","authenticated-orcid":false,"given":"Ivan","family":"Marsic","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering Department, Rutgers University, New Brunswick, NJ, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2024,11,12]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"28","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","author":"Bengio Samy","year":"2015","unstructured":"Samy Bengio, Oriol Vinyals, Navdeep Jaitly, and Noam Shazeer. 2015. Scheduled sampling for sequence prediction with recurrent neural networks. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 28."},{"key":"e_1_3_1_3_2","volume-title":"Proceedings of the AAAI Spring Symposium on Designing AI for Open Worlds","author":"Briscoe Jarren","year":"2022","unstructured":"Jarren Briscoe, Assefaw Gebremedhin, Lawrence B. Holder, and Diane J. Cook. 2022. Adversarial creation of a smart home testbed for novelty detection. In Proceedings of the AAAI Spring Symposium on Designing AI for Open Worlds."},{"key":"e_1_3_1_4_2","unstructured":"Zaharah A. Bukhsh Aaqib Saeed and Remco M. Dijkman. 2021. Processtransformer: Predictive business process monitoring with transformer network. arXiv:2104.00721. Retrieved from https:\/\/arxiv.org\/pdf\/2104.00721"},{"issue":"1","key":"e_1_3_1_5_2","first-page":"219","article-title":"Process mining in the education domain","volume":"8","author":"Cairns Awatef Hicheur","year":"2015","unstructured":"Awatef Hicheur Cairns, Billel Gueni, Mehdi Fhima, Andrew Cairns, St\u00e9phane David, and Nasser Khelifa. 2015. Process mining in the education domain. International Journal on Advances in Intelligent Systems 8, 1 (2015), 219\u2013232.","journal-title":"International Journal on Advances in Intelligent Systems"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDMW.2017.63"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.3390\/s19051181"},{"key":"e_1_3_1_8_2","first-page":"2","volume-title":"Proceedings of naacL-HLT","volume":"1","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of naacL-HLT, Vol. 1, 2."},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2021.114582"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482242"},{"key":"e_1_3_1_11_2","first-page":"27","article-title":"Generative adversarial nets","author":"Goodfellow Ian","year":"2014","unstructured":"Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 27.","journal-title":"Proceedings of the Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_12_2","unstructured":"Roberto Gozalo-Brizuela and Eduardo C. Garrido-Merchan. 2023. ChatGPT is not all you need. A state of the art review of large generative AI models. arXiv:2301.04655. Retrieved from https:\/\/arxiv.org\/pdf\/2301.04655"},{"key":"e_1_3_1_13_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","volume":"30","author":"Gulrajani Ishaan","year":"2017","unstructured":"Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C. Courville. 2017. Improved training of wasserstein gans. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 30."},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11957"},{"key":"e_1_3_1_15_2","unstructured":"Eric Jang Shixiang Gu and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. arXiv:1611.01144."},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00813"},{"key":"e_1_3_1_17_2","first-page":"7482","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Kendall Alex","year":"2018","unstructured":"Alex Kendall, Yarin Gal, and Roberto Cipolla. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7482\u20137491."},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ssaho.2023.100648"},{"key":"e_1_3_1_19_2","unstructured":"Jiwei Li Will Monroe Tianlin Shi S\u00e9bastien Jean Alan Ritter and Dan Jurafsky. 2017. Adversarial learning for neural dialogue generation. arXiv:1701.06547. Retrieved from https:\/\/arxiv.org\/pdf\/1701.06547"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3336191.3371786"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2023.104344"},{"key":"e_1_3_1_22_2","unstructured":"Keyi Li Sen Yang Travis M. Sullivan Randall S. Burd and Ivan Marsic. 2022. Exploring runtime decision support for trauma resuscitation. arXiv:2207.02922."},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-09342-5_13"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2023.3310909"},{"key":"e_1_3_1_25_2","first-page":"72","volume-title":"Proceedings of the RADAR+ EMISA 2017","author":"Mannhardt Felix","year":"2017","unstructured":"Felix Mannhardt and Daan Blinde. 2017. Analyzing the trajectories of patients with sepsis using process mining. In Proceedings of the RADAR+ EMISA 2017. CEUR-ws.org, 72\u201380."},{"key":"e_1_3_1_26_2","first-page":"26","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 26."},{"key":"e_1_3_1_27_2","unstructured":"Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv:1411.1784. Retrieved from https:\/\/arxiv.org\/abs\/1411.1784"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2022.103994"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-021-09960-8"},{"key":"e_1_3_1_30_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Nie Weili","year":"2018","unstructured":"Weili Nie, Nina Narodytska, and Ankit Patel. 2018. RELGAN: Relational generative adversarial networks for text generation. In Proceedings of the International Conference on Learning Representations."},{"issue":"8","key":"e_1_3_1_31_2","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford Alec","year":"2019","unstructured":"Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.","journal-title":"OpenAI Blog"},{"key":"e_1_3_1_32_2","unstructured":"Marc\u2019Aurelio Ranzato Sumit Chopra Michael Auli and Wojciech Zaremba. 2015. Sequence level training with recurrent neural networks. arXiv:1511.06732. Retrieved from https:\/\/arxiv.org\/abs\/1511.06732"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2016.04.007"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1145\/2240156.2240161"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.artmed.2023.102507"},{"key":"e_1_3_1_36_2","doi-asserted-by":"crossref","unstructured":"Peter Shaw Jakob Uszkoreit and Ashish Vaswani. 2018. Self-attention with relative position representations. arXiv:1803.02155. Retrieved from https:\/\/arxiv.org\/abs\/1803.02155","DOI":"10.18653\/v1\/N18-2074"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-019-0197-0"},{"issue":"2000","key":"e_1_3_1_38_2","first-page":"1","article-title":"Simple demographics often identify people uniquely","volume":"671","author":"Sweeney Latanya","year":"2000","unstructured":"Latanya Sweeney. 2000. Simple demographics often identify people uniquely. Health (San Francisco) 671, 2000 (2000), 1\u201334.","journal-title":"Health (San Francisco)"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58666-9_14"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2004.47"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2006.05.003"},{"key":"e_1_3_1_42_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","volume":"30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 30."},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/CBMS.2019.00036"},{"key":"e_1_3_1_44_2","first-page":"283","volume-title":"Proceedings of the Belgium-Netherlands Conference on Artificial Intelligence","author":"Weijters A. J. M. M.","year":"2001","unstructured":"A. J. M. M. Weijters and Wil M. P. van der Aalst. 2001. Process mining: Discovering workflow models from event-based data. In Proceedings of the Belgium-Netherlands Conference on Artificial Intelligence, 283\u2013290."},{"key":"e_1_3_1_45_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","volume":"32","author":"Xu Lei","year":"2019","unstructured":"Lei Xu, Maria Skoularidou, Alfredo Cuesta-Infante, and Kalyan Veeramachaneni. 2019. Modeling tabular data using conditional gan. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 32."},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICHI.2017.67"},{"key":"e_1_3_1_47_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","volume":"32","author":"Yoon Jinsung","year":"2019","unstructured":"Jinsung Yoon, Daniel Jarrett, and Mihaela Van der Schaar. 2019. Time-series generative adversarial networks. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 32."},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.10804"},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.1145\/3424116"},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.techfore.2021.121021"},{"key":"e_1_3_1_51_2","first-page":"21","volume-title":"Proceedings of the NIPS workshop on Adversarial Training","volume":"21","author":"Zhang Yizhe","year":"2016","unstructured":"Yizhe Zhang, Zhe Gan, and Lawrence Carin. 2016. Generating text via adversarial training. In Proceedings of the NIPS workshop on Adversarial Training, Vol. 21. academia. edu, 21\u201332."}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3687464","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3687464","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:05:59Z","timestamp":1750291559000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3687464"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,12]]},"references-count":50,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2024,11,30]]}},"alternative-id":["10.1145\/3687464"],"URL":"https:\/\/doi.org\/10.1145\/3687464","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,12]]},"assertion":[{"value":"2022-10-09","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-07-26","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-11-12","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}