{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T15:42:48Z","timestamp":1774539768957,"version":"3.50.1"},"reference-count":27,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2023,2,16]],"date-time":"2023-02-16T00:00:00Z","timestamp":1676505600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Natural Sciences and Engineering Research Council of Canada (NSERC)","award":["RGPIN-2017-06627"],"award-info":[{"award-number":["RGPIN-2017-06627"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>Synthetic data, artificially generated by computer programs, has become more widely used in the financial domain to mitigate privacy concerns. Variational Autoencoder (VAE) is one of the most popular deep-learning models for generating synthetic data. However, VAE is often considered a \u201cblack box\u201d due to its opaqueness. Although some studies have been conducted to provide explanatory insights into VAE, research focusing on explaining how the input data could influence VAE to create synthetic data, especially for tabular data, is still lacking. However, in the financial industry, most data are stored in a tabular format. This paper proposes a sensitivity-based method to assess the impact of inputted tabular data on how VAE synthesizes data. This sensitivity-based method can provide both global and local interpretations efficiently and intuitively. To test this method, a simulated dataset and three Kaggle banking tabular datasets were employed. The results confirmed the applicability of this proposed method.<\/jats:p>","DOI":"10.3390\/a16020121","type":"journal-article","created":{"date-parts":[[2023,2,16]],"date-time":"2023-02-16T03:32:30Z","timestamp":1676518350000},"page":"121","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["Interpretation for Variational Autoencoder Used to Generate Financial Synthetic Tabular Data"],"prefix":"10.3390","volume":"16","author":[{"given":"Jinhong","family":"Wu","sequence":"first","affiliation":[{"name":"Department of Chemical Engineering & Applied Chemistry, University of Toronto, Toronto, ON M5S 3E5, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3647-5473","authenticated-orcid":false,"given":"Konstantinos","family":"Plataniotis","sequence":"additional","affiliation":[{"name":"Department of Electrical & Computer Engineering, University of Toronto, Toronto, ON M5S 3G4, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lucy","family":"Liu","sequence":"additional","affiliation":[{"name":"Royal Bank of Canada, Toronto, ON M5J 0B6, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4400-6717","authenticated-orcid":false,"given":"Ehsan","family":"Amjadian","sequence":"additional","affiliation":[{"name":"Royal Bank of Canada, Toronto, ON M5J 0B6, Canada"},{"name":"David R. Cheriton School of Computer Science, University of Waterloo, Toronto, ON N2L 3G1, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuri","family":"Lawryshyn","sequence":"additional","affiliation":[{"name":"Department of Chemical Engineering & Applied Chemistry, University of Toronto, Toronto, ON M5S 3E5, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,2,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Alabdullah, B., Beloff, N., and White, M. (2018, January 25\u201326). Rise of Big Data\u2013Issues and Challenges. Proceedings of the 2018 21st Saudi Computer Society National Computer Conference (NCC), Riyadh, Saudi Arabia.","DOI":"10.1109\/NCG.2018.8593166"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Assefa, S.A., Dervovic, D., Mahfouz, M., Tillman, R.E., Reddy, P., and Veloso, M. (2020, January 15\u201316). Generating synthetic data in finance: Opportunities, challenges and pitfalls. Proceedings of the First ACM International Conference on AI in Finance, New York, NY, USA.","DOI":"10.1145\/3383455.3422554"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1038\/s41746-020-00353-9","article-title":"Generating high-fidelity synthetic patient data for assessing machine learning healthcare software","volume":"3","author":"Tucker","year":"2020","journal-title":"NPJ Digit. Med."},{"key":"ref_4","unstructured":"Joseph, A. (2022, March 26). We need Synthetic Data. Available online: https:\/\/towardsdatascience.com\/we-need-synthetic-data-e6f90a8532a4."},{"key":"ref_5","unstructured":"Christoph, M. (2022, March 26). How do You Generate Synthetic Data?. Available online: https:\/\/www.statice.ai\/post\/how-generate-synthetic-data."},{"key":"ref_6","unstructured":"Mi, L., Shen, M., and Zhang, J. (2018). A Probe Towards Understanding GAN and VAE Models. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Singh, A., and Ogunfunmi, T. (2022). An Overview of Variational Autoencoders for Source Separation, Finance, and Bio-Signal Applications. Entropy, 24.","DOI":"10.3390\/e24010055"},{"key":"ref_8","unstructured":"van Bree, M. (2020). Unlocking the Potential of Synthetic Tabular Data Generation with Variational Autoencoders. [Master\u2019s Thesis, Tilburg University]."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Shankaranarayana, S.M., and Runje, D. (2019). ALIME: Autoencoder Based Approach for Local. arXiv.","DOI":"10.1007\/978-3-030-33607-3_49"},{"key":"ref_10","first-page":"563","article-title":"Explainable AI: A Brief Survey on History, Research Areas, Approaches and Challenges","volume":"Volume 11839","author":"Xu","year":"2019","journal-title":"Natural Language Processing and Chinese Computing, Proceedings of the 8th CCF International Conference, NLPCC, Dunhuang, China, 9\u201314 October 2019"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Bengio, Y., Courville, A., and Vincent, P. (2013). Representation Learning: A Review and New Perspectives. arXiv.","DOI":"10.1109\/TPAMI.2013.50"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2225","DOI":"10.1016\/j.neucom.2010.01.011","article-title":"First and second order sensitivity analysis of MLP","volume":"73","author":"Yeh","year":"2010","journal-title":"Neurocomputing"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Shah, C., Du, Q., and Xu, Y. (2022). Enhanced TabNet: Attentive Interpretable Tabular Learning for Hyperspectral Image Classification. Remote Sens., 14.","DOI":"10.3390\/rs14030716"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Arik, S.\u00d6., and Pfister, T. (2020). TabNet: Attentive Interpretable Tabular Learning. arXiv.","DOI":"10.1609\/aaai.v35i8.16826"},{"key":"ref_15","unstructured":"Kingma, D.P., and Welling, M. (2014). Auto-Encoding Variational Bayes. arXiv."},{"key":"ref_16","unstructured":"Spinner, T., K\u00f6rner, J., G\u00f6rtler, J., and Deussen, O. (2018, January 22). Towards an Interpretable Latent Space. Proceedings of the Workshop on Visualization for AI Explainability, Berlin, Germany."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"5684","DOI":"10.1038\/s41467-021-26017-0","article-title":"VEGA is an interpretable generative model for inferring biological network activity in single-cell transcriptomics","volume":"12","author":"Seninge","year":"2021","journal-title":"Nat. Commun."},{"key":"ref_18","unstructured":"Fortuin, V., H\u00fcser, M., Locatello, F., Strathmann, H., and R\u00e4tsch, G. (2019). Som-vae: Interpretable discrete representation learning on time series. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Pizarroso, J., Pizarroso, J., and Mu\u00f1oz, A. (2021). NeuralSens: Sensitivity Analysis of Neural Networks. arXiv.","DOI":"10.18637\/jss.v102.i07"},{"key":"ref_20","unstructured":"Mison, V., Xiong, T., Giesecke, K., and Mangu, L. (2018). Sensitivity based Neural Networks Explanations. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"272","DOI":"10.1007\/s42452-021-04148-9","article-title":"Comparison of feature importance measures as explanations for classification models","volume":"3","author":"Saarela","year":"2021","journal-title":"SN Appl. Sci."},{"key":"ref_22","unstructured":"Terence, S. (2022, March 26). Understanding Feature Importance and How to Implement it in Python. Available online: https:\/\/towardsdatascience.com\/understanding-feature-importance-and-how-to-implement-it-in-python-ff0287b20285."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1561\/2200000056","article-title":"An Introduction to Variational Autoencoders","volume":"12","author":"Kingma","year":"2019","journal-title":"Found. Trends R Mach. Learn."},{"key":"ref_24","unstructured":"Zurada, J.M., Malinowski, A., and Cloete, I. (June, January 30). Sensitivity analysis for minimization of input data dimension for feedforward neural network. Proceedings of the IEEE International Symposium on Circuits and Systems-ISCAS\u201994, London, UK."},{"key":"ref_25","unstructured":"Chandran, S. (2022, March 26). Significance of I.I.D in Machine Learning. Available online: https:\/\/medium.datadriveninvestor.com\/significance-of-i-i-d-in-machine-learning-281da0d0cbef."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"137444","DOI":"10.1109\/ACCESS.2021.3116664","article-title":"Explainable student agency analytics","volume":"9","author":"Saarela","year":"2021","journal-title":"IEEE Access"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Linardatos, P., Papastefanopoulos, V., and Kotsiantis, S. (2021). Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy, 23.","DOI":"10.3390\/e23010018"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/16\/2\/121\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:37:35Z","timestamp":1760121455000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/16\/2\/121"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,16]]},"references-count":27,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["a16020121"],"URL":"https:\/\/doi.org\/10.3390\/a16020121","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,16]]}}}