{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T07:52:25Z","timestamp":1775289145970,"version":"3.50.1"},"reference-count":24,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2025,2,19]],"date-time":"2025-02-19T00:00:00Z","timestamp":1739923200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>Generating high-quality synthetic data is essential for advancing machine learning applications in financial time series, where data scarcity and privacy concerns often pose significant challenges. This study proposes a novel hybrid architecture that combines variational autoencoders (VAEs) with Markov Chain Monte Carlo (MCMC) sampling to enhance the generation of robust synthetic sequential data. The model leverages Gated Recurrent Unit (GRU) layers for capturing long-term temporal dependencies and MCMC sampling for effective latent space exploration, ensuring high variability and accuracy. Experimental evaluations on datasets of Google, Tesla, and Nestl\u00e9 stock prices demonstrate the model\u2019s superior performance in preserving statistical and temporal patterns, as validated by quantitative metrics (discriminative and predictive scores), statistical tests (Kolmogorov\u2013Smirnov), and t-Distributed Stochastic Neighbour Embedding (t-SNE) visualisations. The experiments reveal the model\u2019s scalability, maintaining high fidelity even under augmented dataset sizes and missing data scenarios. These findings position the proposed framework as a computationally efficient and structurally simple alternative to Generative Adversarial Network (GAN)-based methods, suitable for real-world applications in data-driven financial modelling.<\/jats:p>","DOI":"10.3390\/fi17020095","type":"journal-article","created":{"date-parts":[[2025,2,19]],"date-time":"2025-02-19T08:36:46Z","timestamp":1739954206000},"page":"95","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Robust Synthetic Data Generation for Sequential Financial Models Using Hybrid Variational Autoencoder\u2013Markov Chain Monte Carlo Architectures"],"prefix":"10.3390","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-9666-3452","authenticated-orcid":false,"given":"Francesco","family":"Bruni Prenestino","sequence":"first","affiliation":[{"name":"Department of Mathematics and Physics, Catholic University of the Sacred Heart, 25121 Brescia, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1466-0248","authenticated-orcid":false,"given":"Enrico","family":"Barbierato","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Physics, Catholic University of the Sacred Heart, 25121 Brescia, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-8422-8024","authenticated-orcid":false,"given":"Alice","family":"Gatti","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Physics, Catholic University of the Sacred Heart, 25121 Brescia, Italy"}]}],"member":"1968","published-online":{"date-parts":[[2025,2,19]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"121160","DOI":"10.1016\/j.techfore.2021.121160","article-title":"Data science as knowledge creation a framework for synergies between data analysts and domain professionals","volume":"173","author":"Cunningham","year":"2021","journal-title":"Technol. Forecast. Soc. Change"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1089\/big.2013.1508","article-title":"Data science and its relationship to big data and data-driven decision making","volume":"1","author":"Provost","year":"2013","journal-title":"Big Data"},{"key":"ref_3","first-page":"20160114","article-title":"Compelling truth: Legal protection of the infosphere against big data spills","volume":"374","author":"Schafer","year":"2016","journal-title":"Philos. Trans. R. Soc. Math. Phys. Eng. Sci."},{"key":"ref_4","unstructured":"Rezzani, A. (2013). Big Data: Architettura, Tecnologie e Metodi per L\u2019utilizzo di Grandi Basi di Dati, Maggioli Editore."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Assefa, S.A., Dervovic, D., Mahfouz, M., Tillman, R.E., Reddy, P., and Veloso, M. (2020, January 15\u201316). Generating synthetic data in finance: Opportunities, challenges and pitfalls. Proceedings of the First ACM International Conference on AI in Finance, New York, NY, USA.","DOI":"10.1145\/3383455.3422554"},{"key":"ref_6","unstructured":"Lu, Y., Wang, H., and Wei, W. (2023). Machine Learning for Synthetic Data Generation: A Review. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Lee, M. (2023). Recent advances in generative adversarial networks for gene expression data: A comprehensive review. Mathematics, 11.","DOI":"10.3390\/math11143055"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3501305","article-title":"Generation of realistic synthetic financial time-series","volume":"18","author":"Dogariu","year":"2022","journal-title":"ACM Trans. Multimed. Comput. Commun. Appl. (Tomm)"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Orlandi, F., Barbierato, E., and Gatti, A. (2024). Enhancing Financial Time Series Prediction with Quantum-Enhanced Synthetic Data Generation: A Case Study on the S&P 500 Using a Quantum Wasserstein Generative Adversarial Network Approach with a Gradient Penalty. Electronics, 13.","DOI":"10.3390\/electronics13112158"},{"key":"ref_10","unstructured":"Tamblyn, I., Yu, T., and Benlolo, I. (2023). fintech-kMC: Agent based simulations of financial platforms for design and testing of machine learning systems. arXiv."},{"key":"ref_11","unstructured":"Yoon, J., Jarrett, D., and Van der Schaar, M. (2019). Time-series generative adversarial networks. Adv. Neural Inf. Process. Syst., 32."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Pei, H., Ren, K., Yang, Y., Liu, C., Qin, T., and Li, D. (2021, January 7\u201310). Towards generating real-world time series data. Proceedings of the 2021 IEEE International Conference on Data Mining (ICDM), Auckland, New Zealand.","DOI":"10.1109\/ICDM51629.2021.00058"},{"key":"ref_13","unstructured":"Desai, A., Freeman, C., Wang, Z., and Beaver, I. (2021). Timevae: A variational auto-encoder for multivariate time series generation. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Razghandi, M., Zhou, H., Erol-Kantarci, M., and Turgut, D. (2022, January 16\u201320). Variational autoencoder generative adversarial network for synthetic data generation in smart home. Proceedings of the ICC 2022-IEEE International Conference on Communications, Seoul, Republic of Korea.","DOI":"10.1109\/ICC45855.2022.9839249"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Lowd, D., and Domingos, P. (2005, January 7\u201311). Naive Bayes models for probability estimation. Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany.","DOI":"10.1145\/1102351.1102418"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Liu, S., McGree, J., Ge, Z., and Xie, Y. (2016). 2 - Classification methods. Computational and Statistical Methods for Analysing Big Data with Applications, Academic Press.","DOI":"10.1016\/B978-0-12-803732-4.00002-7"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Bouguila, N., Fan, W., and Amayri, M. (2022). Hidden Markov Models and Applications, Unsupervised and Semi-Supervised Learning, Springer International Publishing.","DOI":"10.1007\/978-3-030-99142-5"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"El-Amir, H., and Hamdy, M. (2019). Deep Learning Pipeline: Building a Deep Learning Model with TensorFlow, Apress.","DOI":"10.1007\/978-1-4842-5349-6"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Nosouhian, S., Nosouhian, F., and Khoshouei, A.K. (2025, January 04). A Review of Recurrent Neural Network Architecture for Sequence Learning: Comparison Between LSTM and GRU 2021. Available online: https:\/\/scholar.google.com\/scholar?hl=en&as_sdt=0%2C5&q=A+review+of+recurrent+neural+network+architecture+for+sequence+++learning++Comparison+between+LSTM+and+GRU&btnG=.","DOI":"10.20944\/preprints202107.0252.v1"},{"key":"ref_20","unstructured":"Bok, V., and Langr, J. (2019). GANs in Action: Deep learning with Generative Adversarial Networks, Manning."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Lin, Z., Jain, A., Wang, C., Fanti, G., and Sekar, V. (2020, January 27\u201329). Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions. Proceedings of the ACM Internet Measurement Conference, New York, NY, USA. IMC \u201920.","DOI":"10.1145\/3419394.3423643"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"106886","DOI":"10.1016\/j.compchemeng.2020.106886","article-title":"A review On reinforcement learning: Introduction and applications in industrial process control","volume":"139","author":"Nian","year":"2020","journal-title":"Comput. Chem. Eng."},{"key":"ref_23","unstructured":"Candanedo, L., Feldheim, V., and Deramaix, D. (2025, January 04). Appliances Energy Prediction. Available online: https:\/\/doi.org\/10.24432\/C5VC8G."},{"key":"ref_24","unstructured":"Vito, S.D., Massera, E., Piga, M., Martinotto, L., and Francia, G. (2025, January 04). Air Quality. Available online: https:\/\/doi.org\/10.24432\/C59K5F."}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/17\/2\/95\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T16:38:01Z","timestamp":1760027881000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/17\/2\/95"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,19]]},"references-count":24,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2025,2]]}},"alternative-id":["fi17020095"],"URL":"https:\/\/doi.org\/10.3390\/fi17020095","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,19]]}}}