{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T18:20:54Z","timestamp":1775067654782,"version":"3.50.1"},"reference-count":44,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2024,1,24]],"date-time":"2024-01-24T00:00:00Z","timestamp":1706054400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Generative Adversarial Networks (GANs) for 3D volume generation and reconstruction, such as shape generation, visualization, automated design, real-time simulation, and research applications, are receiving increased amounts of attention in various fields. However, challenges such as limited training data, high computational costs, and mode collapse issues persist. We propose combining a Variational Autoencoder (VAE) and a GAN to uncover enhanced 3D structures and introduce a stable and scalable progressive growth approach for generating and reconstructing intricate voxel-based 3D shapes. The cascade-structured network involves a generator and discriminator, starting with small voxel sizes and incrementally adding layers, while subsequently supervising the discriminator with ground-truth labels in each newly added layer to model a broader voxel space. Our method enhances the convergence speed and improves the quality of the generated 3D models through stable growth, thereby facilitating an accurate representation of intricate voxel-level details. Through comparative experiments with existing methods, we demonstrate the effectiveness of our approach in evaluating voxel quality, variations, and diversity. The generated models exhibit improved accuracy in 3D evaluation metrics and visual quality, making them valuable across various fields, including virtual reality, the metaverse, and gaming.<\/jats:p>","DOI":"10.3390\/s24030751","type":"journal-article","created":{"date-parts":[[2024,1,24]],"date-time":"2024-01-24T09:57:42Z","timestamp":1706090262000},"page":"751","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["A Variational Autoencoder Cascade Generative Adversarial Network for Scalable 3D Object Generation and Reconstruction"],"prefix":"10.3390","volume":"24","author":[{"given":"Min-Su","family":"Yu","sequence":"first","affiliation":[{"name":"Department of Smart Convergence, Kwangwoon University, Seoul 01897, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9732-0881","authenticated-orcid":false,"given":"Tae-Won","family":"Jung","sequence":"additional","affiliation":[{"name":"Department of Immersive Content Convergence, Kwangwoon University, Seoul 01897, Republic of Korea"}]},{"given":"Dai-Yeol","family":"Yun","sequence":"additional","affiliation":[{"name":"Institute of Information Technology, Kwangwoon University, Seoul 01897, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3403-540X","authenticated-orcid":false,"given":"Chi-Gon","family":"Hwang","sequence":"additional","affiliation":[{"name":"Institute of Information Technology, Kwangwoon University, Seoul 01897, Republic of Korea"}]},{"given":"Sea-Young","family":"Park","sequence":"additional","affiliation":[{"name":"Department of Immersive Content Convergence, Kwangwoon University, Seoul 01897, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6595-6415","authenticated-orcid":false,"given":"Soon-Chul","family":"Kwon","sequence":"additional","affiliation":[{"name":"Department of Smart Convergence, Kwangwoon University, Seoul 01897, Republic of Korea"}]},{"given":"Kye-Dong","family":"Jung","sequence":"additional","affiliation":[{"name":"Ingenium College of Liberal Arts, Kwangwoon University, Seoul 01897, Republic of Korea"}]}],"member":"1968","published-online":{"date-parts":[[2024,1,24]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1109\/MMUL.2012.24","article-title":"Microsoft kinect sensor and its effect","volume":"19","author":"Zhang","year":"2012","journal-title":"IEEE MultiMedia"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., and Nie\u00dfner, M. (2017, January 21\u201326). Scannet: Richly-annotated 3d reconstructions of indoor scenes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.261"},{"key":"ref_3","first-page":"2012","article-title":"Unsupervised feature learning and deep learning: A review and new perspectives","volume":"1","author":"Bengio","year":"2012","journal-title":"CoRR"},{"key":"ref_4","unstructured":"Kingma, D.P., and Welling, M. (2014). Auto-encoding variational bayes. arXiv."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1145\/3422622","article-title":"Generative Adversarial Networks","volume":"63","author":"Goodfellow","year":"2014","journal-title":"Commun. ACM"},{"key":"ref_6","unstructured":"Van den Oord, A., Kalchbrenner, N., and Kavukcuoglu, K. (2016). Pixel Recurrent Neural Networks, ICML."},{"key":"ref_7","first-page":"4797","article-title":"Conditional image generation with PixelCNN decoders","volume":"29","author":"Kalchbrenner","year":"2016","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_8","first-page":"4743","article-title":"Improved variational inference with inverse autoregressive flow","volume":"29","author":"Kingma","year":"2016","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_9","unstructured":"Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and Courville, A. (2017, January 4\u20139). Improved training of Wasserstein GANs. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_10","first-page":"1972","article-title":"PixelGAN autoencoders","volume":"30","author":"Makhzani","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_11","unstructured":"Dumoulin, V., Belghazi, I., Poole, B., Mastropietro, O., Lamb, A., Arjovsky, M., and Courville, A. (2016). Adversarially learned inference. arXiv."},{"key":"ref_12","unstructured":"Brock, A., Donahue, J., and Simonyan, K. (2018). Large scale GAN training for high fidelity natural image synthesis. arXiv."},{"key":"ref_13","unstructured":"Rusu, A.A., Rabinowitz, N.C., Desjardins, G., Soyer, H., Kirkpatrick, J., Kavukcuoglu, K., Pascanu, R., and Hadsell, R. (2016). Progressive neural networks. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"He, K.M., Zhang, X.Y., Ren, S.Q., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_15","unstructured":"Arjovsky, M., and Bottou, L. (2017). Towards principled methods for training generative adversarial networks. arXiv."},{"key":"ref_16","unstructured":"Hjelm, R.D., Jacob, A.P., Che, T., Trischler, A., Cho, K., and Bengio, Y. (2017). Boundary Seeking Generative Adversarial Networks. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Mao, X., Li, Q., Xie, H., Lau, R.Y., and Wang, Z. (2017, January 22\u201329). Least squares generative adversarial networks. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.304"},{"key":"ref_18","unstructured":"Zhao, J., Mathieu, M., and LeCun, Y. (2017). Energy-based generative adversarial network. arXiv."},{"key":"ref_19","first-page":"2053","article-title":"Learning with a Wasserstein loss","volume":"28","author":"Frogner","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_20","unstructured":"Arjovsky, M., Chintala, S., and Bottou, L. (2017). Wasserstein GAN. arXiv e-prints. arXiv, 685."},{"key":"ref_21","unstructured":"Wu, J., Zhang, C., Xue, T., Freeman, B., and Tenenbaum, J. (2016, January 5\u201310). Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Girdhar, R., Fouhey, D.F., Rodriguez, M., and Gupta, A. (2016, January 11\u201314). Learning a predictable and generative vector representation for objects. Proceedings of the Computer Vision\u2013ECCV 2016: 14th European Conference, Amsterdam, The Netherlands. Available online: http:\/\/arxiv.org\/abs\/1603.08637.","DOI":"10.1007\/978-3-319-46466-4_29"},{"key":"ref_23","unstructured":"Radford, A., Metz, L., and Chintala, S. (2016, January 2\u20134). Unsupervised representation learning with deep convolutional generative adversarial networks. Proceedings of the International Conference on Learning Representations, San Juan, Puerto Rico."},{"key":"ref_24","unstructured":"Smith, E.J., and Meger, D. (2017, January 13\u201315). Improved adversarial systems for 3D object generation and reconstruction. Proceedings of the 1st Annual Conference on Robot Learning, Mountain View, CA, USA."},{"key":"ref_25","unstructured":"Brock, A., Lim, T., Ritchie, J.M., and Weston, N. (2016). Generative and discriminative voxel modeling with convolutional neural networks. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Balashova, E., Singh, V., Wang, J.P., Teixeira, B., Chen, T., and Funkhouser, T. (2018, January 5\u20138). Structure-aware shape synthesis. Proceedings of the International Conference on 3D Vision, Verona, Italy.","DOI":"10.1109\/3DV.2018.00026"},{"key":"ref_27","unstructured":"Zhu, J.Y., Zhang, Z., Zhang, C., Wu, J., Torralba, A., Tenenbaum, J., and Freeman, B. (2018, January 2\u20138). Visual object networks: Image generation with disentangled 3D representation. Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_28","unstructured":"Arjovsky, M., Chintala, S., and Bottou, L. (2017, January 6\u201311). Wasserstein generative adversarial networks. Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia."},{"key":"ref_29","unstructured":"Henzler, P., Mitra, N., and Ritschel, T. (November, January 27). Escaping plato\u2019s cave: 3D shape from adversarial rendering. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Gadelha, M., Maji, S., and Wang, R. (2017, January 10\u201312). 3D shape induction from 2D views of multiple objects. Proceedings of the International Conference on 3D Vision, Qingdao, China.","DOI":"10.1109\/3DV.2017.00053"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Li, X., Dong, Y., Peers, P., and Tong, X. (2019, January 15\u201320). Synthesizing 3D shapes from silhouette image collections using multi-projection generative adversarial networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00568"},{"key":"ref_32","unstructured":"Karras, T., Aila, T., Laine, S., and Lehtinen, J. (2017). Progressive growing of gans for improved quality, stability, and variation. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1016\/S0925-2312(02)00578-7","article-title":"A computational inverse technique for material characterization of a functionally graded cylinder using a progressive neural network","volume":"51","author":"Han","year":"2003","journal-title":"Neurocomputing"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"2935","DOI":"10.1109\/TPAMI.2017.2773081","article-title":"Learning without forgetting","volume":"40","author":"Li","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Gidaris, S., and Komodakis, N. (2018, January 18\u201323). Dynamic few-shot visual learning without forgetting. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00459"},{"key":"ref_36","unstructured":"Odena, A., Olah, C., and Shlens, J. (2017). Conditional Image Synthesis with Auxiliary Classifier GANs, ICML."},{"key":"ref_37","unstructured":"Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q.X., Li, Z.M., Savarese, S., Savva, M., Song, S.R., and Su, H. (2015). ShapeNet: An information-rich 3D model repository. arXiv."},{"key":"ref_38","unstructured":"Antoniou, A., Storkey, A., and Edwards, H. (2017). Data augmentation generative adversarial networks. arXiv."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1186\/s40537-021-00492-0","article-title":"Text data augmentation for deep learning","volume":"8","author":"Shorten","year":"2021","journal-title":"J. Big Data"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"545","DOI":"10.1111\/1754-9485.13261","article-title":"A review of medical image data augmentation techniques for deep learning applications","volume":"65","author":"Chlap","year":"2021","journal-title":"J. Med. Imaging Radiat. Oncol."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Li, M., Lin, J., Ding, Y., Liu, Z., Zhu, J.Y., and Han, S. (2020, January 13\u201319). Gan compression: Efficient architectures for interactive conditional gans. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00533"},{"key":"ref_42","unstructured":"Zhang, Z., Ning, G., Cen, Y., Li, Y., Zhao, Z., Sun, H., and He, Z. (2018). Progressive neural networks for image classification. arXiv."},{"key":"ref_43","unstructured":"Pei, S., Da Xu, R.Y., Xiang, S., and Meng, G. (2021). Alleviating mode collapse in GAN via diversity penalty module. arXiv."},{"key":"ref_44","unstructured":"Yang, D., Hong, S., Jang, Y., Zhao, T., and Lee, H. (2019). Diversity-sensitive conditional generative adversarial networks. arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/3\/751\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T13:48:31Z","timestamp":1760104111000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/3\/751"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,24]]},"references-count":44,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2024,2]]}},"alternative-id":["s24030751"],"URL":"https:\/\/doi.org\/10.3390\/s24030751","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,1,24]]}}}