{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T01:44:08Z","timestamp":1760060648406,"version":"build-2065373602"},"reference-count":44,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2025,9,10]],"date-time":"2025-09-10T00:00:00Z","timestamp":1757462400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001691","name":"JSPS KAKENHI","doi-asserted-by":"publisher","award":["JP23K11277"],"award-info":[{"award-number":["JP23K11277"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>This study presents a deep generative model designed to predict intermediate stages in the drawing process of character illustrations. To enhance generalization and robustness, the model integrates a variational bottleneck based on the Variational Autoencoder (VAE) and employs Gaussian noise augmentation during training. We also investigate the effect of U-Net-style skip connections, which allow for the direct propagation of low-level features, on autoregressive sequence generation. Comparative experiments with baseline models demonstrate that the proposed VAE with noise augmentation outperforms both CNN- and RNN-based baselines in long-term stability and visual fidelity. While skip connections improve local detail retention, they also introduce instability in extended sequences, suggesting a trade-off between spatial precision and temporal coherence. The findings highlight the advantages of probabilistic modeling and data augmentation for sequential image generation and provide practical insights for designing intelligent drawing support systems.<\/jats:p>","DOI":"10.3390\/fi17090413","type":"journal-article","created":{"date-parts":[[2025,9,10]],"date-time":"2025-09-10T08:43:54Z","timestamp":1757493834000},"page":"413","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Improved Generation of Drawing Sequences Using Variational and Skip-Connected Deep Networks for a Drawing Support System"],"prefix":"10.3390","volume":"17","author":[{"given":"Atomu","family":"Nakamura","sequence":"first","affiliation":[{"name":"Faculty of Engineering, Kyoto Tachibana University, Kyoto 607-8175, Japan"}]},{"given":"Homari","family":"Matsumoto","sequence":"additional","affiliation":[{"name":"Graduate School of Informatics, Kyoto Tachibana University, Kyoto 607-8175, Japan"}]},{"given":"Koharu","family":"Chiba","sequence":"additional","affiliation":[{"name":"Faculty of Engineering, Kyoto Tachibana University, Kyoto 607-8175, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-0461-8606","authenticated-orcid":false,"given":"Shun","family":"Nishide","sequence":"additional","affiliation":[{"name":"Graduate School of Informatics, Kyoto Tachibana University, Kyoto 607-8175, Japan"}]}],"member":"1968","published-online":{"date-parts":[[2025,9,10]]},"reference":[{"key":"ref_1","first-page":"2249","article-title":"Cascaded Diffusion Models for High Fidelity Image Generation","volume":"23","author":"Ho","year":"2022","journal-title":"J. Mach. Learn. Res."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1007\/s10462-025-11110-3","article-title":"Comprehensive Exploration of Diffusion Models in Image Generation: A Survey","volume":"58","author":"Chen","year":"2025","journal-title":"Artif. Intell. Rev."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"7203","DOI":"10.1109\/TIP.2020.2999855","article-title":"MEF-GAN: Multi-Exposure Image Fusion via Generative Adversarial Networks","volume":"29","author":"Xu","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"15725","DOI":"10.1109\/TPAMI.2023.3306436","article-title":"StudioGAN: A Taxonomy and Benchmark of GANs for Image Synthesis","volume":"45","author":"Kang","year":"2023","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1175","DOI":"10.1016\/j.acra.2019.12.024","article-title":"Creating Artificial Images for Radiology Applications Using Generative Adversarial Networks (GANs) \u2013 A Systematic Review","volume":"27","author":"Sorin","year":"2020","journal-title":"Acad. Radiol."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Matsumoto, H., Nakamura, A., and Nishide, S. (2025). Learning and Generation of Drawing Sequences Using a Deep Network for a Drawing Support System. Appl. Sci., 15.","DOI":"10.3390\/app15137038"},{"key":"ref_7","unstructured":"Kingma, D.P., and Welling, M. (2013). Auto-Encoding Variational Bayes. arXiv."},{"key":"ref_8","first-page":"19667","article-title":"NVAE: A Deep Hierarchical Variational Autoencoder","volume":"33","author":"Vahdat","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_9","first-page":"1","article-title":"Neural Discrete Representation Learning","volume":"30","author":"Vinyals","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_10","first-page":"1","article-title":"Generating Diverse High-Fidelity Images with VQ-VAE-2","volume":"32","author":"Razavi","year":"2019","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_11","unstructured":"Yu, J., Li, X., Koh, J.Y., Zhang, H., Pang, R., Qin, J., Ku, A., Xu, Y., Baldridge, J., and Wu, Y. (2022, January 25\u201329). Vector-Quantized Image Modeling with Improved VQGAN. Proceedings of the International Conference on Learning Representations (ICLR), Online."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"109962","DOI":"10.1016\/j.patcog.2023.109962","article-title":"Reparameterizing and Dynamically Quantizing Image Features for Image Generation","volume":"146","author":"Sun","year":"2024","journal-title":"Pattern Recognit."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"153651","DOI":"10.1109\/ACCESS.2020.3018151","article-title":"Variations in Variational Autoencoders: A Comparative Evaluation","volume":"8","author":"Wei","year":"2020","journal-title":"IEEE Access"},{"key":"ref_14","unstructured":"Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X., Botvinick, M., Mohamed, S., and Lerchner, A. (2017, January 24\u201326). Beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Proceedings of the International Conference on Learning Representations (ICLR), Toulon, France."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"02001","DOI":"10.1051\/itmconf\/20257002001","article-title":"Research on the Application of Variational Autoencoder in Image Generation","volume":"70","author":"Liu","year":"2025","journal-title":"ITM Web Conf."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-Net: Convolutional Networks for Biomedical Image Segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Isola, P., Zhu, J.Y., Zhou, T., and Efros, A.A. (2017, January 21\u201326). Image-to-Image Translation with Conditional Adversarial Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.632"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Park, T., Liu, M.Y., Wang, T.C., and Zhu, J.Y. (2019, January 16\u201320). Semantic Image Synthesis with Spatially-Adaptive Normalization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00244"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Baur, C., Wiestler, B., Albarqouni, S., and Navab, N. (2018). Deep Autoencoding Models for Unsupervised Anomaly Segmentation in Brain MR Images. arXiv.","DOI":"10.1007\/978-3-030-11723-8_16"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. (2022, January 18\u201324). High-Resolution Image Synthesis with Latent Diffusion Models. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., and Liang, J. (2018, January 20). UNet++: A Nested U-Net Architecture for Medical Image Segmentation. Proceedings of the Deep Learning in Medical Image Analysis (DLMIA) Workshop, Granada, Spain.","DOI":"10.1007\/978-3-030-00889-5_1"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"\u00c7i\u00e7ek, \u00d6., Abdulkadir, A., Lienkamp, S.S., Brox, T., and Ronneberger, O. (2016, January 17\u201321). 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Athens, Greece.","DOI":"10.1007\/978-3-319-46723-8_49"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"117982","DOI":"10.1016\/j.cma.2025.117982","article-title":"Real-time Inference and Extrapolation with Time-Conditioned UNet: Applications in Hypersonic Flows, Incompressible Flows, and Global Temperature Forecasting","volume":"441","author":"Ovadia","year":"2025","journal-title":"Comput. Methods Appl. Mech. Eng."},{"key":"ref_24","unstructured":"Villegas, R., Yang, J., Hong, S., Lin, X., and Lee, H. (2017, January 24\u201326). Decomposing Motion and Content for Natural Video Sequence Prediction. Proceedings of the International Conference on Learning Representations (ICLR), Toulon, France."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"176202","DOI":"10.1109\/ACCESS.2020.3024554","article-title":"Weakly-Supervised Defect Segmentation on Periodic Textures Using CycleGAN","volume":"8","author":"Kim","year":"2020","journal-title":"IEEE Access"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1743","DOI":"10.1111\/mice.13103","article-title":"A Lightweight Encoder\u2013Decoder Network for Automatic Pavement Crack Detection","volume":"39","author":"Zhu","year":"2024","journal-title":"Comput.-Aided Civ. Infrastruct. Eng."},{"key":"ref_27","unstructured":"Peng, Y., Sonka, M., and Chen, D.Z. (2023). U-Net v2: Rethinking the Skip Connections of U-Net for Medical Image Segmentation. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"106546","DOI":"10.1016\/j.neunet.2024.106546","article-title":"Narrowing the semantic gaps in U-Net with learnable skip connections: The case of medical image segmentation","volume":"178","author":"Wang","year":"2024","journal-title":"Neural Netw."},{"key":"ref_29","first-page":"802","article-title":"Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting","volume":"28","author":"Shi","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_30","first-page":"12345","article-title":"PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning","volume":"45","author":"Wang","year":"2023","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_31","first-page":"2501","article-title":"Dual Convolutional LSTM Network for Referring Image Segmentation","volume":"25","author":"Zuo","year":"2022","journal-title":"IEEE Trans. Multimedia"},{"key":"ref_32","unstructured":"Bertasius, G., Wang, H., and Torresani, L. (2021, January 18\u201324). Is Space-Time Attention All You Need for Video Understanding?. Proceedings of the International Conference on Machine Learning (ICML), Virtual."},{"key":"ref_33","first-page":"6840","article-title":"Denoising Diffusion Probabilistic Models","volume":"33","author":"Ho","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Avrahami, O., Bahat, Y., and Dekel, T. (2022, January 18\u201324). Blended Diffusion for Text-driven Editing of Natural Images. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01767"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Chan, K.C.K., Xie, J., Lu, W., and Loy, C.C. (2021, January 20\u201325). GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01402"},{"key":"ref_36","first-page":"1234","article-title":"Accurately 3D Neuron Localization Using 2D Conv-LSTM Super-Resolution Segmentation","volume":"17","author":"Zhou","year":"2023","journal-title":"IET Image Process."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1145\/3677102","article-title":"Sketchar: Supporting Character Design and Illustration Prototyping Using Generative AI","volume":"8","author":"Ling","year":"2024","journal-title":"Proc. ACM Hum.-Comput. Interact."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Noor, N.Q.M., Zabidi, A., Jaya, M.I.B.M., and Ler, T.J. (2024, January 18\u201321). Performance Comparison between Generative Adversarial Networks (GAN) Variants in Generating Anime\/Comic Character Images\u2014A Preliminary Result. Proceedings of the IEEE International Symposium on Industrial Electronics (ISIE), Ulsan, Korea.","DOI":"10.1109\/ISIEA61920.2024.10607225"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"K\u0131rb\u0131y\u0131k, \u00d6., Simsar, E., and Cemgil, A.T. (2019, January 24\u201326). Comparison of Deep Generative Models for the Generation of Handwritten Character Images. Proceedings of the 27th Signal Process. Commun. Appl. Conf., Sivas, Turkey.","DOI":"10.1109\/SIU.2019.8806416"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Cao, N., Yan, X., Shi, Y., and Chen, C. (February, January 27). AI-Sketcher: A Deep Generative Model for Producing High-Quality Sketches. Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, HI, USA.","DOI":"10.1609\/aaai.v33i01.33012564"},{"key":"ref_41","unstructured":"Chen, Y., Tu, S., Yi, Y., and Xu, L. (2017). Sketch-pix2seq: A Model to Generate Sketches of Multiple Categories. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Cai, A., Rick, S.R., Heyman, J., Zhang, Y., Filipowicz, A., Hong, M.K., Klenk, M., and Malone, T. (2023, January 7\u20139). DesignAID: Using Generative AI and Semantic Diversity for Design Inspiration. Proceedings of the ACM Collective Intelligence Conference, Delft, The Netherlands.","DOI":"10.1145\/3582269.3615596"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"146","DOI":"10.1108\/ITSE-05-2012-0016","article-title":"A Drawing Learning Support System Based on the Drawing Process Model","volume":"11","author":"Nagai","year":"2014","journal-title":"Interact. Technol. Smart Educ."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Hiroi, Y., and Ito, A. (2023). A Robotic System for Remote Teaching of Technical Drawing. Educ. Sci., 13.","DOI":"10.3390\/educsci13040347"}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/17\/9\/413\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:42:56Z","timestamp":1760035376000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/17\/9\/413"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,10]]},"references-count":44,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2025,9]]}},"alternative-id":["fi17090413"],"URL":"https:\/\/doi.org\/10.3390\/fi17090413","relation":{},"ISSN":["1999-5903"],"issn-type":[{"type":"electronic","value":"1999-5903"}],"subject":[],"published":{"date-parts":[[2025,9,10]]}}}