{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T06:23:52Z","timestamp":1776925432561,"version":"3.51.2"},"reference-count":42,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2024,5,11]],"date-time":"2024-05-11T00:00:00Z","timestamp":1715385600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Federal Ministry for Economic Affairs and Climate Action (BMWK)","award":["22849 BG\/2"],"award-info":[{"award-number":["22849 BG\/2"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>This study addresses a significant gap in the field of time series regression modeling by highlighting the central role of data augmentation in improving model accuracy. The primary objective is to present a detailed methodology for systematic sampling of training datasets through data augmentation to improve the accuracy of time series regression models. Therefore, different augmentation techniques are compared to evaluate their impact on model accuracy across different datasets and model architectures. In addition, this research highlights the need for a standardized approach to creating training datasets using multiple augmentation methods. The lack of a clear framework hinders the easy integration of data augmentation into time series regression pipelines. Our systematic methodology promotes model accuracy while providing a robust foundation for practitioners to seamlessly integrate data augmentation into their modeling practices. The effectiveness of our approach is demonstrated using process data from two milling machines. Experiments show that the optimized training dataset improves the generalization ability of machine learning models in 86.67% of the evaluated scenarios. However, the prediction accuracy of models trained on a sufficient dataset remains largely unaffected. Based on these results, sophisticated sampling strategies such as Quadratic Weighting of multiple augmentation approaches may be beneficial.<\/jats:p>","DOI":"10.3390\/make6020049","type":"journal-article","created":{"date-parts":[[2024,5,16]],"date-time":"2024-05-16T09:30:03Z","timestamp":1715851803000},"page":"1072-1086","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Improving Time Series Regression Model Accuracy via Systematic Training Dataset Augmentation and Sampling"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2424-9789","authenticated-orcid":false,"given":"Robin","family":"Str\u00f6bel","sequence":"first","affiliation":[{"name":"wbk Institute of Production Science, Karlsruhe Institute of Technology, Kaiserstra\u00dfe 12, 76131 Karlsruhe, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-4578-3507","authenticated-orcid":false,"given":"Marcus","family":"Mau","sequence":"additional","affiliation":[{"name":"wbk Institute of Production Science, Karlsruhe Institute of Technology, Kaiserstra\u00dfe 12, 76131 Karlsruhe, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9975-7481","authenticated-orcid":false,"given":"Alexander","family":"Puchta","sequence":"additional","affiliation":[{"name":"wbk Institute of Production Science, Karlsruhe Institute of Technology, Kaiserstra\u00dfe 12, 76131 Karlsruhe, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0961-7675","authenticated-orcid":false,"given":"J\u00fcrgen","family":"Fleischer","sequence":"additional","affiliation":[{"name":"wbk Institute of Production Science, Karlsruhe Institute of Technology, Kaiserstra\u00dfe 12, 76131 Karlsruhe, Germany"}]}],"member":"1968","published-online":{"date-parts":[[2024,5,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1016\/j.jmsy.2018.02.004","article-title":"A survey of the advancing use and development of machine learning in smart manufacturing","volume":"48","author":"Sharp","year":"2018","journal-title":"J. Manuf. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"114820","DOI":"10.1016\/j.eswa.2021.114820","article-title":"Machine Learning for industrial applications: A comprehensive literature review","volume":"175","author":"Bertolini","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref_3","first-page":"75","article-title":"Evaluating the effect of dataset size on predictive model using supervised learning technique","volume":"1","author":"Ajiboye","year":"2015","journal-title":"Int. J. Softw. Eng. Comput. Sci."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"106368","DOI":"10.1016\/j.infsof.2020.106368","article-title":"Large-scale machine learning systems in real-world industrial settings: A review of challenges and solutions","volume":"127","author":"Lwakatare","year":"2020","journal-title":"Inf. Softw. Technol."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1016\/j.repl.2015.09.004","article-title":"Customization of mass-produced parts by combining injection molding and additive manufacturing with Industry 4.0 technologies","volume":"60","author":"Gaub","year":"2016","journal-title":"Reinf. Plast."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Fawzi, A., Samulowitz, H., Turaga, D., and Frossard, P. (2016, January 25\u201328). Adaptive data augmentation for image classification. Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA.","DOI":"10.1109\/ICIP.2016.7533048"},{"key":"ref_7","unstructured":"Schlagenhauf, T. (2022). Bildbasierte Quantifizierung und Prognose des Verschlei\u00dfes an Kugelgewindetriebspindeln: Ein Beitrag zur Zustands\u00fcberwachung von Kugelgewindetrieben Mittels Methoden des maschinellen Lernens, Shaker Verlag."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Wen, Q., Sun, L., Yang, F., Song, X., Gao, J., Wang, X., and Xu, H. (2021, January 19\u201326). Time Series Data Augmentation for Deep Learning: A Survey. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, Montreal, QC, Canada.","DOI":"10.24963\/ijcai.2021\/631"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Fu, Q., and Wang, H. (2020). A Novel Deep Learning System with Data Augmentation for Machine Fault Diagnosis from Vibration Signals. Appl. Sci., 10.","DOI":"10.3390\/app10175765"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Bui, V., Pham, T.L., Nguyen, H., and Jang, Y.M. (2021). Data Augmentation Using Generative Adversarial Network for Automatic Machine Fault Detection Based on Vibration Signals. Appl. Sci., 11.","DOI":"10.3390\/app11052166"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Lin, J.C., and Yang, F. (2022, January 7\u20139). Data Augmentation for Industrial Multivariate Time Series via a Spatial and Frequency Domain Knowledge GAN. Proceedings of the 2022 IEEE International Symposium on Advanced Control of Industrial Processes (AdCONIP), Vancouver, BC, Canada.","DOI":"10.1109\/AdCONIP55568.2022.9894177"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1198\/10618600152418584","article-title":"The Art of Data Augmentation","volume":"10","author":"Meng","year":"2001","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_13","unstructured":"DeVries, T., and Taylor, G.W. (2017). Dataset Augmentation in Feature Space. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Wong, S.C., Gatt, A., Stamatescu, V., and McDonnell, M.D. (December, January 30). Understanding Data Augmentation for Classification: When to Warp?. Proceedings of the 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA), Gold Coast, QLD, Australia.","DOI":"10.1109\/DICTA.2016.7797091"},{"key":"ref_15","unstructured":"Wu, S., Zhang, H.R., Valiant, G., and R\u00e9, C. (2020). On the Generalization Effects of Linear Transformations in Data Augmentation. arXiv."},{"key":"ref_16","unstructured":"Tran, T., Pham, T., Carneiro, G., Palmer, L., and Reid, I. (2017). A Bayesian Data Augmentation Approach for Learning Deep Models. arXiv."},{"key":"ref_17","unstructured":"Hu, W., Miyato, T., Tokui, S., Matsumoto, E., and Sugiyama, M. (2017). Learning Discrete Representations via Information Maximizing Self-Augmented Training. arXiv."},{"key":"ref_18","unstructured":"Chung, A.C.S., Gee, J.C., Yushkevich, P.A., and Bao, S. Information Processing in Medical Imaging, Springer International Publishing. Series Title: Lecture Notes in Computer Science."},{"key":"ref_19","unstructured":"Hu, Z., Tan, B., Salakhutdinov, R., Mitchell, T., and Xing, E.P. (2019). Learning Data Manipulation for Augmentation and Weighting. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Vedaldi, A., Bischof, H., Brox, T., and Frahm, J.M. Computer Vision\u2014ECCV 2020, Springer International Publishing. Series Title: Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-030-58604-1"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Sakai, A., Minoda, Y., and Morikawa, K. (September, January 31). Data augmentation methods for machine-learning-based classification of bio-signals. Proceedings of the 2017 10th Biomedical Engineering International Conference (BMEiCON), Hokkaido, Japan.","DOI":"10.1109\/BMEiCON.2017.8229109"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"108148","DOI":"10.1016\/j.patcog.2021.108148","article-title":"Improving the accuracy of global forecasting models using time series data augmentation","volume":"120","author":"Bandara","year":"2021","journal-title":"Pattern Recognit."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Forestier, G., Petitjean, F., Dau, H.A., Webb, G.I., and Keogh, E. (2017, January 18\u201321). Generating Synthetic Time Series to Augment Sparse Datasets. Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM), New Orleans, LA, USA.","DOI":"10.1109\/ICDM.2017.106"},{"key":"ref_24","unstructured":"Fawaz, H.I., Forestier, G., Weber, J., Idoumghar, L., and Muller, P.A. (2018). Data augmentation using synthetic data for time series classification with deep residual networks. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Iwana, B.K., and Uchida, S. (2021). An empirical survey of data augmentation for time series classification with neural networks. PLoS ONE, 16.","DOI":"10.1371\/journal.pone.0254841"},{"key":"ref_26","unstructured":"Fu, B., Kirchbuchner, F., and Kuijper, A. (July, January 30). Data augmentation for time series: Traditional vs generative models on capacitive proximity time series. Proceedings of the 13th ACM International Conference on PErvasive Technologies Related to Assistive Environments, Corfu, Greece."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3559540","article-title":"Generative Adversarial Networks in Time Series: A Systematic Literature Review","volume":"55","author":"Brophy","year":"2023","journal-title":"ACM Comput. Surv."},{"key":"ref_28","unstructured":"Tanaka, F.H.K.d.S., and Aranha, C. (2019). Data Augmentation Using GANs. arXiv."},{"key":"ref_29","unstructured":"Ramponi, G., Protopapas, P., Brambilla, M., and Janssen, R. (2018). T-CGAN: Conditional Generative Adversarial Network for Data Augmentation in Noisy Time Series with Irregular Sampling. arXiv."},{"key":"ref_30","first-page":"7920","article-title":"Neural networks generative models for time series","volume":"34","author":"Gatta","year":"2022","journal-title":"J. King Saud Univ. Comput. Inf. Sci."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"64601","DOI":"10.1109\/ACCESS.2022.3177906","article-title":"Assessing Deep Generative Models on Time Series Network Data","volume":"10","author":"Naveed","year":"2022","journal-title":"IEEE Access"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Haradal, S., Hayashi, H., and Uchida, S. (2018, January 18\u201321). Biosignal Data Augmentation Based on Generative Adversarial Networks. Proceedings of the 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA.","DOI":"10.1109\/EMBC.2018.8512396"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Str\u00f6bel, R., Probst, Y., Deucker, S., and Fleischer, J. (2023). Time Series Prediction for Energy Consumption of Computer Numerical Control Axes Using Hybrid Machine Learning Models. Machines, 11.","DOI":"10.3390\/machines11111015"},{"key":"ref_34","unstructured":"Str\u00f6bel, R., Mau, M., Deucker, S., and Fleischer, J. (2023). Training and Validation Dataset 2 of Milling Processes for Time Series Prediction, Karlsruhe Institute of Technology."},{"key":"ref_35","unstructured":"Str\u00f6bel, R., Probst, Y., and Fleischer, J. (2023). Training and Validation Dataset of Milling Processes for Time Series Prediction, Karlsruhe Institute of Technology."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"10123","DOI":"10.1007\/s00521-023-08459-3","article-title":"Data Augmentation techniques in time series domain: A survey and taxonomy","volume":"35","author":"Iglesias","year":"2023","journal-title":"Neural Comput. Appl."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Um, T.T., Pfister, F.M.J., Pichler, D., Endo, S., Lang, M., Hirche, S., Fietzek, U., and Kuli\u0107, D. (2017, January 13\u201317). Data augmentation of wearable sensor data for parkinson\u2019s disease monitoring using convolutional neural networks. Proceedings of the 19th ACM International Conference on Multimodal Interaction, New York, NY, USA.","DOI":"10.1145\/3136755.3136817"},{"key":"ref_38","unstructured":"Le Guennec, A., Malinowski, S., and Tavenard, R. (2021, January 13). Data Augmentation for Time Series Classification using Convolutional Neural Networks. Proceedings of the ECML\/PKDD Workshop on Advanced Analytics and Learning on Temporal Data, Bilbao, Spain."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"117695","DOI":"10.1016\/j.apenergy.2021.117695","article-title":"Data augmentation for time series regression: Applying transformations, autoencoders and adversarial networks to electricity price forecasting","volume":"304","author":"Demir","year":"2021","journal-title":"Appl. Energy"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Nanni, L., Paci, M., Brahnam, S., and Lumini, A. (2021). Comparison of Different Image Data Augmentation Approaches. J. Imaging, 7.","DOI":"10.20944\/preprints202111.0047.v1"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"126991","DOI":"10.1016\/j.neucom.2023.126991","article-title":"Confidence-based interactable neural-symbolic visual question answering","volume":"564","author":"Bao","year":"2024","journal-title":"Neurocomputing"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"391","DOI":"10.1007\/s10994-011-5268-1","article-title":"Robustness and generalization","volume":"86","author":"Xu","year":"2012","journal-title":"Mach. Learn."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/2\/49\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:44:21Z","timestamp":1760107461000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/2\/49"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,11]]},"references-count":42,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2024,6]]}},"alternative-id":["make6020049"],"URL":"https:\/\/doi.org\/10.3390\/make6020049","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,11]]}}}