{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,20]],"date-time":"2026-06-20T01:16:42Z","timestamp":1781918202886,"version":"3.54.5"},"reference-count":45,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2025,8,26]],"date-time":"2025-08-26T00:00:00Z","timestamp":1756166400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>The data landscape has changed, as more and more information is produced in the form of continuous data streams instead of stationary datasets. In this context, several online machine learning techniques have been proposed with the aim of automatically adapting to changes in data distributions, known as drifts. Though effective in certain scenarios, contemporary techniques do not generalize well to different types of data, while they also require manual parameter tuning, thus significantly hindering their applicability. Moreover, current methods do not thoroughly address drifts, as they mostly focus on concept drifts (distribution shifts on the target variable) and not on data drifts (changes in feature distributions). To confront these challenges, in this paper, we propose an AutoML Pipeline for Streams (AML4S), which automates the choice of preprocessing techniques, the choice of machine learning models, and the tuning of hyperparameters. Our pipeline further includes a drift detection mechanism that identifies different types of drifts, therefore continuously adapting the underlying models. We assess our pipeline on several real and synthetic data streams, including a data stream that we crafted to focus on data drifts. Our results indicate that AML4S produces robust pipelines and outperforms existing online learning or AutoML algorithms.<\/jats:p>","DOI":"10.3390\/make7030087","type":"journal-article","created":{"date-parts":[[2025,8,26]],"date-time":"2025-08-26T14:43:18Z","timestamp":1756219398000},"page":"87","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["AML4S: An AutoML Pipeline for Data Streams"],"prefix":"10.3390","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-0672-7766","authenticated-orcid":false,"given":"Eleftherios","family":"Kalaitzidis","sequence":"first","affiliation":[{"name":"Electrical and Computer Engineering Department, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0520-7225","authenticated-orcid":false,"given":"Themistoklis","family":"Diamantopoulos","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering Department, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-9824-0975","authenticated-orcid":false,"given":"Athanasios","family":"Michailoudis","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering Department, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0235-6046","authenticated-orcid":false,"given":"Andreas L.","family":"Symeonidis","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering Department, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2025,8,26]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Sagiroglu, S., and Sinanc, D. (2013, January 20\u201324). Big data: A review. Proceedings of the 2013 International Conference on Collaboration Technologies and Systems (CTS), San Diego, CA, USA.","DOI":"10.1109\/CTS.2013.6567202"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"9523","DOI":"10.1016\/j.jksuci.2021.11.006","article-title":"Concept Drift Detection in Data Stream Mining: A literature review","volume":"34","author":"Agrahari","year":"2022","journal-title":"J. King Saud Univ.\u2014Comput. Inf. Sci."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2523813","article-title":"A survey on concept drift adaptation","volume":"46","author":"Gama","year":"2014","journal-title":"ACM Comput. Surv."},{"key":"ref_4","first-page":"2346","article-title":"Learning under Concept Drift: A Review","volume":"31","author":"Lu","year":"2019","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1145\/3373464.3373470","article-title":"Machine learning for streaming data: State of the art, challenges, and opportunities","volume":"21","author":"Gomes","year":"2019","journal-title":"SIGKDD Explor. Newsl."},{"key":"ref_6","unstructured":"Bifet, A., and Gavald\u00e0, R. (2009). Adaptive Learning from Evolving Data Streams. Advances in Intelligent Data Analysis VIII: 8th International Symposium on Intelligent Data Analysis, IDA 2009, Lyon, France, 31 August\u20132 September 2009, Springer."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Bifet, A., Holmes, G., and Pfahringer, B. (2010). Leveraging bagging for evolving data streams. Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2010, Barcelona, Spain, 20\u201324 September 2010. Proceedings, Part I, Springer.","DOI":"10.1007\/978-3-642-15880-3_15"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1469","DOI":"10.1007\/s10994-017-5642-8","article-title":"Adaptive random forests for evolving data stream classification","volume":"106","author":"Gomes","year":"2017","journal-title":"Mach. Learn."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Gomes, H.M., Read, J., and Bifet, A. (2019, January 8\u201311). Streaming Random Patches for Evolving Data Stream Classification. Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM), Beijing, China.","DOI":"10.1109\/ICDM.2019.00034"},{"key":"ref_10","unstructured":"Rad, R.H., and Haeri, M.A. (2019). Hybrid Forest: A Concept Drift Aware Data Stream Mining Algorithm. arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Yang, L., Manias, D.M., and Shami, A. (2021, January 7\u201311). PWPAE: An Ensemble Framework for Concept Drift Adaptation in IoT Data Streams. Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain.","DOI":"10.1109\/GLOBECOM46510.2021.9685338"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Madrid, J.G., Jair Escalante, H., Morales, E.F., Tu, W.W., Yu, Y., Sun-Hosoya, L., Guyon, I., and Sebag, M. (2018, January 14). Towards AutoML in the presence of Drift: First results. Proceedings of the Workshop AutoML 2018 @ ICML\/IJCAI-ECAI, Stockholm, Sweden.","DOI":"10.52591\/lxai201812039"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Kulbach, C., Montiel, J., Bahri, M., Heyden, M., and Bifet, A. (2022). Evolution-Based Online Automated Machine Learning. Advances in Knowledge Discovery and Data Mining: 26th Pacific-Asia Conference, PAKDD 2022, Chengdu, China, 16\u201319 May 2022, Proceedings, Part I, Springer.","DOI":"10.1007\/978-3-031-05933-9_37"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Bahri, M., and Georgantas, N. (2023, January 15\u201318). AutoClass: AutoML for Data Stream Classification. Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy.","DOI":"10.1109\/BigData59044.2023.10386362"},{"key":"ref_15","first-page":"11263","article-title":"ChaCha for Online AutoML. In Proceedings of the 38th International Conference on Machine Learning","volume":"139","author":"Wu","year":"2021","journal-title":"Proc. Mach. Learn. Res."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1897","DOI":"10.1007\/s10994-022-06262-0","article-title":"Online AutoML: An adaptive AutoML framework for online learning","volume":"112","author":"Celik","year":"2022","journal-title":"Mach. Learn."},{"key":"ref_17","unstructured":"Verma, N., Bifet, A., Pfahringer, B., and Bahri, M. (2024, January 9\u201312). ASML: A Scalable and Efficient AutoML Solution for Data Streams. Proceedings of the AutoML 2024\u2014International Conference on Automated Machine Learning, Paris, France."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"535","DOI":"10.1007\/s10115-014-0808-1","article-title":"A survey on data stream clustering and classification","volume":"45","author":"Nguyen","year":"2015","journal-title":"Knowl. Inf. Syst."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"3725","DOI":"10.1007\/s10462-020-09939-x","article-title":"Concept learning using one-class classifiers for implicit drift detection in evolving data streams","volume":"54","author":"Can","year":"2021","journal-title":"Artif. Intell. Rev."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Bifet, A., and Gavalda, R. (2007, January 26\u201328). Learning from Time-Changing Data with Adaptive Windowing. Proceedings of the 2007 SIAM International Conference on Data Mining (SDM), Minneapolis, MI, USA.","DOI":"10.1137\/1.9781611972771.42"},{"key":"ref_21","unstructured":"Oza, N.C., and Russell, S.J. (2001, January 4\u20137). Online Bagging and Boosting. Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, Key West, FL, USA."},{"key":"ref_22","unstructured":"Feurer, M., Klein, A., Eggensperger, K., Springenberg, J.T., Blum, M., and Hutter, F. (2015, January 7\u201312). Efficient and robust automated machine learning. Proceedings of the 29th International Conference on Neural Information Processing Systems (NIPS\u201915), Montreal, QC, Canada."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Pesaranghader, A., and Viktor, H.L. (2016, January 19\u201323). Fast Hoeffding Drift Detection Method for Evolving Data Streams. Proceedings of the Machine Learning and Knowledge Discovery in Databases, Riva del Garda, Italy.","DOI":"10.1007\/978-3-319-46227-1_7"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"B\u00e4ck, T. (1996). Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms, Oxford University Press.","DOI":"10.1093\/oso\/9780195099713.001.0001"},{"key":"ref_25","unstructured":"Baena-Garc\u0131a, M., del Campo-\u00c1vila, J., Fidalgo, R., and Bifet, A. (2006, January 20). Early drift detection method. Proceedings of the Fourth International Workshop on Knowledge Discovery from Data Streams, Philadelphia, PA, USA."},{"key":"ref_26","first-page":"1","article-title":"River: Machine learning for streaming data in Python","volume":"22","author":"Montiel","year":"2021","journal-title":"J. Mach. Learn. Res."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Gonz\u00e1lez, M.L., Sedano, J., Garc\u00eda-Vico, \u00c1.M., and Villar, J.R. (2022). A Comparison of Techniques for Virtual Concept Drift Detection. 16th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2021), Springer.","DOI":"10.1007\/978-3-030-87869-6_1"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Mahgoub, M., Moharram, H., Elkafrawy, P., and Awad, A. (2023). Benchmarking Concept Drift Detectors for Online Machine Learning. Model and Data Engineering: 11th International Conference, MEDI 2022, Cairo, Egypt, 21\u201324 November 2022, Springer.","DOI":"10.1007\/978-3-031-21595-7_4"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"914","DOI":"10.1109\/69.250074","article-title":"Database Mining: A Performance Perspective","volume":"5","author":"Agrawal","year":"1993","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_30","unstructured":"Blackard, J. (1998). Covertype. UCI Machine Learning Repository, University of California."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Losing, V., Hammer, B., and Wersing, H. (2016, January 12\u201315). KNN Classifier with Self Adjusting Memory for Heterogeneous Concept Drift. Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM), Barcelona, Spain.","DOI":"10.1109\/ICDM.2016.0040"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"80","DOI":"10.2307\/3001968","article-title":"Individual Comparisons by Ranking Methods","volume":"1","author":"Wilcoxon","year":"1945","journal-title":"Biom. Bull."},{"key":"ref_33","unstructured":"Harries, M. (1999). Splice-2 Comparative Evaluation: Electricity Pricing, University of New South Wales, School of Computer Science and Engineering. Technical Report UNSW-CSE-TR-9905."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"826","DOI":"10.1016\/j.jpdc.2004.03.020","article-title":"Vehicle classification in distributed sensor networks","volume":"64","author":"Duarte","year":"2004","journal-title":"J. Parallel Distrib. Comput."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Bifet, A., Holmes, G., Pfahringer, B., Read, J., Kranen, P., Kremer, H., Jansen, T., and Seidl, T. (2011). MOA: A real-time analytics open source framework. Machine Learning and Knowledge Discovery in Databases, Part III: European Conference, ECML PKDD 2010, Athens, Greece, 5\u20139 September 2011, Proceedings, Part III, Springer.","DOI":"10.1007\/978-3-642-23808-6_41"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Hulten, G., Spencer, L., and Domingos, P. (2001, January 26\u201329). Mining time-changing data streams. Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD \u201901, New York, NY, USA.","DOI":"10.1145\/502512.502529"},{"key":"ref_37","first-page":"1601","article-title":"MOA: Massive Online Analysis, a Framework for Stream Classification and Clustering","volume":"11","author":"Bifet","year":"2010","journal-title":"J. Mach. Learn. Res."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Gomes, H.M., and Bifet, A. (2024, January 25\u201329). Practical Machine Learning for Streaming Data. Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD \u201924, New York, NY, USA.","DOI":"10.1145\/3637528.3671442"},{"key":"ref_39","unstructured":"Snoek, J., Larochelle, H., and Adams, R.P. (2011, January 12\u201315). Practical Bayesian optimization of machine learning algorithms. Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS\u201912), Granada, Spain."},{"key":"ref_40","unstructured":"Bergstra, J., Bardenet, R., Bengio, Y., and K\u00e9gl, B. (2010, January 6\u20139). Algorithms for hyper-parameter optimization. Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS\u201911), Vancouver, BC, Canada."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Hu, L., Lu, Y., and Feng, Y. (2025). Concept Drift Detection Based on Deep Neural Networks and Autoencoders. Appl. Sci., 15.","DOI":"10.3390\/app15063056"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"110705","DOI":"10.1016\/j.knosys.2023.110705","article-title":"Model-centric transfer learning framework for concept drift detection","volume":"275","author":"Wang","year":"2023","journal-title":"Knowl.-Based Syst."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"996","DOI":"10.1016\/j.ins.2022.07.022","article-title":"Meta-ADD: A meta-learning based pre-trained model for concept drift active detection","volume":"608","author":"Yu","year":"2022","journal-title":"Inf. Sci."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"4802","DOI":"10.1109\/TNNLS.2017.2771290","article-title":"A Systematic Study of Online Class Imbalance Learning with Concept Drift","volume":"29","author":"Wang","year":"2018","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"3499","DOI":"10.1007\/s40747-021-00456-0","article-title":"Deep learning framework for handling concept drift and class imbalanced complex decision-making on streaming data","volume":"9","author":"Priya","year":"2023","journal-title":"Complex Intell. Syst."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/7\/3\/87\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:33:19Z","timestamp":1760034799000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/7\/3\/87"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,26]]},"references-count":45,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2025,9]]}},"alternative-id":["make7030087"],"URL":"https:\/\/doi.org\/10.3390\/make7030087","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,8,26]]}}}