{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,6]],"date-time":"2026-01-06T07:17:29Z","timestamp":1767683849882,"version":"3.48.0"},"reference-count":67,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2017,7,1]],"date-time":"2017-07-01T00:00:00Z","timestamp":1498867200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Intelligent Data Analysis: An International Journal"],"published-print":{"date-parts":[[2017,7]]},"abstract":"<jats:p>The problem of selecting learning algorithms has been studied by the meta-learning community for more than two decades. One of the most important task for the success of a meta-learning system is gathering data about the learning process. This data is used to induce a (meta) model able to map characteristics extracted from different data sets to the performance of learning algorithms on these data sets. These systems are built under the assumption that the data are generated by a stationary distribution, i.e., a learning algorithm will perform similarly for new data from the same problem. However, many applications generate data whose characteristics can change over time. Therefore, a suitable bias at a given time may become inappropriate at another time. Although meta-learning has been used to continuously select a learning algorithm in data streams, data characterization has received less attention in this context. In this study, we provide a set of guidelines to support the proposal of characteristics able to describe non-stationary data over time. This guidance considers both the order of arrival of the examples and the type of variables involved in the base-level learning. In addition, we analyze the influence of characteristics regarding their dependence on data morphology. Experimental results using real data streams showed the effectiveness of the proposed data characterization general scheme to support algorithm selection by meta-learning systems. Moreover, the dependent meta-features provided crucial information for the success of some meta-models.<\/jats:p>","DOI":"10.3233\/ida-160083","type":"journal-article","created":{"date-parts":[[2017,8,22]],"date-time":"2017-08-22T14:00:08Z","timestamp":1503410408000},"page":"1015-1035","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":7,"title":["A guidance of data stream characterization for meta-learning"],"prefix":"10.1177","volume":"21","author":[{"given":"Andr\u00e9 Luis Debiaso","family":"Rossi","sequence":"first","affiliation":[{"name":"Universidade Estadual Paulista (UNESP)","place":["Brazil"]}]},{"given":"Bruno Feres","family":"de Souza","sequence":"additional","affiliation":[{"name":"Universidade Federal do Maranh\u00e3o","place":["Brazil"]}]},{"given":"Carlos","family":"Soares","sequence":"additional","affiliation":[{"name":"Universidade do Porto","place":["Portugal"]}]},{"given":"Andr\u00e9 Carlos Ponce","family":"de Leon Ferreira de Carvalho","sequence":"additional","affiliation":[{"name":"Universidade de S\u00e3o Paulo","place":["Brazil"]}]}],"member":"179","published-online":{"date-parts":[[2017,7]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0169-2070(01)00079-6"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2004.12.002"},{"key":"e_1_3_2_4_2","unstructured":"AmasyaliM. and ErsoyO. A study of meta learning for regression Technical report Purdue University 2009. http:\/\/docs.lib.purdue.edu\/ecetr\/386\/."},{"key":"e_1_3_2_5_2","unstructured":"ASAA.S.A. Data Expo 2009 \u2013 Sections on Statistical Computing and Statistical Graphics 2009. http:\/\/stat-computing.org\/dataexpo\/2009\/."},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/380995.380999"},{"key":"e_1_3_2_7_2","doi-asserted-by":"crossref","unstructured":"BifetA. ReadJ. \u017dliobaiteI. PfahringerB. and HolmesG. Pitfalls in benchmarking data stream classification and how to avoid them in: Machine Learning and Knowledge Discovery in Databases BlockeelH. KerstingK. NijssenS. and \u017delezn\u00fdF. eds volume 8188 Springer Berlin Heidelberg 2013 pp. 465\u2013479.","DOI":"10.1007\/978-3-642-40988-2_30"},{"key":"e_1_3_2_8_2","doi-asserted-by":"crossref","unstructured":"BrazdilP. Giraud-CarrierC. SoaresC. and VilaltaR. Metalearning: Applications to Data Mining Springer Verlag 2009.","DOI":"10.1007\/978-3-540-73263-1"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1021713901879"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010933404324"},{"key":"e_1_3_2_11_2","unstructured":"BreimanL. FriedmanJ. OlshenR. and StoneC. Classification and Regression Trees Chapman & Hall (Wadsworth Inc.) 1984."},{"issue":"12","key":"e_1_3_2_12_2","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1016\/S0925-2312(03)00433-8","article-title":"A comparison of pca, KPCA and ICA for dimensionality reduction in support vector machine","volume":"55","author":"Cao L.","year":"2003","unstructured":"CaoL. ChuaK. ChongW. LeeH. and GuQ., A comparison of pca, KPCA and ICA for dimensionality reduction in support vector machine, Neurocomputing 55(12) (2003), 321\u2013336.","journal-title":"Neurocomputing"},{"key":"e_1_3_2_13_2","doi-asserted-by":"crossref","unstructured":"CaruanaR. and Niculescu-MizilA. An empirical comparison of supervised learning algorithms in Proceedings of the 23rd International Conference on Machine Learning New York NY USA 2006 pp. 161\u2013168. ACM.","DOI":"10.1145\/1143844.1143865"},{"key":"e_1_3_2_14_2","doi-asserted-by":"crossref","unstructured":"CastielloC. CastellanoG. and FanelliA. Meta-data: Characterization of input features for meta-learning in: Modeling Decisions for Artificial Intelligence TorraV. NarukawaY. and MiyamotoS. eds Springer Berlin\/Heidelberg 2005 pp. 295\u2013304.","DOI":"10.1007\/11526018_45"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1177\/001316446002000104"},{"key":"e_1_3_2_16_2","doi-asserted-by":"crossref","unstructured":"CristianiniN. and Shawe-TaylorJ. An introduction to support Vector Machines: and Other Kernel-Based Learning Methods Cambridge University Press New York NY USA 2000.","DOI":"10.1017\/CBO9780511801389"},{"key":"e_1_3_2_17_2","doi-asserted-by":"crossref","unstructured":"DasuT. KrishnanS. LinD. VenkatasubramanianS. and YiK. Change (detection) you can believe in: Finding distributional shifts in data streams in: Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII Berlin Heidelberg 2009 pp. 21\u201334. Springer-Verlag.","DOI":"10.1007\/978-3-642-03915-7_3"},{"key":"e_1_3_2_18_2","doi-asserted-by":"crossref","unstructured":"DomingosP. and HultenG. Mining high-speed data streams in Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining New York NY USA 2000 pp. 71\u201380. ACM.","DOI":"10.1145\/347090.347107"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2011.19"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1214\/aos\/1176347963"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1981.10477729"},{"key":"e_1_3_2_22_2","unstructured":"GamaJ. Knowledge Discovery from Data Streams CRC Press 2010."},{"key":"e_1_3_2_23_2","doi-asserted-by":"crossref","unstructured":"GamaJ. and KosinaP. Learning about the learning process in: Proceedings of the 10th International Conference on Advances in Intelligent Data Analysis Berlin Heidelberg 2011 pp. 162\u2013172. Springer-Verlag.","DOI":"10.1007\/978-3-642-24800-9_17"},{"key":"e_1_3_2_24_2","doi-asserted-by":"crossref","unstructured":"GamaJ. Sebasti\u00e3oR. and RodriguesP.P. Issues in evaluation of stream learning algorithms in: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining New York NY USA 2009 pp. 329\u2013338. ACM.","DOI":"10.1145\/1557019.1557060"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-012-5320-9"},{"key":"e_1_3_2_26_2","unstructured":"GarciaL.P.F. de CarvalhoA.C.P.F. and LorenaA.C. Noise detection in the meta-learning level Neurocomputing (2015). in press 2015."},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1023\/B:MACH.0000015878.60765.42"},{"key":"e_1_3_2_28_2","doi-asserted-by":"crossref","unstructured":"GomesJ.B. MenasalvasE. and SousaP.A.C. Learning recurring concepts from data streams with a context-aware ensemble In Proceedings of the ACM Symposium on Applied Computing New York NY USA 2011 994\u2013999. ACM.","DOI":"10.1145\/1982185.1982403"},{"key":"e_1_3_2_29_2","unstructured":"HarriesM. Splice-2 comparative evaluation: Electricity pricing Technical Report 9905 School of Computer Science and Engineering University of New South Wales 1999."},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007420529897"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/34.990132"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-010-0201-y"},{"key":"e_1_3_2_33_2","unstructured":"KalousisA. Algorithm Selection via Meta-Learning PhD thesis University of Geneva Faculty of Sciences Geneva Switzerland 2002."},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1023\/B:MACH.0000015882.38031.85"},{"key":"e_1_3_2_35_2","unstructured":"KalousisA. and HilarioM. Representational issues in meta-learning in: Proceedings of the Twentieth International Conference on Machine Learning FawcettT. and MishraN. eds AAAI Press 2003 pp. 313\u2013320."},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.3233\/IDA-1999-3502"},{"key":"e_1_3_2_37_2","doi-asserted-by":"crossref","unstructured":"KiferD. Ben-DavidS. and GehrkeJ. Detecting change in data streams in: Proceedings of the Thirtieth International Conference on Very Large Data Bases VLDB Endowment 2004 pp. 180\u2013191.","DOI":"10.1016\/B978-012088469-8.50019-X"},{"key":"e_1_3_2_38_2","unstructured":"KlinkenbergR. Meta-learning model selection and example selection in machine learning domains with concept drift in: LWA BauerM. BrandhermB. F\u00fcrnkranzJ. GrieserG. HothoA. JedlitschkaA. and Kr\u00f6nerA. eds DFKI 2005 pp. 164\u2013171."},{"key":"e_1_3_2_39_2","unstructured":"KubaP. BrazdilP. SoaresC. and WoznicaA. Exploiting sampling and meta-learning for parameter setting support vector machines in: Proceedings of the Workshop de Miner\u00eda de Datos Y Aprendizaje of IBERAMIA GarijoM.T.F. and RiquelmeJ. eds Universidad de Sevilla 2002 pp. 217\u2013225."},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2009.09.020"},{"issue":"3","key":"e_1_3_2_41_2","first-page":"18","article-title":"Classification and regression by randomforest","volume":"2","author":"Liaw A.","year":"2002","unstructured":"LiawA. and WienerM., Classification and regression by randomforest, R News 2(3) (2002), 18\u201322.","journal-title":"R News"},{"key":"e_1_3_2_42_2","unstructured":"MeyerD. DimitriadouE. HornikK. WeingesselA. and LeischF. e1071: Misc Functions of the Department of Statistics (e1071) TU Wien 2012. R package version 16-1."},{"key":"e_1_3_2_43_2","unstructured":"MichieD. SpiegelhalterD. and TaylorC. Introduction In MichieD. SpiegelhalterD. and TaylorC. editors Machine Learning Neural and Statistical Classification Ellis Horwood 1994."},{"key":"e_1_3_2_44_2","unstructured":"MilborrowS. Earth: Multivariate Adaptive Regression Spline Models 2012. Derived from mda:mars by Trevor Hastie and Rob Tibshirani. R package version 32-3."},{"key":"e_1_3_2_45_2","unstructured":"MoreiraJ.P.C.L.M. Travel time prediction for the planning of mass transit companies: a machine learning approach PhD thesis Faculty of Engineering of University of Porto 2008."},{"key":"e_1_3_2_46_2","doi-asserted-by":"crossref","unstructured":"MusliuN. and SchwengererM. Algorithm selection for the graph coloring problem in: Proceedings of the Learning and Intelligent Optimization Conference NicosiaG. and PardalosP. eds Springer Berlin Heidelberg 2013 pp. 389\u2013403.","DOI":"10.1007\/978-3-642-44973-4_42"},{"key":"e_1_3_2_47_2","unstructured":"PfahringerB. BensusanH. and Giraud-CarrierC.G. Meta-learning by landmarking various learning algorithms in: Proceedings of the Seventeenth International Conference on Machine Learning San Francisco CA USA 2000 pp. 743\u2013750. Morgan Kaufmann Publishers Inc."},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2004.03.008"},{"key":"e_1_3_2_49_2","unstructured":"R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing Vienna Austria 2012. ISBN 3-900051-07-0."},{"key":"e_1_3_2_50_2","doi-asserted-by":"crossref","unstructured":"RaedtL.D. Logical and Relational Learning Cognitive Technologies. Springer-Verlag New York Inc. Secaucus NJ USA 2008.","DOI":"10.1007\/978-3-540-68856-3"},{"key":"e_1_3_2_51_2","unstructured":"RendellL.A. SheshuR. and TchengD.K. Layered concept-learning and dynamically variable bias management in: Proceedings of the International Joint Conference on Artificial Intelligence Morgan Kaufmann 1987 pp. 308\u2013314."},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2013.05.048"},{"key":"e_1_3_2_53_2","doi-asserted-by":"crossref","unstructured":"RossiA.L.D. de CarvalhoA.C.P.L.F. and SoaresC. Meta-learning for periodic algorithms selection in time-changing data in: Proceedings of the Brazilian Symposium on Neural Networks IEEE Computer Society 2012 pp. 7\u201312.","DOI":"10.1109\/SBRN.2012.50"},{"key":"e_1_3_2_54_2","doi-asserted-by":"crossref","unstructured":"Sebasti\u00e3oR. RodriguesP. and GamaJ. Change detection in climate data over the iberian peninsula in: IEEE International Conference on Data Mining Workshops 2009 pp. 248\u2013253.","DOI":"10.1109\/ICDMW.2009.27"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/1456650.1456656"},{"key":"e_1_3_2_56_2","unstructured":"SoaresC. Learning Rankings of Learning Algorithms: Recomendation of Algorithms with Meta-Learning PhD thesis Faculdade de Ci\u00eancias da Universidade do Porto Porto Portugal 2004."},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1023\/B:MACH.0000015879.28004.9b"},{"key":"e_1_3_2_58_2","doi-asserted-by":"crossref","unstructured":"TaoY. and OzsuM.T. Mining data streams with periodically changing distributions in: Proceeding of the 18th ACM Conference on Information and Knowledge Management New York NY USA 2009 pp. 887\u2013896. ACM.","DOI":"10.1145\/1645953.1646065"},{"key":"e_1_3_2_59_2","unstructured":"TherneauT. AtkinsonB. and RipleyB. rpart: Recursive Partitioning 2012. R package version 31-52."},{"key":"e_1_3_2_60_2","doi-asserted-by":"crossref","unstructured":"TodorovskiL. and D\u017eeroskiS. Experiments in meta-level learning with ilp in: Proceedings of the 3rd European Conference on Principles and Practice of Knowledge Discovery in Databases \u017dytkowJ. and RauchJ. eds Springer Berlin Heidelberg 1999 pp. 98\u2013106.","DOI":"10.1007\/978-3-540-48247-5_11"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1019956318069"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1145\/2408736.2408746"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2008.10.017"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/TGRS.2010.2041784"},{"key":"e_1_3_2_65_2","unstructured":"WestonJ. MukherjeeS. ChapelleO. PontilM. PoggioT. and VapnikV. Feature selection for svms in: Advances in Neural Information Processing Systems 13 Cambridge MA USA 2001 pp. 668\u2013674. MIT Press."},{"key":"e_1_3_2_66_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007365809034"},{"key":"e_1_3_2_67_2","unstructured":"WittenI.H. and FrankE. Data Mining: Practical Machine Learning Tools and Techniques Morgan Kaufmann Publishers Inc. San Francisco CA USA 2005."},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.1109\/4235.585893"}],"container-title":["Intelligent Data Analysis: An International Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/IDA-160083","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/IDA-160083","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/IDA-160083","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,6]],"date-time":"2026-01-06T07:13:56Z","timestamp":1767683636000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/IDA-160083"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,7]]},"references-count":67,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2017,7]]}},"alternative-id":["10.3233\/IDA-160083"],"URL":"https:\/\/doi.org\/10.3233\/ida-160083","relation":{},"ISSN":["1088-467X","1571-4128"],"issn-type":[{"type":"print","value":"1088-467X"},{"type":"electronic","value":"1571-4128"}],"subject":[],"published":{"date-parts":[[2017,7]]}}}