{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T22:01:52Z","timestamp":1773093712672,"version":"3.50.1"},"reference-count":69,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2017,10,4]],"date-time":"2017-10-04T00:00:00Z","timestamp":1507075200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>Time series forecasting using machine learning algorithms has gained popularity recently. Random forest is a machine learning algorithm implemented in time series forecasting; however, most of its forecasting properties have remained unexplored. Here we focus on assessing the performance of random forests in one-step forecasting using two large datasets of short time series with the aim to suggest an optimal set of predictor variables. Furthermore, we compare its performance to benchmarking methods. The first dataset is composed by 16,000 simulated time series from a variety of Autoregressive Fractionally Integrated Moving Average (ARFIMA) models. The second dataset consists of 135 mean annual temperature time series. The highest predictive performance of RF is observed when using a low number of recent lagged predictor variables. This outcome could be useful in relevant future applications, with the prospect to achieve higher predictive accuracy.<\/jats:p>","DOI":"10.3390\/a10040114","type":"journal-article","created":{"date-parts":[[2017,10,4]],"date-time":"2017-10-04T13:53:55Z","timestamp":1507125235000},"page":"114","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":166,"title":["Variable Selection in Time Series Forecasting Using Random Forests"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8932-4997","authenticated-orcid":false,"given":"Hristos","family":"Tyralis","sequence":"first","affiliation":[{"name":"Department of Water Resources and Environmental Engineering, School of Civil Engineering, National Technical University of Athens, Iroon Polytechniou 5, 157 80 Zografou, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5446-954X","authenticated-orcid":false,"given":"Georgia","family":"Papacharalampous","sequence":"additional","affiliation":[{"name":"Department of Water Resources and Environmental Engineering, School of Civil Engineering, National Technical University of Athens, Iroon Polytechniou 5, 157 80 Zografou, Greece"}]}],"member":"1968","published-online":{"date-parts":[[2017,10,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1214\/10-STS330","article-title":"To explain or to predict?","volume":"25","author":"Shmueli","year":"2010","journal-title":"Stat. Sci."},{"key":"ref_2","first-page":"62","article-title":"Machine learning strategies for time series forecasting","volume":"Volume 138","author":"Aufaure","year":"2013","journal-title":"Business Intelligence (Lecture Notes in Business Information Processing)"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/j.ijforecast.2006.01.001","article-title":"25 years of time series forecasting","volume":"22","author":"Hyndman","year":"2006","journal-title":"Int. J. Forecast."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1150","DOI":"10.1057\/palgrave.jors.2602597","article-title":"Forecasting and operational research: A review","volume":"59","author":"Fildes","year":"2008","journal-title":"J. Oper. Res. Soc."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1030","DOI":"10.1016\/j.ijforecast.2014.08.008","article-title":"Electricity price forecasting: A review of the state-of-the-art with a look into the future","volume":"30","author":"Weron","year":"2014","journal-title":"Int. J. Forecast."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"914","DOI":"10.1016\/j.ijforecast.2015.11.011","article-title":"Probabilistic electric load forecasting: A tutorial review","volume":"32","author":"Hong","year":"2016","journal-title":"Int. J. Forecast."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"7067","DOI":"10.1016\/j.eswa.2012.01.039","article-title":"A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition","volume":"39","author":"Taieb","year":"2012","journal-title":"Expert Syst. Appl."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"454","DOI":"10.1088\/1009-1963\/13\/4\/007","article-title":"Chaotic time series prediction using least squares support vector machines","volume":"13","year":"2004","journal-title":"Chin. Phys."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1111\/1467-9876.00109","article-title":"Time series forecasting with neural networks: A comparative study using the air line data","volume":"47","author":"Faraway","year":"1998","journal-title":"J. R. Stat. Soc. C Appl. Stat."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1179","DOI":"10.1016\/j.ymssp.2007.11.012","article-title":"Machine condition prognosis based on regression trees and one-step-ahead prediction","volume":"22","author":"Yang","year":"2008","journal-title":"Mech. Syst. Signal Process."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1016\/S0169-2070(03)00004-9","article-title":"Combining time series models for forecasting","volume":"20","author":"Zou","year":"2004","journal-title":"Int. J. Forecast."},{"key":"ref_12","unstructured":"Papacharalampous, G.A., Tyralis, H., and Koutsoyiannis, D. (2017, January 5\u20139). Forecasting of geophysical processes using stochastic and machine learning algorithms. Proceedings of the 10th World Congress of EWRA on Water Resources and Environment \u201cPanta Rhei\u201d, Athens, Greece."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"490","DOI":"10.1016\/j.jempfin.2004.03.001","article-title":"STAR and ANN models: Forecasting performance on the Spanish \u201cIbex-35\u201d stock index","volume":"12","author":"Torra","year":"2005","journal-title":"J. Empir. Financ."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"2664","DOI":"10.1016\/j.asoc.2010.10.015","article-title":"A novel hybridization of artificial neural networks and ARIMA models for time series forecasting","volume":"11","author":"Khashei","year":"2011","journal-title":"Appl. Soft Comput."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1028","DOI":"10.1109\/TNNLS.2012.2198074","article-title":"Toward automatic time-series forecasting using neural networks","volume":"23","author":"Yan","year":"2012","journal-title":"IEEE Trans. Neural Netw. Lear. Stat."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1016\/j.asoc.2014.05.028","article-title":"A moving-average filter based hybrid ARIMA\u2013ANN model for forecasting time series data","volume":"23","author":"Babu","year":"2014","journal-title":"Appl. Soft Comput."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1016\/j.eswa.2017.04.013","article-title":"Random forests-based extreme learning machine ensemble for multi-regime time series prediction","volume":"85","author":"Lin","year":"2017","journal-title":"Expert Syst. Appl."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random Forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1716","DOI":"10.1214\/15-AOS1321","article-title":"Consistency of random forests","volume":"43","author":"Scornet","year":"2015","journal-title":"Ann. Stat."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1007\/s11749-016-0481-7","article-title":"A random forest guided tour","volume":"25","author":"Biau","year":"2016","journal-title":"Test"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning, Springer. [2nd ed.].","DOI":"10.1007\/978-0-387-84858-7"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"330","DOI":"10.1016\/j.patcog.2010.08.011","article-title":"Mining data with random forests: A survey and results of new tests","volume":"44","author":"Verikas","year":"2011","journal-title":"Pattern Recognit."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1016\/j.jhydrol.2010.04.005","article-title":"Predictive models for forecasting hourly urban water demand","volume":"387","author":"Herrera","year":"2010","journal-title":"J. Hydrol."},{"key":"ref_24","first-page":"821","article-title":"Short-term load forecasting using random forests","volume":"Volume 323","author":"Filev","year":"2015","journal-title":"Proceedings of the 7th IEEE International Conference Intelligent Systems IS\u20192014 (Advances in Intelligent Systems and Computing)"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"915053","DOI":"10.1155\/2012\/915053","article-title":"Statistical uncertainty estimation using random forests and its application to drought forecast","volume":"2012","author":"Chen","year":"2012","journal-title":"Math. Probl. Eng."},{"key":"ref_26","first-page":"10109","article-title":"Forecasting of monthly temperature variations using random forests","volume":"10","author":"Naing","year":"2015","journal-title":"APRN J. Eng. Appl. Sci."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Nguyen, T.T., Huu, Q.N., and Li, M.J. (2015, January 8\u201310). Forecasting time series water levels on Mekong river using machine learning models. Proceedings of the 2015 Seventh International Conference on Knowledge and Systems Engineering (KSE), Ho Chi Minh City, Vietnam.","DOI":"10.1109\/KSE.2015.53"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Kumar, M., and Thenmozhi, M. (2006). Forecasting stock index movement: A comparison of support vector machines and random forest. Indian Institute of Capital Markets 9th Capital Markets Conference Paper, Indian Institute of Capital Markets.","DOI":"10.2139\/ssrn.876544"},{"key":"ref_29","first-page":"284","article-title":"Forecasting stock index returns using ARIMA-SVM, ARIMA-ANN, and ARIMA-random forest hybrid models","volume":"5","author":"Kumar","year":"2014","journal-title":"Int. J. Bank. Acc. Financ."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Kane, M.J., Price, N., Scotch, M., and Rabinowitz, P. (2014). Comparison of ARIMA and Random Forest time series models for prediction of avian influenza H5N1 outbreaks. BMC Bioinform.","DOI":"10.1186\/1471-2105-15-276"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"2225","DOI":"10.1016\/j.patrec.2010.03.014","article-title":"Variable selection using random forests","volume":"31","author":"Genuer","year":"2010","journal-title":"Pattern Recognit. Lett."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Perner, P. (2012). How many trees in a random forest?. Machine Learning and Data Mining in Pattern Recognition (Lecture Notes in Computer Science), Springer.","DOI":"10.1007\/978-3-642-31537-4"},{"key":"ref_33","unstructured":"Probst, P., and Boulesteix, A.L. (2017). To tune or not to tune the number of trees in random forest?. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Kuhn, M., and Johnson, K. (2013). Applied Predictive Modeling, Springer.","DOI":"10.1007\/978-1-4614-6849-3"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"D\u00edaz-Uriarte, R., and De Andres, S.A. (2006). Gene selection and classification of microarray data using random forest. BMC Bioinform., 7.","DOI":"10.1186\/1471-2105-7-3"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"489","DOI":"10.1016\/0169-2070(87)90045-8","article-title":"Confidence intervals: An empirical investigation of the series in the M-competition","volume":"3","author":"Makridakis","year":"1987","journal-title":"Int. J. Forecast."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1016\/S0169-2070(00)00057-1","article-title":"The M3-Competition: Results, conclusions and implications","volume":"16","author":"Makridakis","year":"2000","journal-title":"Int. J. Forecast."},{"key":"ref_38","unstructured":"Pritzsche, U. (2015). Benchmarking of classical and machine-learning algorithms (with special emphasis on bagging and boosting approaches) for time series forecasting. [Master\u2019s Thesis, Ludwig-Maximilians-Universit\u00e4t M\u00fcnchen]."},{"key":"ref_39","unstructured":"Bagnall, A., and Cawley, G.C. (2017). On the use of default parameter settings in the empirical evaluation of classification algorithms. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Salles, R., Assis, L., Guedes, G., Bezerra, E., Porto, F., and Ogasawara, E. (2017, January 14\u201319). A framework for benchmarking machine learning methods using linear models for univariate time series prediction. Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA.","DOI":"10.1109\/IJCNN.2017.7966139"},{"key":"ref_41","unstructured":"Bontempi, G. (2017, September 25). Machine Learning Strategies for Time Series Prediction. Available online: https:\/\/pdfs.semanticscholar.org\/f8ad\/a97c142b0a2b1bfe20d8317ef58527ee329a.pdf."},{"key":"ref_42","unstructured":"McShane, B.B. (2010). Machine Learning Methods with Time Series Dependence. [Ph.D. Thesis, University of Pennsylvania]."},{"key":"ref_43","unstructured":"Bagnall, A., Bostrom, A., Large, J., and Lines, J. (2017). Simulated data experiments for time series classification part 1: Accuracy comparison with default settings. arXiv."},{"key":"ref_44","first-page":"91","article-title":"Some recent advances in forecasting and control","volume":"17","author":"Box","year":"1968","journal-title":"J. R. Stat. Soc. C Appl. Stat."},{"key":"ref_45","unstructured":"Wei, W.W.S. (2006). Time Series Analysis, Univariate and Multivariate Methods, Pearson Addison Wesley. [2nd ed.]."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1016\/S0169-7439(03)00111-4","article-title":"Using support vector machines for time series prediction","volume":"69","author":"Thissen","year":"2003","journal-title":"Chemom. Intell. Lab."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"1183","DOI":"10.1016\/S0305-0548(00)00033-2","article-title":"An investigation of neural networks for linear time-series forecasting","volume":"28","author":"Zhang","year":"2001","journal-title":"Comput. Oper. Res."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Lawrimore, J.H., Menne, M.J., Gleason, B.E., Williams, C.N., Wuertz, D.B., Vose, R.S., and Rennie, J. (2011). An overview of the Global Historical Climatology Network monthly mean temperature data set, version 3. J. Geophys. Res., 116.","DOI":"10.1029\/2011JD016187"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"521","DOI":"10.1016\/S0169-2070(00)00066-2","article-title":"The theta model: A decomposition approach to forecasting","volume":"16","author":"Assimakopoulos","year":"2000","journal-title":"Int. J. Forecast."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Kuhn, M. (2008). Building predictive models in R using the caret package. J. Stat. Softw., 28.","DOI":"10.18637\/jss.v028.i05"},{"key":"ref_51","unstructured":"Kuhn, M., Wing, J., Weston, S., Williams, A., Keefer, C., Engelhardt, A., Cooper, T., Mayer, Z., Kenkel, B., and The R Core Team (2017, September 07). Available online: https:\/\/cran.r-project.org\/web\/packages\/caret\/index.html."},{"key":"ref_52","unstructured":"The R Core Team (2017). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.1467-9574.1966.tb00488.x","article-title":"Underlining random variables","volume":"20","author":"Hemelrijk","year":"1966","journal-title":"Stat. Neerl."},{"key":"ref_54","unstructured":"Fraley, C., Leisch, F., Maechler, M., Reisen, V., and Lemonte, A. (2012, December 02). Fracdiff: Fractionally Differenced ARIMA aka ARFIMA(p,d,q) Models, Available online: https:\/\/rdrr.io\/cran\/fracdiff\/."},{"key":"ref_55","unstructured":"Hyndman, R.J., O\u2019Hara-Wild, M., Bergmeir, C., Razbash, S., and Wang, E. (2017, September 25). Forecast: Forecasting Functions for Time Series and Linear Models, Available online: https:\/\/rdrr.io\/cran\/forecast\/."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Hyndman, R.J., and Khandakar, Y. (2008). Automatic time series forecasting: The forecast package for R. J. Stat. Softw., 27.","DOI":"10.18637\/jss.v027.i03"},{"key":"ref_57","unstructured":"Hyndman, R.J., and Athanasopoulos, G. (2017, September 25). Available online: http:\/\/otexts.org\/fpp\/."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1016\/S0169-2070(01)00143-1","article-title":"Unmasking the Theta method","volume":"19","author":"Hyndman","year":"2003","journal-title":"Int. J. Forecast."},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008). Forecasting with Exponential Smoothing: The State Space Approach, Springer.","DOI":"10.1007\/978-3-540-71918-2"},{"key":"ref_60","first-page":"18","article-title":"Classification and Regression by randomForest","volume":"2","author":"Liaw","year":"2002","journal-title":"R News"},{"key":"ref_61","first-page":"572","article-title":"Data mining with neural networks and support vector machines using the R\/rminer tool","volume":"Volume 6171","author":"Perner","year":"2010","journal-title":"Advances in Data Mining. Applications and Theoretical Aspects (Lecture Notes in Artificial Intelligence)"},{"key":"ref_62","unstructured":"Cortez, P. (2016, September 02). Rminer: Data Mining Classification and Regression Methods, Available online: https:\/\/rdrr.io\/cran\/rminer\/."},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1016\/j.ijforecast.2006.03.001","article-title":"Another look at measures of forecast accuracy","volume":"22","author":"Hyndman","year":"2006","journal-title":"Int. J. Forecast."},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"1316","DOI":"10.1021\/acs.jcim.5b00206","article-title":"Beware of R2: Simple, unambiguous assessment of the prediction accuracy of QSAR and QSPR models","volume":"55","author":"Alexander","year":"2015","journal-title":"J. Chem. Inf. Model."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"1127","DOI":"10.1021\/acs.jcim.6b00088","article-title":"A historical excursus on the statistical validation parameters for QSAR models: A clarification concerning metrics and terminology","volume":"56","author":"Gramatica","year":"2016","journal-title":"J. Chem. Inf. Model."},{"key":"ref_66","unstructured":"Warnes, G.R., Bolker, B., Gorjanc, G., Grothendieck, G., Korosec, A., Lumley, T., MacQueen, D., Magnusson, A., and Rogers, J. (2017, June 06). Gdata: Various R Programming Tools for Data Manipulation, Available online: https:\/\/cran.r-project.org\/web\/packages\/gdata\/index.html."},{"key":"ref_67","unstructured":"Wickham, H. (2016). Ggplot2: Elegant Graphics for Data Analysis, Springer International Publishing. [2nd ed.]."},{"key":"ref_68","unstructured":"Wickham, H., Hester, J., Francois, R., Jyl\u00e4nki, J., and J\u00f8rgensen, M. (2017). Readr: Read Rectangular Text Data, Available online: https:\/\/cran.r-project.org\/web\/packages\/readr\/index.html."},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Wickham, H. (2007). Reshaping data with the reshape package. J. Stat. Softw., 21.","DOI":"10.18637\/jss.v021.i12"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/10\/4\/114\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T18:46:35Z","timestamp":1760208395000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/10\/4\/114"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,10,4]]},"references-count":69,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2017,12]]}},"alternative-id":["a10040114"],"URL":"https:\/\/doi.org\/10.3390\/a10040114","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,10,4]]}}}