{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T15:36:38Z","timestamp":1776785798846,"version":"3.51.2"},"reference-count":66,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2022,1,30]],"date-time":"2022-01-30T00:00:00Z","timestamp":1643500800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100006952","name":"Louisiana Board of Regents","doi-asserted-by":"publisher","award":["LEQSF (2016-19)-RD-B-07"],"award-info":[{"award-number":["LEQSF (2016-19)-RD-B-07"]}],"id":[{"id":"10.13039\/100006952","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>To encourage proper employee scheduling for managing crew load, restaurants need accurate sales forecasting. This paper proposes a case study on many machine learning (ML) models using real-world sales data from a mid-sized restaurant. Trendy recurrent neural network (RNN) models are included for direct comparison to many methods. To test the effects of trend and seasonality, we generate three different datasets to train our models with and to compare our results. To aid in forecasting, we engineer many features and demonstrate good methods to select an optimal sub-set of highly correlated features. We compare the models based on their performance for forecasting time steps of one-day and one-week over a curated test dataset. The best results seen in one-day forecasting come from linear models with a sMAPE of only 19.6%. Two RNN models, LSTM and TFT, and ensemble models also performed well with errors less than 20%. When forecasting one-week, non-RNN models performed poorly, giving results worse than 20% error. RNN models extended better with good sMAPE scores giving 19.5% in the best result. The RNN models performed worse overall on datasets with trend and seasonality removed, however many simpler ML models performed well when linearly separating each training instance.<\/jats:p>","DOI":"10.3390\/make4010006","type":"journal-article","created":{"date-parts":[[2022,1,31]],"date-time":"2022-01-31T01:46:08Z","timestamp":1643593568000},"page":"105-130","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":44,"title":["Machine Learning Based Restaurant Sales Forecasting"],"prefix":"10.3390","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7838-6116","authenticated-orcid":false,"given":"Austin","family":"Schmidt","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of New Orleans, New Orleans, LA 70148, USA"}]},{"given":"Md Wasi Ul","family":"Kabir","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of New Orleans, New Orleans, LA 70148, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0110-2194","authenticated-orcid":false,"given":"Md Tamjidul","family":"Hoque","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of New Orleans, New Orleans, LA 70148, USA"}]}],"member":"1968","published-online":{"date-parts":[[2022,1,30]]},"reference":[{"key":"ref_1","unstructured":"Green, Y.N.J. (2001). An Exploratory Investigation of the Sales Forecasting Process in the Casual Themeand Family Dining Segments of Commercial Restaurant Corporations, Virginia Polytechnic Institute and State University."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1016\/0278-4319(92)90006-H","article-title":"A comparison of time series and econometric models for forecasting restaurant sales","volume":"11","author":"Cranage","year":"1992","journal-title":"Int. J. Hosp. Manag."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Lasek, A., Cercone, N., and Saunders, J. (2016). Restaurant Sales and Customer Demand Forecasting: Literature Survey and Categorization of Methods, in Smart City 360\u00b0, Springer International Publishing.","DOI":"10.1007\/978-3-319-33681-7_40"},{"key":"ref_4","first-page":"164","article-title":"Approaches, techniques, and information technology systems in the restaurants and foodservice industry: A qualitative study in sales forecasting","volume":"9","author":"Green","year":"2008","journal-title":"Int. J. Hosp. Tour. Adm."},{"key":"ref_5","unstructured":"Lim, B., Arik, S.O., Loeff, N., and Pfister, T. (2019). Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting. arXiv."},{"key":"ref_6","unstructured":"Borovykh, A., Bohte, S., and Oosterlee, C.W. (2018). Conditional Time Series Forecasting with Convolutional Neural Networks. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"20200209","DOI":"10.1098\/rsta.2020.0209","article-title":"Time-series forecasting with deep learning: A survey","volume":"379","author":"Lim","year":"2021","journal-title":"Philos. Trans. R. Soc."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Bandara, K., Shi, P., Bergmeir, C., Hewamalage, H., Tran, Q., and Seaman, B. (2019). Sales Demand Forecast in E-commerce Using a Long Short-Term Memory Neural Network Methodology. International Conference on Neural Information Processing, Springer.","DOI":"10.1007\/978-3-030-36718-3_39"},{"key":"ref_9","first-page":"e27712v1","article-title":"Sales forecasting using multivariate long short term memorynetwork models","volume":"7","author":"Helmini","year":"2019","journal-title":"PeerJ PrePrints"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Makridakis, S., Spiliotis, E., and Assimakopoulos, V. (2018). Statistical and Machine Learning forecasting methods: Concerns and ways forward. PLoS ONE, 13.","DOI":"10.1371\/journal.pone.0194889"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"16713","DOI":"10.1007\/s00521-021-06266-2","article-title":"Application of deep learning and chaos theory for load forecastingin Greece","volume":"33","author":"Stergiou","year":"2021","journal-title":"Neural Comput. Appl."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"132306","DOI":"10.1016\/j.physd.2019.132306","article-title":"Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network","volume":"404","author":"Sherstinsky","year":"2020","journal-title":"Physia D Nonlinear Phenom."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Graves, A. (2013). Generating Sequences with Recurrent Neural Networks. arXiv.","DOI":"10.1007\/978-3-642-24797-2_3"},{"key":"ref_14","unstructured":"Holmberg, M., and Halld\u00e9n, P. (2018). Machine Learning for Restauraunt Sales Forecast, Department of Information Technology, UPPSALA University."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1016\/j.procir.2019.02.042","article-title":"Demand forecasting in restaurants usingmachine learning and statistical analysis","volume":"79","author":"Tanizaki","year":"2019","journal-title":"Procedia CIRP"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1007\/978-981-15-9651-3_31","article-title":"Machine Learning based Restaurant Revenue Prediction","volume":"53","author":"Rao","year":"2021","journal-title":"Lect. Notes Data Eng. Commun. Technol."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Sakib, S.N. (2022, January 10). Restaurant Sales Prediction Using Machine Learning. Available online: https:\/\/engrxiv.org\/preprint\/view\/2073.","DOI":"10.31224\/osf.io\/wa927"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1007\/978-3-319-61845-6_10","article-title":"Food Sales Prediction with Meteorological Data-A Case Study of a Japanese Chain Supermarket","volume":"10387","author":"Liu","year":"2017","journal-title":"Data Min. Big Data"},{"key":"ref_19","unstructured":"Schmidt, A. (2021). Machine Learning based Restaurant Sales Forecasting. Computer Science, University of New Orleans."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"106973","DOI":"10.1016\/j.patcog.2019.106973","article-title":"Learning representations of multivariate time series with missing data","volume":"96","author":"Bianchi","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Allison, P.D. (2001). Missing Data, Sage Publications.","DOI":"10.4135\/9781412985079"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"14889","DOI":"10.1073\/pnas.0701020104","article-title":"On the trend, detrending, and variability of nonlinearand nonstationary time series","volume":"104","author":"Wu","year":"2007","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_23","unstructured":"Hyndman, R.J., and Athanasopoulos, G. (2018). Forecasting: Principles and Practice, OTexts."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1080\/00031305.1975.10479105","article-title":"Ridge regression in practice","volume":"29","author":"Marquardt","year":"1975","journal-title":"Am. Stat."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1214\/aos\/1176344891","article-title":"Adaptive Multivariant Ridge Regression","volume":"8","author":"Brown","year":"1980","journal-title":"Ann. Stat."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression Shrinkage and Selection via the Lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Bottou, L. (2010). Large-Scale Machine Learning with Stochastic Gradient Descent. COMPSTAT\u20192010, Physica-Verlag HD.","DOI":"10.1007\/978-3-7908-2604-3_16"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1080\/00401706.1970.10488634","article-title":"Ridge Regression: Biased Estimation for Nonorthogonal Problems","volume":"12","author":"Hoerl","year":"1970","journal-title":"Technometrics"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v033.i01","article-title":"Regularization Paths for Generalized Linear Models via Coordinate Descent","volume":"33","author":"Friedman","year":"2010","journal-title":"J. Stat. Softw."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1111\/j.1467-9868.2005.00503.x","article-title":"Regularization and variable selection via the elastic net","volume":"67","author":"Zou","year":"2005","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1162\/neco.1992.4.3.415","article-title":"Bayesian Interpolation","volume":"4","author":"MacKay","year":"1991","journal-title":"Neural Comput."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1080\/01621459.1997.10473615","article-title":"Bayesian Model Averaging for Linear Regressions Models","volume":"92","author":"Raftery","year":"1997","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_33","first-page":"1","article-title":"Support vector machines-kernels and the kernel trick","volume":"Volume 26","author":"Hofmann","year":"2006","journal-title":"Notes"},{"key":"ref_34","unstructured":"Welling, M. (2013). Kernel ridge Regression. Max Welling\u2019s Class Lecture Notes in Machine Learning, University of Toronto."},{"key":"ref_35","unstructured":"Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning Data Mining, Inference, and Prediction, Springer Science & Business Media."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1002\/widm.8","article-title":"Classification and Regression Trees","volume":"1","author":"Loh","year":"2011","journal-title":"Wiley Interdiscip. Rev. Data Min. Knowl. Discov."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). Xgboost: A scalable tree boosting system. Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1023\/B:STCO.0000035301.49549.88","article-title":"A tutorial on support vector regression","volume":"14","author":"Smola","year":"2004","journal-title":"Stat. Comput."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1207","DOI":"10.1162\/089976600300015565","article-title":"New Support Vector Algorithms","volume":"12","author":"Smola","year":"2000","journal-title":"Neural Comput."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1961189.1961199","article-title":"LIBSVM: A Library for Support Vector Machines","volume":"2","author":"Chang","year":"2011","journal-title":"ACM Trans. Intell. Syst. Technol."},{"key":"ref_41","first-page":"3323","article-title":"Large-scale Linear Support Vector Regression","volume":"13","author":"Ho","year":"2012","journal-title":"J. Mach. Learn. Res."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1109\/TIT.1967.1053964","article-title":"Nearest Neighbor Pattern Classification","volume":"13","author":"Cover","year":"1967","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_43","first-page":"513","article-title":"Neighbourhood Components Analysis","volume":"17","author":"Goldberger","year":"2004","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40064-016-2941-7","article-title":"The distance function effect on k-nearest neighbor classification for medical datasets","volume":"5","author":"Hu","year":"2016","journal-title":"SpringerPlus"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Rasmussen, C.E. (2003). Gaussian Processes for Machine Learning, Springer. Summer school on machine learning.","DOI":"10.1007\/978-3-540-28650-9_4"},{"key":"ref_46","unstructured":"Duvenaud, D. (2014). Automatic Model Construction with Gaussian Processes, University of Cambridge."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1007\/BF00117832","article-title":"Stacked Regressions","volume":"24","author":"Breiman","year":"1996","journal-title":"Mach. Learn."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"3289","DOI":"10.1093\/bioinformatics\/bty352","article-title":"PBRpredict-Suite: A Suite of Models to Predict Peptide Recognition Domain Residues from Protein Sequence","volume":"34","author":"Iqbal","year":"2018","journal-title":"Bioinformatics"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"107857","DOI":"10.1016\/j.carres.2019.107857","article-title":"StackCBPred: A Stacking based Prediction of Protein-Carbohydrate Binding Sites from Sequence","volume":"486","author":"Gattani","year":"2019","journal-title":"Carbohydr. Res."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1093\/bioinformatics\/bty653","article-title":"StackDPPred: A Stacking based Prediction of DNA-binding Protein from Sequence","volume":"35","author":"Mishra","year":"2019","journal-title":"Bioinformatics"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1016\/S0893-6080(05)80023-1","article-title":"Stacked Generalization","volume":"5","author":"Wolpert","year":"1992","journal-title":"Neural Netw."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1214\/aos\/1016218223","article-title":"Additive Logistic Regression: A Statistical View of Boosting","volume":"28","author":"Friedman","year":"2000","journal-title":"Ann. Stat."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s10994-006-6226-1","article-title":"Extremely randomized trees","volume":"63","author":"Geurts","year":"2006","journal-title":"Mach. Learn."},{"key":"ref_54","first-page":"3146","article-title":"LightGBM: A Highly Efficient Gradient BoostingDecision Tree","volume":"30","author":"Ke","year":"2017","journal-title":"Adv. Neural Inf. Processing Syst."},{"key":"ref_55","unstructured":"Anderson, J.A. (1993). An Introduction to Neural Networks, MIT Press."},{"key":"ref_56","unstructured":"Bishop, C.M. (2006). Pattern Recognition and Machine Learning, Springer."},{"key":"ref_57","unstructured":"Hecht-Nielsen, R. (1992). Theory of the Backpropagation Neural Network in Neural Networks for Perception, Academic Press."},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Medsker, L., and Jain, L.C. (1999). Recurrent Neural Networks: Design and Applications, CRC Press LLC.","DOI":"10.1201\/9781420049176"},{"key":"ref_59","unstructured":"Pascanu, R., Mikolov, T., and Bengio, Y. (2013, January 26). On the difficulty of training recurrent neural networks. Proceedings of the 30th International Conference on Machine Learning, PMLR, Atlanta, GA, USA."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long Short-Term Memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Cho, K., Van Merri\u00ebnboer, B., Bahdanau, D., and Bengio, Y. (2014). On the Properties of Neural Machine Translation: Encoder\u2013DecoderApproaches. arXiv.","DOI":"10.3115\/v1\/W14-4012"},{"key":"ref_62","unstructured":"Chung, J., Gulcehre, C., Cho, K., and Bengio, Y. (2014). Empirical Evaluation ofGated Recurrent Neural Networkson Sequence Modeling. arXiv."},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1002\/widm.1157","article-title":"A survey on multi-output regression","volume":"5","author":"Borchani","year":"2015","journal-title":"Wiley Interdiscip. Rev. Data Min. Knowl. Discov."},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"1997","DOI":"10.1007\/s10994-020-05910-7","article-title":"Evaluating time series forecasting models: An empirical study on performance estimation methods","volume":"109","author":"Cerqueira","year":"2020","journal-title":"Mach. Learn."},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Cho, K., Van Merri\u00ebnboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder\u2013Decoderfor Statistical Machine Translation. arXiv.","DOI":"10.3115\/v1\/D14-1179"},{"key":"ref_66","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/4\/1\/6\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:11:30Z","timestamp":1760134290000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/4\/1\/6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,30]]},"references-count":66,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,3]]}},"alternative-id":["make4010006"],"URL":"https:\/\/doi.org\/10.3390\/make4010006","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,30]]}}}