{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T07:43:27Z","timestamp":1772005407371,"version":"3.50.1"},"reference-count":26,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2024,7,28]],"date-time":"2024-07-28T00:00:00Z","timestamp":1722124800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>The field of sports analytics has grown rapidly, with a primary focus on performance forecasting, enhancing the understanding of player capabilities, and indirectly benefiting team strategies and player development. This work aims to forecast and comparatively evaluate players\u2019 goal-scoring likelihood in four elite football leagues (Premier League, Bundesliga, La Liga, and Serie A) by mining advanced statistics from 2017 to 2023. Six types of machine learning (ML) models were developed and tested individually through experiments on the comprehensive datasets collected for these leagues. We also tested the upper 30th percentile of the best-performing players based on their performance in the last season, with varied features evaluated to enhance prediction accuracy in distinct scenarios. The results offer insights into the forecasting abilities of those leagues, identifying the best forecasting methodologies and the factors that most significantly contribute to the prediction of players\u2019 goal-scoring. XGBoost consistently outperformed other models in most experiments, yielding the most accurate results and leading to a well-generalized model. Notably, when applied to Serie A, it achieved a mean absolute error (MAE) of 1.29. This study provides insights into ML-based performance prediction, advancing the field of player performance forecasting.<\/jats:p>","DOI":"10.3390\/make6030086","type":"journal-article","created":{"date-parts":[[2024,8,1]],"date-time":"2024-08-01T18:04:59Z","timestamp":1722535499000},"page":"1762-1781","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Diverse Machine Learning for Forecasting Goal-Scoring Likelihood in Elite Football Leagues"],"prefix":"10.3390","volume":"6","author":[{"given":"Christina","family":"Markopoulou","sequence":"first","affiliation":[{"name":"School of Science and Technology, International Hellenic University, 57001 Thessaloniki, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9361-8621","authenticated-orcid":false,"given":"George","family":"Papageorgiou","sequence":"additional","affiliation":[{"name":"School of Science and Technology, International Hellenic University, 57001 Thessaloniki, Greece"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8263-9024","authenticated-orcid":false,"given":"Christos","family":"Tjortjis","sequence":"additional","affiliation":[{"name":"School of Science and Technology, International Hellenic University, 57001 Thessaloniki, Greece"}]}],"member":"1968","published-online":{"date-parts":[[2024,7,28]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1007\/s41060-017-0093-7","article-title":"Sports Analytics and the Big-Data Era","volume":"5","author":"Morgulev","year":"2018","journal-title":"Int. J. Data Sci. Anal."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"4333","DOI":"10.1007\/s10115-024-02092-9","article-title":"Evaluating the Effectiveness of Machine Learning Models for Performance Forecasting in Basketball: A Comparative Study","volume":"66","author":"Papageorgiou","year":"2024","journal-title":"Knowl. Inf. Syst."},{"key":"ref_3","first-page":"9","article-title":"Application of Machine Learning Approaches in Intrusion Detection System: A Survey","volume":"4","author":"Haq","year":"2015","journal-title":"Int. J. Adv. Res. Artif. Intell."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Papageorgiou, G., Sarlis, V., and Tjortjis, C. (2024). An Innovative Method for Accurate NBA Player Performance Forecasting and Line-up Optimization in Daily Fantasy Sports. Int. J. Data Sci. Anal.","DOI":"10.1007\/s41060-024-00523-y"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Pantzalis, V.C., and Tjortjis, C. (2020, January 15\u201317). Sports Analytics for Football League Table and Player Performance Prediction. Proceedings of the 2020 11th International Conference on Information, Intelligence, Systems and Applications, Piraeus, Greece.","DOI":"10.1109\/IISA50023.2020.9284352"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Zeng, Z., and Pan, B. (2021, January 28\u201329). A Machine Learning Model to Predict Player\u2019s Positions Based on Performance. Proceedings of the 9th International Conference on Sport Sciences Research and Technology Support, Online.","DOI":"10.5220\/0010653300003059"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1044","DOI":"10.1016\/j.jsams.2020.04.021","article-title":"Using Machine Learning to Improve Our Understanding of Injury Risk and Prediction in Elite Male Youth Football Players","volume":"23","author":"Oliver","year":"2020","journal-title":"J. Sci. Med. Sport"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Martins, F., Przednowek, K., Fran\u00e7a, C., Lopes, H., de Maio Nascimento, M., Sarmento, H., Marques, A., Ihle, A., Henriques, R., and Gouveia, \u00c9.R. (2022). Predictive Modeling of Injury Risk Based on Body Composition and Selected Physical Fitness Tests for Elite Football Players. J. Clin. Med., 11.","DOI":"10.3390\/jcm11164923"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1186\/s40798-022-00465-4","article-title":"Machine Learning for Understanding and Predicting Injuries in Football","volume":"8","author":"Majumdar","year":"2022","journal-title":"Sports Med. Open"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Pariath, R., Shah, S., Surve, A., and Mittal, J. (2018, January 29\u201331). Player Performance Prediction in Football Game. Proceedings of the 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India.","DOI":"10.1109\/ICECA.2018.8474750"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"741","DOI":"10.1016\/j.ijforecast.2018.01.003","article-title":"Predictive Analysis and Modelling Football Results Using Machine Learning Approach for English Premier League","volume":"35","author":"Baboota","year":"2019","journal-title":"Int. J. Forecast."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"St\u00fcbinger, J., Mangold, B., and Knoll, J. (2019). Machine Learning in Football Betting: Prediction of Match Results Based on Player Characteristics. Appl. Sci., 10.","DOI":"10.3390\/app10010046"},{"key":"ref_13","unstructured":"Sports Reference (2023, September 01). Sports Reference. Available online: https:\/\/www.sports-reference.com."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Chatzilygeroudis, K., Hatzilygeroudis, I., and Perikos, I. (2021). Machine Learning Basics. Intelligent Computing for Interactive System Design, ACM.","DOI":"10.1145\/3447404.3447414"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Poole, M.A., and O\u2019farrell, P.N. (1971). The Assumptions of the Linear Regression Model. Trans. Inst. Br. Geogr., 145\u2013158.","DOI":"10.2307\/621706"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1002\/wics.14","article-title":"Ridge Regression","volume":"1","author":"McDonald","year":"2009","journal-title":"WIREs Comput. Stat."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Parmar, A., Katariya, R., and Patel, V. (2018, January 7\u20138). A Review on Random Forest: An Ensemble Classifier. Proceedings of the International conference on intelligent data communication technologies and internet of things (ICICI), Coimbatore, India.","DOI":"10.1007\/978-3-030-03146-6_86"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1937","DOI":"10.1007\/s10462-020-09896-5","article-title":"A Comparative Analysis of Gradient Boosting Algorithms","volume":"54","year":"2021","journal-title":"Artif. Intell. Rev."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"109564","DOI":"10.1016\/j.enbuild.2019.109564","article-title":"Developing Window Behavior Models for Residential Buildings Using XGBoost Algorithm","volume":"205","author":"Mo","year":"2019","journal-title":"Energy Build."},{"key":"ref_20","first-page":"440","article-title":"Forecasting of Stock Prices Using Multi Layer Perceptron","volume":"2","author":"Devadoss","year":"2013","journal-title":"Int. J. Comput. Algorithm"},{"key":"ref_21","unstructured":"Huang, Q., Mao, J., and Liu, Y. (2012, January 9\u201311). An Improved Grid Search Algorithm of SVR Parameters Optimization. Proceedings of the 2012 IEEE 14th International Conference on Communication Technology, Chengdu, China."},{"key":"ref_22","unstructured":"Maglogiannis, I., Iliadis, L., MacIntyre, J., and Dominguez, M. (2023, January 14\u201317). Forecasting Goal Performance for Top League Football Players: A Comparative Study. Proceedings of the Artificial Intelligence Applications and Innovations, Le\u00f3n, Spain."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1177\/1941738115616917","article-title":"Wearable Performance Devices in Sports Medicine","volume":"8","author":"Li","year":"2016","journal-title":"Sports Health Multidiscip. Approach"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1177\/21674795221141328","article-title":"It\u2019s Okay to Be Not Okay: An Analysis of Twitter Responses to Naomi Osaka\u2019s Withdrawal Due to Mental Health Concerns","volume":"11","author":"Chen","year":"2023","journal-title":"Commun. Sport"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Dreyer, F., Greif, J., G\u00fcnther, K., Spiliopoulou, M., and Niemann, U. (2022, January 10\u201312). Data-Driven Prediction of Athletes\u2019 Performance Based on Their Social Media Presence. Proceedings of the Discovery Science (DS), Montpellier, France.","DOI":"10.1007\/978-3-031-18840-4_15"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1080\/24733938.2022.2095006","article-title":"Forecasting Football Injuries by Combining Screening, Monitoring and Machine Learning","volume":"7","author":"Hecksteden","year":"2023","journal-title":"Sci. Med. Footb."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/3\/86\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:25:15Z","timestamp":1760109915000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/3\/86"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,28]]},"references-count":26,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2024,9]]}},"alternative-id":["make6030086"],"URL":"https:\/\/doi.org\/10.3390\/make6030086","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,28]]}}}