{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T04:18:53Z","timestamp":1772770733927,"version":"3.50.1"},"reference-count":47,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2022,7,13]],"date-time":"2022-07-13T00:00:00Z","timestamp":1657670400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Due to a colossal soccer market, soccer analysis has attracted considerable attention from industry and academia. In-game outcome prediction has great potential in various applications such as game broadcasting, tactical decision making, and betting. In some sports, the method of directly predicting in-game outcomes based on the ongoing game state is already being used as a statistical tool. However, soccer is a sport with low-scoring games and frequent draws, which makes in-game prediction challenging. Most existing studies focus on pre-game prediction instead. This paper, however, proposes a two-stage method for soccer in-game outcome prediction, namely in-game outcome prediction (IGSOP). When the full length of a soccer game is divided into sufficiently small time frames, the goal scored by each team in each time frame can be modeled as a random variable following the Bernoulli distribution. In the first stage, IGSOP adopts state-based machine learning to predict the probability of a scoring goal in each future time frame. In the second stage, IGSOP simulates the remainder of the game to estimate the outcome of a game. This two-stage approach effectively captures the dynamic situation after a goal and the uncertainty in the late phase of a game. Chinese Super League data have been used for algorithm training and evaluation, and the results demonstrate that IGSOP outperforms existing methods, especially in predicting draws and prediction during final moments of games. IGSOP provides a novel perspective to solve the problem of in-game outcome prediction in soccer, which has a potential ripple effect on related research.<\/jats:p>","DOI":"10.3390\/e24070971","type":"journal-article","created":{"date-parts":[[2022,7,13]],"date-time":"2022-07-13T22:06:00Z","timestamp":1657749960000},"page":"971","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Goal or Miss? A Bernoulli Distribution for In-Game Outcome Prediction in Soccer"],"prefix":"10.3390","volume":"24","author":[{"given":"Wendi","family":"Yao","sequence":"first","affiliation":[{"name":"Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Shanghai Institute of Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China"}]},{"given":"Yifan","family":"Wang","sequence":"additional","affiliation":[{"name":"Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Shanghai Institute of Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China"}]},{"given":"Mengyao","family":"Zhu","sequence":"additional","affiliation":[{"name":"Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Shanghai Institute of Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China"}]},{"given":"Yixin","family":"Cao","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University, Shanghai 200433, China"}]},{"given":"Dan","family":"Zeng","sequence":"additional","affiliation":[{"name":"Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Shanghai Institute of Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,7,13]]},"reference":[{"key":"ref_1","unstructured":"(2022, May 25). Sports Industry Statistic and Market Size Overview, Business and Industry Statistics. Available online: https:\/\/www.plunkettresearch.com\/statistics\/Industry-Statistics-Sports-Industry-Statistic-and-Market-Size-Overview."},{"key":"ref_2","first-page":"1","article-title":"Learning basketball dribbling skills using trajectory optimization and deep reinforcement learning","volume":"37","author":"Liu","year":"2018","journal-title":"ACM Trans. Graph. (TOG)"},{"key":"ref_3","first-page":"15","article-title":"Predicting Attendance at Major League Soccer Matches: A Comparison of Four Techniques","volume":"6","author":"King","year":"2018","journal-title":"J. Comput. Sci. Inf. Technol."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1007\/s00521-015-2056-z","article-title":"Neural network models for group behavior prediction: A case of soccer match attendance","volume":"28","author":"Strnad","year":"2017","journal-title":"Neural Comput. Appl."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"115912","DOI":"10.1016\/j.eswa.2021.115912","article-title":"Customized prediction of attendance to soccer matches based on symbolic regression and genetic programming","volume":"187","author":"Yamashita","year":"2022","journal-title":"Expert Syst. Appl."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"6","DOI":"10.2165\/00007256-198401010-00002","article-title":"The Predictability of Sports Injuries","volume":"1","author":"Lysens","year":"1984","journal-title":"Sports Med."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"2325967120953404","DOI":"10.1177\/2325967120953404","article-title":"Machine Learning Outperforms Logistic Regression Analysis to Predict Next-Season NHL Player Injury: An Analysis of 2322 Players From 2007 to 2017","volume":"8","author":"Luu","year":"2020","journal-title":"Orthop. J. Sports Med."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1464","DOI":"10.1177\/0363546514529083","article-title":"Major and minor League baseball hamstring injuries: Epidemiologic findings from the major league baseball injury surveillance system","volume":"42","author":"Ahmad","year":"2014","journal-title":"Am. J. Sports Med."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"101750","DOI":"10.1016\/j.is.2021.101750","article-title":"A Data Science approach analysing the Impact of Injuries on Basketball Player and Team Performance","volume":"99","author":"Sarlis","year":"2021","journal-title":"Inf. Syst."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Dijkhuis, T., Kempe, M., and Lemmink, K. (2021). Early Prediction of Physical Performance in Elite Soccer Matches\u2014A Machine Learning Approach to Support Substitutions. Entropy, 23.","DOI":"10.3390\/e23080952"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1715","DOI":"10.1111\/sms.13078","article-title":"Modeling the impact of players\u2019 workload on the injury-burden of English Premier League football clubs","volume":"28","author":"Fuller","year":"2018","journal-title":"Scand. J. Med. Sci. Sports"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Decroos, T., Bransen, L., Van Haaren, J., and Davis, J. (2019, January 4\u20138). Actions Speak Louder than Goals: Valuing Player Actions in Soccer. Proceedings of the Kdd\u201919: Proceedings of the 25th Acm Sigkdd International Conferencce on Knowledge Discovery and Data Mining, Anchorage, AK, USA.","DOI":"10.1145\/3292500.3330758"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Bialkowski, A., Lucey, P., Carr, P., Yue, Y., Sridharan, S., and Matthews, I. (2014, January 14\u201317). Large-scale analysis of soccer matches using spatiotemporal tracking data. Proceedings of the 2014 IEEE International Conference on Data Mining, Shenzhen, China.","DOI":"10.1109\/ICDM.2014.133"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"2596","DOI":"10.1109\/TKDE.2016.2581158","article-title":"Discovering Team Structures in Soccer from Spatiotemporal Data","volume":"28","author":"Bialkowski","year":"2016","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1109\/TVCG.2018.2865041","article-title":"ForVizor: Visualizing Spatio-Temporal Team Formations in Soccer","volume":"25","author":"Wu","year":"2018","journal-title":"IEEE Trans. Vis. Comput. Graph."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1007\/s40745-018-00189-x","article-title":"NBA Game Result Prediction Using Feature Analysis and Machine Learning","volume":"6","author":"Thabtah","year":"2019","journal-title":"Ann. Data Sci."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Chen, W.-J., Jhou, M.-J., Lee, T.-S., and Lu, C.-J. (2021). Hybrid Basketball Game Outcome Prediction Model by Integrating Data Mining Methods for the National Basketball Association. Entropy, 23.","DOI":"10.3390\/e23040477"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). Xgboost: A scalable tree boosting system. Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1109\/TG.2018.2841057","article-title":"Machine Learning Approaches to Competing in Fantasy Leagues for the NFL","volume":"11","author":"Landers","year":"2018","journal-title":"IEEE Trans. Games"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"741","DOI":"10.1016\/j.ijforecast.2018.01.003","article-title":"Predictive analysis and modelling football results using machine learning approach for English Premier League","volume":"35","author":"Baboota","year":"2019","journal-title":"Int. J. Forecast."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Robberechts, P., Van Haaren, J., and Davis, J. (2021, January 14\u201318). A Bayesian Approach to In-Game Win Probability in Soccer. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Online, Singapore.","DOI":"10.1145\/3447548.3467194"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1128","DOI":"10.1080\/01621459.1994.10476851","article-title":"A Brownian motion model for the progress of sports scores","volume":"89","author":"Stern","year":"1994","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"96","DOI":"10.1089\/big.2017.0054","article-title":"A Data Snapshot Approach for Making Real-Time Predictions in Basketball","volume":"6","author":"Kayhan","year":"2018","journal-title":"Big Data"},{"key":"ref_24","first-page":"197","article-title":"Using random forests to estimate win probability before each play of an NFL game","volume":"10","author":"Lock","year":"2014","journal-title":"J. Quant. Anal. Sports"},{"key":"ref_25","unstructured":"Pelechrinis, K. (2017). iWinRNFL: A Simple, Interpretable & Well-Calibrated In-Game Win Probability Model for NFL. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Zou, Q., Song, K., and Shi, J. (2020). A Bayesian In-Play Prediction Model for Association Football Outcomes. Appl. Sci., 10.","DOI":"10.3390\/app10082904"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"24139","DOI":"10.1038\/s41598-021-03157-3","article-title":"In-play forecasting in football using event and positional data","volume":"11","author":"Klemp","year":"2021","journal-title":"Sci. Rep."},{"key":"ref_28","first-page":"1","article-title":"Automatic differentiation variational inference","volume":"18","author":"Kucukelbir","year":"2017","journal-title":"J. Mach. Learn. Res."},{"key":"ref_29","unstructured":"Singh, K. (2022, May 25). Introducing Expected Threat (xT). Available online: https:\/\/karun.in\/blog\/expected-threat.html."},{"key":"ref_30","first-page":"229","article-title":"On modelling soccer data","volume":"3","author":"Karlis","year":"2000","journal-title":"Student"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1111\/1467-9876.00065","article-title":"Modelling Association Football Scores and Inefficiencies in the Football Betting Market","volume":"46","author":"Dixon","year":"1997","journal-title":"J. R. Stat. Soc. Ser. C (Appl. Stat.)"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1080\/09332480.1997.10554791","article-title":"Modeling scores in the Premier League: Is Manchester United really the best?","volume":"10","author":"Lee","year":"1997","journal-title":"Chance"},{"key":"ref_33","first-page":"381","article-title":"Analysis of sports data by using bivariate Poisson models","volume":"52","author":"Karlis","year":"2003","journal-title":"J. R. Stat. Soc. Ser. D (Stat.)"},{"key":"ref_34","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1502","DOI":"10.12928\/telkomnika.v14i4.3956","article-title":"SVM parameter optimization using grid search and genetic algorithm to improve classification performance","volume":"14","author":"Syarif","year":"2016","journal-title":"Telkomnika"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1080\/00401706.1970.10488634","article-title":"Ridge regression: Biased estimation for nonorthogonal problems","volume":"12","author":"Hoerl","year":"1970","journal-title":"Technometrics"},{"key":"ref_37","first-page":"211","article-title":"Sparse Bayesian learning and the relevance vector machine","volume":"1","author":"Tipping","year":"2001","journal-title":"J. Mach. Learn. Res."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1247","DOI":"10.5194\/gmd-7-1247-2014","article-title":"Root mean square error (RMSE) or mean absolute error (MAE)?\u2013Arguments against avoiding RMSE in the literature","volume":"7","author":"Chai","year":"2014","journal-title":"Geosci. Model Dev."},{"key":"ref_40","first-page":"209","article-title":"R-squared measures for count data regression models with applications to health-care utilization","volume":"14","author":"Cameron","year":"1996","journal-title":"J. Bus. Econ. Stat."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"985","DOI":"10.1175\/1520-0450(1969)008<0985:ASSFPF>2.0.CO;2","article-title":"A scoring system for probability forecasts of ranked categories","volume":"8","author":"Epstein","year":"1969","journal-title":"J. Appl. Meteorol."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1007\/s10994-018-5703-7","article-title":"Dolores: A model that predicts football match outcomes from all over the world","volume":"108","author":"Constantinou","year":"2019","journal-title":"Mach. Learn."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Constantinou, A.C., and Fenton, N.E. (2012). Solving the problem of inadequate scoring rules for assessing probabilistic football forecast models. J. Quant. Anal. Sports, 8.","DOI":"10.1515\/1559-0410.1418"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1895","DOI":"10.1162\/089976698300017197","article-title":"Approximate statistical tests for comparing supervised classification learning algorithms","volume":"10","author":"Dietterich","year":"1998","journal-title":"Neural Comput."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Niculescu-Mizil, A., and Caruana, R. (2005, January 7\u201311). Predicting good probabilities with supervised learning. Proceedings of the 22nd International Conference on Machine Learning, New York, NY, USA.","DOI":"10.1145\/1102351.1102430"},{"key":"ref_46","unstructured":"Guo, C., Pleiss, G., Sun, Y., and Weinberger, K.Q. (2017, January 6\u201311). On calibration of modern neural networks. Proceedings of the International Conference on Machine Learning, PMLR, Sydney, Australia."},{"key":"ref_47","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, Neural Information Processing Systems Foundation, Inc."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/24\/7\/971\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:49:42Z","timestamp":1760140182000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/24\/7\/971"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,13]]},"references-count":47,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2022,7]]}},"alternative-id":["e24070971"],"URL":"https:\/\/doi.org\/10.3390\/e24070971","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,13]]}}}