{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T07:02:14Z","timestamp":1775718134667,"version":"3.50.1"},"reference-count":30,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2021,11,24]],"date-time":"2021-11-24T00:00:00Z","timestamp":1637712000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e Tecnologia","doi-asserted-by":"publisher","award":["FCT DSAIPA\/DS\/0090\/2018"],"award-info":[{"award-number":["FCT DSAIPA\/DS\/0090\/2018"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"<jats:p>Traffic accidents are one of the most important concerns of the world, since they result in numerous casualties, injuries, and fatalities each year, as well as significant economic losses. There are many factors that are responsible for causing road accidents. If these factors can be better understood and predicted, it might be possible to take measures to mitigate the damages and its severity. The purpose of this work is to identify these factors using accident data from 2016 to 2019 from the district of Set\u00fabal, Portugal. This work aims at developing models that can select a set of influential factors that may be used to classify the severity of an accident, supporting an analysis on the accident data. In addition, this study also proposes a predictive model for future road accidents based on past data. Various machine learning approaches are used to create these models. Supervised machine learning methods such as decision trees (DT), random forests (RF), logistic regression (LR), and naive Bayes (NB) are used, as well as unsupervised machine learning techniques including DBSCAN and hierarchical clustering. Results show that a rule-based model using the C5.0 algorithm is capable of accurately detecting the most relevant factors describing a road accident severity. Further, the results of the predictive model suggests the RF model could be a useful tool for forecasting accident hotspots.<\/jats:p>","DOI":"10.3390\/computers10120157","type":"journal-article","created":{"date-parts":[[2021,11,28]],"date-time":"2021-11-28T22:19:16Z","timestamp":1638137956000},"page":"157","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":108,"title":["Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9906-0358","authenticated-orcid":false,"given":"Daniel","family":"Santos","sequence":"first","affiliation":[{"name":"Informatics Departament, University of \u00c9vora, 7002-554 \u00c9vora, Portugal"}]},{"given":"Jos\u00e9","family":"Saias","sequence":"additional","affiliation":[{"name":"Informatics Departament, University of \u00c9vora, 7002-554 \u00c9vora, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5086-059X","authenticated-orcid":false,"given":"Paulo","family":"Quaresma","sequence":"additional","affiliation":[{"name":"Informatics Departament, University of \u00c9vora, 7002-554 \u00c9vora, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0793-0003","authenticated-orcid":false,"given":"V\u00edtor Beires","family":"Nogueira","sequence":"additional","affiliation":[{"name":"Informatics Departament, University of \u00c9vora, 7002-554 \u00c9vora, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2021,11,24]]},"reference":[{"key":"ref_1","unstructured":"(2021, August 02). Moprevis. Available online: https:\/\/moprevis.uevora.pt\/en\/."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"H\u00e9bert, A., Gu\u00e9don, T., Glatard, T., and Jaumard, B. (2019, January 9\u201312). High-Resolution Road Vehicle Collision Prediction for the City of Montreal. Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA.","DOI":"10.1109\/BigData47090.2019.9006009"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Hernes, M., Wojtkiewicz, K., and Szczerbicki, E. (2020). Study of Machine Learning Techniques on Accident Data. Advances in Computational Collective Intelligence, Springer International Publishing.","DOI":"10.1007\/978-3-030-63119-2"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1016\/j.jsr.2013.04.007","article-title":"Identifying crash-prone traffic conditions under different weather on freeways","volume":"46","author":"Xu","year":"2013","journal-title":"J. Saf. Res."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1016\/j.trc.2014.09.016","article-title":"A correlated random parameter approach to investigate the effects of weather conditions on crash risk for a mountainous freeway","volume":"50","author":"Yu","year":"2014","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1080\/15389588.2016.1198871","article-title":"Investigation of Powered-Two-Wheeler accident involvement in urban arterials by considering real-time traffic and weather data","volume":"18","author":"Theofilatos","year":"2016","journal-title":"Traffic Inj. Prev."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1080\/15389588.2012.661110","article-title":"Factors Affecting Accident Severity Inside and Outside Urban Areas in Greece","volume":"13","author":"Theofilatos","year":"2012","journal-title":"Traffic Inj. Prev."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1016\/j.aap.2017.08.008","article-title":"Comparison of four statistical and machine learning methods for crash severity prediction","volume":"108","author":"Iranitalab","year":"2017","journal-title":"Accid. Anal. Prev."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1016\/j.trc.2015.03.015","article-title":"A Novel Variable Selection Method based on Frequent Pattern Tree for Real-time Traffic Accident Risk Prediction","volume":"55","author":"Lin","year":"2015","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1016\/j.jsr.2005.06.013","article-title":"Data mining of tree-based models to analyze freeway accident frequency","volume":"36","author":"Chang","year":"2005","journal-title":"J. Saf. Res."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"657","DOI":"10.1016\/j.aap.2006.10.012","article-title":"A crash-prediction model for multilane roads","volume":"39","author":"Caliendo","year":"2007","journal-title":"Accid. Anal. Prev."},{"key":"ref_12","first-page":"775","article-title":"Machine learning applied to road safety modeling: A systematic literature review","volume":"7","author":"Silva","year":"2020","journal-title":"J. Traffic Transp. Eng. (Engl. Ed.)"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1016\/j.jsr.2017.02.003","article-title":"Incorporating real-time traffic and weather data to explore road accident likelihood and severity in urban arterials","volume":"61","author":"Theofilatos","year":"2017","journal-title":"J. Saf. Res."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"036119811984157","DOI":"10.1177\/0361198119841571","article-title":"Comparing Machine Learning and Deep Learning Methods for Real-Time Crash Prediction","volume":"2673","author":"Theofilatos","year":"2019","journal-title":"Transp. Res. Rec. J. Transp. Res. Board"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Ren, H., Song, Y., Wang, J., Hu, Y., and Lei, J. (2018, January 4\u20137). A Deep Learning Approach to the Citywide Traffic Accident Risk Prediction. Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA.","DOI":"10.1109\/ITSC.2018.8569437"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Yuan, Z., Zhou, X., and Yang, T. (2018, January 24). Hetero-ConvLSTM: A Deep Learning Approach to Traffic Accident Prediction on Heterogeneous Spatio-Temporal Data. Proceedings of the KDD\u201918: 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, New York, NY, USA.","DOI":"10.1145\/3219819.3219922"},{"key":"ref_17","first-page":"432","article-title":"Modern data sources and techniques for analysis and forecast of road accidents: A review","volume":"7","author":"Pedraza","year":"2020","journal-title":"J. Traffic Transp. Eng. (Engl. Ed.)"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1007\/s40534-016-0095-5","article-title":"A data mining approach to characterize road accident locations","volume":"24","author":"Kumar","year":"2016","journal-title":"J. Mod. Transp."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"571","DOI":"10.1016\/j.aap.2009.09.025","article-title":"A comparative analysis of hotspot identification methods","volume":"42","author":"Montella","year":"2010","journal-title":"Accid. Anal. Prev."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Szenasi, S., and Csiba, P. (2014, January 19\u201325). Clustering Algorithm in Order to Find Accident Black Spots Identified By GPS Coordinates. Proceedings of the 14th GeoConference on Informatics, Geoinformatics, and Remote Sensing, Ilza, Poland.","DOI":"10.5593\/SGEM2014\/B21\/S8.063"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1016\/j.aap.2015.03.013","article-title":"Impact of real-time traffic characteristics on freeway crash occurrence: Systematic review and meta-analysis","volume":"79","author":"Roshandel","year":"2015","journal-title":"Accid. Anal. Prev."},{"key":"ref_22","first-page":"82","article-title":"Clustering Techniques: A Brief Survey of Different Clustering Algorithms","volume":"1","author":"Sisodia","year":"2012","journal-title":"Int. J. Latest Trends Eng. Technol."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","article-title":"Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis","volume":"20","author":"Rousseeuw","year":"1987","journal-title":"J. Comput. Appl. Math."},{"key":"ref_24","unstructured":"Ho, T.K. (1995, January 14\u201316). Random decision forests. Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random Forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1007\/s10462-007-9052-3","article-title":"Machine learning: A review of classification and combining techniques","volume":"26","author":"Kotsiantis","year":"2006","journal-title":"Artif. Intell. Rev."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s42979-021-00592-x","article-title":"Machine Learning: Algorithms, Real-World Applications and Research Directions","volume":"2","author":"Sarker","year":"2021","journal-title":"SN Comput. Sci."},{"key":"ref_28","unstructured":"Yuan, Z., Zhou, X., Yang, T., and Tamerius, J. (2017, January 13\u201317). Predicting Traffic Accidents Through Heterogeneous Urban Data: A Case Study. Proceedings of the 6th international workshop on urban computing (UrbComp 2017), Halifax, NS, Canada."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1016\/j.aap.2008.12.014","article-title":"Kernel density estimation and K-means clustering to profile road accident hotspots","volume":"41","author":"Anderson","year":"2009","journal-title":"Accid. Anal. Prev."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Elvik, R. (2008). A Survey of Operational Definitions of Hazardous Road Locations in Some European Countries, Accident Analysis & Prevention.","DOI":"10.1016\/j.aap.2008.08.001"}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/10\/12\/157\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:35:03Z","timestamp":1760168103000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/10\/12\/157"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,24]]},"references-count":30,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2021,12]]}},"alternative-id":["computers10120157"],"URL":"https:\/\/doi.org\/10.3390\/computers10120157","relation":{},"ISSN":["2073-431X"],"issn-type":[{"value":"2073-431X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,11,24]]}}}