{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,18]],"date-time":"2026-04-18T01:52:08Z","timestamp":1776477128586,"version":"3.51.2"},"reference-count":60,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2025,1,31]],"date-time":"2025-01-31T00:00:00Z","timestamp":1738281600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Projecto de Desenvolvimento de Ci\u00eancia e Tecnologia, from MESCTI","award":["011\/D-UL\/PDCT-M003\/2022"],"award-info":[{"award-number":["011\/D-UL\/PDCT-M003\/2022"]}]},{"name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","award":["011\/D-UL\/PDCT-M003\/2022"],"award-info":[{"award-number":["011\/D-UL\/PDCT-M003\/2022"]}]},{"name":"LAETA Programatic Funding","award":["011\/D-UL\/PDCT-M003\/2022"],"award-info":[{"award-number":["011\/D-UL\/PDCT-M003\/2022"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Aerospace"],"abstract":"<jats:p>The use of machine learning techniques to identify contributing factors in air incidents has grown significantly, helping to identify and prevent accidents and improve air safety. In this paper, classifier models such as LS, KNN, Random Forest, Extra Trees, and XGBoost, which have proven effective in classification tasks, are used to analyze incident reports parsed with natural language processing (NLP) techniques, to uncover hidden patterns and prevent future incidents. Metrics such as precision, recall, F1-score and accuracy are used to assess the degree of correctness of the predictive models. The adjustment of hyperparameters is obtained with Grid Search and Bayesian Optimization. KNN had the best predictive rating, followed by Random Forest and Extra Trees. The results indicate that the use of machine learning tools to classify incidents and accidents helps to identify their root cause, improving situational decision-making.<\/jats:p>","DOI":"10.3390\/aerospace12020106","type":"journal-article","created":{"date-parts":[[2025,1,31]],"date-time":"2025-01-31T04:05:47Z","timestamp":1738296347000},"page":"106","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Identifying Human Factors in Aviation Accidents with Natural Language Processing and Machine Learning Models"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-1429-4414","authenticated-orcid":false,"given":"Fl\u00e1vio L.","family":"L\u00e1zaro","sequence":"first","affiliation":[{"name":"Institute of Mechanical Engineering (IDMEC-LAETA), Instituto Superior T\u00e9cnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal"},{"name":"Faculdade de Engenharia, Universidade Agostinho Neto, Av. 21 de Janeiro, Luanda 1756, Angola"}]},{"given":"Tom\u00e1s","family":"Madeira","sequence":"additional","affiliation":[{"name":"Institute of Mechanical Engineering (IDMEC-LAETA), Instituto Superior T\u00e9cnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1081-2729","authenticated-orcid":false,"given":"Rui","family":"Melicio","sequence":"additional","affiliation":[{"name":"Institute of Mechanical Engineering (IDMEC-LAETA), Instituto Superior T\u00e9cnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal"},{"name":"Aeronautics and Astronautics Research Center (AEROG-LAETA), Universidade da Beira Interior, Cal\u00e7ada Fonte do Lameiro, 6200-358 Covilh\u00e3, Portugal"},{"name":"Synopsis Planet, Advance Engineering Unipessoal LDA, 2810-174 Almada, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9388-4308","authenticated-orcid":false,"given":"Duarte","family":"Val\u00e9rio","sequence":"additional","affiliation":[{"name":"Institute of Mechanical Engineering (IDMEC-LAETA), Instituto Superior T\u00e9cnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7169-2660","authenticated-orcid":false,"given":"Lu\u00eds F. F. M.","family":"Santos","sequence":"additional","affiliation":[{"name":"Aeronautics and Astronautics Research Center (AEROG-LAETA), Universidade da Beira Interior, Cal\u00e7ada Fonte do Lameiro, 6200-358 Covilh\u00e3, Portugal"},{"name":"ISEC Lisboa, Alameda das Linhas de Torres 179, 1750-142 Lisboa, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2025,1,31]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Ali, T., Khazaei, H., Moghaddam, M.H.Y., and Hassan, Y. (2019). Machine Learning in Transportation, Hindawi.","DOI":"10.1155\/2019\/4359785"},{"key":"ref_2","first-page":"35","article-title":"Stress, pressure and fatigue on aircraft maintenance personal","volume":"12","author":"Santos","year":"2019","journal-title":"Int. Rev. Aerosp. Eng."},{"key":"ref_3","unstructured":"Cusick, S.K., Cortes, A.I., and Rodrigues, C.C. (2017). Commercial Aviation Safety, McGraw Hill Education. [6th ed.]."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"100184","DOI":"10.1016\/j.treng.2023.100184","article-title":"The role of human factors in aviation ground operation-related accidents\/incidents: A human error analysis approach","volume":"13","author":"Muecklich","year":"2023","journal-title":"Transp. Eng."},{"key":"ref_5","unstructured":"International Air Transport Association (2024, May 09). Annual Report 2023. IATA. Available online: https:\/\/www.iata.org\/contentassets\/c81222d96c9a4e0bb4ff6ced0126f0bb\/annual-review-2023.pdf."},{"key":"ref_6","unstructured":"ICAO (2024, May 09). Annual Report of the Council to the Assembly. Available online: https:\/\/www.icao.int\/about-icao\/Annual_Report_2023_EN\/AnnualReport2023.html#p=1."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"5540046","DOI":"10.1155\/2021\/5540046","article-title":"Identifying Incident Causal Factors to Improve Aviation Transportation Safety: Proposing a Deep Learning Approach","volume":"2021","author":"Dong","year":"2021","journal-title":"J. Adv. Transp."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1054","DOI":"10.1080\/07421222.2017.1394056","article-title":"A data-mining approach to identification of risk factors in safety management systems","volume":"34","author":"Shi","year":"2017","journal-title":"J. Manag. Inf. Syst."},{"key":"ref_9","unstructured":"Dodd, R.S., Eldredge, D., and Mangold, S.J. (1992). A Review and Discussion of Flight Management System Incidents Reported to the Aviation Safety Reporting System, The National Academies of Sciences, Engineering, and Medicine."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"L\u00e1zaro, F.L., Nogueira, R.P.R., Melicio, R., Val\u00e9rio, D., and Santos, L.F.F.M. (2024). Human Factors as Predictor of Fatalities in Aviation Accidents: A Neural Network Analysis. Appl. Sci, 14.","DOI":"10.3390\/app14020640"},{"key":"ref_11","unstructured":"Council, N.R. (1998). Improving the Continued Airworthiness of Civil Aircraft: A Strategy for the FAA\u2019s Aircraft Certification Service, The National Academies Press."},{"key":"ref_12","unstructured":"Schreiber, F. (2024, May 23). Human Performance Error Management. Available online: https:\/\/skybrary.aero\/bookshelf\/books\/1640.pdf."},{"key":"ref_13","first-page":"80","article-title":"Mental Workload Evaluation of Aircraft Operators\u2019 Using Pupil Dilation and Nasa-Task Load Index","volume":"9","author":"Othman","year":"2016","journal-title":"Int. Rev. Aerospace Eng."},{"key":"ref_14","unstructured":"ICAO (2013). Annex 19 to the Convention on International Civil Aviation\u2013Safety Management, ICAO."},{"key":"ref_15","unstructured":"ICAO (2024, August 02). Safety Report. Available online: https:\/\/www.icao.int\/safety\/Documents\/ICAO_SR_2024.pdf."},{"key":"ref_16","unstructured":"International Air Transport Association (IATA), International Civil Aviation Organization (ICAO), and International Federation of Air Line Pilots\u2019 Associations (IFALPA) (2015). Fatigue Management Guide for Airline Operators, International Civil Aviation Organization. Available online: https:\/\/www.icao.int\/safety\/fatiguemanagement\/frms%20tools\/frms%20implementation%20guide%20for%20operators%20july%202011.pdf."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"105","DOI":"10.3141\/2449-12","article-title":"Human Reliability Analysis for Visual Inspection in Aviation Maintenance by a Bayesian Network Approach","volume":"2449","author":"Chen","year":"2014","journal-title":"Transp. Res. Rec. J. Transp. Res. Board"},{"key":"ref_18","first-page":"115","article-title":"A Taxonomy of Performance Shaping Factors for Human Reliability Analysis in Industrial Maintenance","volume":"12","author":"Franciosi","year":"2019","journal-title":"J. Ind. Eng. Manag."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1108\/IJQRM-01-2018-0008","article-title":"Enhancing human performance reliability in aircraft pushback operations","volume":"36","author":"Ng","year":"2019","journal-title":"Int. J. Qual. Reliab. Manag."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"106021","DOI":"10.1016\/j.ssci.2022.106021","article-title":"Human reliability assessment on building construction work at height: The case of scaffolding work","volume":"159","author":"Li","year":"2023","journal-title":"Saf. Sci."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"39596","DOI":"10.1109\/ACCESS.2022.3166157","article-title":"Predicting airline additional services consumption willingness based on high-dimensional incomplete data","volume":"10","author":"Chen","year":"2022","journal-title":"IEEE Access"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Lee, H., Madar, S., Sairam, S., Puranik, T.G., Payan, A.P., Kirby, M., Pinon, O.J., and Mavris, D.N. (2020). Critical Parameter Identification for Safety Events in Commercial Aviation Using Machine Learning. Aerospace, 7.","DOI":"10.3390\/aerospace7060073"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1007\/s42979-021-00592-x","article-title":"Machine learning: Algorithms, real-world applications and research directions","volume":"2","author":"Sarker","year":"2021","journal-title":"SN Comput. Sci."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"108354","DOI":"10.1016\/j.ast.2023.108354","article-title":"Improving aircraft performance using machine learning: A review","volume":"138","author":"Ferrer","year":"2023","journal-title":"Aerosp. Sci. Technol."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Yang, C., and Huang, C. (2023). Natural language processing (NLP) in aviation safety: Systematic review of research and outlook into the future. Aerospace, 10.","DOI":"10.3390\/aerospace10070600"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"115694","DOI":"10.1016\/j.eswa.2021.115694","article-title":"Natural Language Processing for the identification of Human factors in aviation accidents causes: An application to the SHEL methodology","volume":"186","author":"Perboli","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1007\/s44196-024-00671-w","article-title":"Artificial Intelligence in Aviation Safety: Systematic Review and Biometric Analysis","volume":"17","author":"Demir","year":"2024","journal-title":"Int. J. Comput. Intell. Syst."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Madeira, T., Mel\u00edcio, R., Val\u00e9rio, D., and Santos, L. (2021). Machine Learning and Natural Language Processing for Prediction of Human Factors in Aviation Incident Reports. Aerospace, 8.","DOI":"10.3390\/aerospace8020047"},{"key":"ref_29","unstructured":"ASN (2024, March 12). Aviation Safety Database. Available online: https:\/\/aviation-safety.net\/database\/."},{"key":"ref_30","unstructured":"Wiegmann, D.A., and Shappell, S.A. (2003). A Human Error Approach to Aviation Accident Analysis. The Human Factors Analysis and Classification System, Ashgate Publishing Limited."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"205","DOI":"10.3233\/WEB-200442","article-title":"A survey on text classification and its applications","volume":"18","author":"Zhou","year":"2020","journal-title":"Web Intell."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"De Vries, V. (2020, January 3\u20134). Classification of Aviation Safety Reports Using Machine Learning. Proceedings of the International Conference on Artificial Intelligence and Data Analytics for Air Transportation (AIDA-AT), Singapore.","DOI":"10.1109\/AIDA-AT48540.2020.9049187"},{"key":"ref_33","unstructured":"Bird, S., Klein, E., and Loper, E. (2024, May 23). Natural Language Toolkit (NLTK) Documentation. Available online: https:\/\/www.nltk.org\/."},{"key":"ref_34","unstructured":"Bastian, M. (2024, May 13). GPT-4 Has More Than a Trillion Parameters\u2014Report. The Decoder: AI in Practice. Available online: https:\/\/the-decoder.com\/gpt-4-has-a-trillion-parameters\/."},{"key":"ref_35","unstructured":"Saunders, D., Hu, K., and Li, W.C. (2024). The Process of Training ChatGPT Using HFACS to Analyse Aviation Accident Reports. Ergonomics & Human Factors, Proceedings of the Conference, 22\u201324 April 2024, Kenilworth, UK, Chartered Institute of Ergonomics and Human Factors (CIEHF). Available online: https:\/\/publications.ergonomics.org.uk\/uploads\/The-Process-of-Training-ChatGPT-Using-HFACS-to-Analyse-Aviation-Accident-Reports.pdf\/."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"36120","DOI":"10.1109\/ACCESS.2023.3266377","article-title":"A survey of text representation and embedding techniques in nlp","volume":"11","author":"Patil","year":"2023","journal-title":"IEEE Access"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Iscen, A., Tolias, G., Avrithis, Y., and Chum, O. (2019). Label Propagation for Deep Semi-supervised Learning. arXiv.","DOI":"10.1109\/CVPR.2019.00521"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1613\/jair.1.12228","article-title":"A survey on the explainability of supervised machine learning","volume":"70","author":"Burkart","year":"2021","journal-title":"J. Artif. Intell. Res."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Lau, J.H., and Baldwin, T. (2016). An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation. Proceedings of the 1st Workshop on Representation Learning for NLP, Association for Computational Linguistics.","DOI":"10.18653\/v1\/W16-1609"},{"key":"ref_40","unstructured":"Le, Q., and Mikolov, T. (2014, January 22\u201324). Distributed representations of sentences and documents. Proceedings of the International Conference on Machine Learning, PMLR, Beijing, China."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1007\/978-3-319-53817-4_4","article-title":"Word embedding for understanding natural language: A survey","volume":"Volume 26","author":"Li","year":"2018","journal-title":"Guide to Big Data Applications, Studies in Big Data"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Robinson, S.D. (2018). Multi-Label Classification of Contributing Causal Factors in Self-Reported Safety Narratives. Safety, 4.","DOI":"10.3390\/safety4030030"},{"key":"ref_43","first-page":"321","article-title":"Learning with local and global consistency","volume":"Volume 16","author":"Thrun","year":"2004","journal-title":"Advances in Neural Information Processing Systems"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1016\/S1005-8885(11)60321-X","article-title":"Semi-supervised learning via manifold regularization","volume":"19","author":"Yu","year":"2012","journal-title":"J. China Univ. Post Telecommun."},{"key":"ref_45","first-page":"1229","article-title":"Manifold regularization and semi-supervised learning: Some theoretical analyses","volume":"14","author":"Niyogi","year":"2013","journal-title":"J. Mach. Res."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"1883","DOI":"10.4249\/scholarpedia.1883","article-title":"K-nearest neighbor","volume":"4","author":"Peterson","year":"2009","journal-title":"Scholarpedia"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random Forests","volume":"45","author":"Brieman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s10994-006-6226-1","article-title":"Extremely randomized trees","volume":"63","author":"Geurts","year":"2006","journal-title":"Mach. Learn."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Jiao, Y., Dong, J., Han, J., and Sun, H. (2022). Classification and Causes Identification of Chinese Civil Aviation Incident Reports. Appl. Sci., 12.","DOI":"10.3390\/app122110765"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Mehdary, A., Chehri, A., Jakimi, A., and Saadane, R. (2024). Hyperparameter Optimization with Genetic Algorithms and XGBoost: A Step Forward in Smart Grid Fraud Detection. Sensors, 24.","DOI":"10.3390\/s24041230"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1016\/j.neucom.2020.07.061","article-title":"On hyperparameter optimization of machine learning algorithms: Theory and practice","volume":"415","author":"Yang","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_52","unstructured":"Li, H., Chaudhari, P., Yang, H., Lam, M., Ravichandran, A., Bhotika, R., and Soatto, S. (2020). Rethinking the Hyperparameters for Fine-tuning. arXiv."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"102275","DOI":"10.1016\/j.scs.2020.102275","article-title":"A survey on hyperparameters optimization algorithms of forecasting models in smart grid","volume":"61","author":"Khalid","year":"2020","journal-title":"Sustain. Cities Soc."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"124154","DOI":"10.1016\/j.eswa.2024.124154","article-title":"One-step vs horizon-step training strategies for multi-step traffic flow forecasting with direct particle swarm optimization grid search support vector regression and long short-term memory","volume":"252","author":"Omar","year":"2024","journal-title":"Expert Syst. Appl."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Hutter, F., Kotthoff, L., and Vanschoren, J. (2019). Automated Machine Learning: Methods, Systems, Challenges, Springer Publishing Company, Incorporated. [1st. ed.].","DOI":"10.1007\/978-3-030-05318-5"},{"key":"ref_56","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1016\/j.aap.2019.05.005","article-title":"A feature learning approach based on xgboost for driving assessment and risk prediction","volume":"129","author":"Shi","year":"2019","journal-title":"Accid. Anal. Prev."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"206806","DOI":"10.1109\/ACCESS.2020.3037922","article-title":"Application of XGBoost for Hazardous Material Road Transport Accident Severity Analysis","volume":"8","author":"Shen","year":"2020","journal-title":"IEEE Access"},{"key":"ref_59","first-page":"1","article-title":"Recent advances in Bayesian optimization","volume":"55","author":"Wang","year":"2023","journal-title":"ACM Comput. Surv."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"4055","DOI":"10.1007\/s10994-023-06467-x","article-title":"Hybrid approaches to optimization and machine learning methods: A systematic literature review","volume":"113","author":"Azevedo","year":"2024","journal-title":"Mach. Learn."}],"container-title":["Aerospace"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2226-4310\/12\/2\/106\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T16:24:37Z","timestamp":1760027077000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2226-4310\/12\/2\/106"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,31]]},"references-count":60,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2025,2]]}},"alternative-id":["aerospace12020106"],"URL":"https:\/\/doi.org\/10.3390\/aerospace12020106","relation":{},"ISSN":["2226-4310"],"issn-type":[{"value":"2226-4310","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,31]]}}}