{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,20]],"date-time":"2026-06-20T05:09:58Z","timestamp":1781932198475,"version":"3.54.5"},"reference-count":34,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2025,2,26]],"date-time":"2025-02-26T00:00:00Z","timestamp":1740528000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004837","name":"Ministerio de Ciencia, Innovaci\u00f3n y Universidades","doi-asserted-by":"publisher","award":["PID2020-113192GB-I00"],"award-info":[{"award-number":["PID2020-113192GB-I00"]}],"id":[{"id":"10.13039\/501100004837","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004837","name":"Ministerio de Ciencia, Innovaci\u00f3n y Universidades","doi-asserted-by":"publisher","award":["PID2021-127946OB-I00"],"award-info":[{"award-number":["PID2021-127946OB-I00"]}],"id":[{"id":"10.13039\/501100004837","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100014440","name":"MCIN\/AEI\/10.13039\/501100011033 by \u201cERDF A way of making Europe\u201d","doi-asserted-by":"publisher","award":["PID2020-113192GB-I00"],"award-info":[{"award-number":["PID2020-113192GB-I00"]}],"id":[{"id":"10.13039\/100014440","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100014440","name":"MCIN\/AEI\/10.13039\/501100011033 by \u201cERDF A way of making Europe\u201d","doi-asserted-by":"publisher","award":["PID2021-127946OB-I00"],"award-info":[{"award-number":["PID2021-127946OB-I00"]}],"id":[{"id":"10.13039\/100014440","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>In the era of increasing digitalisation, organisations face the critical challenge of detecting anomalies in large volumes of data, which may indicate suspicious activities. To address this challenge, audit engagements are conducted regularly, and internal auditors and purchasing specialists seek innovative methods to streamline these processes. This study introduces a methodology to prioritise the investigation of anomalies identified in two large real-world purchase datasets. The primary objective is to enhance the effectiveness of companies\u2019 control efforts and improve the efficiency of anomaly detection tasks. The approach begins with a comprehensive exploratory data analysis, followed by the application of unsupervised machine learning techniques to identify anomalies. A univariate analysis is performed using the z-Score index and the DBSCAN algorithm, while multivariate analysis employs k-Means clustering and Isolation Forest algorithms. Additionally, the Silhouette index is used to evaluate the quality of the clustering, ensuring each method produces a prioritised list of candidate transactions for further review. To refine this process, an ensemble prioritisation framework is developed, integrating multiple methods. Furthermore, explainability tools such as SHAP are utilised to provide actionable insights and support specialists in interpreting the results. This methodology aims to empower organisations to detect anomalies more effectively and streamline the audit process.<\/jats:p>","DOI":"10.3390\/info16030177","type":"journal-article","created":{"date-parts":[[2025,2,26]],"date-time":"2025-02-26T08:28:31Z","timestamp":1740558511000},"page":"177","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Applied Machine Learning to Anomaly Detection in Enterprise Purchase Processes: A Hybrid Approach Using Clustering and Isolation Forest"],"prefix":"10.3390","volume":"16","author":[{"given":"Antonio","family":"Herreros-Mart\u00ednez","sequence":"first","affiliation":[{"name":"Electrical Engineering Department, University of Valencia, Ave Universitats, s\/n. Burjassot, 46100 Valencia, Spain"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3752-8231","authenticated-orcid":false,"given":"Rafael","family":"Magdalena-Benedicto","sequence":"additional","affiliation":[{"name":"Electrical Engineering Department, University of Valencia, Ave Universitats, s\/n. Burjassot, 46100 Valencia, Spain"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8293-8235","authenticated-orcid":false,"given":"Joan","family":"Vila-Franc\u00e9s","sequence":"additional","affiliation":[{"name":"Electrical Engineering Department, University of Valencia, Ave Universitats, s\/n. Burjassot, 46100 Valencia, Spain"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3863-8611","authenticated-orcid":false,"given":"Antonio Jos\u00e9","family":"Serrano-L\u00f3pez","sequence":"additional","affiliation":[{"name":"Electrical Engineering Department, University of Valencia, Ave Universitats, s\/n. Burjassot, 46100 Valencia, Spain"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0174-5325","authenticated-orcid":false,"given":"Sonia","family":"P\u00e9rez-D\u00edaz","sequence":"additional","affiliation":[{"name":"Physics and Mathematics Department, University of Alcal\u00e1 de Henares, Campus Tecnol\u00f3gico Universitario, Alcal\u00e1 de Henares, 28805 Madrid, Spain"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2351-7163","authenticated-orcid":false,"given":"Jos\u00e9 Javier","family":"Mart\u00ednez-Herr\u00e1iz","sequence":"additional","affiliation":[{"name":"Computer Science Department, University of Alcal\u00e1 de Henares, Campus Tecnol\u00f3gico Universitario, Alcal\u00e1 de Henares, 28805 Madrid, Spain"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2025,2,26]]},"reference":[{"key":"ref_1","unstructured":"IIA (2025, February 04). Definition of Internal Auditing. Available online: https:\/\/www.theiia.org\/en\/standards\/what-are-the-standards\/definition-of-internal-audit\/."},{"key":"ref_2","unstructured":"IIA (2025, February 04). International Standards for fhe Professional Practice of Internal Auditing (Standards). Available online: https:\/\/www.theiia.org\/globalassets\/site\/standards\/mandatory-guidance\/ippf\/2017\/ippf-standards-2017-english.pdf."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Byrnes, P., Al-Awadhi, A., Gullkvist, B., Brown-Liburd, H., Teeter, R., Warren, J., and Vasarhelyi, M. (2018). Evolution of Auditing: From the Traditional Approach to the Future Audit: Theory and Application. Continuous Auditing: Theory and Application, Emerald Publishing Limited.","DOI":"10.1108\/978-1-78743-413-420181014"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Chiu, V., Liu, Q., and Vasarhelyi, M.A. (2018). The Development and Intellectual Structure of Continuous Auditing Research. Continuous Auditing, Emerald Publishing Limited.","DOI":"10.1108\/978-1-78743-413-420181003"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.knosys.2017.05.001","article-title":"Mining corporate annual reports for intelligent detection of financial statement fraud\u2014A comparative study of machine learning methods","volume":"128","author":"Hajek","year":"2017","journal-title":"Knowl.-Based Syst."},{"key":"ref_6","unstructured":"Liu, Q. (2014). The Application of Exploratory Data Analysis in Auditing. [Master\u2019s Thesis, Graduate School-NewarkRutgers]."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"115","DOI":"10.2308\/jeta-51730","article-title":"The Emergence of Artificial Intelligence: How Automation is Changing Auditing","volume":"14","author":"Kokina","year":"2017","journal-title":"J. Emerg. Technol. Account."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1016\/j.accinf.2009.12.004","article-title":"Internal fraud risk reduction: Results of a data mining case study","volume":"11","author":"Jans","year":"2010","journal-title":"Int. J. Account. Inf. Syst."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2133360.2133363","article-title":"Isolation-Based Anomaly Detection","volume":"6","author":"Liu","year":"2012","journal-title":"ACM Trans. Knowl. Discov. Data"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"43","DOI":"10.2308\/jeta-52474","article-title":"Automated Clustering for Data Analytics","volume":"16","author":"Byrnes","year":"2019","journal-title":"J. Emerg. Technol. Account."},{"key":"ref_11","first-page":"1","article-title":"Unsupervised anomaly detection for internal auditing: Literature review and research agend","volume":"21","author":"Nonnenmacher","year":"2021","journal-title":"Int. J. Digital Account. Res."},{"key":"ref_12","unstructured":"Sharma, A., and Panigrahi, P.K. (2013). A Review of Financial Accounting Fraud Detection based on Data Mining Techniques. arXiv."},{"key":"ref_13","unstructured":"Jans, M., Lybaert, N., and Vanhoof, K. (2008, January 23\u201325). Data Mining for Fraud Detection: Toward an Improvement on Internal Control Systems?. Proceedings of the European Accounting Association-Annual Congress, Rotterdam, The Netherlands."},{"key":"ref_14","unstructured":"PwC (2022). Global Economic Crime Survey 2022. Technical Report, PricewaterhouseCoopers International Limited."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Kotu, V., and Deshpande, B. (2019). Data Science: Concepts and Practice, Elsevier Science & Technology Books.","DOI":"10.1016\/B978-0-12-814761-0.00002-2"},{"key":"ref_16","unstructured":"Soria, E., Mart\u00edn, J.D., Serrano, A.J., and Aguado, D. (2007). An\u00e1lisis de Datos Experimentales, Ed. UPV."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1541880.1541882","article-title":"Anomaly Detection: A Survey","volume":"41","author":"Chandola","year":"2009","journal-title":"ACM Comput. Surv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Kotu, V., and Deshpande, B. (2019). Anomaly Detection. Data Science, Elsevier.","DOI":"10.1016\/B978-0-12-814761-0.00013-7"},{"key":"ref_19","first-page":"2501","article-title":"Effect of Different Distance Measures on the Performance of k-Means Algorithm: An Experimental Study in Matlab","volume":"5","author":"Bora","year":"2014","journal-title":"Int. J. Comput. Sci. Inf. Technol. (IJCSIT)"},{"key":"ref_20","unstructured":"Banerji, A. (2025, February 04). K-Mean: Getting The Optimal Number of Clusters. Available online: https:\/\/www.analyticsvidhya.com\/blog\/2021\/05\/k-mean-getting-the-optimal-number-of-clusters\/."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","article-title":"Silhouettes: A graphical aid to the interpretation and validation of cluster analysis","volume":"20","author":"Rousseeuw","year":"1987","journal-title":"J. Comput. Appl. Math."},{"key":"ref_22","unstructured":"Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. (1996, January 2\u20134). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the KDD\u201996: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3068335","article-title":"DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN","volume":"42","author":"Schubert","year":"2017","journal-title":"ACM Trans. Database Syst."},{"key":"ref_24","first-page":"69","article-title":"Cluster Analysis for Anomaly Detection in Accounting Data: An Audit Approach","volume":"11","author":"Thiprungsri","year":"2011","journal-title":"Int. J. Digit. Account. Res."},{"key":"ref_25","unstructured":"Hayasaka, S. (2025, February 04). What Is Clustering and How Does It Work? 21 June 2021. Available online: https:\/\/www.knime.com\/blog\/what-is-clustering-how-does-it-work."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1023\/A:1009745219419","article-title":"Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications","volume":"2","author":"Sander","year":"1998","journal-title":"Data Min. And Knowledge Discov."},{"key":"ref_27","unstructured":"KNIME (2025, February 04). Introduction to Machine Learning Algorithms. Available online: https:\/\/www.knime.com\/sites\/default\/files\/2021-01\/2021-01-slides-l4-ml.pdf."},{"key":"ref_28","unstructured":"Zuccarelli, E. (2025, February 04). Handling Categorical Data, The Right Way. Available online: https:\/\/medium.com\/towards-data-science\/handling-categorical-data-the-right-way-9d1279956fc6."},{"key":"ref_29","unstructured":"Scikit-Learn (2025, February 04). Isolation Forest Algorithm. Available online: https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.ensemble.IsolationForest.html."},{"key":"ref_30","unstructured":"Palacios, V., Widmann, M., and Cadili, R. (2025, February 04). The Pros and Cons of Statistical Sampling Methods Plus How to Find the Right Strategy. Available online: https:\/\/www.knime.com\/blog\/statistical-sampling."},{"key":"ref_31","unstructured":"(2025, February 04). H2O, Isolation Forest\u2014H2O implementation. Available online: https:\/\/docs.h2o.ai\/h2o\/latest-stable\/h2o-docs\/data-science\/if.html."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1016\/j.future.2015.01.001","article-title":"A survey of anomaly detection techniques in financial domain","volume":"55","author":"Ahmed","year":"2016","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1016\/j.is.2014.03.001","article-title":"Outlier detection in audit logs for application systems","volume":"44","author":"Kuna","year":"2014","journal-title":"Inf. Syst."},{"key":"ref_34","unstructured":"Molnar, C. (2025, February 04). Interpretable Machine Learning. Available online: https:\/\/christophm.github.io\/interpretable-ml-book."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/3\/177\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T16:43:05Z","timestamp":1760028185000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/3\/177"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,26]]},"references-count":34,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2025,3]]}},"alternative-id":["info16030177"],"URL":"https:\/\/doi.org\/10.3390\/info16030177","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,26]]}}}