{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T06:43:15Z","timestamp":1769150595664,"version":"3.49.0"},"reference-count":64,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2023,8,9]],"date-time":"2023-08-09T00:00:00Z","timestamp":1691539200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computation"],"abstract":"<jats:p>Cryptocurrencies are rapidly growing and are increasingly accepted by major commercial vendors. However, along with their rising popularity, they have also become the go-to currency for illicit activities driven by the anonymity they provide. Cryptocurrencies such as the one on the Ethereum blockchain provide a way for entities to hide their real-world identities behind pseudonyms, also known as addresses. Hence, the purpose of this work is to uncover the level of anonymity in Ethereum by investigating multiclass classification models for Externally Owned Accounts (EOAs) of Ethereum. The researchers aim to achieve this by examining patterns of transaction activity associated with these addresses. Using a labelled Ethereum address dataset from Kaggle and the Ethereum crypto dataset by Google BigQuery, an address profiles dataset was compiled based on the transaction history of the addresses. The compiled dataset, consisting of 4371 samples, was used to tune and evaluate the Random Forest, Gradient Boosting and XGBoost classifier for predicting the category of the addresses. The best-performing model found for the problem was the XGBoost classifier, achieving an accuracy of 75.3% with a macro-averaged F1-Score of 0.689. Following closely was the Random Forest classifier, with an accuracy of 73.7% and a macro-averaged F1-Score of 0.641. Gradient Boosting came in last with 73% accuracy and a macro-averaged F1-Score of 0.659. Owing to the data limitations in this study, the overall scores of the best model were weaker in comparison to similar research, with the exception of precision, which scored slightly higher. Nevertheless, the results proved that it is possible to predict the category of an Ethereum wallet address such as Phish\/Hack, Scamming, Exchange and ICO wallets based on its transaction behaviour.<\/jats:p>","DOI":"10.3390\/computation11080156","type":"journal-article","created":{"date-parts":[[2023,8,9]],"date-time":"2023-08-09T10:08:59Z","timestamp":1691575739000},"page":"156","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["CEAT: Categorising Ethereum Addresses\u2019 Transaction Behaviour with Ensemble Machine Learning Algorithms"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-9145-816X","authenticated-orcid":false,"given":"Tiffany Tien Nee","family":"Pragasam","sequence":"first","affiliation":[{"name":"Department of Computing, UOW Malaysia, KDU Penang University College, George Town 10400, Malaysia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9992-6094","authenticated-orcid":false,"given":"John Victor Joshua","family":"Thomas","sequence":"additional","affiliation":[{"name":"Department of Computing, UOW Malaysia, KDU Penang University College, George Town 10400, Malaysia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5712-7761","authenticated-orcid":false,"given":"Maria Anu","family":"Vensuslaus","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Vellore Institute of Technology, Chennai 600127, India"}]},{"given":"Subhashini","family":"Radhakrishnan","sequence":"additional","affiliation":[{"name":"Department of Information Technology, Sathyabama Institute of Science and Technology, Chennai 600119, India"}]}],"member":"1968","published-online":{"date-parts":[[2023,8,9]]},"reference":[{"key":"ref_1","unstructured":"(2023, March 12). CoinMarketCap Cryptocurrency Prices, Charts and Market Capitalizations. Available online: https:\/\/coinmarketcap.com\/."},{"key":"ref_2","unstructured":"Wu, M., McTighe, W., Wang, K., Seres, I.A., Bax, N., Puebla, M., Mendez, M., Carrone, F., De Mattey, T., and Demaestri, H.O. (2022). Tutela: An Open-Source Tool for Assessing User-Privacy on Ethereum and Tornado Cash. arXiv."},{"key":"ref_3","unstructured":"Grauer, K., Jardine, E., Leosz, E., and Updegrave, H. (2023). The 2023 Crypto Crime Report, Chainalysis."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"113318","DOI":"10.1016\/j.eswa.2020.113318","article-title":"Detection of Illicit Accounts over the Ethereum Blockchain","volume":"150","author":"Farrugia","year":"2020","journal-title":"Expert Syst. Appl."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Harlev, M.A., Sun Yin, H., Langenheldt, K.C., Mukkamala, R., and Vatrapu, R. (2018). Breaking Bad: De-Anonymising Entity Types on the Bitcoin Blockchain Using Supervised Machine Learning, Publisher Hawaii International Conference on System Sciences (HICSS).","DOI":"10.24251\/HICSS.2018.443"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Alarab, I., Prakoonwit, S., and Nacer, M.I. (2020, January 19\u201321). Comparative Analysis Using Supervised Learning Methods for Anti-Money Laundering in Bitcoin. Proceedings of the 2020 5th International Conference on Machine Learning Technologies, Beijing, China.","DOI":"10.1145\/3409073.3409078"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1007\/978-3-030-49785-9_7","article-title":"Detecting Malicious Accounts on the Ethereum Blockchain with Supervised Learning","volume":"Volume 12161","author":"Dolev","year":"2020","journal-title":"Cyber Security Cryptography and Machine Learning"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Lorenz, J., Silva, M.I., Apar\u00edcio, D., Ascens\u00e3o, J.T., and Bizarro, P. (2020, January 15\u201316). Machine Learning Methods to Detect Money Laundering in the Bitcoin Blockchain in the Presence of Label Scarcity. Proceedings of the First ACM International Conference on AI in Finance, New York, NY, USA.","DOI":"10.1145\/3383455.3422549"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Poursafaei, F., Hamad, G.B., and Zilic, Z. (2020, January 28\u201330). Detecting Malicious Ethereum Entities via Application of Machine Learning Classification. Proceedings of the 2020 2nd Conference on Blockchain Research & Applications for Innovative Networks and Services (BRAINS), Paris, France.","DOI":"10.1109\/BRAINS49436.2020.9223304"},{"key":"ref_10","unstructured":"Weber, M., Domeniconi, G., Chen, J., Weidele, D.K.I., Bellei, C., Robinson, T., and Leiserson, C.E. (2019). Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks for Financial Forensics. arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1080\/07421222.2018.1550550","article-title":"Regulating Cryptocurrencies: A Supervised Machine Learning Approach to De-Anonymizing the Bitcoin Blockchain","volume":"36","author":"Langenheldt","year":"2019","journal-title":"J. Manag. Inf. Syst."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Zola, F., Eguimendia, M., Bruse, J.L., and Orduna Urrutia, R. (2019, January 14\u201317). Cascading Machine Learning to Attack Bitcoin Anonymity. Proceedings of the 2019 IEEE International Conference on Blockchain (Blockchain), Seoul, Korea.","DOI":"10.1109\/Blockchain.2019.00011"},{"key":"ref_13","unstructured":"Hall, H. (2023, March 09). Labelled Ethereum Addresses|Kaggle. Available online: https:\/\/www.kaggle.com\/datasets\/hamishhall\/labelled-ethereum-addresses."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"SMOTE: Synthetic Minority Over-Sampling Technique","volume":"16","author":"Chawla","year":"2002","journal-title":"J. Artif. Intell. Res."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Maimon, O., and Rokach, L. (2005). Data Mining and Knowledge Discovery Handbook, Springer.","DOI":"10.1007\/b107408"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"644","DOI":"10.1111\/j.1651-2227.2006.00178.x","article-title":"Understanding Diagnostic Tests 3: Receiver Operating Characteristic Curves","volume":"96","author":"Akobeng","year":"2007","journal-title":"Acta Paediatr."},{"key":"ref_17","unstructured":"Crosby, M., Pattanayak, P., Verma, S., and Kalyanaraman, V. (2016). BlockChain Technology: Beyond Bitcoin, Sutardja Center for Entrepreneurship and Technology."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zheng, Z., Xie, S., Dai, H., Chen, X., and Wang, H. (2017, January 25\u201330). An Overview of Blockchain Technology: Architecture, Consensus, and Future Trends. Proceedings of the 2017 IEEE International Congress on Big Data (BigData Congress), Honolulu, HI, USA.","DOI":"10.1109\/BigDataCongress.2017.85"},{"key":"ref_19","unstructured":"Niranjanamurthy, M., Nithya, B.N., and Jagannatha, S. (2023, July 24). Analysis of Blockchain Technology: Pros, Cons and SWOT|SpringerLink. Available online: https:\/\/link.springer.com\/article\/10.1007\/s10586-018-2387-5."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1016\/j.rser.2018.10.014","article-title":"Blockchain Technology in the Energy Sector: A Systematic Review of Challenges and Opportunities","volume":"100","author":"Andoni","year":"2019","journal-title":"Renew. Sustain. Energy Rev."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"117134","DOI":"10.1109\/ACCESS.2019.2936094","article-title":"A Survey of Blockchain from the Perspectives of Applications, Challenges, and Opportunities","volume":"7","author":"Monrat","year":"2019","journal-title":"IEEE Access"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"e5493","DOI":"10.1002\/cpe.5493","article-title":"On the Ethereum Blockchain Structure: A Complex Networks Theory Perspective","volume":"32","author":"Ferretti","year":"2020","journal-title":"Concurr. Comput. Pract. Exp."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"352","DOI":"10.1504\/IJWGS.2018.095647","article-title":"Blockchain Challenges and Opportunities: A Survey","volume":"14","author":"Zheng","year":"2018","journal-title":"Int. J. Web Grid Serv."},{"key":"ref_24","unstructured":"Nakamoto, S. (2023, July 24). Bitcoin: A Peer-to-Peer Electronic Cash System. Available online: https:\/\/bitcoin.org\/bitcoin.pdf."},{"key":"ref_25","unstructured":"Buterin, V. (2023, July 24). A Next Generation Smart Contract & Decentralized Application Platform. Available online: https:\/\/finpedia.vn\/wp-content\/uploads\/2022\/02\/Ethereum_white_paper-a_next_generation_smart_contract_and_decentralized_application_platform-vitalik-buterin.pdf."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Vuji\u010di\u0107, D., Jagodi\u0107, D., and Ran\u0111i\u0107, S. (2018, January 21\u201323). Blockchain Technology, Bitcoin, and Ethereum: A Brief Overview. Proceedings of the 2018 17th International Symposium INFOTEH-JAHORINA (INFOTEH), East Sarajevo, Bosnia and Herzegovina.","DOI":"10.1109\/INFOTEH.2018.8345547"},{"key":"ref_27","unstructured":"Wood, D.G. (2023, July 24). Ethereum: A Secure Decentralised Generalised Transaction Ledger. Available online: https:\/\/cryptodeep.ru\/doc\/paper.pdf."},{"key":"ref_28","first-page":"2737","article-title":"Modeling and Understanding Ethereum Transaction Records via a Complex Network Approach","volume":"67","author":"Lin","year":"2020","journal-title":"IEEE Trans. Circuits Syst. II Express Briefs"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1093\/nsr\/nwx106","article-title":"A Brief Introduction to Weakly Supervised Learning","volume":"5","author":"Zhou","year":"2018","journal-title":"Natl. Sci. Rev."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1002\/widm.8","article-title":"Classification and Regression Trees","volume":"1","author":"Loh","year":"2011","journal-title":"WIREs Data Min. Knowl. Discov."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1016\/j.ipm.2009.03.002","article-title":"A Systematic Analysis of Performance Measures for Classification Tasks","volume":"45","author":"Sokolova","year":"2009","journal-title":"Inf. Process. Manag."},{"key":"ref_32","unstructured":"Han, J., Pei, J., and Tong, H. (2022). Data Mining: Concepts and Techniques, Morgan Kaufmann."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1007\/s42979-021-00592-x","article-title":"Machine Learning: Algorithms, Real-World Applications and Research Directions","volume":"2","author":"Sarker","year":"2021","journal-title":"SN Comput. Sci."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1545","DOI":"10.1162\/neco.1997.9.7.1545","article-title":"Shape Quantization and Recognition with Randomized Trees","volume":"9","author":"Amit","year":"1997","journal-title":"Neural Comput."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random Forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Zhang, C., and Ma, Y. (2012). Ensemble Machine Learning: Methods and Applications, Springer.","DOI":"10.1007\/978-1-4419-9326-7"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1937","DOI":"10.1007\/s10462-020-09896-5","article-title":"A Comparative Analysis of Gradient Boosting Algorithms","volume":"54","year":"2021","journal-title":"Artif. Intell. Rev."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1016\/j.isprsjprs.2016.01.011","article-title":"Random Forest in Remote Sensing: A Review of Applications and Future Directions","volume":"114","author":"Belgiu","year":"2016","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_39","first-page":"18","article-title":"Classification and Regression by RandomForest","volume":"2","author":"Liaw","year":"2002","journal-title":"R News"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1189","DOI":"10.1214\/aos\/1013203451","article-title":"Greedy Function Approximation: A Gradient Boosting Machine","volume":"29","author":"Friedman","year":"2001","journal-title":"Ann. Stat."},{"key":"ref_41","first-page":"771","article-title":"A Short Introduction to Boosting","volume":"14","author":"Freund","year":"1999","journal-title":"J. Jpn. Soc. Artif. Intell."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Abraham, A., Dutta, P., Mandal, J.K., Bhattacharya, A., and Dutta, S. (2019). Proceedings of the Emerging Technologies in Data Mining and Information Security, Springer.","DOI":"10.1007\/978-981-13-1498-8"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Chen, Z., Jiang, F., Cheng, Y., Gu, X., Liu, W., and Peng, J. (2018, January 15\u201317). XGBoost Classifier for DDoS Attack Detection and Analysis in SDN-Based Cloud. Proceedings of the 2018 IEEE International Conference on Big Data and Smart Computing (BigComp), Shanghai, China.","DOI":"10.1109\/BigComp.2018.00044"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"617","DOI":"10.1007\/978-3-030-51280-4_33","article-title":"Address Clustering Heuristics for Ethereum","volume":"Volume 12059","author":"Bonneau","year":"2020","journal-title":"Financial Cryptography and Data Security"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Bhargavi, M.S., Katti, S.M., Shilpa, M., Kulkarni, V.P., and Prasad, S. (2020, January 3\u20135). Transactional Data Analytics for Inferring Behavioural Traits in Ethereum Blockchain Network. Proceedings of the 2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP), Cluj-Napoca, Romania.","DOI":"10.1109\/ICCP51029.2020.9266176"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Ashfaq, T., Khalid, R., Yahaya, A.S., Aslam, S., Azar, A.T., Alsafari, S., and Hameed, I.A. (2022). A Machine Learning and Blockchain Based Efficient Fraud Detection Mechanism. Sensors, 22.","DOI":"10.3390\/s22197162"},{"key":"ref_48","first-page":"14","article-title":"Applying Supervised Machine Learning Algorithms for Fraud Detection in Anti-Money Laundering","volume":"1","author":"Raiter","year":"2021","journal-title":"J. Mod. Issues Bus. Res."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Krishnan, L.P., Vakilinia, I., Reddivari, S., and Ahuja, S. (2023). Scams and Solutions in Cryptocurrencies\u2014A Survey Analyzing Existing Machine Learning Models. Information, 14.","DOI":"10.3390\/info14030171"},{"key":"ref_50","unstructured":"Payette, J., Schwager, S., and Murphy, J. (2023, July 24). Characterizing the Ethereum Address Space. Available online: http:\/\/cs229.stanford.edu\/proj2017\/final-reports\/5244232.pdf."},{"key":"ref_51","unstructured":"Day, A., Medvedev, E., Risdal, M., and Katesit, T. (2023, July 26). Ethereum in BigQuery: A Public Dataset for Smart Contract Analytics. Available online: https:\/\/cloud.google.com\/blog\/products\/data-analytics\/ethereum-bigquery-public-dataset-smart-contract-analytics."},{"key":"ref_52","unstructured":"Johnson, N. (2023, March 09). Ethereum Analytics with BigQuery. Available online: https:\/\/mirror.xyz\/nick.eth\/INhEmxgxoyoa8kPZ3rjYNZXoyfGsReLgx42MdDvn4SM."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1214\/09-SS054","article-title":"A Survey of Cross-Validation Procedures for Model Selection","volume":"4","author":"Arlot","year":"2010","journal-title":"Statist. Surv."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Liu, L., and \u00d6zsu, M.T. (2009). Encyclopedia of Database Systems, Springer.","DOI":"10.1007\/978-0-387-39940-9"},{"key":"ref_55","unstructured":"Berrar, D. (2019). Encyclopedia of Bioinformatics and Computational Biology, Elsevier."},{"key":"ref_56","first-page":"281","article-title":"Random Search for Hyper-Parameter Optimization","volume":"13","author":"Bergstra","year":"2012","journal-title":"J. Mach. Learn. Res."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Dalianis, H. (2018). Clinical Text Mining, Springer International Publishing.","DOI":"10.1007\/978-3-319-78503-5"},{"key":"ref_58","unstructured":"Grandini, M., Bagli, E., and Visani, G. (2020). Metrics for Multi-Class Classification: An Overview. arXiv."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"928","DOI":"10.1161\/CIRCULATIONAHA.106.672402","article-title":"Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction","volume":"115","author":"Cook","year":"2007","journal-title":"Circulation"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Hosmer, D.W., Lemeshow, S., and Sturdivant, R.X. (2013). Applied Logistic Regression, John Wiley & Sons.","DOI":"10.1002\/9781118548387"},{"key":"ref_61","first-page":"517","article-title":"Enhancement of Cross Validation Using Hybrid Visual and Analytical Means with Shannon Function","volume":"Volume 835","author":"Kosheleva","year":"2020","journal-title":"Beyond Traditional Probabilistic Data Processing Techniques: Interval, Fuzzy etc. Methods and Their Applications"},{"key":"ref_62","first-page":"1089","article-title":"No Unbiased Estimator of the Variance of K-Fold Cross-Validation","volume":"5","author":"Bengio","year":"2004","journal-title":"J. Mach. Learn. Res."},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1016\/j.jeconom.2015.02.006","article-title":"Cross-Validation for Selecting a Model Selection Procedure","volume":"187","author":"Zhang","year":"2015","journal-title":"J. Econom."},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"105845","DOI":"10.1016\/j.knosys.2020.105845","article-title":"LR-SMOTE\u2014An Improved Unbalanced Data Set Oversampling Based on K-Means and SVM","volume":"196","author":"Liang","year":"2020","journal-title":"Knowl.-Based Syst."}],"container-title":["Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-3197\/11\/8\/156\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:28:21Z","timestamp":1760128101000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-3197\/11\/8\/156"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,9]]},"references-count":64,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2023,8]]}},"alternative-id":["computation11080156"],"URL":"https:\/\/doi.org\/10.3390\/computation11080156","relation":{},"ISSN":["2079-3197"],"issn-type":[{"value":"2079-3197","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,9]]}}}