{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T13:27:32Z","timestamp":1776778052084,"version":"3.51.2"},"reference-count":53,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2025,12,27]],"date-time":"2025-12-27T00:00:00Z","timestamp":1766793600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>Wangiri fraud is a pervasive telecommunications scam that exploits missed calls to lure victims into dialing premium-rate numbers, resulting in significant financial losses for operators and consumers. This paper presents a comprehensive machine learning framework for detecting Wangiri fraud in highly imbalanced and unlabeled Call Detail Record (CDR) datasets. We introduce a novel unsupervised labeling approach using domain-driven heuristics, coupled with advanced feature engineering to capture temporal, geographic, and behavioral patterns indicative of fraud. To address severe class imbalance, we evaluate multiple sampling strategies like the Synthetic Minority Over-sampling Technique (SMOTE) and undersampling, and also compare the performance of Logistic Regression, Decision Trees, Random Forest, XGBoost, and Multi-Layer Perceptron (MLP). Our results demonstrate that ensemble methods, particularly Random Forest and XGBoost, achieve near-perfect accuracy (e.g., Receiver Operating Characteristic Area Under the Curve (ROC-AUC) &gt;0.99) on balanced data while maintaining interpretability. The proposed pipeline offers a scalable and practical solution for real-time fraud detection, providing telecom operators with an effective tool to mitigate Wangiri fraud risks.<\/jats:p>","DOI":"10.3390\/fi18010015","type":"journal-article","created":{"date-parts":[[2025,12,28]],"date-time":"2025-12-28T23:54:36Z","timestamp":1766966076000},"page":"15","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Wangiri Fraud Detection: A Comprehensive Approach to Unlabeled Telecom Data"],"prefix":"10.3390","volume":"18","author":[{"given":"Amirreza","family":"Balouchi","sequence":"first","affiliation":[{"name":"Department of Electrical & Computer Engineering, University of Victoria, Victoria, BC V8P 5C2, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0187-6867","authenticated-orcid":false,"given":"Meisam","family":"Abdollahi","sequence":"additional","affiliation":[{"name":"Department of Electrical & Computer Engineering, University of Victoria, Victoria, BC V8P 5C2, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ali","family":"Eskandarian","sequence":"additional","affiliation":[{"name":"Computer Engineering Department, Shiraz University, Shiraz 8433471946, Iran"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kianoush","family":"Karimi Pour Kerman","sequence":"additional","affiliation":[{"name":"Department of Telecommunication Engineering, Islamic Azad University, Tehran 1477893855, Iran"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Elham","family":"Majd","sequence":"additional","affiliation":[{"name":"Department of Electrical & Computer Engineering, University of Victoria, Victoria, BC V8P 5C2, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Neda","family":"Azouji","sequence":"additional","affiliation":[{"name":"Department of Electrical & Computer Engineering, University of Victoria, Victoria, BC V8P 5C2, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Amirali","family":"Baniasadi","sequence":"additional","affiliation":[{"name":"Department of Electrical & Computer Engineering, University of Victoria, Victoria, BC V8P 5C2, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,12,27]]},"reference":[{"key":"ref_1","first-page":"25","article-title":"A Review of AI-Driven Predictive Maintenance in Telecommunications","volume":"3","author":"Silitonga","year":"2024","journal-title":"Int. J. Inf. Syst. Innov. Technol."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1109\/MCOM.003.2300055","article-title":"TelOps: AI-driven operations and maintenance for telecommunication networks","volume":"62","author":"Yang","year":"2023","journal-title":"IEEE Commun. Mag."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Aziz, Z., and Bestak, R. (2024). Insight into anomaly detection and prediction and mobile network security enhancement leveraging k-means clustering on call detail records. Sensors, 24.","DOI":"10.3390\/s24061716"},{"key":"ref_4","first-page":"3507","article-title":"Analysis and Modeling of Mobile Phone Activity Data Using Interactive Cyber-Physical Social System","volume":"80","author":"Amin","year":"2024","journal-title":"Comput. Mater. Contin."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Abdollahi, M., Mashhadi, S., Sabzalizadeh, R., Mirzaei, A., Elahi, M., Baharloo, M., and Baniasadi, A. (2023, January 18\u201321). IODnet: Indoor\/Outdoor Telecommunication Signal Detection through Deep Neural Network. Proceedings of the 2023 IEEE 16th International Symposium on Embedded Multicore\/Many-Core Systems-on-Chip (MCSoC), Singapore.","DOI":"10.1109\/MCSoC60832.2023.00028"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1109\/JAS.2019.1911795","article-title":"Big data analytics in telecommunications: Literature review and architecture recommendations","volume":"7","author":"Zahid","year":"2019","journal-title":"IEEE\/CAA J. Autom. Sin."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Farabi, S. (2025). AI-Driven Predictive Maintenance Model for DWDM Systems to Enhance Fiber Network Uptime in Underserved US Regions. Preprints.","DOI":"10.20944\/preprints202506.1152.v1"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Singh, P. (2025). Streamlining telecom customer support with AI-enhanced IVR and chat. Preprints.","DOI":"10.20944\/preprints202504.1359.v1"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Chegini, M., Abdollahi, M., Baniasadi, A., and Patooghy, A. (2024, January 16\u201317). Tiny-RFNet: Enabling Modulation Classification of Radio Signals on Edge Systems. Proceedings of the 2024 5th CPSSI International Symposium on Cyber-Physical Systems (Applications and Theory) (CPSAT), Tehran, Iran.","DOI":"10.1109\/CPSAT64082.2024.10745466"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Abdollahi, M., Sabzalizadeh, R., Javadinia, S., Mashhadi, S., Mehrizi, S.S., and Baniasadi, A. (2023, January 26\u201328). Automatic modulation classification for nlos 5g signals with deep learning approaches. Proceedings of the 2023 10th International Conference on Wireless Networks and Mobile Communications (WINCOM), Istanbul, Turkiye.","DOI":"10.1109\/WINCOM59760.2023.10322928"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Mashhadi, S., Diyanat, A., Abdollahi, M., and Baniasadi, A. (2023, January 26\u201328). DSP: A Deep Neural Network Approach for Serving Cell Positioning in Mobile Networks. Proceedings of the 2023 10th International Conference on Wireless Networks and Mobile Communications (WINCOM), Istanbul, Turkiye.","DOI":"10.1109\/WINCOM59760.2023.10323029"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"41728","DOI":"10.1109\/ACCESS.2018.2859756","article-title":"Call detail records driven anomaly detection and traffic prediction in mobile cellular networks","volume":"6","author":"Sultan","year":"2018","journal-title":"IEEE Access"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Konstantoulas, I., Loi, I., Tsimas, D., Sgarbas, K., Gkamas, A., and Bouras, C. (2025). A Framework for User Traffic Prediction and Resource Allocation in 5G Networks. Appl. Sci., 15.","DOI":"10.20944\/preprints202506.1524.v1"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1007\/s10462-025-11108-x","article-title":"Artificial intelligence advances in anomaly detection for telecom networks","volume":"58","author":"Edozie","year":"2025","journal-title":"Artif. Intell. Rev."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"33","DOI":"10.4236\/jcc.2023.114003","article-title":"Artificial Intelligence Self-Organising (AI-SON) Frameworks for 5G-Enabled Networks: A Review","volume":"11","author":"Dake","year":"2023","journal-title":"J. Comput. Commun."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Chang, V., Hall, K., Xu, Q.A., Amao, F.O., Ganatra, M.A., and Benson, V. (2024). Prediction of customer churn behavior in the telecommunication industry using machine learning models. Algorithms, 17.","DOI":"10.3390\/a17060231"},{"key":"ref_17","first-page":"100212","article-title":"A pricing optimization modelling for assisted decision making in telecommunication product-service bundling","volume":"4","author":"Zakaria","year":"2024","journal-title":"Int. J. Inf. Manag. Data Insights"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"101","DOI":"10.36676\/j.sust.sol.v1.i4.42","article-title":"Machine learning models for customer segmentation in telecom","volume":"1","author":"Bagam","year":"2024","journal-title":"J. Sustain. Solut."},{"key":"ref_19","unstructured":"Panahi, P.H., Jalilvand, A.H., and Diyanat, A. (2024). Enhancing quality of experience in telecommunication networks: A review of frameworks and machine learning algorithms. arXiv."},{"key":"ref_20","first-page":"89012","article-title":"Wangiri Fraud Pattern Analysis and Machine-Learning-Based Detection","volume":"11","author":"Smith","year":"2023","journal-title":"IEEE Access"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Hu, X., Chen, H., Chen, H., Zhang, S., Liu, S., and Li, X. (2022, January 11\u201314). Telecom fraud detection via imbalanced graph learning. Proceedings of the 2022 IEEE 22nd International Conference on Communication Technology (ICCT), Nanjing, China.","DOI":"10.1109\/ICCT56141.2022.10073400"},{"key":"ref_22","unstructured":"Taylor, L. (2025, December 21). Telecoms Fraud Costing Operators $40 Billion Annually. Capacity Media. Available online: https:\/\/cfca.org\/telecommunications-fraud-increased-12-in-2023-equating-to-an-estimated-38-95-billion-lost-to-fraud\/."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Mawgoud, A.A., Abu-Talleb, A., and Tawfik, B.S. (2021). A Holistic Neural Networks Classification for Wangiri Fraud Detection in Telecommunications Regulatory Authorities. International Conference on Advanced Machine Learning Technologies and Applications, Springer.","DOI":"10.1007\/978-3-030-69717-4_19"},{"key":"ref_24","unstructured":"Author1, A., and Author2, B. (2019). Detection of Wangiri Telecommunication Fraud Using Ensemble Learning. J. Electron. Eng. Inf. Technol., 10, 123\u2013135."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Mishra, N., and Shivaji, G.B. (2025, January 4\u20136). Data Mining for Fraud Detection in Telecommunications: Detecting Anomalous Behaviors in Real-Time. Proceedings of the 2025 International Conference on Automation and Computation (AUTOCOM), Dehradun, India.","DOI":"10.1109\/AUTOCOM64127.2025.10957085"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"\u015eahin, Y.G., and Duman, E. (2011). Detecting credit card fraud by decision trees and support vector machines. Proceedings of the International MultiConference of Engineers and Computer Scientists 2011, International Association of Engineers.","DOI":"10.1109\/INISTA.2011.5946108"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Arafat, M., Qusef, A., and Sammour, G. (2019, January 9\u201311). Detection of wangiri telecommunication fraud using ensemble learning. Proceedings of the 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT), Amman, Jordan.","DOI":"10.1109\/JEEIT.2019.8717528"},{"key":"ref_28","unstructured":"Birhanu, M. (2024). Near Real-time SIM-box Fraud Detection in Telecommunication System Using Machine Learning Approach in the Case of Ethio Telecom. [Ph.D. Thesis, St. Mary\u2019s University]."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Krasi\u0107, I., and \u010celar, S. (2022, January 22\u201324). Telecom fraud detection with machine learning on imbalanced dataset. Proceedings of the 2022 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), Split, Croatia.","DOI":"10.23919\/SoftCOM55329.2022.9911518"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"6794","DOI":"10.1109\/JIOT.2022.3174143","article-title":"Wangiri fraud: Pattern analysis and machine-learning-based detection","volume":"10","author":"Ravi","year":"2022","journal-title":"IEEE Internet Things J."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Liang, F.Y., Li, F.P., Xu, R.H., Cheng, W., Deng, S.X., Yang, Z.R., and Wang, C.D. (2023, January 1\u20134). Telecom fraud detection based on feature binning and autoencoder. Proceedings of the 2023 IEEE International Conference on Data Mining (ICDM), Shanghai, China.","DOI":"10.1109\/ICDM58522.2023.00046"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1016\/j.dcan.2023.03.002","article-title":"NFA: A neural factorization autoencoder based online telephony fraud detection","volume":"10","author":"Wahid","year":"2024","journal-title":"Digit. Commun. Netw."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Cazzolato, M., Vijayakumar, S., Lee, M.C., Vajiac, C., Park, N., Fidalgo, P., Traina, A.J., and Faloutsos, C. (2023, January 21\u201325). Callmine: Fraud detection and visualization of million-scale call graphs. Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, Birmingham, UK.","DOI":"10.1145\/3583780.3614662"},{"key":"ref_34","unstructured":"Singh, G., Singh, P., and Singh, M. (2025). Advanced Real-Time Fraud Detection Using RAG-Based LLMs. arXiv."},{"key":"ref_35","first-page":"29487","article-title":"Combating Phone Scams with LLM-based Detection: Where Do We Stand? (Student Abstract)","volume":"39","author":"Shen","year":"2025","journal-title":"AAAI Conf. Artif. Intell."},{"key":"ref_36","unstructured":"Shen, Z., Yan, S., Zhang, Y., Luo, X., Ngai, G., and Fu, E.Y. (May, January 26). It Warned Me Just at the Right Moment: Exploring LLM-based Real-time Detection of Phone Scams. Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, Yokohama, Japan."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Kirkos, E., Boskou, G., Chatzipetrou, E., Tiakas, E., and Spathis, C. (2025, December 21). Exploring the Boundaries of Financial Statement Fraud Detection with Large Language Models. Available online: https:\/\/www.researchgate.net\/publication\/381676241_Exploring_the_Boundaries_of_Financial_Statement_Fraud_Detection_with_Large_Language_Models.","DOI":"10.2139\/ssrn.4895081"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Korkanti, S. (2024, January 23\u201325). Enhancing Financial Fraud Detection Using LLMs and Advanced Data Analytics. Proceedings of the 2024 2nd International Conference on Self Sustainable Artificial Intelligence Systems (ICSSAS), Erode, India.","DOI":"10.1109\/ICSSAS64001.2024.10760895"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"279","DOI":"10.37284\/eajit.7.1.2212","article-title":"Assessing Mobile Network Fraud Threats and Prevention Strategies in Kenya","volume":"7","author":"Mundia","year":"2024","journal-title":"East Afr. J. Inf. Technol."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Muchilwa, L., and Musuva, P. (2023, January 26\u201328). Coeus: A Cyber Threat Intelligence Sharing Platform for Fraudulent Phone Numbers. Proceedings of the 2023 IST-Africa Conference (IST-Africa), Istanbul, Turkiye.","DOI":"10.23919\/IST-Africa60249.2023.10187737"},{"key":"ref_41","first-page":"365","article-title":"Regulatory Recommendations for Fraud Problem in The Turkish Telecommunication Sector","volume":"14","author":"Bayram","year":"2023","journal-title":"AJIT-e Acad. J. Inf. Technol."},{"key":"ref_42","unstructured":"SahaIDak, V., LySenko, Y., and Senkov, Y. (2022). Telecom fraud and it\u2019s impact on mobile carrier business. Communication, 17\u201320."},{"key":"ref_43","unstructured":"Lundberg, S.M., and Lee, S.I. (2017). A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst., 30."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1111\/j.2517-6161.1958.tb00292.x","article-title":"The regression analysis of binary sequences","volume":"20","author":"Cox","year":"1958","journal-title":"J. R. Stat. Soc. Ser. B (Methodol.)"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Hosmer, D.W., and Lemeshow, S. (2000). Applied Logistic Regression, John Wiley & Sons. [2nd ed.].","DOI":"10.1002\/0471722146"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Kleinbaum, D.G., and Klein, M. (2010). Logistic Regression: A Self-Learning Text, Springer. [3rd ed.].","DOI":"10.1007\/978-1-4419-1742-3"},{"key":"ref_47","unstructured":"Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J. (1984). Classification and Regression Trees, Wadsworth International Group."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random Forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_49","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press. Available online: http:\/\/www.deeplearningbook.org."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). Xgboost: A scalable tree boosting system. Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_51","first-page":"37","article-title":"Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation","volume":"2","author":"Powers","year":"2011","journal-title":"J. Mach. Learn. Technol."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","article-title":"An Introduction to ROC Analysis","volume":"27","author":"Fawcett","year":"2006","journal-title":"Pattern Recognit. Lett."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Davis, J., and Goadrich, M. (2006, January 25\u201329). The Relationship Between Precision-Recall and ROC Curves. Proceedings of the 23rd International Conference on Machine Learning (ICML), Pittsburgh, PN, USA.","DOI":"10.1145\/1143844.1143874"}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/18\/1\/15\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,31]],"date-time":"2025-12-31T14:13:25Z","timestamp":1767190405000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/18\/1\/15"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,27]]},"references-count":53,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2026,1]]}},"alternative-id":["fi18010015"],"URL":"https:\/\/doi.org\/10.3390\/fi18010015","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,27]]}}}