{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T13:12:29Z","timestamp":1769001149353,"version":"3.49.0"},"reference-count":57,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2020,11,28]],"date-time":"2020-11-28T00:00:00Z","timestamp":1606521600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>One of the significant challenges in machine learning is the classification of imbalanced data. In many situations, standard classifiers cannot learn how to distinguish minority class examples from the others. Since many real problems are unbalanced, this problem has become very relevant and deeply studied today. This paper presents a new preprocessing method based on Delaunay tessellation and the preprocessing algorithm SMOTE (Synthetic Minority Over-sampling Technique), which we call DTO-SMOTE (Delaunay Tessellation Oversampling SMOTE). DTO-SMOTE constructs a mesh of simplices (in this paper, we use tetrahedrons) for creating synthetic examples. We compare results with five preprocessing algorithms (GEOMETRIC-SMOTE, SVM-SMOTE, SMOTE-BORDERLINE-1, SMOTE-BORDERLINE-2, and SMOTE), eight classification algorithms, and 61 binary-class data sets. For some classifiers, DTO-SMOTE has higher performance than others in terms of Area Under the ROC curve (AUC), Geometric Mean (GEO), and Generalized Index of Balanced Accuracy (IBA).<\/jats:p>","DOI":"10.3390\/info11120557","type":"journal-article","created":{"date-parts":[[2020,11,28]],"date-time":"2020-11-28T03:51:16Z","timestamp":1606535476000},"page":"557","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["DTO-SMOTE: Delaunay Tessellation Oversampling for Imbalanced Data Sets"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8801-4321","authenticated-orcid":false,"given":"Alexandre M.","family":"de Carvalho","sequence":"first","affiliation":[{"name":"Center of Mathematics, Computing and Cognition (CMCC), Federal University of ABC (UFABC), Santo Andr\u00e9, SP 09210-580, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8597-4987","authenticated-orcid":false,"given":"Ronaldo C.","family":"Prati","sequence":"additional","affiliation":[{"name":"Center of Mathematics, Computing and Cognition (CMCC), Federal University of ABC (UFABC), Santo Andr\u00e9, SP 09210-580, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,11,28]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1007\/s10115-014-0794-3","article-title":"Class imbalance revisited: A new experimental setup to assess the performance of treatment methods","volume":"45","author":"Prati","year":"2015","journal-title":"Knowl. Inf. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"714","DOI":"10.1016\/j.asoc.2015.08.060","article-title":"Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy","volume":"38","author":"Krawczyk","year":"2016","journal-title":"Appl. Soft Comput. J."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1016\/j.envsoft.2017.11.024","article-title":"Imbalanced classification techniques for monsoon forecasting based on a new climatic time series","volume":"106","author":"Troncoso","year":"2018","journal-title":"Environ. Model. Softw."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Yan, B., and Han, G. (2018). LA-GRU: Building Combined Intrusion Detection Model Based on Imbalanced Learning and Gated Recurrent Unit Neural Network. Secur. Commun. Netw., 2018.","DOI":"10.1155\/2018\/6026878"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"2147","DOI":"10.3233\/JIFS-179880","article-title":"Irony detection in Twitter with imbalanced class distributions","volume":"39","author":"Prati","year":"2020","journal-title":"J. Intell. Fuzzy Syst."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1007\/s10614-020-09975-3","article-title":"Predicting Extreme Financial Risks on Imbalanced Dataset: A Combined Kernel FCM and Kernel SMOTE Based SVM Classifier","volume":"56","author":"Huang","year":"2020","journal-title":"Comput. Econ."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"4161","DOI":"10.1007\/s10462-019-09789-2","article-title":"Predicting firm failure in the software industry","volume":"53","author":"Roumani","year":"2020","journal-title":"Artif. Intell. Rev."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1016\/j.patcog.2016.08.023","article-title":"KRNN: K Rare-class Nearest Neighbour classification","volume":"62","author":"Zhang","year":"2017","journal-title":"Pattern Recognit."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Sawangarreerak, S., and Thanathamathee, P. (2020). Random Forest with Sampling Techniques for Handling Imbalanced Prediction of University Student Depression. Information, 11.","DOI":"10.3390\/info11110519"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Oksuz, K., Cam, B.C., Kalkan, S., and Akbas, E. (2020). Imbalance problems in object detection: A review. IEEE Trans. Pattern Anal. Mach. Intell.","DOI":"10.1109\/TPAMI.2020.2981890"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Fiorentini, N., and Losa, M. (2020). Handling imbalanced data in road crash severity prediction by machine learning algorithms. Infrastructures, 5.","DOI":"10.3390\/infrastructures5070061"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1550147720916404","DOI":"10.1177\/1550147720916404","article-title":"A review on classification of imbalanced data for wireless sensor networks","volume":"16","author":"Patel","year":"2020","journal-title":"Int. J. Distrib. Sens. Netw."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"SMOTE: Synthetic Minority Over-sampling Technique Nitesh","volume":"16","author":"Chawla","year":"2002","journal-title":"J. Artif. Intell. Res."},{"key":"ref_14","first-page":"L29","article-title":"Continuous fields and discrete samples: Reconstruction through Delaunay tessellations","volume":"363","author":"Schaap","year":"2000","journal-title":"Astron. Astrophys."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Carvalho, A.M.D., and Prati, R.C. (2018, January 8\u201313). Improving kNN classification under Unbalanced Data. A New Geometric Oversampling Approach. Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil.","DOI":"10.1109\/IJCNN.2018.8489411"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Fern\u00e1ndez, A., Garc\u00eda, S., Galar, M., Prati, R.C., Krawczyk, B., and Herrera, F. (2018). Learning from Imbalanced Data Sets, Springer International Publishing.","DOI":"10.1007\/978-3-319-98074-4"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Japkowicz, N., and Shah, M. (2011). Evaluating Learning Algorithms: A Classification Perspective, Cambridge University Press.","DOI":"10.1017\/CBO9780511921803"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1016\/j.knosys.2011.06.013","article-title":"On the effectiveness of preprocessing methods when dealing with different levels of class imbalance","volume":"25","author":"Mollineda","year":"2012","journal-title":"Knowl.-Based Syst."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1601","DOI":"10.1109\/TKDE.2011.59","article-title":"A Survey on Graphical Methods for Classification Predictive Performance Evaluation","volume":"23","author":"Prati","year":"2011","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1016\/j.eswa.2016.12.035","article-title":"Learning from class-imbalanced data: Review of methods and applications","volume":"73","author":"Haixiang","year":"2017","journal-title":"Expert Syst. Appl."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"863","DOI":"10.1613\/jair.1.11192","article-title":"SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary","volume":"61","author":"Fernandez","year":"2018","journal-title":"J. Artif. Intell. Res."},{"key":"ref_22","first-page":"1","article-title":"Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning","volume":"18","author":"Nogueira","year":"2017","journal-title":"J. Mach. Learn. Res."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1162\/evco.2009.17.3.275","article-title":"Evolutionary undersampling for classification with imbalanced datasets: Proposals and taxonomy","volume":"17","author":"Herrera","year":"2009","journal-title":"Evol. Comput."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1016\/j.neucom.2012.08.018","article-title":"ACOSampling: An ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data","volume":"101","author":"Yu","year":"2013","journal-title":"Neurocomputing"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1109\/MCI.2006.329691","article-title":"Ant colony optimization","volume":"1","author":"Dorigo","year":"2006","journal-title":"IEEE Comput. Intell. Mag."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1949","DOI":"10.1016\/j.patcog.2009.01.027","article-title":"Using pre & post-processing methods to improve binding site predictions","volume":"42","author":"Sun","year":"2009","journal-title":"Pattern Recognit."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1145\/1007730.1007735","article-title":"A study of the behavior of several methods for balancing machine learning training data","volume":"6","author":"Batista","year":"2004","journal-title":"ACM Sigkdd Explor. Newsl."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1007\/s10115-011-0465-6","article-title":"SMOTE-RSB *: A hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory","volume":"33","author":"Ramentol","year":"2012","journal-title":"Knowl. Inf. Syst."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1016\/j.ins.2014.08.051","article-title":"SMOTE-IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering","volume":"291","author":"Luengo","year":"2015","journal-title":"Inf. Sci."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Guo, H., Zhou, J., and Wu, C.A. (2018). Imbalanced learning based on data-partition and SMOTE. Information, 9.","DOI":"10.3390\/info9090238"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Fern\u00e1ndez, A., Garc\u00eda, S., Galar, M., Prati, R.C., Krawczyk, B., and Herrera, F. (2018). Cost-Sensitive Learning. Learning from Imbalanced Data Sets, Springer.","DOI":"10.1007\/978-3-319-98074-4"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Fern\u00e1ndez, A., Garc\u00eda, S., Galar, M., Prati, R.C., Krawczyk, B., and Herrera, F. (2018). Ensemble Learning. Learning from Imbalanced Data Sets, Springer.","DOI":"10.1007\/978-3-319-98074-4"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"463","DOI":"10.1109\/TSMCC.2011.2161285","article-title":"A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches","volume":"42","author":"Galar","year":"2012","journal-title":"Syst. Man Cybern. Part C Appl. Rev. IEEE Trans."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/BF00058655","article-title":"Bagging predictors","volume":"24","author":"Leo","year":"1996","journal-title":"Mach. Learn."},{"key":"ref_35","unstructured":"Huang, D.S., Zhang, X.P., and Huang, G.B. (2005). Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning. Advances in Intelligent Computing, Springer."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1504\/IJKESDP.2011.039875","article-title":"Borderline over-sampling for imbalanced data classification","volume":"3","author":"Nguyen","year":"2011","journal-title":"Int. J. Knowl. Eng. Soft Data Paradig."},{"key":"ref_37","unstructured":"He, H., Bai, Y., Garcia, E.A., and Li, S. (2018, January 8\u201313). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Rio de Janeiro, Brazil."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1016\/j.ins.2019.06.007","article-title":"Geometric SMOTE a geometrically enhanced drop-in replacement for SMOTE","volume":"501","author":"Douzas","year":"2019","journal-title":"Inf. Sci."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1007\/s10994-017-5670-4","article-title":"Manifold-based synthetic oversampling with manifold conformance estimation","volume":"107","author":"Bellinger","year":"2018","journal-title":"Mach. Learn."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1016\/j.ins.2019.07.070","article-title":"A Comprehensive Analysis of Synthetic Minority Oversampling TEchnique (SMOTE) for Handling Class Imbalance","volume":"505","author":"Elreedy","year":"2019","journal-title":"Inf. Sci."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1016\/j.gmod.2012.10.007","article-title":"Feature-preserving surface mesh smoothing via suboptimal Delaunay triangulation","volume":"75","author":"Gao","year":"2013","journal-title":"Graph. Model."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"6803","DOI":"10.1109\/TGRS.2016.2591066","article-title":"Jointly Informative and Manifold Structure Representative Sampling Based Active Learning for Remote Sensing Image Classification","volume":"54","author":"Samat","year":"2016","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Kolluri, R., Shewchuk, J.R., and O\u2019Brien, J.F. (2004, January 8\u201310). Spectral surface reconstruction from noisy point clouds. Proceedings of the 2004 Eurographics\/ACM SIGGRAPH Symposium on Geometry Processing, Nice, France.","DOI":"10.1145\/1057432.1057434"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1016\/j.comgeo.2005.09.005","article-title":"Generating realistic terrains with higher-order Delaunay triangulations","volume":"36","year":"2007","journal-title":"Comput. Geom."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Anderson, S.J., Karumanchi, S.B., and Iagnemma, K. (2012, January 3\u20137). Constraint-based planning and control for safe, semi-autonomous operation of vehicles. Proceedings of the 2012 IEEE Intelligent Vehicles Symposium (IV), Madrid, Spain.","DOI":"10.1109\/IVS.2012.6232153"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"469","DOI":"10.1093\/comnet\/cny036","article-title":"The simplex geometry of graphs","volume":"7","author":"Devriendt","year":"2019","journal-title":"J. Complex Netw."},{"key":"ref_47","unstructured":"Jones, E., Oliphant, T., and Peterson, P. (2020, November 05). SciPy: Open Source Scientific Tools for Python. Available online: https:\/\/www.scipy.org\/."},{"key":"ref_48","unstructured":"Maur, P. (2002). Delaunay Triangulation in 3D. [Ph.D. Thesis, University of West Bohemia in Pilsen]."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1109\/MCI.2018.2866730","article-title":"Cross-validation for imbalanced datasets: Avoiding overoptimistic and overfitting approaches [research frontier]","volume":"13","author":"Santos","year":"2018","journal-title":"IEEE Comput. Intell. Mag."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s10994-006-6226-1","article-title":"Extremely randomized trees","volume":"63","author":"Geurts","year":"2006","journal-title":"Mach. Learn."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1007\/s11634-017-0286-x","article-title":"Local generalized quadratic distance metrics: Application to the k-nearest neighbors classifier","volume":"12","author":"Ferrie","year":"2018","journal-title":"Adv. Data Anal. Classif."},{"key":"ref_52","first-page":"1","article-title":"Classification and regression trees","volume":"1","author":"Breiman","year":"2017","journal-title":"Classif. Regres. Trees"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1162\/neco.1994.6.1.147","article-title":"Fast Exact Multiplication by the Hessian","volume":"6","author":"Pearlmutter","year":"1994","journal-title":"Neural Comput."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.knosys.2014.02.007","article-title":"Robust boosting classification models with local sets of probability distributions","volume":"61","author":"Utkin","year":"2014","journal-title":"Knowl.-Based Syst."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Shen, H. (2018, January 18\u201322). Towards a Mathematical Understanding of the Difficulty in Learning with Feedforward Neural Networks. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00091"},{"key":"ref_56","first-page":"1","article-title":"LIBSVM: A Library for Support Vector Machines","volume":"307","author":"Chang","year":"2008","journal-title":"ACM Trans. Intell. Syst. Technol. (TIST)"},{"key":"ref_57","first-page":"615","article-title":"Text chunking based on a generalization of winnow","volume":"2","author":"Zhang","year":"2002","journal-title":"J. Mach. Learn. Res."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/11\/12\/557\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:38:46Z","timestamp":1760179126000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/11\/12\/557"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,28]]},"references-count":57,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2020,12]]}},"alternative-id":["info11120557"],"URL":"https:\/\/doi.org\/10.3390\/info11120557","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,11,28]]}}}