{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,2]],"date-time":"2026-06-02T18:24:48Z","timestamp":1780424688996,"version":"3.54.1"},"reference-count":47,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2021,10,20]],"date-time":"2021-10-20T00:00:00Z","timestamp":1634688000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,10,20]],"date-time":"2021-10-20T00:00:00Z","timestamp":1634688000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"University of Dublin, Trinity College"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Cent Eur J Oper Res"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The Na\u00efve Bayes is a tractable and efficient approach for statistical classification. In general classification problems, the consequences of misclassifications may be rather different in different classes, making it crucial to control misclassification rates in the most critical and, in many realworld problems, minority cases, possibly at the expense of higher misclassification rates in less problematic classes. One traditional approach to address this problem consists of assigning misclassification costs to the different classes and applying the Bayes rule, by optimizing a loss function. However, fixing precise values for such misclassification costs may be problematic in realworld applications. In this paper we address the issue of misclassification for the Na\u00efve Bayes classifier. Instead of requesting precise values of misclassification costs, threshold values are used for different performance measures. This is done by adding constraints to the optimization problem underlying the estimation process. Our findings show that, under a reasonable computational cost, indeed, the performance measures under consideration achieve the desired levels yielding a user-friendly constrained classification procedure.<\/jats:p>","DOI":"10.1007\/s10100-021-00782-1","type":"journal-article","created":{"date-parts":[[2021,10,20]],"date-time":"2021-10-20T21:46:57Z","timestamp":1634766417000},"page":"1403-1425","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Constrained Na\u00efve Bayes with application to unbalanced data classification"],"prefix":"10.1007","volume":"30","author":[{"given":"Rafael","family":"Blanquero","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Emilio","family":"Carrizosa","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Pepa","family":"Ram\u00edrez-Cobo","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1097-391X","authenticated-orcid":false,"given":"M. Remedios","family":"Sillero-Denamiel","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2021,10,20]]},"reference":[{"key":"782_CR1","first-page":"255","volume":"17","author":"J Alcal\u00e1-Fdez","year":"2011","unstructured":"Alcal\u00e1-Fdez J, Fern\u00e1ndez A, Luengo J, Derrac J, Garc\u00eda S, S\u00e1nchez L, Herrera F (2011) KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework. J Mult-Valued Logic Soft Comput 17:255\u2013287","journal-title":"J Mult-Valued Logic Soft Comput"},{"issue":"3","key":"782_CR2","doi-asserted-by":"publisher","first-page":"307","DOI":"10.1007\/s00500-008-0323-y","volume":"13","author":"J Alcal\u00e1-Fdez","year":"2009","unstructured":"Alcal\u00e1-Fdez J, S\u00e1nchez L, Garc\u00eda S, del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fern\u00e1ndez JC, Herrera F (2009) KEEL: A Software Tool to Assess Evolutionary Algorithms for Data Mining Problems. Soft Computing 13(3):307\u2013318","journal-title":"Soft Comput"},{"issue":"3","key":"782_CR3","doi-asserted-by":"publisher","first-page":"663","DOI":"10.1007\/s11634-018-0330-5","volume":"13","author":"S Ben\u00edtez-Pe\u00f1a","year":"2019","unstructured":"Ben\u00edtez-Pe\u00f1a S, Blanquero R, Carrizosa E, Ram\u00edrez-Cobo P (2019) On support vector machines under a multiple-cost scenario. Advances in Data Analysis and Classification 13(3):663\u2013682","journal-title":"Adv Data Anal Classif"},{"issue":"3","key":"782_CR4","doi-asserted-by":"publisher","first-page":"2072","DOI":"10.1016\/j.eswa.2010.07.146","volume":"38","author":"P Bermejo","year":"2011","unstructured":"Bermejo P, G\u00e1mez JA, Puerta JM (2011) Improving the performance of Naive Bayes multinomial in e-mail foldering by introducing distribution-based balance of datasets. Expert Systems with Applications 38(3):2072\u20132080","journal-title":"Expert Syst Appl"},{"issue":"2","key":"782_CR5","doi-asserted-by":"publisher","first-page":"177","DOI":"10.1080\/10556780701577730","volume":"23","author":"E Birgin","year":"2008","unstructured":"Birgin E, Mart\u00ednez J (2008) Improving ultimate convergence of an augmented Llagrangian method. Optim Methods Softw 23(2):177\u2013195","journal-title":"Optim Methods Softw"},{"key":"782_CR6","doi-asserted-by":"publisher","first-page":"105281","DOI":"10.1016\/j.cor.2021.105281","volume":"132","author":"R Blanquero","year":"2021","unstructured":"Blanquero R, Carrizosa E, Molero-R\u00edo C, Romero Morales D (2021) Optimal randomized classification trees. Computers & Operations Research 132:105281","journal-title":"Comput Oper Res"},{"key":"782_CR7","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1007\/s11634-020-00389-5","volume":"15","author":"R Blanquero","year":"2021","unstructured":"Blanquero R, Carrizosa E, Ram\u00edrez-Cobo P, Sillero-Denamiel MR (2021) A cost-sensitive constrained lasso. Advances in Data Analysis and Classification 15:121\u2013158","journal-title":"Adv Data Anal Classif"},{"key":"782_CR8","first-page":"1659","volume":"8","author":"M Boull\u00e9","year":"2007","unstructured":"Boull\u00e9 M (2007) Compression-based Averaging of Selective Naive Bayes Classifiers. Journal of Machine Learning Research 8:1659\u20131685","journal-title":"J Mach Learn Res"},{"key":"782_CR9","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1007\/BFb0026682","volume-title":"Machine learning: ECML-98","author":"JP Bradford","year":"1998","unstructured":"Bradford JP, Kunz C, Kohavi R, Brunk C, Brodley CE (1998) Pruning decision trees with misclassification costs. In: N\u00e9dellec C, Rouveirol C (eds) Machine learning: ECML-98. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 131\u2013136"},{"key":"782_CR10","doi-asserted-by":"publisher","first-page":"452","DOI":"10.1007\/978-3-642-40319-4_39","volume-title":"Trends and applications in knowledge discovery and data mining","author":"P Cao","year":"2013","unstructured":"Cao P, Zhao D, Za\u00efane OR (2013) A PSO-based cost-sensitive neural network for imbalanced data classification. In: Li J, Cao L, Wang C, Tan KC, Liu B, Pei J, Tseng VS (eds) Trends and applications in knowledge discovery and data mining. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 452\u2013463"},{"key":"782_CR11","doi-asserted-by":"publisher","first-page":"950","DOI":"10.1016\/j.dam.2007.05.060","volume":"156","author":"E Carrizosa","year":"2008","unstructured":"Carrizosa E, Mart\u00edn-Barrag\u00e1n B, Romero Morales D (2008) Multi-group support vector machines with measurement costs: A biobjective approach. Discrete Applied Mathematics 156:950\u2013966","journal-title":"Discrete Appl Math"},{"issue":"1","key":"782_CR12","doi-asserted-by":"publisher","first-page":"150","DOI":"10.1016\/j.cor.2012.05.015","volume":"40","author":"E Carrizosa","year":"2013","unstructured":"Carrizosa E, Romero Morales D (2013) Supervised classification and mathematical optimization. Computers and Operations Research 40(1):150\u2013165","journal-title":"Comput Oper Res"},{"issue":"3","key":"782_CR13","doi-asserted-by":"publisher","first-page":"1293","DOI":"10.1016\/j.eswa.2010.06.076","volume":"38","author":"B Chandra","year":"2011","unstructured":"Chandra B, Gupta M (2011) Robust approach for estimating probabilities in Na\u00efve-Bayes classifier for gene expression data. Expert Systems with Applications 38(3):1293\u20131298","journal-title":"Expert Syst Appl"},{"key":"782_CR14","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1016\/j.neunet.2015.06.005","volume":"70","author":"S Datta","year":"2015","unstructured":"Datta S, Das S (2015) Near\u2013Bayesian support vector machines for imbalanced data classification with equal or unequal misclassification costs. Neural Netw 70:39\u201352","journal-title":"Neural Netw"},{"key":"782_CR15","first-page":"1","volume":"7","author":"J Dem\u0161ar","year":"2006","unstructured":"Dem\u0161ar J (2006) Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research 7:1\u201330","journal-title":"J Mach Learn Res"},{"issue":"2\u20133","key":"782_CR16","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1023\/A:1007413511361","volume":"29","author":"P Domingos","year":"1997","unstructured":"Domingos P, Pazzani M (1997) On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn 29(2\u20133):103\u2013130","journal-title":"Mach Learn"},{"key":"782_CR17","doi-asserted-by":"publisher","first-page":"303","DOI":"10.1007\/978-3-540-74553-2_28","volume-title":"Data warehousing and knowledge discovery","author":"A Freitas","year":"2007","unstructured":"Freitas A, Costa-Pereira A, Brazdil P (2007) Cost-sensitive decision trees applied to medical data. In: Song IY, Eder J, Nguyen TM (eds) Data Warehousing and Knowledge Discovery. Springer, Berlin Heidelberg, pp 303\u2013312"},{"issue":"3","key":"782_CR18","doi-asserted-by":"publisher","first-page":"445","DOI":"10.1080\/07350015.2014.903086","volume":"32","author":"G Guan","year":"2014","unstructured":"Guan G, Guo J, Wang H (2014) Varying Na\u00efve Bayes Models With Applications to Classification of Chinese Text Documents. Journal of Business & Economic Statistics 32(3):445\u2013456","journal-title":"J Bus Econ Stat"},{"issue":"3","key":"782_CR19","first-page":"385","volume":"69","author":"DJ Hand","year":"2001","unstructured":"Hand DJ, Yu K (2001) Idiot\u2019s Bayes - Not So Stupid After All? International Statistical Review 69(3):385\u2013398","journal-title":"Int Stat Rev"},{"key":"782_CR20","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-21606-5","volume-title":"The elements of statistical learning","author":"T Hastie","year":"2001","unstructured":"Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, NY"},{"key":"782_CR21","doi-asserted-by":"publisher","DOI":"10.1002\/9781118646106","volume-title":"Imbalanced learning: foundations, algorithms, and applications","author":"H He","year":"2013","unstructured":"He H, Yunqian M (2013) Imbalanced Learning: Foundations, Algorithms, and Applications. Wiley, Hoboken"},{"key":"782_CR22","unstructured":"Hogg RV, McKean J, Craig AT (2005) Introduction to Mathematical Statistics. Pearson Education"},{"issue":"Supplement C","key":"782_CR23","doi-asserted-by":"publisher","first-page":"346","DOI":"10.1016\/j.ins.2015.09.037","volume":"329","author":"L Jiang","year":"2016","unstructured":"Jiang L, Wang S, Li C, Zhang L (2016) Structure extended multinomial naive Bayes. Information Sciences 329(Supplement C):346\u2013356","journal-title":"Inf Sci"},{"issue":"Supplement C","key":"782_CR24","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1016\/j.ins.2016.11.014","volume":"381","author":"W Lee","year":"2017","unstructured":"Lee W, Jun CH, Lee JS (2017) Instance categorization by support vector machines to adjust weights in adaboost for imbalanced data classification. Information Sciences 381(Supplement C):92\u2013103","journal-title":"Inf Sci"},{"key":"782_CR25","doi-asserted-by":"publisher","unstructured":"Leevy JL, Khoshgoftaar TM, Bauder RA, Seliya N (2018) A survey on addressing high-class imbalance in big data. J Big Data. https:\/\/doi.org\/10.1186\/s40537-018-0151-6","DOI":"10.1186\/s40537-018-0151-6"},{"key":"782_CR26","unstructured":"Lichman, M (2013) UCI machine learning repository. http:\/\/archive.ics.uci.edu\/ml"},{"key":"782_CR27","doi-asserted-by":"crossref","unstructured":"Ling CX, Yang Q, Wang J, Zhang S (2004) Decision trees with minimal costs. In: Proceedings of the twenty-first international conference on machine learning, ICML \u201904, p.\u00a069. New York, NY, USA","DOI":"10.1145\/1015330.1015369"},{"issue":"4","key":"782_CR28","first-page":"572","volume":"4","author":"N Mehra","year":"2013","unstructured":"Mehra N, Gupta S (2013) Survey on multiclass classification methods. International Journal of Computer Science and Information Technologies 4(4):572\u2013576","journal-title":"Int J Comput Sci Inf Technol"},{"issue":"1","key":"782_CR29","doi-asserted-by":"publisher","first-page":"2","DOI":"10.1109\/TSE.2007.256941","volume":"33","author":"T Menzies","year":"2007","unstructured":"Menzies T, Greenwald J, Frank A (2007) Data Mining Static Code Attributes to Learn Defect Predictors. IEEE Transactions on Software Engineering 33(1):2\u201313","journal-title":"IEEE Trans Softw Eng"},{"issue":"509","key":"782_CR30","doi-asserted-by":"publisher","first-page":"393","DOI":"10.1080\/01621459.2014.908778","volume":"110","author":"J Minnier","year":"2015","unstructured":"Minnier J, Yuan M, Liu JS, Cai T (2015) Risk Classification With an Adaptive Naive Bayes Kernel Machine Model. Journal of the American Statistical Association 110(509):393\u2013404","journal-title":"J Am Stat Assoc"},{"issue":"3","key":"782_CR31","first-page":"0975","volume":"24","author":"G Parthiban","year":"2011","unstructured":"Parthiban G, Rajesh A, Srivatsa SK (2011) Diagnosis of Heart Disease for Diabetic Patients using Naive Bayes Method. International Journal of Computer Applications 24(3):0975\u20138887","journal-title":"Int J Comput Appl"},{"issue":"Supplement C","key":"782_CR32","doi-asserted-by":"publisher","first-page":"347","DOI":"10.1016\/j.ins.2014.04.046","volume":"288","author":"L Peng","year":"2014","unstructured":"Peng L, Zhang H, Yang B, Chen Y (2014) A new approach for imbalanced data classification based on data gravitation. Inf Sci 288(Supplement C):347\u2013373","journal-title":"Inf Sci"},{"key":"782_CR33","doi-asserted-by":"publisher","first-page":"247","DOI":"10.1007\/s10115-014-0794-3","volume":"45","author":"RC Prati","year":"2015","unstructured":"Prati RC, Batista GE, Silva DF (2015) Class imbalance revisited: a new experimental setup to assess the performance of treatment methods. Knowledge and Information Systems 45:247\u2013270","journal-title":"Knowl Inf Syst"},{"issue":"5","key":"782_CR34","doi-asserted-by":"publisher","first-page":"582","DOI":"10.1017\/S0269888913000039","volume":"29","author":"A Romei","year":"2014","unstructured":"Romei A, Ruggieri S (2014) A multidisciplinary survey on discrimination analysis. The Knowledge Engineering Review 29(5):582\u2013638","journal-title":"Knowl Eng Rev"},{"issue":"1","key":"782_CR35","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1093\/bioinformatics\/btq619","volume":"27","author":"GL Rosen","year":"2010","unstructured":"Rosen GL, Reichenberger ER, Rosenfeld AM (2010) NBC: the Na\u00efve Bayes Classification tool webserver for taxonomic classification of metagenomic reads. Bioinformatics 27(1):127\u2013129","journal-title":"Bioinformatics"},{"issue":"4","key":"782_CR36","doi-asserted-by":"publisher","first-page":"427","DOI":"10.1016\/j.ipm.2009.03.002","volume":"45","author":"M Sokolova","year":"2009","unstructured":"Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Information Processing & Management 45(4):427\u2013437","journal-title":"Inf Process Manag"},{"issue":"12","key":"782_CR37","doi-asserted-by":"publisher","first-page":"3358","DOI":"10.1016\/j.patcog.2007.04.009","volume":"40","author":"Y Sun","year":"2007","unstructured":"Sun Y, Kamel MS, Wong AK, Wang Y (2007) Cost-sensitive boosting for classification of imbalanced data. Pattern Recognition 40(12):3358\u20133378","journal-title":"Pattern Recognit"},{"key":"782_CR38","doi-asserted-by":"publisher","first-page":"687","DOI":"10.1142\/S0218001409007326","volume":"23","author":"Y Sun","year":"2009","unstructured":"Sun Y, Wong AK, Kamel MS (2009) Classification of imbalanced data: A review. International Journal of Pattern Recognition and Artificial Intelligence 23:687\u2013719","journal-title":"Int J Pattern Recognit Artif Intell"},{"issue":"2","key":"782_CR39","doi-asserted-by":"publisher","first-page":"278","DOI":"10.1016\/j.datak.2008.10.005","volume":"68","author":"B Turhan","year":"2009","unstructured":"Turhan B, Bener A (2009) Analysis of Naive Bayes\u2019 assumptions on software fault data: An empirical study. Data & Knowledge Engineering 68(2):278\u2013290","journal-title":"Data Knowl Eng"},{"issue":"4","key":"782_CR40","doi-asserted-by":"publisher","first-page":"370","DOI":"10.1136\/amiajnl-2011-000101","volume":"18","author":"W Wei","year":"2011","unstructured":"Wei W, Visweswaran S, Cooper GF (2011) The application of naive Bayes model averaging to predict Alzheimer\u2019s disease from genome-wide data. Journal of the American Medical Informatics Association 18(4):370\u2013375","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"782_CR41","doi-asserted-by":"publisher","first-page":"112","DOI":"10.1080\/00401706.2013.810174","volume":"56","author":"DM Witten","year":"2014","unstructured":"Witten DM, Shojaie A, Zhang F (2014) The Cluster Elastic Net for High-Dimensional Regression With Unknown Variable Grouping. Technometrics 56(1):112\u2013122","journal-title":"Technometrics"},{"key":"782_CR42","doi-asserted-by":"crossref","unstructured":"Wolfson J, Bandyopadhyay S, Elidrisi M, Vazquez-Benitez G, Vock DM, Musgrove D, Adomavicius G, Johnson PE, O\u2019Connor PJ (2015) A Naive Bayes machine learning approach to risk prediction using censored, time-to-event data. Statistics in Medicine 34(21):2941\u20132957","DOI":"10.1002\/sim.6526"},{"issue":"3","key":"782_CR43","doi-asserted-by":"publisher","first-page":"1487","DOI":"10.1016\/j.eswa.2014.09.019","volume":"42","author":"J Wu","year":"2015","unstructured":"Wu J, Pan S, Zhu X, Cai Z, Zhang P, Zhang C (2015) Self-adaptive attribute weighting for Naive Bayes classification. Expert Systems with Applications 42(3):1487\u20131502","journal-title":"Expert Syst Appl"},{"issue":"1","key":"782_CR44","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/S0169-7439(00)00122-2","volume":"56","author":"QS Xu","year":"2001","unstructured":"Xu QS, Liang YZ (2001) Monte Carlo cross validation. Chemom Intell Lab Syst 56(1):1\u201311","journal-title":"Chemom Intell Lab Syst"},{"issue":"5","key":"782_CR45","doi-asserted-by":"publisher","first-page":"577","DOI":"10.1016\/j.ins.2004.12.006","volume":"176","author":"RR Yager","year":"2006","unstructured":"Yager RR (2006) An extension of the naive Bayesian classifier. Information Sciences 176(5):577\u2013588","journal-title":"Inf Sci"},{"key":"782_CR46","doi-asserted-by":"crossref","unstructured":"Yang Y, Liu X (1999). A re-examination of text categorization methods. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval (SIGIR), pp. 42\u201349. New York, NY, USA","DOI":"10.1145\/312624.312647"},{"issue":"1","key":"782_CR47","doi-asserted-by":"publisher","first-page":"63","DOI":"10.1109\/TKDE.2006.17","volume":"18","author":"Zhi-Hua Zhou","year":"2006","unstructured":"Zhou Zhi-Hua, Liu Xu-Ying (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1):63\u201377","journal-title":"IEEE Trans Knowl Data Eng"}],"container-title":["Central European Journal of Operations Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10100-021-00782-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10100-021-00782-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10100-021-00782-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,17]],"date-time":"2022-10-17T17:16:15Z","timestamp":1666026975000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10100-021-00782-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,20]]},"references-count":47,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["782"],"URL":"https:\/\/doi.org\/10.1007\/s10100-021-00782-1","relation":{},"ISSN":["1435-246X","1613-9178"],"issn-type":[{"value":"1435-246X","type":"print"},{"value":"1613-9178","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,10,20]]},"assertion":[{"value":"15 September 2021","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 October 2021","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}