{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:33:58Z","timestamp":1760243638562,"version":"build-2065373602"},"reference-count":58,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2013,11,14]],"date-time":"2013-11-14T00:00:00Z","timestamp":1384387200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/3.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>For evaluating the classification model of an information system, a proper measure is usually needed to determine if the model is appropriate for dealing with the specific domain task. Though many performance measures have been proposed, few measures were specially defined for multi-class problems, which tend to be more complicated than two-class problems, especially in addressing the issue of class discrimination power. Confusion entropy was proposed for evaluating classifiers in the multi-class case. Nevertheless, it makes no use of the probabilities of samples classified into different classes. In this paper, we propose to calculate confusion entropy based on a probabilistic confusion matrix. Besides inheriting the merit of measuring if a classifier can classify with high accuracy and class discrimination power, probabilistic confusion entropy also tends to measure if samples are classified into true classes and separated from others with high probabilities. Analysis and experimental comparisons show the feasibility of the simply improved measure and demonstrate that the measure does not stand or fall over the classifiers on different datasets in comparison with the compared measures.<\/jats:p>","DOI":"10.3390\/e15114969","type":"journal-article","created":{"date-parts":[[2013,11,14]],"date-time":"2013-11-14T11:24:25Z","timestamp":1384428265000},"page":"4969-4992","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Probabilistic Confusion Entropy for Evaluating Classifiers"],"prefix":"10.3390","volume":"15","author":[{"given":"Xiao-Ning","family":"Wang","sequence":"first","affiliation":[{"name":"College of Information Technical Science, Nankai University, Tianjin 300071, China"}]},{"given":"Jin-Mao","family":"Wei","sequence":"additional","affiliation":[{"name":"College of Information Technical Science, Nankai University, Tianjin 300071, China"}]},{"given":"Han","family":"Jin","sequence":"additional","affiliation":[{"name":"College of Information Technical Science, Nankai University, Tianjin 300071, China"}]},{"given":"Gang","family":"Yu","sequence":"additional","affiliation":[{"name":"College of Information Technical Science, Nankai University, Tianjin 300071, China"}]},{"given":"Hai-Wei","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Information Technical Science, Nankai University, Tianjin 300071, China"}]}],"member":"1968","published-online":{"date-parts":[[2013,11,14]]},"reference":[{"key":"ref_1","unstructured":"Egan, J.P. (1975). Series in Cognition and Perception, Academic Press."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Spackman, K.A. (1989, January 26\u201327). Signal Detection Theory: Valuable Tools for Evaluating Inductive Learning. Proceedings of the 6th International Workshop on Machine Learning, Ithaca, NY, USA.","DOI":"10.1016\/B978-1-55860-036-2.50047-3"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"3799","DOI":"10.1016\/j.eswa.2009.11.040","article-title":"A novel measure for evaluating classifiers","volume":"37","author":"Wei","year":"2010","journal-title":"Expert Syst. Appl."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pone.0041882","article-title":"A comparison of MCC and CEN error measures in multi- class prediction","volume":"7","author":"Jurman","year":"2012","journal-title":"PLoS One"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1016\/j.patrec.2008.08.010","article-title":"An experimental comparison of performance measures for classification","volume":"30","author":"Ferri","year":"2009","journal-title":"Pattern Recognit. Lett."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1007\/BF01617722","article-title":"Classification-algorithm evaluation: five performance measures based on confusion matrices","volume":"11","author":"Forbes","year":"1995","journal-title":"J. Clin. Monit."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1002\/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3","article-title":"Index for rating diagnostic tests","volume":"3","author":"Youden","year":"1950","journal-title":"Cancer"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1002\/bimj.200410135","article-title":"Estimation of the Youden Index and it\u2019s associated cutoff point","volume":"47","author":"Fluss","year":"2005","journal-title":"Biom. J."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"360","DOI":"10.7326\/0003-4819-122-5-199503010-00007","article-title":"Noninvasive carotid artery testing: A meta-analytic review","volume":"122","author":"Blakeley","year":"1995","journal-title":"Ann. Intern. Med."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1177\/001316446002000104","article-title":"A coefficient of agreement for nominal scales","volume":"20","author":"Cohen","year":"1960","journal-title":"Educ. Psychol. Meas."},{"key":"ref_11","unstructured":"Fleiss, J.L. (1981). Statistical Methods for Rates and Proportions, Wiley. [2nd ed.]."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Sindhwani, V., Bhattacharya, P., and Rakshit, S. (2001, January 5\u20137). Information Theoretic Feature Crediting in Multiclass Support Vector Machines. Proceedings of the First SIAM International Conference on Data Mining (ICDM\u201901), Chicago, IL, USA.","DOI":"10.1137\/1.9781611972719.16"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1093\/bioinformatics\/bti033","article-title":"A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis","volume":"21","author":"Statnikov","year":"2005","journal-title":"Bioinformatics"},{"key":"ref_14","unstructured":"Wickens, T.D. (1989). Multiway Contingency Table Analysis for the Social Sciences, Lawrence Erlbaum."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1016\/j.compbiolchem.2004.09.006","article-title":"Comparing two K-category assignments by a K-category correlation coefficient","volume":"28","author":"Gorodkin","year":"2004","journal-title":"Comput. Biol. Chem."},{"key":"ref_16","unstructured":"Yates, R.B., and Neto, B.R. (1999). Modern Information Retrieval, Addison Wesley."},{"key":"ref_17","unstructured":"Mitchell, T.M. (1997). Machine Learning, McGraw-Hill."},{"key":"ref_18","unstructured":"Lebanon, G., and Lafferty, J. (2002, January 8\u201312). Cranking: Combining Rankings Using Conditional Probability Models on Permutations. Proceedings of the 19th International Conference (ICML 2002), Sydney, Australia."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1111\/j.2517-6161.1952.tb00104.x","article-title":"Rational decisions","volume":"14","author":"Good","year":"1952","journal-title":"J. R. Stat. Soc. Series B"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1093\/bjps\/19.2.123","article-title":"Corroboration, explanation, evolving probability, simplicity, and a sharpened razor","volume":"19","author":"Good","year":"1968","journal-title":"Br. J. Philos. Sci."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1007\/s10994-007-5011-0","article-title":"PAV and the ROC convex hull","volume":"68","author":"Fawcett","year":"2007","journal-title":"Mach. Learn."},{"key":"ref_22","unstructured":"Flach, P., and Matsubara, E.T. (2007, January 17\u201321). A Simple Lexicographic Ranker and Probability Estimator. Proceedings of the 18th European Conference on Machine Learning, Warsaw, Poland."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Caruana, R., and Niculescu-Mizil, A. (2004, January 22\u201325). Data Mining in Metric Space: An Empirical Analysis of Supervised Learning Performance Criteria. Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD\u201904), Seattle, WA, USA.","DOI":"10.1145\/1014052.1014063"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","article-title":"An introduction to ROC analysis","volume":"27","author":"Fawcett","year":"2006","journal-title":"Pattern Recognit. Lett."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1145","DOI":"10.1016\/S0031-3203(96)00142-2","article-title":"The use of the area under the ROC curve in the evaluation of machine learning algorithms","volume":"30","author":"Bradley","year":"1997","journal-title":"Pattern Recognit."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"810","DOI":"10.1109\/TPAMI.2007.70740","article-title":"Efficient multiclass ROC approximation by decomposition via confusion matrix perturbation analysis","volume":"30","author":"Landgrebe","year":"2008","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1007\/s10994-009-5119-5","article-title":"Measuring classifier performance: A coherent alternative to the area under the ROC curve","volume":"77","author":"Hand","year":"2009","journal-title":"Mach. Learn."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1502","DOI":"10.1002\/sim.3859","article-title":"Evaluating diagnostic tests: the area under the ROC curve and the balance of errors","volume":"29","author":"Hand","year":"2010","journal-title":"Stat. Med."},{"key":"ref_29","unstructured":"Flach, P., Hern\u00e1ndez-Orallo, J., and Ferri, C. (July, January 28). A Coherent interpretation of AUC as a measure of aggregated classification performance. Proceedings of the 28th International Conference on Machine Learning (ICML\u201911), Bellevue, WA, USA."},{"key":"ref_30","unstructured":"Fawcett, T. (December, January 29). Using Rule Sets to Maximize ROC Performance. Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM-01), San Jose, CA, USA."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1023\/A:1010920819831","article-title":"A simple generalisation of the area under the ROC curve for multiple class classification problems","volume":"45","author":"Hand","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_32","unstructured":"Wu, S., Flach, P., and Ferri, C. (2007, January 17\u201321). An Improved Model Selection Heuristic for AUC. Proceedings of the 18th European Conference on Machine Learning, Warsaw, Poland."},{"key":"ref_33","unstructured":"Ferri, C., Flach, P., Hern\u00e1ndez-Orallo, J., and Senad, A. (2005, January 7\u201311). Modifying ROC Curves to Incorporate Predicted Probabilities. Proceedings of the ICML 2005 Workshop on ROC Analysis in Machine Learning, Bonn, Germany."},{"key":"ref_34","unstructured":"Provost, F., and Fawcett, T. (1997, January 14\u201317). Analysis and Visualization of Classifier Performance: Comparison Under Imprecise Class and Cost Distributions. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, Newport Beach, CA, USA."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1023\/A:1007601015854","article-title":"Robust classification for imprecise environments","volume":"42","author":"Provost","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1007\/s10994-006-8199-5","article-title":"Cost curves: An improved method for visualizing classifier performance","volume":"65","author":"Drummond","year":"2006","journal-title":"Mach. Learn."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"548","DOI":"10.1007\/978-3-540-87479-9_54","article-title":"A projection-based framework for classifier performance evaluation","volume":"5211","author":"Japkowicz","year":"2008","journal-title":"Lecture Notes Comput. Sci."},{"key":"ref_38","unstructured":"Hern\u00e1ndez-Orallo, J., Flach, P., and Ferri, C. (July, January 28). Brier Curves: A New Cost-Based Visualisation of Classifier Performance. Proceedings of the 28th International Conference on Machine Learning, Bellevue, WA, USA."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1016\/S0001-2998(78)80014-2","article-title":"Basic principles of ROC analysis","volume":"8","author":"Metz","year":"1978","journal-title":"Semin. Nucl. Med."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1148\/radiology.148.3.6878708","article-title":"The meaning and use of the area under a receiver operating characteristics(ROC) curve","volume":"148","author":"Hanley","year":"1983","journal-title":"Radiology"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1285","DOI":"10.1126\/science.3287615","article-title":"Measuring the accuracy of diagnostic systems","volume":"240","author":"Swets","year":"1988","journal-title":"Scince"},{"key":"ref_42","unstructured":"Drummond, C. (2006, January 16\u201320). Machine Learning as an Experimental Science (Revisited). Proceedings of the AAAI06-Workshop on Evaluation Methods for Machine Learning, Boston, MA, USA."},{"key":"ref_43","unstructured":"Hand, D.J. (1997). Construction and Assessment of Classification Rules, John Wiley and Sons."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1111\/1467-9574.00153","article-title":"Measuring diagnostic accuracy of statistical prediction rules","volume":"55","author":"Hand","year":"2001","journal-title":"Stat. Neerlandica"},{"key":"ref_45","first-page":"1015","article-title":"Beyond accuracy, F-score and ROC: A family of discriminant measures for performance evaluation","volume":"4304","author":"Sokolova","year":"2006","journal-title":"AI 2006, Lecture Notes Comput. Sci."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1016\/j.ipm.2009.03.002","article-title":"A systematic analysis of performance measures for classification tasks","volume":"45","author":"Sokolova","year":"2009","journal-title":"Inf. Process. Manag."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Bishop, C.M. (1995). Neural Networks for Pattern Recognition, Oxford University Press.","DOI":"10.1093\/oso\/9780198538493.001.0001"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Harrell, F.E. (2001). Regression Modeling Strategies: With Applications to Linear Models, Logistics Regression, and Survival Analysis, Springer.","DOI":"10.1007\/978-1-4757-3462-1"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Ripley, B.D. (1996). Pattern Recognition and Neural Networks, Cambridge University Press.","DOI":"10.1017\/CBO9780511812651"},{"key":"ref_50","first-page":"2813","article-title":"A unified view of performance metrics: Translating threshold choice into expected classification loss","volume":"13","author":"Flach","year":"2012","journal-title":"J. Mach. Learn. Res."},{"key":"ref_51","first-page":"461","article-title":"Evaluation Measures of the Classification Performance of Imbalanced datasets","volume":"Volume 51","author":"Gu","year":"2009","journal-title":"Computational Intelligence and Intelligent Systems, Proceedings of the 4th International Symposium on Intelligence Computation and Applications (ISICA 2009)"},{"key":"ref_52","first-page":"21","article-title":"Evaluation of performance measures for classifiers comparison","volume":"6","author":"Labatut","year":"2011","journal-title":"Ubiquitous Comput. Commun. J."},{"key":"ref_53","unstructured":"Huang, J., and Ling, C. (2007, January 9\u201312). Constructing New and Better Evaluation Measures for Machine Learning. Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI\u20192007), Hyderabad, India."},{"key":"ref_54","unstructured":"Witten, I.H., and Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann. [2nd ed.]."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1007\/s10994-008-5070-x","article-title":"A critical analysis of variants of the AUC","volume":"72","author":"Vanderlooy","year":"2008","journal-title":"Mach. Learn."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1175\/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2","article-title":"Verification of forecasts expressed in terms of probability","volume":"78","author":"Brier","year":"1950","journal-title":"Mon. Weather Rev."},{"key":"ref_57","unstructured":"UCI Machine Learning. Available online: http:\/\/mlearn\/MLRepository.html."},{"key":"ref_58","unstructured":"Quinlan, J.R. (1993). C4.5: Programs for Machine Learning, Morgan Kaufmann."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/15\/11\/4969\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T21:50:38Z","timestamp":1760219438000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/15\/11\/4969"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,11,14]]},"references-count":58,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2013,11]]}},"alternative-id":["e15114969"],"URL":"https:\/\/doi.org\/10.3390\/e15114969","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2013,11,14]]}}}