{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:45:42Z","timestamp":1760244342390,"version":"build-2065373602"},"reference-count":77,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2022,11,17]],"date-time":"2022-11-17T00:00:00Z","timestamp":1668643200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>The family of \u03b1-divergences including the oriented forward and reverse Kullback\u2013Leibler divergences is often used in signal processing, pattern recognition, and machine learning, among others. Choosing a suitable \u03b1-divergence can either be done beforehand according to some prior knowledge of the application domains or directly learned from data sets. In this work, we generalize the \u03b1-divergences using a pair of strictly comparable weighted means. Our generalization allows us to obtain in the limit case \u03b1\u21921 the 1-divergence, which provides a generalization of the forward Kullback\u2013Leibler divergence, and in the limit case \u03b1\u21920, the 0-divergence, which corresponds to a generalization of the reverse Kullback\u2013Leibler divergence. We then analyze the condition for a pair of weighted quasi-arithmetic means to be strictly comparable and describe the family of quasi-arithmetic \u03b1-divergences including its subfamily of power homogeneous \u03b1-divergences. In particular, we study the generalized quasi-arithmetic 1-divergences and 0-divergences and show that these counterpart generalizations of the oriented Kullback\u2013Leibler divergences can be rewritten as equivalent conformal Bregman divergences using strictly monotone embeddings. Finally, we discuss the applications of these novel divergences to k-means clustering by studying the robustness property of the centroids.<\/jats:p>","DOI":"10.3390\/a15110435","type":"journal-article","created":{"date-parts":[[2022,11,18]],"date-time":"2022-11-18T04:08:40Z","timestamp":1668744520000},"page":"435","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Generalizing the Alpha-Divergences and the Oriented Kullback\u2013Leibler Divergences with Quasi-Arithmetic Means"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5728-0726","authenticated-orcid":false,"given":"Frank","family":"Nielsen","sequence":"first","affiliation":[{"name":"Sony Computer Science Laboratories, Tokyo 141-0022, Japan"}]}],"member":"1968","published-online":{"date-parts":[[2022,11,17]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Keener, R.W. (2011). Theoretical Statistics: Topics for a Core Course, Springer.","DOI":"10.1007\/978-0-387-93839-4"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Basu, A., Shioya, H., and Park, C. (2011). Statistical Inference: The Minimum Distance Approach, CRC Press.","DOI":"10.1201\/b10956"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"621","DOI":"10.1016\/j.sigpro.2012.09.003","article-title":"Divergence measures for statistical data processing \u2014 An annotated bibliography","volume":"93","author":"Basseville","year":"2013","journal-title":"Signal Process."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Pardo, L. (2018). Statistical Inference Based on Divergence Measures, CRC Press.","DOI":"10.1201\/9781420034813"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Oller, J.M. (1989). Some geometrical aspects of data analysis and statistics. Statistical Data Analysis and Inference, Elsevier.","DOI":"10.1016\/B978-0-444-88029-1.50009-5"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Amari, S. (2016). Information Geometry and Its Applications, Applied Mathematical Sciences; Springer.","DOI":"10.1007\/978-4-431-55978-8"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"631","DOI":"10.32917\/hmj\/1206128508","article-title":"Geometry of minimum contrast","volume":"22","author":"Eguchi","year":"1992","journal-title":"Hiroshima Math. J."},{"key":"ref_8","unstructured":"Cover, T.M., and Thomas, J.A. (2012). Elements of Information Theory, John Wiley & Sons."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1532","DOI":"10.3390\/e12061532","article-title":"Families of alpha-beta-and gamma-divergences: Flexible and robust measures of similarities","volume":"12","author":"Cichocki","year":"2010","journal-title":"Entropy"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"4925","DOI":"10.1109\/TIT.2009.2030485","article-title":"\u03b1-Divergence is Unique, belonging to Both f-divergence and Bregman Divergence Classes","volume":"55","author":"Amari","year":"2009","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1162\/08997660460734047","article-title":"Divergence function, duality, and convex analysis","volume":"16","author":"Zhang","year":"2004","journal-title":"Neural Comput."},{"key":"ref_12","unstructured":"Hero, A.O., Ma, B., Michel, O., and Gorman, J. (2001). Alpha-Divergence for Classification, Indexing and Retrieval, Communication and Signal Processing Laboratory, University of Michigan. Technical Report CSPL-328."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1442","DOI":"10.1109\/TPAMI.2014.2366144","article-title":"Learning the information divergence","volume":"37","author":"Dikmen","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.artmed.2008.05.001","article-title":"On \u03b1-divergence based nonnegative matrix factorization for clustering cancer gene expression data","volume":"44","author":"Liu","year":"2008","journal-title":"Artif. Intell. Med."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"210","DOI":"10.1515\/crll.1909.136.210","article-title":"Neue Begr\u00fcndung der Theorie Quadratischer Formen von Unendlichvielen Ver\u00e4nderlichen","volume":"1909","author":"Hellinger","year":"1909","journal-title":"J. F\u00fcr Die Reine Und Angew. Math."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1111\/j.2517-6161.1966.tb00626.x","article-title":"A general class of coefficients of divergence of one distribution from another","volume":"28","author":"Ali","year":"1966","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"ref_17","first-page":"229","article-title":"Information-type measures of difference of probability distributions and indirect observation","volume":"2","year":"1967","journal-title":"Stud. Sci. Math. Hung."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"3884","DOI":"10.1109\/TSP.2010.2047340","article-title":"A study on invariance of f-divergence and its application to speech recognition","volume":"58","author":"Qiao","year":"2010","journal-title":"IEEE Trans. Signal Process."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1007\/s41884-021-00063-5","article-title":"Transport information Bregman divergences","volume":"4","author":"Li","year":"2021","journal-title":"Inf. Geom."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Li, W. (2021, January 21\u201323). Transport information Hessian distances. Proceedings of the International Conference on Geometric Science of Information (GSI), Paris, France.","DOI":"10.1007\/978-3-030-80209-7_87"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1007\/s41884-021-00059-1","article-title":"Transport information geometry: Riemannian calculus on probability simplex","volume":"5","author":"Li","year":"2022","journal-title":"Inf. Geom."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"2780","DOI":"10.1162\/neco.2007.19.10.2780","article-title":"Integration of stochastic models by minimizing \u03b1-divergence","volume":"19","author":"Amari","year":"2007","journal-title":"Neural Comput."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1433","DOI":"10.1016\/j.patrec.2008.02.016","article-title":"Non-negative matrix factorization with \u03b1-divergence","volume":"29","author":"Cichocki","year":"2008","journal-title":"Pattern Recognit. Lett."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1016\/j.mathsocsci.2018.02.003","article-title":"Studying malapportionment using \u03b1-divergence","volume":"93","author":"Wada","year":"2018","journal-title":"Math. Soc. Sci."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"5352","DOI":"10.1109\/TIT.2019.2915245","article-title":"Harmonic Bayesian prediction under \u03b1-divergence","volume":"65","author":"Maruyama","year":"2019","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"5729","DOI":"10.1109\/TIP.2019.2922074","article-title":"An \u03b1-Divergence-Based Approach for Robust Dictionary Learning","volume":"28","author":"Iqbal","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1138","DOI":"10.1080\/03610918.2017.1406511","article-title":"Exponentiality test based on alpha-divergence and gamma-divergence","volume":"48","author":"Ahrari","year":"2019","journal-title":"Commun. Stat.-Simul. Comput."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Sarmiento, A., Fond\u00f3n, I., Dur\u00e1n-D\u00edaz, I., and Cruces, S. (2019). Centroid-based clustering with \u03b1\u03b2-divergences. Entropy, 21.","DOI":"10.3390\/e21020196"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Niculescu, C.P., and Persson, L.E. (2006). Convex Functions and Their Applications: A Contemporary Approach, Springer Science & Business Media. [1st ed.].","DOI":"10.1007\/0-387-31077-0_2"},{"key":"ref_30","first-page":"388","article-title":"Sur la notion de moyenne","volume":"12","author":"Kolmogorov","year":"1930","journal-title":"Acad. Naz. Lincei Mem. Cl. Sci. His. Mat. Natur. Sez."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"419","DOI":"10.1111\/j.1751-5823.2002.tb00178.x","article-title":"On choosing and bounding probability metrics","volume":"70","author":"Gibbs","year":"2002","journal-title":"Int. Stat. Rev."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Rachev, S.T., Klebanov, L.B., Stoyanov, S.V., and Fabozzi, F. (2013). The Methods of Distances in the Theory of Probability and Statistics, Springer.","DOI":"10.1007\/978-1-4614-4869-3"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"475","DOI":"10.1109\/TMI.2010.2086464","article-title":"Total Bregman divergence and its applications to DTI analysis","volume":"30","author":"Vemuri","year":"2010","journal-title":"IEEE Trans. Med Imaging"},{"key":"ref_34","unstructured":"Arthur, D., and Vassilvitskii, S. (2007, January 7\u20139). k-means++: The advantages of careful seeding. Proceedings of the SODA \u201907: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, LA, USA."},{"key":"ref_35","unstructured":"Bullen, P.S., Mitrinovic, D.S., and Vasic, M. (2013). Means and Their Inequalities, Springer Science & Business Media."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Toader, G., and Costin, I. (2017). Means in Mathematical Analysis: Bivariate Means, Academic Press.","DOI":"10.1016\/B978-0-12-811080-5.00002-5"},{"key":"ref_37","unstructured":"Cauchy, A.L.B. (1821). Cours d\u2019analyse de l\u2019\u00c9cole Royale Polytechnique, Debure fr\u00e8res."},{"key":"ref_38","first-page":"106","article-title":"Sul concetto di media","volume":"4","author":"Chisini","year":"1929","journal-title":"Period. Di Mat."},{"key":"ref_39","first-page":"99","article-title":"On a measure of divergence between two statistical populations defined by their probability distributions","volume":"35","author":"Bhattacharyya","year":"1943","journal-title":"Bull. Calcutta Math. Soc."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"5455","DOI":"10.1109\/TIT.2011.2159046","article-title":"The Burbea-Rao and Bhattacharyya centroids","volume":"57","author":"Nielsen","year":"2011","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/j.patrec.2014.01.002","article-title":"Generalized Bhattacharyya and Chernoff upper bounds on Bayes error using quasi-arithmetic means","volume":"42","author":"Nielsen","year":"2014","journal-title":"Pattern Recognit. Lett."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"71","DOI":"10.4099\/jjm1924.7.0_71","article-title":"\u00dcber eine klasse der mittelwerte","volume":"7","author":"Nagumo","year":"1930","journal-title":"Jpn. J. Math. Trans. Abstr."},{"key":"ref_43","first-page":"369","article-title":"Sul concetto di media","volume":"3","year":"1931","journal-title":"Ist. Ital. Degli Attuari"},{"key":"ref_44","unstructured":"Hardy, G., Littlewood, J., and P\u00f3lya, G. (1988). Inequalities, Cambridge Mathematical Library, Cambridge University Press."},{"key":"ref_45","unstructured":"R\u00e9nyi, A. (July, January 20). On measures of entropy and information. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA. Contributions to the Theory of Statistics."},{"key":"ref_46","first-page":"38","article-title":"\u00dcber einen Mittelwertssatz","volume":"44","author":"Holder","year":"1889","journal-title":"Nachr. Akad. Wiss. Gottingen Math.-Phys. Kl."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Bhatia, R. (2013). The Riemannian mean of positive matrices. Matrix Information Geometry, Springer.","DOI":"10.1007\/978-3-642-30232-9_2"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s10463-021-00818-y","article-title":"Bahadur efficiency of the maximum likelihood estimator and one-step estimator for quasi-arithmetic means of the Cauchy distribution","volume":"74","author":"Akaoka","year":"2022","journal-title":"Ann. Inst. Stat. Math."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1515\/forum-2017-0136","article-title":"The quasi-arithmetic means and Cartan barycenters of compactly supported measures","volume":"30","author":"Kim","year":"2018","journal-title":"Forum Math. Gruyter"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"615","DOI":"10.1080\/00029890.1972.11993095","article-title":"The logarithmic mean","volume":"79","author":"Carlson","year":"1972","journal-title":"Am. Math. Mon."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1080\/0025570X.1975.11976447","article-title":"Generalizations of the logarithmic mean","volume":"48","author":"Stolarsky","year":"1975","journal-title":"Math. Mag."},{"key":"ref_52","first-page":"71","article-title":"When Lagrangean and quasi-arithmetic means coincide","volume":"8","author":"Jarczyk","year":"2007","journal-title":"J. Inequal. Pure Appl. Math."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1007\/s00025-019-1141-5","article-title":"On the Equality of Bajraktarevi\u0107 Means to Quasi-Arithmetic Means","volume":"75","author":"Zakaria","year":"2020","journal-title":"Results Math."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"77","DOI":"10.4064\/cm120-1-6","article-title":"Remarks on the comparison of weighted quasi-arithmetic means","volume":"120","author":"Maksa","year":"2010","journal-title":"Colloq. Math."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"5384","DOI":"10.3390\/e15125384","article-title":"Nonparametric information geometry: From divergence function to referential-representational biduality on statistical manifolds","volume":"15","author":"Zhang","year":"2013","journal-title":"Entropy"},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"1123","DOI":"10.1109\/LSP.2017.2712195","article-title":"Generalizing Skew Jensen Divergences and Bregman Divergences with Comparative Convexity","volume":"24","author":"Nielsen","year":"2017","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Kuczma, M. (2009). An Introduction to the Theory of Functional Equations and Inequalities: Cauchy\u2019s Equation and Jensen\u2019s Inequality, Springer Science & Business Media.","DOI":"10.1007\/978-3-7643-8749-5"},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"527","DOI":"10.1109\/TIT.2015.2448072","article-title":"On conformal divergences and their population minimizers","volume":"62","author":"Nock","year":"2015","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Ohara, A. (2018). Conformal flattening for deformed information geometries on the probability simplex. Entropy, 20.","DOI":"10.3390\/e20030186"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Ohara, A. (2019). Conformal Flattening on the Probability Simplex and Its Applications to Voronoi Partitions and Centroids. Geometric Structures of Information, Springer.","DOI":"10.1007\/978-3-030-02520-5_4"},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1016\/0041-5553(67)90040-7","article-title":"The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming","volume":"7","author":"Bregman","year":"1967","journal-title":"USSR Comput. Math. Math. Phys."},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"4485","DOI":"10.3390\/e17074485","article-title":"On monotone embedding in information geometry","volume":"17","author":"Zhang","year":"2015","journal-title":"Entropy"},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Nielsen, F., and Nock, R. (2009, January 23\u201326). The dual Voronoi diagrams with respect to representational Bregman divergences. Proceedings of the Sixth International Symposium on Voronoi Diagrams (ISVD), Copenhagen, Denmark.","DOI":"10.1109\/ISVD.2009.15"},{"key":"ref_64","unstructured":"Itakura, F., and Saito, S. (1968, January 21\u201328). Analysis synthesis telephony based on the maximum likelihood method. Proceedings of the 6th International Congress on Acoustics, Tokyo, Japan."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"961","DOI":"10.1214\/aos\/1176348131","article-title":"Asymptotic theory of sequential estimation: Differential geometrical approach","volume":"19","author":"Okamoto","year":"1991","journal-title":"Ann. Stat."},{"key":"ref_66","doi-asserted-by":"crossref","first-page":"1250063","DOI":"10.1142\/S0217984912500637","article-title":"Conformal geometry of escort probability and its applications","volume":"26","author":"Ohara","year":"2012","journal-title":"Mod. Phys. Lett. B"},{"key":"ref_67","first-page":"427","article-title":"On the divergences of 1-conformally flat statistical manifolds","volume":"46","author":"Kurose","year":"1994","journal-title":"Tohoku Math. J. Second Ser."},{"key":"ref_68","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1007\/s11579-015-0159-z","article-title":"The geometry of relative arbitrage","volume":"10","author":"Pal","year":"2016","journal-title":"Math. Financ. Econ."},{"key":"ref_69","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1109\/TIT.1982.1056489","article-title":"Least squares quantization in PCM","volume":"28","author":"Lloyd","year":"1982","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_70","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1016\/j.tcs.2010.05.034","article-title":"The planar k-means problem is NP-hard","volume":"442","author":"Mahajan","year":"2012","journal-title":"Theor. Comput. Sci."},{"key":"ref_71","doi-asserted-by":"crossref","first-page":"29","DOI":"10.32614\/RJ-2011-015","article-title":"Ckmeans.1d.dp: Optimal k-means clustering in one dimension by dynamic programming","volume":"3","author":"Wang","year":"2011","journal-title":"R J."},{"key":"ref_72","first-page":"1705","article-title":"Clustering with Bregman divergences","volume":"6","author":"Banerjee","year":"2005","journal-title":"J. Mach. Learn. Res."},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"2882","DOI":"10.1109\/TIT.2009.2018176","article-title":"Sided and symmetrized Bregman centroids","volume":"55","author":"Nielsen","year":"2009","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_74","unstructured":"Ronchetti, E.M., and Huber, P.J. (2009). Robust Statistics, John Wiley & Sons."},{"key":"ref_75","doi-asserted-by":"crossref","unstructured":"Nielsen, F., and Nock, R. (2015, January 19\u201324). Total Jensen divergences: Definition, properties and clustering. Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Australia.","DOI":"10.1109\/ICASSP.2015.7178324"},{"key":"ref_76","doi-asserted-by":"crossref","unstructured":"Eguchi, S., and Komori, O. (2022). Minimum Divergence Methods in Statistical Machine Learning, Springer.","DOI":"10.1007\/978-4-431-56922-0"},{"key":"ref_77","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1109\/TCOM.1967.1089532","article-title":"The divergence and Bhattacharyya distance measures in signal selection","volume":"15","author":"Kailath","year":"1967","journal-title":"IEEE Trans. Commun. Technol."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/15\/11\/435\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:20:39Z","timestamp":1760145639000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/15\/11\/435"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,17]]},"references-count":77,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["a15110435"],"URL":"https:\/\/doi.org\/10.3390\/a15110435","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2022,11,17]]}}}