{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,27]],"date-time":"2026-05-27T13:13:29Z","timestamp":1779887609665,"version":"3.53.1"},"reference-count":41,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2022,11,10]],"date-time":"2022-11-10T00:00:00Z","timestamp":1668038400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Kyoto University","award":["JPMJER1903"],"award-info":[{"award-number":["JPMJER1903"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Optimal transport is a mathematical tool that has been a widely used to measure the distance between two probability distributions. To mitigate the cubic computational complexity of the vanilla formulation of the optimal transport problem, regularized optimal transport has received attention in recent years, which is a convex program to minimize the linear transport cost with an added convex regularizer. Sinkhorn optimal transport is the most prominent one regularized with negative Shannon entropy, leading to densely supported solutions, which are often undesirable in light of the interpretability of transport plans. In this paper, we report that a deformed entropy designed by q-algebra, a popular generalization of the standard algebra studied in Tsallis statistical mechanics, makes optimal transport solutions supported sparsely. This entropy with a deformation parameter q interpolates the negative Shannon entropy (q=1) and the squared 2-norm (q=0), and the solution becomes more sparse as q tends to zero. Our theoretical analysis reveals that a larger q leads to a faster convergence when optimized with the Broyden\u2013Fletcher\u2013Goldfarb\u2013Shanno (BFGS) algorithm. In summary, the deformation induces a trade-off between the sparsity and convergence speed.<\/jats:p>","DOI":"10.3390\/e24111634","type":"journal-article","created":{"date-parts":[[2022,11,10]],"date-time":"2022-11-10T19:17:34Z","timestamp":1668107854000},"page":"1634","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Sparse Regularized Optimal Transport with Deformed q-Entropy"],"prefix":"10.3390","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4473-2604","authenticated-orcid":false,"given":"Han","family":"Bao","sequence":"first","affiliation":[{"name":"Graduate School of Informatics and The Hakubi Center for Advanced Research, Kyoto University, Kyoto 604-8103, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Shinsaku","family":"Sakaue","sequence":"additional","affiliation":[{"name":"Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo 153-8505, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2022,11,10]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Villani, C. (2009). Optimal Transport: Old and New, Springer.","DOI":"10.1007\/978-3-540-71050-9"},{"key":"ref_2","unstructured":"Shafieezadeh-Abadeh, S., Mohajerin Esfahani, P.M., and Kuhn, D. (2015). Distributionally robust logistic regression. Adv. Neural Inf. Process. Syst., 28."},{"key":"ref_3","unstructured":"Courty, N., Flamary, R., Habrard, A., and Rakotomamonjy, A. (2017). Joint distribution optimal transportation for domain adaptation. Adv. Neural Inf. Process. Syst., 30."},{"key":"ref_4","unstructured":"Arjovsky, M., Chintala, S., and Bottou, L. (2017, January 6\u201311). Wasserstein generative adversarial networks. Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia. PMLR."},{"key":"ref_5","unstructured":"Kusner, M., Sun, Y., Kolkin, N., and Weinberger, K. (2015, January 7\u20139). From word embeddings to document distances. Proceedings of the 32nd International Conference on Machine Learning, Lille, France. PMLR."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Swanson, K., Yu, L., and Lei, T. (2020, January 5\u201310). Rationalizing text matching: Learning sparse alignments via optimal transport. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online.","DOI":"10.18653\/v1\/2020.acl-main.496"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Otani, M., Togashi, R., Nakashima, Y., Rahtu, E., Heikkil\u00e4, J., and Satoh, S. (2022, January 21\u201324). Optimal correction cost for object detection evaluation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.02043"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Pele, O., and Werman, M. (October, January 29). Fast and robust Earth Mover\u2019s Distances. Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan.","DOI":"10.1109\/ICCV.2009.5459199"},{"key":"ref_9","first-page":"2292","article-title":"Sinkhorn distances: Lightspeed computation of optimal transport","volume":"26","author":"Cuturi","year":"2013","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_10","first-page":"590","article-title":"Regularized optimal transport and the rot mover\u2019s distance","volume":"19","author":"Dessein","year":"2018","journal-title":"J. Mach. Learn. Res."},{"key":"ref_11","unstructured":"Dvurechensky, P., Gasnikov, A., and Kroshnin, A. (2018, January 10\u201315). Computational optimal transport: Complexity by accelerated gradient descent is better than by Sinkhorn\u2019s algorithm. Proceedings of the 36th International Conference on Machine Learning, Stockholm, Sweden. PMLR."},{"key":"ref_12","first-page":"12304","article-title":"Tree-sliced variants of Wasserstein distances","volume":"32","author":"Le","year":"2019","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_13","unstructured":"Le, T., Nguyen, T., Phung, D., and Nguyen, V.A. (2022, January 28\u201330). Sobolev transport: A scalable metric for probability measures with graph metrics. Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, Online. PMLR."},{"key":"ref_14","first-page":"2053","article-title":"Learning with a Wasserstein loss","volume":"28","author":"Frogner","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_15","first-page":"6861","article-title":"Differentiable ranking and sorting using optimal transport","volume":"32","author":"Cuturi","year":"2019","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_16","first-page":"1","article-title":"Learning with Fenchel-Young losses","volume":"21","author":"Blondel","year":"2020","journal-title":"J. Mach. Learn. Res."},{"key":"ref_17","first-page":"147","article-title":"Tres observaciones sobre el algebra lineal","volume":"5","author":"Birkhoff","year":"1946","journal-title":"Univ. Nac. Tucum\u2019an Rev. Ser. A"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Brualdi, R.A. (2006). Combinatorial Matrix Classes, Cambridge University Press.","DOI":"10.1017\/CBO9780511721182"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Alvarez-Melis, D., and Jaakkola, T. (November, January 31). Gromov\u2013Wasserstein alignment of word embedding spaces. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.","DOI":"10.18653\/v1\/D18-1214"},{"key":"ref_20","unstructured":"Blondel, M., Seguy, V., and Rolet, A. (2018, January 9\u201311). Smooth and sparse optimal transport. Proceedings of the 21st International Conference on Artificial Intelligence and Statistics, Canary Islands, Spain. PMLR."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1007\/BF01589116","article-title":"On the limited memory BFGS method for large scale optimization","volume":"45","author":"Liu","year":"1989","journal-title":"Math. Program."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1170","DOI":"10.3390\/e13061170","article-title":"Geometry of q-exponential family of probability distributions","volume":"13","author":"Amari","year":"2011","journal-title":"Entropy"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"479","DOI":"10.1007\/BF01016429","article-title":"Possible generalization of Boltzmann-Gibbs statistics","volume":"52","author":"Tsallis","year":"1988","journal-title":"J. Stat. Phys."},{"key":"ref_24","unstructured":"Powell, M.J.D. (1976, January 1). Some global convergence properties of a variable metric algorithm for minimization without exact line searches. Proceedings of the Nonlinear Programming, SIAM-AMS Proceedings, New York, NY, USA."},{"key":"ref_25","first-page":"1961","article-title":"Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration","volume":"30","author":"Altschuler","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"343","DOI":"10.2140\/pjm.1967.21.343","article-title":"Concerning nonnegative matrices and doubly stochastic matrices","volume":"21","author":"Sinkhorn","year":"1967","journal-title":"Pac. J. Math."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"641","DOI":"10.1137\/0114053","article-title":"The theory of max-min, with applications","volume":"14","author":"Danskin","year":"1966","journal-title":"SIAM J. Appl. Math."},{"key":"ref_28","unstructured":"Bao, H., and Sugiyama, M. (2021, January 13\u201315). Fenchel-Young losses with skewed entropies for class-posterior probability estimation. Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, San Diego, CA, USA."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1016\/S0378-4371(02)01018-X","article-title":"Deformed exponentials and logarithms in generalized thermostatistics","volume":"316","author":"Naudts","year":"2002","journal-title":"Phys. A Stat. Mech. Its Appl."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1143\/PTPS.162.79","article-title":"The unique non self-referential q-canonical distribution and the physical temperature derived from the maximum entropy principle in Tsallis statistics","volume":"162","author":"Suyari","year":"2006","journal-title":"Prog. Theor. Phys. Suppl."},{"key":"ref_31","first-page":"514","article-title":"t-Logistic regression","volume":"23","author":"Ding","year":"2010","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_32","unstructured":"Futami, F., Sato, I., and Sugiyama, M. (2017). Expectation propagation for t-exponential family using q-algebra. Adv. Neural Inf. Process. Syst., 30."},{"key":"ref_33","first-page":"15013","article-title":"Robust bi-tempered logistic loss based on bregman divergences","volume":"32","author":"Amid","year":"2019","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Martins, A.F., Figueiredo, M.A., Aguiar, P.M., Smith, N.A., and Xing, E.P. (2008, January 5\u20139). Nonextensive entropic kernels. Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland.","DOI":"10.21236\/ADA487475"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Muzellec, B., Nock, R., Patrini, G., and Nielsen, F. (2017, January 4\u20139). Tsallis regularized optimal transport and ecological inference. Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.10854"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"1171","DOI":"10.1137\/0724077","article-title":"Global convergence of a cass of quasi-Newton methods on convex problems","volume":"24","author":"Byrd","year":"1987","journal-title":"SIAM J. Numer. Anal."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"A1443","DOI":"10.1137\/16M1106018","article-title":"Stabilized sparse scaling algorithms for entropy regularized transport problems","volume":"41","author":"Schmitzer","year":"2019","journal-title":"SIAM J. Sci. Comput."},{"key":"ref_38","first-page":"1","article-title":"POT: Python optimal transport","volume":"22","author":"Flamary","year":"2021","journal-title":"J. Mach. Learn. Res."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1038\/s41592-019-0686-2","article-title":"SciPy 1.0: Fundamental algorithms for scientific computing in Python","volume":"17","author":"Virtanen","year":"2020","journal-title":"Nat. Methods"},{"key":"ref_40","unstructured":"Weed, J. (2018, January 5\u20139). An explicit analysis of the entropic penalty in linear programming. Proceedings of the the 31st Conference on Learning Theory, Stockholm, Sweden. PMLR."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Golub, G.H., and van Loan, C.F. (2013). Matrix Computations, The Johns Hopkins University Press.","DOI":"10.56021\/9781421407944"}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/24\/11\/1634\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:14:04Z","timestamp":1760145244000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/24\/11\/1634"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,10]]},"references-count":41,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["e24111634"],"URL":"https:\/\/doi.org\/10.3390\/e24111634","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,11,10]]}}}