{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T15:30:36Z","timestamp":1772119836720,"version":"3.50.1"},"reference-count":36,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2024,8,13]],"date-time":"2024-08-13T00:00:00Z","timestamp":1723507200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,8,13]],"date-time":"2024-08-13T00:00:00Z","timestamp":1723507200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"crossref","award":["TRR 305"],"award-info":[{"award-number":["TRR 305"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"crossref","award":["458051812"],"award-info":[{"award-number":["458051812"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"crossref","award":["458051812"],"award-info":[{"award-number":["458051812"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"crossref","award":["460248186"],"award-info":[{"award-number":["460248186"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Stat Comput"],"published-print":{"date-parts":[[2024,10]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The Kullback\u2013Leibler (KL) divergence is frequently used in data science. For discrete distributions on large state spaces, approximations of probability vectors may result in a few small negative entries, rendering the KL divergence undefined. We address this problem by introducing a parameterized family of substitute divergence measures, the shifted KL (sKL) divergence measures. Our approach is generic and does not increase the computational overhead. We show that the sKL divergence shares important theoretical properties with the KL divergence and discuss how its shift parameters should be chosen. If Gaussian noise is added to a probability vector, we prove that the average sKL divergence converges to the KL divergence for small enough noise. We also show that our method solves the problem of negative entries in an application from computational oncology, the optimization of Mutual Hazard Networks for cancer progression using tensor-train approximations.<\/jats:p>","DOI":"10.1007\/s11222-024-10480-y","type":"journal-article","created":{"date-parts":[[2024,8,13]],"date-time":"2024-08-13T13:02:21Z","timestamp":1723554141000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Taming numerical imprecision by adapting the KL divergence to negative probabilities"],"prefix":"10.1007","volume":"34","author":[{"given":"Simon","family":"Pfahler","sequence":"first","affiliation":[]},{"given":"Peter","family":"Georg","sequence":"additional","affiliation":[]},{"given":"Rudolf","family":"Schill","sequence":"additional","affiliation":[]},{"given":"Maren","family":"Klever","sequence":"additional","affiliation":[]},{"given":"Lars","family":"Grasedyck","sequence":"additional","affiliation":[]},{"given":"Rainer","family":"Spang","sequence":"additional","affiliation":[]},{"given":"Tilo","family":"Wettig","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,8,13]]},"reference":[{"key":"10480_CR1","doi-asserted-by":"publisher","DOI":"10.1007\/978-4-431-55978-8","volume-title":"Information geometry and its applications","author":"S-I Amari","year":"2016","unstructured":"Amari, S.-I.: Information geometry and its applications. Springer, Tokyo (2016). https:\/\/doi.org\/10.1007\/978-4-431-55978-8"},{"issue":"5","key":"10480_CR2","doi-asserted-by":"publisher","first-page":"281","DOI":"10.1016\/S0370-1573(01)00010-2","volume":"353","author":"R Alkofer","year":"2001","unstructured":"Alkofer, R., Smekal, L.: The infrared behaviour of QCD Green\u2019s functions: confinement, dynamical symmetry breaking, and hadrons as relativistic bound states. Phys. Rep. 353(5), 281 (2001). https:\/\/doi.org\/10.1016\/S0370-1573(01)00010-2","journal-title":"Phys. Rep."},{"issue":"4","key":"10480_CR3","doi-asserted-by":"publisher","first-page":"621","DOI":"10.1016\/j.sigpro.2012.09.003","volume":"93","author":"M Basseville","year":"2013","unstructured":"Basseville, M.: Divergence measures for statistical data processing\u2014an annotated bibliography. Signal Process. 93(4), 621 (2013). https:\/\/doi.org\/10.1016\/j.sigpro.2012.09.003","journal-title":"Signal Process."},{"issue":"4","key":"10480_CR4","doi-asserted-by":"publisher","first-page":"1619","DOI":"10.1140\/epjc\/s10052-011-1619-0","volume":"71","author":"Y Burnier","year":"2011","unstructured":"Burnier, Y., Laine, M., Mether, L.: A test on ananlytic continuation of thermal imaginary-time data. Eur. Phys. J. C 71(4), 1619 (2011). https:\/\/doi.org\/10.1140\/epjc\/s10052-011-1619-0","journal-title":"Eur. Phys. J. C"},{"issue":"7","key":"10480_CR5","doi-asserted-by":"publisher","first-page":"410","DOI":"10.1016\/j.tree.2010.04.001","volume":"25","author":"K Csill\u00e9ry","year":"2010","unstructured":"Csill\u00e9ry, K., Blum, M.G.B., Gaggiotti, O.E., Fran\u00e7ois, O.: Approximate Bayesian computation (ABC) in practice. Trends Ecol. Evol. 25(7), 410\u2013418 (2010). https:\/\/doi.org\/10.1016\/j.tree.2010.04.001","journal-title":"Trends Ecol. Evol."},{"issue":"3","key":"10480_CR6","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pone.0283004","volume":"18","author":"J Chen","year":"2023","unstructured":"Chen, J.: Time hazard networks: incorporating temporal difference for oncogenetic analysis. PLoS ONE 18(3), 1 (2023). https:\/\/doi.org\/10.1371\/journal.pone.0283004","journal-title":"PLoS ONE"},{"issue":"4","key":"10480_CR7","doi-asserted-by":"publisher","first-page":"1272","DOI":"10.1137\/110859063","volume":"33","author":"EC Chi","year":"2012","unstructured":"Chi, E.C., Kolda, T.G.: On tensors, sparsity, and nonnegative factorizations. SIAM J. Matrix Anal. Appl. 33(4), 1272 (2012). https:\/\/doi.org\/10.1137\/110859063","journal-title":"SIAM J. Matrix Anal. Appl."},{"issue":"2","key":"10480_CR8","doi-asserted-by":"publisher","first-page":"187","DOI":"10.1111\/j.2517-6161.1972.tb00899.x","volume":"34","author":"DR Cox","year":"1972","unstructured":"Cox, D.R.: Regression models and life-tables. J. Roy. Stat. Soc.: Ser. B (Methodol.) 34(2), 187 (1972). https:\/\/doi.org\/10.1111\/j.2517-6161.1972.tb00899.x","journal-title":"J. Roy. Stat. Soc.: Ser. B (Methodol.)"},{"key":"10480_CR9","doi-asserted-by":"publisher","DOI":"10.1002\/0471200611","volume-title":"Elements of information theory","author":"TM Cover","year":"1991","unstructured":"Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley, Hoboken (New Jersey) (1991). https:\/\/doi.org\/10.1002\/0471200611"},{"issue":"5","key":"10480_CR10","doi-asserted-by":"publisher","first-page":"2248","DOI":"10.1137\/140953289","volume":"36","author":"SV Dolgov","year":"2014","unstructured":"Dolgov, S.V., Savostyanov, D.V.: Alternating minimal energy methods for linear systems in higher dimensions. SIAM J. Sci. Comput. 36(5), 2248 (2014). https:\/\/doi.org\/10.1137\/140953289","journal-title":"SIAM J. Sci. Comput."},{"key":"10480_CR11","doi-asserted-by":"publisher","DOI":"10.1002\/9781118723203","volume-title":"Practical methods of optimization","author":"R Fletcher","year":"2000","unstructured":"Fletcher, R.: Practical methods of optimization. Wiley, Chichester (2000). https:\/\/doi.org\/10.1002\/9781118723203"},{"key":"10480_CR12","unstructured":"Georg, P.: Tensor Train Decomposition for solving high-dimensional Mutual Hazard Networks. PhD thesis, University of Regensburg (2022). https:\/\/epub.uni-regensburg.de\/53004"},{"issue":"1","key":"10480_CR13","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1007\/s00285-022-01846-9","volume":"86","author":"P Georg","year":"2022","unstructured":"Georg, P., Grasedyck, L., Klever, M., Schill, R., Spang, R., Wettig, T.: Low-rank tensor methods for Markov chains with applications to tumor progression models. J. Math. Biol. 86(1), 7 (2022). https:\/\/doi.org\/10.1007\/s00285-022-01846-9","journal-title":"J. Math. Biol."},{"issue":"1","key":"10480_CR14","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1002\/gamm.201310004","volume":"36","author":"L Grasedyck","year":"2013","unstructured":"Grasedyck, L., Kressner, D., Tobler, C.: A literature survey of low-rank tensor approximation techinques. GAMM-Mitteilungen 36(1), 53 (2013). https:\/\/doi.org\/10.1002\/gamm.201310004","journal-title":"GAMM-Mitteilungen"},{"key":"10480_CR15","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-35554-8","volume-title":"Tensor spaces and numerical tensor calculus. Springer series in computational mathematics","author":"W Hackbusch","year":"2019","unstructured":"Hackbusch, W.: Tensor spaces and numerical tensor calculus. Springer series in computational mathematics, vol. 57. Springer, Cham (2019). https:\/\/doi.org\/10.1007\/978-3-030-35554-8"},{"issue":"9","key":"10480_CR16","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevD.90.091501","volume":"90","author":"M Haas","year":"2014","unstructured":"Haas, M., Fister, L., Pawlowski, J.M.: Gluon spectral functions and transport coefficients in Yang\u2013Mills theory. Phys. Rev. D 90(9), 091501 (2014). https:\/\/doi.org\/10.1103\/PhysRevD.90.091501","journal-title":"Phys. Rev. D"},{"key":"10480_CR17","doi-asserted-by":"publisher","unstructured":"Hobson, M.P., Lasenby, A.N.: The entropic prior for distributions with positive and negative values. Mon. Not. R. Astron. Soc. 298(3), 905\u2013908 (1998). https:\/\/doi.org\/10.1046\/j.1365-8711.1998.01707.x","DOI":"10.1046\/j.1365-8711.1998.01707.x"},{"issue":"2","key":"10480_CR18","doi-asserted-by":"publisher","first-page":"708","DOI":"10.1021\/ar400244v","volume":"47","author":"JC Hoch","year":"2014","unstructured":"Hoch, J.C.: Nonuniform sampling and maximum entropy reconstruction in multidimensional NMR. Acc. Chem. Res. 47(2), 708 (2014). https:\/\/doi.org\/10.1021\/ar400244v","journal-title":"Acc. Chem. Res."},{"issue":"5","key":"10480_CR19","doi-asserted-by":"publisher","first-page":"1002","DOI":"10.1080\/10556788.2015.1009977","volume":"30","author":"S Hansen","year":"2015","unstructured":"Hansen, S., Plantenga, T., Kolda, T.G.: Newton-based optimization for Kullback\u2013Leibler nonnegative tensor factorizations. Optim. Methods Softw. 30(5), 1002 (2015). https:\/\/doi.org\/10.1080\/10556788.2015.1009977","journal-title":"Optim. Methods Softw."},{"issue":"2","key":"10480_CR20","doi-asserted-by":"publisher","first-page":"683","DOI":"10.1137\/100818893","volume":"34","author":"S Holtz","year":"2012","unstructured":"Holtz, S., Rohwedder, T., Schneider, R.: The alternating linear scheme for tensor optimization in the tensor train format. SIAM J. Sci. Comput. 34(2), 683 (2012). https:\/\/doi.org\/10.1137\/100818893","journal-title":"SIAM J. Sci. Comput."},{"issue":"1","key":"10480_CR21","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1002\/mp.13257","volume":"46","author":"W Ha","year":"2019","unstructured":"Ha, W., Sidky, E.Y., Barber, R.F., Schmidt, T.G., Pan, X.: Estimating the spectrum in computed tomography via Kullback\u2013Leibler divergence constrained optimization. Med. Phys. 46(1), 81 (2019). https:\/\/doi.org\/10.1002\/mp.13257","journal-title":"Med. Phys."},{"key":"10480_CR22","unstructured":"Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2017). arxiv.org\/abs\/1412.6980"},{"issue":"1","key":"10480_CR23","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1214\/aoms\/1177729694","volume":"22","author":"S Kullback","year":"1951","unstructured":"Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79 (1951). https:\/\/doi.org\/10.1214\/aoms\/1177729694","journal-title":"Ann. Math. Stat."},{"key":"10480_CR24","unstructured":"Kingma, D.P., Welling, M.: Auto-encoding variational bayes (2022). arxiv.org\/abs\/1312.6114"},{"key":"10480_CR25","doi-asserted-by":"publisher","unstructured":"Luo, X.G., Kuipers, J., Beerenwinkel, N.: Joint inference of exclusivity patterns and recurrent trajectories from tumor mutation trees. Nat. Commun. 14(1), 3676 (2023). https:\/\/doi.org\/10.1038\/s41467-023-39400-w","DOI":"10.1038\/s41467-023-39400-w"},{"key":"10480_CR26","doi-asserted-by":"publisher","unstructured":"Lee, N., Phan, A.-H., Cong, F., Cichocki, A.: Nonnegative Tensor train decomposition for multi-domain feature extraction and clustering. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) Neural Information Processing, p. 87. Springer, Cham (2016). https:\/\/doi.org\/10.1007\/978-3-319-46675-0_10","DOI":"10.1007\/978-3-319-46675-0_10"},{"issue":"3","key":"10480_CR27","doi-asserted-by":"publisher","first-page":"197","DOI":"10.1038\/nrc1295","volume":"4","author":"F Michor","year":"2004","unstructured":"Michor, F., Iwasa, Y., Nowak, M.A.: Dynamics of cancer progression. Nat. Rev. Cancer 4(3), 197 (2004). https:\/\/doi.org\/10.1038\/nrc1295","journal-title":"Nat. Rev. Cancer"},{"key":"10480_CR28","unstructured":"Mathews, J., Walker, R.L.: Mathematical methods of physics. Addison-Wesley, New York (1970)"},{"key":"10480_CR29","doi-asserted-by":"publisher","unstructured":"Oseledets, I.V.: Tensor-Train Decomposition. SIAM J. Sci. Comput. 33(5), 2295 (2011). https:\/\/doi.org\/10.1137\/090752286","DOI":"10.1137\/090752286"},{"issue":"6","key":"10480_CR30","doi-asserted-by":"publisher","first-page":"1156","DOI":"10.1287\/opre.40.6.1156","volume":"40","author":"B Philippe","year":"1992","unstructured":"Philippe, B., Saad, Y., Stewart, W.J.: Numerical methods in markov chain modeling. Oper. Res. 40(6), 1156 (1992). https:\/\/doi.org\/10.1287\/opre.40.6.1156","journal-title":"Oper. Res."},{"issue":"2","key":"10480_CR31","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1002\/env.3170050203","volume":"5","author":"P Paatero","year":"1994","unstructured":"Paatero, P., Tapper, U.: Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2), 111 (1994). https:\/\/doi.org\/10.1002\/env.3170050203","journal-title":"Environmetrics"},{"issue":"5","key":"10480_CR32","doi-asserted-by":"publisher","DOI":"10.1103\/physrevd.95.056016","volume":"95","author":"A Rothkopf","year":"2017","unstructured":"Rothkopf, A.: Bayesian inference of nonpositive spectral functions in quantum field theory. Phys. Rev. D 95(5), 056016 (2017). https:\/\/doi.org\/10.1103\/physrevd.95.056016","journal-title":"Phys. Rev. D"},{"issue":"1","key":"10480_CR33","doi-asserted-by":"publisher","first-page":"241","DOI":"10.1093\/bioinformatics\/btz513","volume":"36","author":"R Schill","year":"2019","unstructured":"Schill, R., Solbrig, S., Wettig, T., Spang, R.: Modelling cancer progression using Mutual Hazard Networks. Bioinformatics 36(1), 241 (2019). https:\/\/doi.org\/10.1093\/bioinformatics\/btz513","journal-title":"Bioinformatics"},{"key":"10480_CR34","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.92.012120","volume":"92","author":"P Thomas","year":"2015","unstructured":"Thomas, P., Grima, R.: Approximate probability distributions of the master equation. Phys. Rev. E 92, 012120 (2015). https:\/\/doi.org\/10.1103\/PhysRevE.92.012120","journal-title":"Phys. Rev. E"},{"key":"10480_CR35","doi-asserted-by":"publisher","unstructured":"van Erven, T., Harremos, P.: R\u00e9nyi divergence and Kullback\u2013Leibler divergence. IEEE Trans. Inf. Theory 60(7), 3797 (2014). https:\/\/doi.org\/10.1109\/TIT.2014.2320500","DOI":"10.1109\/TIT.2014.2320500"},{"issue":"12","key":"10480_CR36","doi-asserted-by":"publisher","first-page":"1255","DOI":"10.1016\/S0167-8655(01)00070-8","volume":"22","author":"M Welling","year":"2001","unstructured":"Welling, M., Weber, M.: Positive tensor factorization. Pattern Reconition Lett. 22(12), 1255 (2001). https:\/\/doi.org\/10.1016\/S0167-8655(01)00070-8","journal-title":"Pattern Reconition Lett."}],"container-title":["Statistics and Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11222-024-10480-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11222-024-10480-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11222-024-10480-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,3]],"date-time":"2024-10-03T11:07:45Z","timestamp":1727953665000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11222-024-10480-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,13]]},"references-count":36,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,10]]}},"alternative-id":["10480"],"URL":"https:\/\/doi.org\/10.1007\/s11222-024-10480-y","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-3917579\/v1","asserted-by":"object"}]},"ISSN":["0960-3174","1573-1375"],"issn-type":[{"value":"0960-3174","type":"print"},{"value":"1573-1375","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,8,13]]},"assertion":[{"value":"1 February 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 July 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 August 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no Conflict of interest to declare that are relevant to the content of this article.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"168"}}