{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,12]],"date-time":"2026-04-12T00:23:59Z","timestamp":1775953439220,"version":"3.50.1"},"reference-count":46,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2019,5,31]],"date-time":"2019-05-31T00:00:00Z","timestamp":1559260800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Efficient approximation lies at the heart of large-scale machine learning problems. In this paper, we propose a novel, robust maximum entropy algorithm, which is capable of dealing with hundreds of moments and allows for computationally efficient approximations. We showcase the usefulness of the proposed method, its equivalence to constrained Bayesian variational inference and demonstrate its superiority over existing approaches in two applications, namely, fast log determinant estimation and information-theoretic Bayesian optimisation.<\/jats:p>","DOI":"10.3390\/e21060551","type":"journal-article","created":{"date-parts":[[2019,5,31]],"date-time":"2019-05-31T11:59:56Z","timestamp":1559303996000},"page":"551","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["MEMe: An Accurate Maximum Entropy Method for Efficient Approximations in Large-Scale Machine Learning"],"prefix":"10.3390","volume":"21","author":[{"given":"Diego","family":"Granziol","sequence":"first","affiliation":[{"name":"Machine Learning Research Group, University of Oxford, Walton Well Rd, Oxford OX2 6ED, UK"},{"name":"Oxford-Man Institute of Quantitative Finance, Walton Well Rd, Oxford OX2 6ED, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Binxin","family":"Ru","sequence":"additional","affiliation":[{"name":"Machine Learning Research Group, University of Oxford, Walton Well Rd, Oxford OX2 6ED, UK"},{"name":"Oxford-Man Institute of Quantitative Finance, Walton Well Rd, Oxford OX2 6ED, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stefan","family":"Zohren","sequence":"additional","affiliation":[{"name":"Machine Learning Research Group, University of Oxford, Walton Well Rd, Oxford OX2 6ED, UK"},{"name":"Oxford-Man Institute of Quantitative Finance, Walton Well Rd, Oxford OX2 6ED, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaowen","family":"Dong","sequence":"additional","affiliation":[{"name":"Machine Learning Research Group, University of Oxford, Walton Well Rd, Oxford OX2 6ED, UK"},{"name":"Oxford-Man Institute of Quantitative Finance, Walton Well Rd, Oxford OX2 6ED, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Osborne","sequence":"additional","affiliation":[{"name":"Machine Learning Research Group, University of Oxford, Walton Well Rd, Oxford OX2 6ED, UK"},{"name":"Oxford-Man Institute of Quantitative Finance, Walton Well Rd, Oxford OX2 6ED, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stephen","family":"Roberts","sequence":"additional","affiliation":[{"name":"Machine Learning Research Group, University of Oxford, Walton Well Rd, Oxford OX2 6ED, UK"},{"name":"Oxford-Man Institute of Quantitative Finance, Walton Well Rd, Oxford OX2 6ED, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2019,5,31]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Granziol, D., and Roberts, S.J. (2017, January 11\u201314). Entropic determinants of massive matrices. Proceedings of the 2017 IEEE International Conference on Big Data, Boston, MA, USA.","DOI":"10.1109\/BigData.2017.8257915"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"057701","DOI":"10.1103\/PhysRevE.71.057701","article-title":"Maximum entropy and the problem of moments: A stable algorithm","volume":"71","author":"Bandyopadhyay","year":"2005","journal-title":"Phys. Rev. E"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"2404","DOI":"10.1063\/1.526446","article-title":"Maximum entropy in the problem of moments","volume":"25","author":"Mead","year":"1984","journal-title":"J. Math. Phys."},{"key":"ref_4","unstructured":"Han, I., Malioutov, D., and Shin, J. (2015, January 6\u201311). Large-scale log-determinant computation through stochastic Chebyshev expansions. Proceedings of the 32nd International Conference on International Conference on Machine Learning, Lille, France."},{"key":"ref_5","unstructured":"Dong, K., Eriksson, D., Nickisch, H., Bindel, D., and Wilson, A.G. (2017, January 4\u20139). Scalable Log Determinants for Gaussian Process Kernel Learning. Proceedings of the 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1080\/10629360600569279","article-title":"Approximate implementation of the logarithm of the matrix determinant in Gaussian process regression","volume":"77","author":"Zhang","year":"2007","journal-title":"J. Stat. Comput. Simul."},{"key":"ref_7","unstructured":"Hern\u00e1ndez-Lobato, J.M., Hoffman, M.W., and Ghahramani, Z. (2014, January 8\u201313). Predictive entropy search for efficient global optimization of black-box functions. Proceedings of the 27st Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_8","unstructured":"Wang, Z., and Jegelka, S. (2017). Max-value Entropy Search for Efficient Bayesian Optimization. arXiv."},{"key":"ref_9","unstructured":"Ru, B., McLeod, M., Granziol, D., and Osborne, M.A. (2018, January 10\u201315). Fast Information-theoretic Bayesian Optimisation. Proceedings of the 2018 International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1007\/s10462-011-9236-8","article-title":"A tutorial on variational Bayesian inference","volume":"38","author":"Fox","year":"2012","journal-title":"Artif. Intell. Rev."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1561\/2200000044","article-title":"Determinantal Point Processes for Machine Learning","volume":"5","author":"Kulesza","year":"2012","journal-title":"Found. Trends Mach. Learn."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1115","DOI":"10.1103\/RevModPhys.85.1115","article-title":"Principles of Maximum Entropy and Maximum Caliber in Statistical Physics","volume":"85","author":"Ghosh","year":"2013","journal-title":"Rev. Mod. Phys."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"620","DOI":"10.1103\/PhysRev.106.620","article-title":"Information Theory and Statistical Mechanics","volume":"106","author":"Jaynes","year":"1957","journal-title":"Phys. Rev."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.physa.2016.02.069","article-title":"Application of the Maximum Relative Entropy method to the Physics of Ferromagnetic Materials","volume":"455","author":"Giffin","year":"2016","journal-title":"Phys. A Stat. Mech. Appl."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1007\/s00780-011-0167-7","article-title":"Maximum Entropy Distributions inferred from Option Portfolios on an Asset","volume":"16","author":"Neri","year":"2012","journal-title":"Financ. Stoch."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1063\/1.4819977","article-title":"The maximum entropy method of moments and Bayesian probability theory","volume":"1553","author":"Bretthorst","year":"2013","journal-title":"AIP Conf. Proc."},{"key":"ref_17","unstructured":"Beal, M.J. (2003). Variational Algorithms for Approximate Bayesian Inference. [Master\u2019s Thesis, University of London]."},{"key":"ref_18","unstructured":"Caticha, A. (2019, May 31). Entropic Inference and the Foundations of Physics (Monograph Commissioned by the 11th Brazilian Meeting on Bayesian Statistics-EBEB-2012. Available online: https:\/\/www.albany.edu\/physics\/ACaticha-EIFP-book.pdf."},{"key":"ref_19","unstructured":"Boyd, S.P., and Vandenberghe, L. (2009). Convex Optimization, Cambridge University Press."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1080\/03610919008812866","article-title":"A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines","volume":"19","author":"Hutchinson","year":"1990","journal-title":"Commun. Stat. Simul. Comput."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Skilling, J. (1989). The eigenvalues of mega-dimensional matrices. Maximum Entropy and Bayesian Methods, Springer.","DOI":"10.1007\/978-94-015-7860-8_48"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"432","DOI":"10.1093\/biostatistics\/kxm045","article-title":"Sparse inverse covariance estimation with the graphical lasso","volume":"9","author":"Friedman","year":"2008","journal-title":"Biostatistics"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Rasmussen, C.E., and Williams, C.K. (2006). Gaussian Processes for Machine Learning, The MIT Press.","DOI":"10.7551\/mitpress\/3206.001.0001"},{"key":"ref_24","unstructured":"MacKay, D.J. (2003). Information Theory, Inference and Learning Algorithms, Cambridge University Press."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1002\/wics.19","article-title":"Minimum volume ellipsoid","volume":"1","author":"Rousseeuw","year":"2009","journal-title":"Wiley Interdiscip. Rev. Comput. Stat."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"2099","DOI":"10.1109\/TSP.2006.874409","article-title":"Log-determinant relaxation for approximate inference in discrete Markov random fields","volume":"54","author":"Wainwright","year":"2006","journal-title":"IEEE Trans. Signal Process."},{"key":"ref_27","first-page":"749","article-title":"\u00dcber die Abgrenzung der Eigenwerte einer Matrix","volume":"6","author":"Gershgorin","year":"1931","journal-title":"Izvestija Akademii Nauk SSSR Serija Matematika"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1075","DOI":"10.1137\/16M1104974","article-title":"Fast Estimation of tr(f(A)) via Stochastic Lanczos Quadrature","volume":"38","author":"Ubaru","year":"2016","journal-title":"SIAM J. Matrix Anal. Appl."},{"key":"ref_29","unstructured":"Granziol, D., and Roberts, S. (2017). An Information and Field Theoretic approach to the Grand Canonical Ensemble. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Fitzsimons, J., Granziol, D., Cutajar, K., Osborne, M., Filippone, M., and Roberts, S. (2017). Entropic Trace Estimates for Log Determinants. arXiv.","DOI":"10.1007\/978-3-319-71249-9_20"},{"key":"ref_31","first-page":"1","article-title":"The University of Florida sparse matrix collection","volume":"38","author":"Davis","year":"2011","journal-title":"ACM Trans. Math. Softw."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"20150142","DOI":"10.1098\/rspa.2015.0142","article-title":"Probabilistic numerics and uncertainty in computations","volume":"471","author":"Hennig","year":"2015","journal-title":"Proc. R. Soc. A"},{"key":"ref_33","unstructured":"Bergstra, J.S., Bardenet, R., Bengio, Y., and K\u00e9gl, B. (2011, January 12\u201315). Algorithms for hyper-parameter optimization. Proceedings of the 24th International Conference on Neural Information Processing Systems, Granada, Spain."},{"key":"ref_34","unstructured":"Snoek, J., Larochelle, H., and Adams, R.P. (2012, January 3\u20136). Practical bayesian optimization of machine learning algorithms. Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Thornton, C., Hutter, F., Hoos, H.H., and Leyton-Brown, K. (2013, January 11\u201314). Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, IL, USA.","DOI":"10.1145\/2487575.2487629"},{"key":"ref_36","unstructured":"Hoffman, M., Shahriari, B., and Freitas, N. (2014, January 22\u201324). On correlation and budget constraints in model-based bandit optimization with application to automatic machine learning. Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, Reykjavik, Iceland."},{"key":"ref_37","first-page":"944","article-title":"Automatic Gait Optimization with Gaussian Process Regression","volume":"7","author":"Lizotte","year":"2007","journal-title":"IJCAI"},{"key":"ref_38","first-page":"321","article-title":"Active policy learning for robot planning and exploration under uncertainty","volume":"3","author":"Doucet","year":"2007","journal-title":"Robotics Sci. Syst."},{"key":"ref_39","unstructured":"Azimi, J., Jalali, A., and Fern, X. (2012). Hybrid batch Bayesian optimization. arXiv."},{"key":"ref_40","unstructured":"Brochu, E., Cora, V.M., and de Freitas, N. (2010). A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv."},{"key":"ref_41","first-page":"1809","article-title":"Entropy search for information-efficient global optimization","volume":"13","author":"Hennig","year":"2012","journal-title":"J. Mach. Learn. Res."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Hershey, J.R., and Olsen, P.A. (2007, January 15\u201320). Approximating the Kullback Leibler divergence between Gaussian mixture models. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Honolulu, HI, USA.","DOI":"10.1109\/ICASSP.2007.366913"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Huber, M.F., Bailey, T., Durrant-Whyte, H., and Hanebeck, U.D. (2008, January 20\u201322). On entropy approximation for Gaussian mixture random vectors. Proceedings of the IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, Seoul, South Korea.","DOI":"10.1109\/MFI.2008.4648062"},{"key":"ref_44","unstructured":"Molga, M., and Smutnicki, C. (2005, May 30). Test Functions for Optimization Needs. Available online: http:\/\/www.zsd.ict.pwr.wroc.pl\/files\/docs\/functions.pdf."},{"key":"ref_45","first-page":"1","article-title":"The global optimization problem. An introduction","volume":"2","author":"Dixon","year":"1978","journal-title":"Toward Glob. Optim."},{"key":"ref_46","unstructured":"Billingsley, P. (2012). Probability and Measure, Wiley."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/6\/551\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:55:04Z","timestamp":1760187304000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/6\/551"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,5,31]]},"references-count":46,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2019,6]]}},"alternative-id":["e21060551"],"URL":"https:\/\/doi.org\/10.3390\/e21060551","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,5,31]]}}}