{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,26]],"date-time":"2025-03-26T15:44:29Z","timestamp":1743003869353,"version":"3.40.3"},"publisher-location":"Cham","reference-count":52,"publisher":"Springer Nature Switzerland","isbn-type":[{"type":"print","value":"9783031264184"},{"type":"electronic","value":"9783031264191"}],"license":[{"start":{"date-parts":[[2023,1,1]],"date-time":"2023-01-01T00:00:00Z","timestamp":1672531200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,3,17]],"date-time":"2023-03-17T00:00:00Z","timestamp":1679011200000},"content-version":"vor","delay-in-days":75,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models. Although AIS is guaranteed to provide unbiased estimate for any set of hyperparameters, the common implementations rely on simple heuristics such as the geometric average bridging distributions between initial and the target distribution which affect the estimation performance when the computation budget is limited. In order to reduce the number of sampling iterations, we present a parameteric AIS process with flexible intermediary distributions defined by a residual density with respect to the geometric mean path. Our method allows parameter sharing between annealing distributions, the use of fix linear schedule for discretization and amortization of hyperparameter selection in latent variable models. We assess the performance of Optimized-Path AIS for marginal likelihood estimation of deep generative models and compare it to compare it to more computationally intensive AIS.<\/jats:p>","DOI":"10.1007\/978-3-031-26419-1_11","type":"book-chapter","created":{"date-parts":[[2023,3,27]],"date-time":"2023-03-27T00:24:57Z","timestamp":1679876697000},"page":"174-190","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Optimization of\u00a0Annealed Importance Sampling Hyperparameters"],"prefix":"10.1007","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4902-165X","authenticated-orcid":false,"given":"Shirin","family":"Goshtasbpour","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8996-5076","authenticated-orcid":false,"given":"Fernando","family":"Perez-Cruz","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,3,17]]},"reference":[{"key":"11_CR1","unstructured":"Arbel, M., Matthews, A.G., Doucet, A.: Annealed flow transport Monte Carlo. arXiv preprint arXiv:2102.07501 (2021)"},{"issue":"1","key":"11_CR2","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1007\/s11222-010-9206-z","volume":"22","author":"G Behrens","year":"2012","unstructured":"Behrens, G., Friel, N., Hurn, M.: Tuning tempered transitions. Stat. Comput. 22(1), 65\u201378 (2012)","journal-title":"Stat. Comput."},{"issue":"518","key":"11_CR3","doi-asserted-by":"publisher","first-page":"859","DOI":"10.1080\/01621459.2017.1285773","volume":"112","author":"DM Blei","year":"2017","unstructured":"Blei, D.M., Kucukelbir, A., McAuliffe, J.D.: Variational inference: a review for statisticians. J. Am. Stat. Assoc. 112(518), 859\u2013877 (2017)","journal-title":"J. Am. Stat. Assoc."},{"key":"11_CR4","doi-asserted-by":"crossref","unstructured":"Brooks, S., Gelman, A., Jones, G., Meng, X.L.: Handbook of Markov Chain Monte Carlo. CRC Press, Boca Raton (2011)","DOI":"10.1201\/b10905"},{"key":"11_CR5","unstructured":"Burda, Y., Grosse, R., Salakhutdinov, R.: Importance weighted autoencoders. arXiv preprint arXiv:1509.00519 (2015)"},{"issue":"12","key":"11_CR6","doi-asserted-by":"publisher","first-page":"4028","DOI":"10.1016\/j.csda.2009.07.025","volume":"53","author":"B Calderhead","year":"2009","unstructured":"Calderhead, B., Girolami, M.: Estimating bayes factors via thermodynamic integration and population MCMC. Comput. Stat. Data Anal. 53(12), 4028\u20134045 (2009)","journal-title":"Comput. Stat. Data Anal."},{"key":"11_CR7","unstructured":"Caterini, A.L., Doucet, A., Sejdinovic, D.: Hamiltonian variational auto-encoder. arXiv preprint arXiv:1805.11328 (2018)"},{"issue":"3","key":"11_CR8","doi-asserted-by":"publisher","first-page":"411","DOI":"10.1111\/j.1467-9868.2006.00553.x","volume":"68","author":"P Del Moral","year":"2006","unstructured":"Del Moral, P., Doucet, A., Jasra, A.: Sequential Monte Carlo samplers. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 68(3), 411\u2013436 (2006)","journal-title":"J. Roy. Stat. Soc. Ser. B (Stat. Methodol.)"},{"key":"11_CR9","unstructured":"Domke, J., Sheldon, D.: Importance weighting and variational inference. arXiv preprint arXiv:1808.09034 (2018)"},{"key":"11_CR10","unstructured":"Donahue, J., Kr\u00e4henb\u00fchl, P., Darrell, T.: Adversarial feature learning. arXiv preprint arXiv:1605.09782 (2016)"},{"key":"11_CR11","unstructured":"Elvira, V., Martino, L., Robert, C.P.: Rethinking the effective sample size. arXiv preprint arXiv:1809.04129 (2018)"},{"key":"11_CR12","unstructured":"Geffner, T., Domke, J.: MCMC variational inference via uncorrected Hamiltonian annealing. arXiv preprint arXiv:2107.04150 (2021)"},{"key":"11_CR13","doi-asserted-by":"crossref","unstructured":"Gelman, A., Meng, X.L.: Simulating normalizing constants: from importance sampling to bridge sampling to path sampling. Stat. Sci. 13, 163\u2013185 (1998)","DOI":"10.1214\/ss\/1028905934"},{"key":"11_CR14","unstructured":"Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems 27 (2014)"},{"key":"11_CR15","unstructured":"Grosse, R.B., Ancha, S., Roy, D.M.: Measuring the reliability of MCMC inference with bidirectional Monte Carlo. arXiv preprint arXiv:1606.02275 (2016)"},{"key":"11_CR16","unstructured":"Grosse, R.B., Ghahramani, Z., Adams, R.P.: Sandwiching the marginal likelihood using bidirectional Monte Carlo. arXiv preprint arXiv:1511.02543 (2015)"},{"key":"11_CR17","unstructured":"Grosse, R.B., Maddison, C.J., Salakhutdinov, R.: Annealing between distributions by averaging moments. In: Advances in Neural Information Processing Systems (NIPS), pp. 2769\u20132777. Citeseer (2013)"},{"key":"11_CR18","unstructured":"Gu, S., Ghahramani, Z., Turner, R.E.: Neural adaptive sequential Monte Carlo. arXiv preprint arXiv:1506.03338 (2015)"},{"key":"11_CR19","unstructured":"Hoffman, M.D.: Learning deep latent gaussian models with Markov chain Monte Carlo. In: International Conference on Machine Learning, pp. 1510\u20131519. PMLR (2017)"},{"key":"11_CR20","unstructured":"Huang, S., Makhzani, A., Cao, Y., Grosse, R.: Evaluating lossy compression rates of deep generative models. In: International Conference on Machine Learning, pp. 4444\u20134454. PMLR (2020)"},{"issue":"1","key":"11_CR21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1111\/j.1467-9469.2010.00723.x","volume":"38","author":"A Jasra","year":"2011","unstructured":"Jasra, A., Stephens, D.A., Doucet, A., Tsagaris, T.: Inference for l\u00e9vy-driven stochastic volatility models via adaptive sequential Monte Carlo. Scand. J. Stat. 38(1), 1\u201322 (2011)","journal-title":"Scand. J. Stat."},{"key":"11_CR22","unstructured":"Johansen, A.M., Aston, J.A., Zhou, Y.: Towards automatic model comparison: An adaptive sequential Monte Carlo approach (2015)"},{"key":"11_CR23","unstructured":"Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7\u20139 May 2015, Conference Track Proceedings (2015). http:\/\/arxiv.org\/abs\/1412.6980"},{"key":"11_CR24","unstructured":"Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)"},{"key":"11_CR25","unstructured":"Kingma, D.P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., Welling, M.: Improved variational inference with inverse autoregressive flow. In: Advances in Neural Information Processing Systems. vol. 29 (2016)"},{"key":"11_CR26","unstructured":"Kiwaki, T.: Variational optimization of annealing schedules. arXiv preprint arXiv:1502.05313 (2015)"},{"key":"11_CR27","doi-asserted-by":"crossref","unstructured":"Laubenfels, R.: Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applications. Taylor & Francis, Milton Park (2005)","DOI":"10.1198\/jasa.2005.s52"},{"key":"11_CR28","unstructured":"Maddison, C.J., et al.: Filtering variational objectives. arXiv preprint arXiv:1705.09279 (2017)"},{"key":"11_CR29","unstructured":"Masrani, V., et al.: q-Paths: generalizing the geometric annealing path using power means. arXiv preprint arXiv:2107.00745 (2021)"},{"key":"11_CR30","unstructured":"Mnih, A., Rezende, D.: Variational inference for Monte Carlo objectives. In: International Conference on Machine Learning, pp. 2188\u20132196. PMLR (2016)"},{"key":"11_CR31","unstructured":"Naesseth, C., Linderman, S., Ranganath, R., Blei, D.: Variational sequential Monte Carlo. In: International Conference on Artificial Intelligence and Statistics, pp. 968\u2013977. PMLR (2018)"},{"key":"11_CR32","unstructured":"Naesseth, C., Lindsten, F., Schon, T.: Nested sequential Monte Carlo methods. In: International Conference on Machine Learning, pp. 1292\u20131301. PMLR (2015)"},{"key":"11_CR33","unstructured":"Naesseth, C., Ruiz, F., Linderman, S., Blei, D.: Reparameterization gradients through acceptance-rejection sampling algorithms. In: Artificial Intelligence and Statistics, pp. 489\u2013498. PMLR (2017)"},{"issue":"2","key":"11_CR34","doi-asserted-by":"publisher","first-page":"125","DOI":"10.1023\/A:1008923215028","volume":"11","author":"RM Neal","year":"2001","unstructured":"Neal, R.M.: Annealed importance sampling. Stat. Comput. 11(2), 125\u2013139 (2001)","journal-title":"Stat. Comput."},{"key":"11_CR35","unstructured":"Paszke, A., et al.: Automatic differentiation in pytorch (2017)"},{"key":"11_CR36","unstructured":"Ranganath, R., Gerrish, S., Blei, D.: Black box variational inference. In: Artificial intelligence and statistics, pp. 814\u2013822. PMLR (2014)"},{"key":"11_CR37","unstructured":"Rezende, D., Mohamed, S.: Variational inference with normalizing flows. In: International Conference on Machine Learning, pp. 1530\u20131538. PMLR (2015)"},{"key":"11_CR38","unstructured":"Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: International conference on machine learning, pp. 1278\u20131286. PMLR (2014)"},{"key":"11_CR39","unstructured":"Sajjadi, M.S., Bachem, O., Lucic, M., Bousquet, O., Gelly, S.: Assessing generative models via precision and recall. In: Advances in Neural Information Processing Systems 31 (2018)"},{"key":"11_CR40","doi-asserted-by":"crossref","unstructured":"Salakhutdinov, R., Murray, I.: On the quantitative analysis of deep belief networks. In: Proceedings of the 25th International Conference on Machine Learning, pp. 872\u2013879 (2008)","DOI":"10.1145\/1390156.1390266"},{"key":"11_CR41","unstructured":"Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Advances in Neural Information Processing Systems 29 (2016)"},{"key":"11_CR42","unstructured":"Salimans, T., Kingma, D., Welling, M.: Markov chain Monte Carlo and variational inference: Bridging the gap. In: International Conference on Machine Learning, pp. 1218\u20131226. PMLR (2015)"},{"issue":"4","key":"11_CR43","doi-asserted-by":"publisher","first-page":"833","DOI":"10.1214\/06-BA127","volume":"1","author":"J Skilling","year":"2006","unstructured":"Skilling, J., et al.: Nested sampling for general bayesian computation. Bayesian Anal. 1(4), 833\u2013859 (2006)","journal-title":"Bayesian Anal."},{"key":"11_CR44","unstructured":"Theis, L., Oord, A.V.D., Bethge, M.: A note on the evaluation of generative models. arXiv preprint arXiv:1511.01844 (2015)"},{"key":"11_CR45","unstructured":"Thin, A., et al.: MetFlow: a new efficient method for bridging the gap between Markov chain Monte Carlo and variational inference. arXiv preprint arXiv:2002.12253 (2020)"},{"key":"11_CR46","unstructured":"Thin, A., Kotelevskii, N., Doucet, A., Durmus, A., Moulines, E., Panov, M.: Monte Carlo variational auto-encoders. In: International Conference on Machine Learning, pp. 10247\u201310257. PMLR (2021)"},{"key":"11_CR47","unstructured":"Tom, T.A.L.M.I., Wood, J.T.R.F.: Auto-encoding sequential Monte Carlo. stat 1050, 29 (2017)"},{"key":"11_CR48","unstructured":"Wolf, C., Karl, M., van der Smagt, P.: Variational inference with Hamiltonian Monte Carlo. arXiv preprint arXiv:1609.08203 (2016)"},{"key":"11_CR49","unstructured":"Wu, H., K\u00f6hler, J., No\u00e9, F.: Stochastic normalizing flows. arXiv preprint arXiv:2002.06707 (2020)"},{"key":"11_CR50","unstructured":"Wu, Y., Burda, Y., Salakhutdinov, R., Grosse, R.: On the quantitative analysis of decoder-based generative models. CoRR arXiv preprint arXiv:1611.04273 (2016)"},{"key":"11_CR51","unstructured":"Yu, Y., Zhang, W., Deng, Y.: Frechet inception distance (fid) for evaluating GANs"},{"key":"11_CR52","unstructured":"Zhang, G., Hsu, K., Li, J., Finn, C., Grosse, R.: Differentiable annealed importance sampling and the perils of gradient noise. arXiv preprint arXiv:2107.10211 (2021)"}],"container-title":["Lecture Notes in Computer Science","Machine Learning and Knowledge Discovery in Databases"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-26419-1_11","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,3,27]],"date-time":"2023-03-27T00:36:47Z","timestamp":1679877407000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-26419-1_11"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023]]},"ISBN":["9783031264184","9783031264191"],"references-count":52,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-26419-1_11","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2023]]},"assertion":[{"value":"17 March 2023","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"ECML PKDD","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Joint European Conference on Machine Learning and Knowledge Discovery in Databases","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Grenoble","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"France","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2022","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"19 September 2022","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"23 September 2022","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"22","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"ecml2022","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/2022.ecmlpkdd.org\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Double-blind","order":1,"name":"type","label":"Type","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"CMT","order":2,"name":"conference_management_system","label":"Conference Management System","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"1060","order":3,"name":"number_of_submissions_sent_for_review","label":"Number of Submissions Sent for Review","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"236","order":4,"name":"number_of_full_papers_accepted","label":"Number of Full Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"0","order":5,"name":"number_of_short_papers_accepted","label":"Number of Short Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"22% - The value is computed by the equation \"Number of Full Papers Accepted \/ Number of Submissions Sent for Review * 100\" and then rounded to a whole number.","order":6,"name":"acceptance_rate_of_full_papers","label":"Acceptance Rate of Full Papers","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"3-4","order":7,"name":"average_number_of_reviews_per_paper","label":"Average Number of Reviews per Paper","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"3-4","order":8,"name":"average_number_of_papers_per_reviewer","label":"Average Number of Papers per Reviewer","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"No","order":9,"name":"external_reviewers_involved","label":"External Reviewers Involved","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"17 demo track papers have been accepted from 28 submissions","order":10,"name":"additional_info_on_review_process","label":"Additional Info on Review Process","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}}]}}