{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,25]],"date-time":"2025-03-25T19:53:09Z","timestamp":1742932389401,"version":"3.40.3"},"publisher-location":"Cham","reference-count":42,"publisher":"Springer International Publishing","isbn-type":[{"type":"print","value":"9783030676636"},{"type":"electronic","value":"9783030676643"}],"license":[{"start":{"date-parts":[[2021,1,1]],"date-time":"2021-01-01T00:00:00Z","timestamp":1609459200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2021,1,1]],"date-time":"2021-01-01T00:00:00Z","timestamp":1609459200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021]]},"DOI":"10.1007\/978-3-030-67664-3_26","type":"book-chapter","created":{"date-parts":[[2021,2,24]],"date-time":"2021-02-24T07:06:46Z","timestamp":1614150406000},"page":"431-446","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Automatic Tuning of Stochastic Gradient Descent with Bayesian Optimisation"],"prefix":"10.1007","author":[{"given":"Victor","family":"Picheny","sequence":"first","affiliation":[]},{"given":"Vincent","family":"Dutordoir","sequence":"additional","affiliation":[]},{"given":"Artem","family":"Artemev","sequence":"additional","affiliation":[]},{"given":"Nicolas","family":"Durrande","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,2,25]]},"reference":[{"key":"26_CR1","unstructured":"Andrychowicz, M., et al.: Learning to learn by gradient descent by gradient descent. In: Advances in Neural Information Processing Systems, pp. 3981\u20133989 (2016)"},{"key":"26_CR2","unstructured":"Baydin, A.G., Cornish, R., Rubio, D.M., Schmidt, M., Wood, F.: Online learning rate adaptation with hypergradient descent. arXiv preprint arXiv:1703.04782 (2017)"},{"key":"26_CR3","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"437","DOI":"10.1007\/978-3-642-35289-8_26","volume-title":"Neural Networks: Tricks of the Trade","author":"Y Bengio","year":"2012","unstructured":"Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Montavon, G., Orr, G.B., M\u00fcller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 437\u2013478. Springer, Heidelberg (2012). https:\/\/doi.org\/10.1007\/978-3-642-35289-8_26"},{"key":"26_CR4","unstructured":"Bergstra, J.S., Bardenet, R., Bengio, Y., K\u00e9gl, B.: Algorithms for hyper-parameter optimization. In: Advances in Neural Information Processing Systems, pp. 2546\u20132554 (2011)"},{"key":"26_CR5","unstructured":"Bogunovic, I., Scarlett, J., Jegelka, S., Cevher, V.: Adversarially robust optimization with Gaussian processes. In: Advances in Neural Information Processing Systems, pp. 5760\u20135770 (2018)"},{"key":"26_CR6","unstructured":"Bogunovic, I., Scarlett, J., Krause, A., Cevher, V.: Truncated variance reduction: a unified approach to Bayesian optimization and level-set estimation. In: Advances in Neural Information Processing Systems, pp. 1507\u20131515 (2016)"},{"key":"26_CR7","unstructured":"Chollet, F.: Keras implementation of ResNet for CIFAR. https:\/\/keras.io\/examples\/cifar10_resnet\/ (2009)"},{"key":"26_CR8","unstructured":"Domhan, T., Springenberg, J.T., Hutter, F.: Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015)"},{"key":"26_CR9","unstructured":"Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(7), 2121\u20132159 (2011)"},{"key":"26_CR10","unstructured":"Falkner, S., Klein, A., Hutter, F.: Bohb: Robust and efficient hyperparameter optimization at scale. arXiv preprint arXiv:1807.01774 (2018)"},{"issue":"1","key":"26_CR11","doi-asserted-by":"publisher","first-page":"490","DOI":"10.1137\/130949555","volume":"2","author":"D Ginsbourger","year":"2014","unstructured":"Ginsbourger, D., Baccou, J., Chevalier, C., Perales, F., Garland, N., Monerie, Y.: Bayesian adaptive reconstruction of profile optima and optimizers. SIAM\/ASA J. Uncertainty Quantification 2(1), 490\u2013510 (2014)","journal-title":"SIAM\/ASA J. Uncertainty Quantification"},{"key":"26_CR12","unstructured":"Gugger, S., Howard, J.: Adamw and super-convergence is now the fastest way to train neural nets (2018). https:\/\/www.fast.ai\/2018\/07\/02\/adam-weight-decay\/"},{"issue":"2","key":"26_CR13","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1162\/106365601750190398","volume":"9","author":"N Hansen","year":"2001","unstructured":"Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 9(2), 159\u2013195 (2001)","journal-title":"Evol. Comput."},{"key":"26_CR14","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770\u2013778 (2016)","DOI":"10.1109\/CVPR.2016.90"},{"key":"26_CR15","unstructured":"Hensman, J., Matthews, A.G.D.G., Ghahramani, Z.: Scalable variational Gaussian process classification. In: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics (2015)"},{"key":"26_CR16","unstructured":"Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. Journal of Machine Learning Research (2013)"},{"key":"26_CR17","unstructured":"Kaufmann, E., Capp\u00e9, O., Garivier, A.: On Bayesian upper confidence bounds for bandit problems. In: Artificial Intelligence and Statistics, pp. 592\u2013600 (2012)"},{"key":"26_CR18","unstructured":"Klein, A., Falkner, S., Bartels, S., Hennig, P., Hutter, F.: Fast Bayesian optimization of machine learning hyperparameters on large datasets. In: International Conference on Artificial Intelligence and Statistics (AISTATS 2017), pp. 528\u2013536. PMLR (2017)"},{"key":"26_CR19","unstructured":"Klein, A., Falkner, S., Springenberg, J.T., Hutter, F.: Learning curve prediction with Bayesian neural networks. In: ICLR (2017)"},{"key":"26_CR20","unstructured":"Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)"},{"issue":"2","key":"26_CR21","doi-asserted-by":"publisher","first-page":"329","DOI":"10.1080\/0020718508961130","volume":"41","author":"I Leontaritis","year":"1985","unstructured":"Leontaritis, I., Billings, S.A.: Input-output parametric models for non-linear systems part ii: stochastic non-linear systems. Int. J. Control 41(2), 329\u2013344 (1985)","journal-title":"Int. J. Control"},{"issue":"185","key":"26_CR22","first-page":"1","volume":"18","author":"L Li","year":"2018","unstructured":"Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., Talwalkar, A.: Hyperband: a novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res. 18(185), 1\u201352 (2018)","journal-title":"J. Mach. Learn. Res."},{"key":"26_CR23","unstructured":"Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: ICLR (2019)"},{"key":"26_CR24","unstructured":"Matthews, A.G.D.G., Hensman, J., Turner, R., Ghahramani, Z.: On sparse variational methods and the kullback-leibler divergence between stochastic Processes. J. Mach. Learn. Res. 51, 231\u2013239 (2016)"},{"key":"26_CR25","unstructured":"Matthews, A.G.D.G., et al.: Gpflow: a Gaussian process library using tensorflow. J. Mach. Learn. Res. 18(1), 1299\u20131304 (2017)"},{"key":"26_CR26","doi-asserted-by":"crossref","unstructured":"Nishida, K., Akimoto, Y.: PSA-CMA-ES: CMA-ES with population size adaptation. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 865\u2013872 (2018)","DOI":"10.1145\/3205455.3205467"},{"issue":"3","key":"26_CR27","doi-asserted-by":"publisher","first-page":"1074","DOI":"10.1016\/j.ejor.2018.03.017","volume":"270","author":"M Pearce","year":"2018","unstructured":"Pearce, M., Branke, J.: Continuous multi-task Bayesian optimisation with correlation. Eur. J. Oper. Res. 270(3), 1074\u20131085 (2018)","journal-title":"Eur. J. Oper. Res."},{"issue":"1","key":"26_CR28","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1137\/120882834","volume":"1","author":"V Picheny","year":"2013","unstructured":"Picheny, V., Ginsbourger, D.: A nonstationary space-time Gaussian process model for partially converged simulations. SIAM\/ASA J. Uncertainty Quantification 1(1), 57\u201378 (2013)","journal-title":"SIAM\/ASA J. Uncertainty Quantification"},{"key":"26_CR29","doi-asserted-by":"crossref","unstructured":"Poloczek, M., Wang, J., Frazier, P.I.: Warm starting Bayesian optimization. In: Proceedings of the 2016 Winter Simulation Conference, pp. 770\u2013781. IEEE Press (2016)","DOI":"10.1109\/WSC.2016.7822140"},{"key":"26_CR30","unstructured":"Reddi, S.J., Kale, S., Kumar, S.: On the convergence of ADAM and beyond. In: ICLR (2018)"},{"key":"26_CR31","unstructured":"Saul, A.D., Hensman, J., Vehtari, A., Lawrence, N.D., et al.: Chained Gaussian processes. In: AISTATS, pp. 1431\u20131440 (2016)"},{"issue":"1","key":"26_CR32","doi-asserted-by":"publisher","first-page":"148","DOI":"10.1109\/JPROC.2015.2494218","volume":"104","author":"B Shahriari","year":"2016","unstructured":"Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., De Freitas, N.: Taking the human out of the loop: a review of Bayesian optimization. Proc. IEEE 104(1), 148\u2013175 (2016)","journal-title":"Proc. IEEE"},{"key":"26_CR33","doi-asserted-by":"crossref","unstructured":"Smith, L.N.: Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 464\u2013472. IEEE (2017)","DOI":"10.1109\/WACV.2017.58"},{"key":"26_CR34","doi-asserted-by":"crossref","unstructured":"Smith, L.N., Topin, N.: Super-convergence: very fast training of neural networks using large learning rates. In: Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications. vol. 11006, p. 1100612. International Society for Optics and Photonics (2019)","DOI":"10.1117\/12.2520589"},{"key":"26_CR35","unstructured":"Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp. 2951\u20132959 (2012)"},{"key":"26_CR36","unstructured":"Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Gaussian Process optimization in the bandit setting: no regret and experimental design. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, pp. 1015\u20131022. Omnipress (2010)"},{"key":"26_CR37","unstructured":"Swersky, K., Snoek, J., Adams, R.P.: Multi-task Bayesian optimization. In: Advances in Neural Information Processing Systems, pp. 2004\u20132012 (2013)"},{"key":"26_CR38","unstructured":"Swersky, K., Snoek, J., Adams, R.P.: Freeze-thaw Bayesian optimization. arXiv preprint arXiv:1406.3896 (2014)"},{"key":"26_CR39","unstructured":"Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA: Neural Netw. Mach. Learn. 4(2), 26\u201331 (2012)"},{"key":"26_CR40","unstructured":"Titsias, M.: Variational learning of inducing variables in sparse Gaussian processes. In: Artificial Intelligence and Statistics (2009)"},{"key":"26_CR41","unstructured":"van der Wilk, M., Dutordoir, V., John, S., Artemev, A., Adam, V., Hensman, J.: A framework for interdomain and multioutput Gaussian processes. arXiv:2003.01115 (2020). https:\/\/arxiv.org\/abs\/2003.01115"},{"key":"26_CR42","unstructured":"Wilson, J., Hutter, F., Deisenroth, M.: Maximizing acquisition functions for Bayesian optimization. In: Advances in Neural Information Processing Systems, pp. 9884\u20139895 (2018)"}],"container-title":["Lecture Notes in Computer Science","Machine Learning and Knowledge Discovery in Databases"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-030-67664-3_26","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,23]],"date-time":"2025-02-23T23:05:41Z","timestamp":1740351941000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-030-67664-3_26"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021]]},"ISBN":["9783030676636","9783030676643"],"references-count":42,"URL":"https:\/\/doi.org\/10.1007\/978-3-030-67664-3_26","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2021]]},"assertion":[{"value":"25 February 2021","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"ECML PKDD","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Joint European Conference on Machine Learning and Knowledge Discovery in Databases","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Ghent","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Belgium","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2020","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"14 September 2020","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"18 September 2020","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"ecml2020","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/ecmlpkdd2020.net\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Double-blind","order":1,"name":"type","label":"Type","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"EasyChair","order":2,"name":"conference_management_system","label":"Conference Management System","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"945","order":3,"name":"number_of_submissions_sent_for_review","label":"Number of Submissions Sent for Review","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"195","order":4,"name":"number_of_full_papers_accepted","label":"Number of Full Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"0","order":5,"name":"number_of_short_papers_accepted","label":"Number of Short Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"21% - The value is computed by the equation \"Number of Full Papers Accepted \/ Number of Submissions Sent for Review * 100\" and then rounded to a whole number.","order":6,"name":"acceptance_rate_of_full_papers","label":"Acceptance Rate of Full Papers","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"4,5","order":7,"name":"average_number_of_reviews_per_paper","label":"Average Number of Reviews per Paper","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"4,4","order":8,"name":"average_number_of_papers_per_reviewer","label":"Average Number of Papers per Reviewer","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"No","order":9,"name":"external_reviewers_involved","label":"External Reviewers Involved","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"The conference took place virtually due to the COVID-19 pandemic","order":10,"name":"additional_info_on_review_process","label":"Additional Info on Review Process","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}}]}}