{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T00:19:17Z","timestamp":1768522757357,"version":"3.49.0"},"reference-count":90,"publisher":"Association for Computing Machinery (ACM)","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2022,1]]},"abstract":"<jats:p>Hyper-parameter optimization is crucial for pushing the accuracy of a deep learning model to its limits. However, a hyper-parameter optimization job, referred to as a study, involves numerous trials of training a model using different training knobs, and therefore is very computation-heavy, typically taking hours and days to finish.<\/jats:p>\n          <jats:p>We observe that trials issued from hyper-parameter optimization algorithms often share common hyper-parameter sequence prefixes. Based on this observation, we propose Hippo, a hyper-parameter optimization system that reuses computation across trials to reduce the overall amount of computation significantly. Instead of treating each trial independently as in existing hyper-parameter optimization systems, Hippo breaks down the hyper-parameter sequences into stages and merges common stages to form a tree of stages (a stage tree). Hippo maintains an internal data structure, search plan, to manage the current status and history of a study, and employs a critical path based scheduler to minimize the overall study completion time. Hippo applies to not only single studies but multi-study scenarios as well. Evaluations show that Hippo's stage-based execution strategy outperforms trial-based methods for several models and hyper-parameter optimization algorithms, reducing end-to-end training time by up to 2.76X (3.53x) and GPU-hours by up to 4.81X (6.77x), for single (multiple) studies.<\/jats:p>","DOI":"10.14778\/3510397.3510402","type":"journal-article","created":{"date-parts":[[2022,5,18]],"date-time":"2022-05-18T22:23:10Z","timestamp":1652912590000},"page":"1038-1052","source":"Crossref","is-referenced-by-count":2,"title":["Hippo"],"prefix":"10.14778","volume":"15","author":[{"given":"Ahnjae","family":"Shin","sequence":"first","affiliation":[{"name":"Seoul National University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Joo Seong","family":"Jeong","sequence":"additional","affiliation":[{"name":"Seoul National University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Do Yoon","family":"Kim","sequence":"additional","affiliation":[{"name":"University of Michigan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Soyoung","family":"Jung","sequence":"additional","affiliation":[{"name":"Seoul National University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Byung-Gon","family":"Chun","sequence":"additional","affiliation":[{"name":"Seoul National University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,5,18]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , Manjunath Kudlur , Josh Levenberg , Rajat Monga , Sherry Moore , Derek G. Murray , Benoit Steiner , Paul Tucker , Vijay Vasudevan , Pete Warden , Martin Wicke , Yuan Yu , and Xiaoqiang Zheng . 2016 . TensorFlow: A System for Large-Scale Machine Learning . In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation ( Savannah, GA, USA) (OSDI'16). USENIX Association, USA, 265--283. Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (Savannah, GA, USA) (OSDI'16). USENIX Association, USA, 265--283."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330701"},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research), Maria Florina Balcan and Kilian Q. Weinberger (Eds.)","volume":"48","author":"Amodei Dario","year":"2016","unstructured":"Dario Amodei , Sundaram Ananthanarayanan , Rishita Anubhai , Jingliang Bai , Eric Battenberg , Carl Case , Jared Casper , Bryan Catanzaro , Qiang Cheng , Guoliang Chen , Jie Chen , Jingdong Chen , Zhijie Chen , Mike Chrzanowski , Adam Coates , Greg Diamos , Ke Ding , Niandong Du , Erich Elsen , Jesse Engel , Weiwei Fang , Linxi Fan , Christopher Fougner , Liang Gao , Caixia Gong , Awni Hannun , Tony Han , Lappi Johannes , Bing Jiang , Cai Ju , Billy Jun , Patrick LeGresley , Libby Lin , Junjie Liu , Yang Liu , Weigao Li , Xiangang Li , Dongpeng Ma , Sharan Narang , Andrew Ng , Sherjil Ozair , Yiping Peng , Ryan Prenger , Sheng Qian , Zongfeng Quan , Jonathan Raiman , Vinay Rao , Sanjeev Satheesh , David Seetapun , Shubho Sengupta , Kavya Srinet , Anuroop Sriram , Haiyuan Tang , Liliang Tang , Chong Wang , Jidong Wang , Kaifu Wang , Yi Wang , Zhijian Wang , Zhiqian Wang , Shuang Wu , Likai Wei , Bo Xiao , Wen Xie , Yan Xie , Dani Yogatama , Bin Yuan , Jun Zhan , and Zhenyao Zhu . 2016 . Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin . In Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research), Maria Florina Balcan and Kilian Q. Weinberger (Eds.) , Vol. 48 . PMLR, New York, New York, USA, 173--182. https:\/\/proceedings.mlr.press\/v48\/amodei16.html Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, Jie Chen, Jingdong Chen, Zhijie Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Ke Ding, Niandong Du, Erich Elsen, Jesse Engel, Weiwei Fang, Linxi Fan, Christopher Fougner, Liang Gao, Caixia Gong, Awni Hannun, Tony Han, Lappi Johannes, Bing Jiang, Cai Ju, Billy Jun, Patrick LeGresley, Libby Lin, Junjie Liu, Yang Liu, Weigao Li, Xiangang Li, Dongpeng Ma, Sharan Narang, Andrew Ng, Sherjil Ozair, Yiping Peng, Ryan Prenger, Sheng Qian, Zongfeng Quan, Jonathan Raiman, Vinay Rao, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Kavya Srinet, Anuroop Sriram, Haiyuan Tang, Liliang Tang, Chong Wang, Jidong Wang, Kaifu Wang, Yi Wang, Zhijian Wang, Zhiqian Wang, Shuang Wu, Likai Wei, Bo Xiao, Wen Xie, Yan Xie, Dani Yogatama, Bin Yuan, Jun Zhan, and Zhenyao Zhu. 2016. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin. In Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research), Maria Florina Balcan and Kilian Q. Weinberger (Eds.), Vol. 48. PMLR, New York, New York, USA, 173--182. https:\/\/proceedings.mlr.press\/v48\/amodei16.html"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV48630.2021.00305"},{"key":"e_1_2_1_5_1","volume-title":"6th International Conference on Learning Representations, ICLR","author":"Baydin Atilim Gunes","year":"2018","unstructured":"Atilim Gunes Baydin , Robert Cornish , David Mart\u00ednez-Rubio , Mark Schmidt , and Frank Wood . 2018. Online Learning Rate Adaptation with Hypergradient Descent . In 6th International Conference on Learning Representations, ICLR 2018 , Vancouver, BC , Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview .net. https:\/\/openreview.net\/forum?id=BkrsAzWAb Atilim Gunes Baydin, Robert Cornish, David Mart\u00ednez-Rubio, Mark Schmidt, and Frank Wood. 2018. Online Learning Rate Adaptation with Hypergradient Descent. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. https:\/\/openreview.net\/forum?id=BkrsAzWAb"},{"key":"e_1_2_1_6_1","unstructured":"Kurian Benoy. 2020. Classifying Flowers with Fastai2. https:\/\/www.kaggle.com\/kurianbenoy\/classifying-flowers-with-fastai2\/notebook  Kurian Benoy. 2020. Classifying Flowers with Fastai2. https:\/\/www.kaggle.com\/kurianbenoy\/classifying-flowers-with-fastai2\/notebook"},{"key":"e_1_2_1_7_1","unstructured":"Xavier Bouthillier and Ga\u00ebl Varoquaux. 2020. Survey of machine-learning experimental methods at NeurIPS2019 and ICLR2020. Research Report. Inria Saclay Ile de France. https:\/\/hal.archives-ouvertes.fr\/hal-02447823  Xavier Bouthillier and Ga\u00ebl Varoquaux. 2020. Survey of machine-learning experimental methods at NeurIPS2019 and ICLR2020. Research Report. Inria Saclay Ile de France. https:\/\/hal.archives-ouvertes.fr\/hal-02447823"},{"key":"e_1_2_1_8_1","unstructured":"Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell Sandhini Agarwal Ariel Herbert-Voss Gretchen Krueger Tom Henighan Rewon Child Aditya Ramesh Daniel M. Ziegler Jeffrey Wu Clemens Winter Christopher Hesse Mark Chen Eric Sigler Mateusz Litwin Scott Gray Benjamin Chess Jack Clark Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever and Dario Amodei. 2020. Language Models are Few-Shot Learners. arXiv:2005.14165 [cs.CL]  Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell Sandhini Agarwal Ariel Herbert-Voss Gretchen Krueger Tom Henighan Rewon Child Aditya Ramesh Daniel M. Ziegler Jeffrey Wu Clemens Winter Christopher Hesse Mark Chen Eric Sigler Mateusz Litwin Scott Gray Benjamin Chess Jack Clark Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever and Dario Amodei. 2020. Language Models are Few-Shot Learners. arXiv:2005.14165 [cs.CL]"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983669"},{"key":"e_1_2_1_10_1","unstructured":"Tianqi Chen Mu Li Yutian Li Min Lin Naiyan Wang Minjie Wang Tianjun Xiao Bing Xu Chiyuan Zhang and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arXiv:1512.01274 [cs.DC]  Tianqi Chen Mu Li Yutian Li Min Lin Naiyan Wang Minjie Wang Tianjun Xiao Bing Xu Chiyuan Zhang and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arXiv:1512.01274 [cs.DC]"},{"key":"e_1_2_1_11_1","unstructured":"Fran\u00e7ois Chollet et al. 2015. Keras. https:\/\/github.com\/fchollet\/keras.  Fran\u00e7ois Chollet et al. 2015. Keras. https:\/\/github.com\/fchollet\/keras."},{"key":"e_1_2_1_12_1","volume-title":"4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2--4, 2016, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http:\/\/arxiv.org\/abs\/1511","author":"Clevert Djork-Arn\u00e9","year":"2016","unstructured":"Djork-Arn\u00e9 Clevert , Thomas Unterthiner , and Sepp Hochreiter . 2016 . Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) . In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2--4, 2016, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http:\/\/arxiv.org\/abs\/1511 .07289 Djork-Arn\u00e9 Clevert, Thomas Unterthiner, and Sepp Hochreiter. 2016. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs). In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2--4, 2016, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http:\/\/arxiv.org\/abs\/1511.07289"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.23919\/Eusipco47968.2020.9287362"},{"key":"e_1_2_1_14_1","volume-title":"Clipper: A Low-Latency Online Prediction Serving System. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17)","author":"Crankshaw Daniel","year":"2017","unstructured":"Daniel Crankshaw , Xin Wang , Guilio Zhou , Michael J. Franklin , Joseph E. Gonzalez , and Ion Stoica . 2017 . Clipper: A Low-Latency Online Prediction Serving System. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17) . USENIX Association, Boston, MA, 613--627. https:\/\/www.usenix.org\/conference\/nsdi17\/technical-sessions\/presentation\/crankshaw Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J. Franklin, Joseph E. Gonzalez, and Ion Stoica. 2017. Clipper: A Low-Latency Online Prediction Serving System. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 613--627. https:\/\/www.usenix.org\/conference\/nsdi17\/technical-sessions\/presentation\/crankshaw"},{"key":"e_1_2_1_15_1","volume-title":"Lin (Eds.)","volume":"33","author":"Cubuk Ekin Dogus","year":"2020","unstructured":"Ekin Dogus Cubuk , Barret Zoph , Jon Shlens , and Quoc Le . 2020 . RandAugment: Practical Automated Data Augmentation with a Reduced Search Space. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H . Lin (Eds.) , Vol. 33 . Curran Associates, Inc. , 18613--18624. https:\/\/proceedings.neurips.cc\/paper\/2020\/file\/d85b63ef0ccb114d0a3bb7b7d808028f-Paper.pdf Ekin Dogus Cubuk, Barret Zoph, Jon Shlens, and Quoc Le. 2020. RandAugment: Practical Automated Data Augmentation with a Reduced Search Space. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 18613--18624. https:\/\/proceedings.neurips.cc\/paper\/2020\/file\/d85b63ef0ccb114d0a3bb7b7d808028f-Paper.pdf"},{"key":"e_1_2_1_16_1","volume-title":"Gibbons","author":"Cui Henggang","year":"2018","unstructured":"Henggang Cui , Gregory R. Ganger , and Phillip B . Gibbons . 2018 . MLtuner: System Support for Automatic Machine Learning Tuning . arXiv:1803.07445 [cs.LG] Henggang Cui, Gregory R. Ganger, and Phillip B. Gibbons. 2018. MLtuner: System Support for Automatic Machine Learning Tuning. arXiv:1803.07445 [cs.LG]"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3399579.3399870"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"e_1_2_1_19_1","unstructured":"Aditya Devarakonda Maxim Naumov and Michael Garland. 2018. AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks. arXiv:1712.02029 [cs.LG]  Aditya Devarakonda Maxim Naumov and Michael Garland. 2018. AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks. arXiv:1712.02029 [cs.LG]"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/n19-1423"},{"key":"e_1_2_1_21_1","volume-title":"Taylor","author":"Devries Terrance","year":"2017","unstructured":"Terrance Devries and Graham W . Taylor . 2017 . Improved Regularization of Convolutional Neural Networks with Cutout . arXiv:1708.04552 (2017). Terrance Devries and Graham W. Taylor. 2017. Improved Regularization of Convolutional Neural Networks with Cutout. arXiv:1708.04552 (2017)."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.5555\/2832581.2832731"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.14778\/2168651.2168659"},{"key":"e_1_2_1_24_1","unstructured":"Canva Engineering. 2021. Machine learning hyperparameter optimization with Argo. https:\/\/canvatechblog.com\/machine-learning-hyperparameter-optimization-with-argo-a60d70b1fc8c.  Canva Engineering. 2021. Machine learning hyperparameter optimization with Argo. https:\/\/canvatechblog.com\/machine-learning-hyperparameter-optimization-with-argo-a60d70b1fc8c."},{"key":"e_1_2_1_25_1","unstructured":"Lex Fridman Jack Terwilliger and Benedikt Jenik. 2019. DeepTraffic: Crowd-sourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic Navigation. arXiv:1801.02805 [cs.NE]  Lex Fridman Jack Terwilliger and Benedikt Jenik. 2019. DeepTraffic: Crowd-sourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic Navigation. arXiv:1801.02805 [cs.NE]"},{"key":"e_1_2_1_26_1","volume-title":"Yuan Tang, Ramdoot Pydipaty, and Amit Kumar Saha.","author":"George Johnu","year":"2020","unstructured":"Johnu George , Ce Gao , Richard Liu , Hou Gang Liu , Yuan Tang, Ramdoot Pydipaty, and Amit Kumar Saha. 2020 . A Scalable and Cloud-Native Hyperparameter Tuning System . arXiv:2006.02085 [cs.DC] Johnu George, Ce Gao, Richard Liu, Hou Gang Liu, Yuan Tang, Ramdoot Pydipaty, and Amit Kumar Saha. 2020. A Scalable and Cloud-Native Hyperparameter Tuning System. arXiv:2006.02085 [cs.DC]"},{"key":"e_1_2_1_27_1","volume-title":"John Elliot Karro, and D","author":"Golovin Daniel","year":"2017","unstructured":"Daniel Golovin , Benjamin Solnik , Subhodeep Moitra , Greg Kochanski , John Elliot Karro, and D . Sculley (Eds.). 2017 . Google Vizier : A Service for Black-Box Optimization . http:\/\/www.kdd.org\/kdd2017\/papers\/view\/google-vizier-a-service-for-black-box-optimization Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Elliot Karro, and D. Sculley (Eds.). 2017. Google Vizier: A Service for Black-Box Optimization. http:\/\/www.kdd.org\/kdd2017\/papers\/view\/google-vizier-a-service-for-black-box-optimization"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.)","volume":"97","author":"Gong Linyuan","year":"2019","unstructured":"Linyuan Gong , Di He , Zhuohan Li , Tao Qin , Liwei Wang , and Tieyan Liu . 2019 . Efficient Training of BERT by Progressively Stacking . In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.) , Vol. 97 . PMLR, 2337--2346. https:\/\/proceedings.mlr.press\/v97\/gong19a.html Linyuan Gong, Di He, Zhuohan Li, Tao Qin, Liwei Wang, and Tieyan Liu. 2019. Efficient Training of BERT by Progressively Stacking. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.), Vol. 97. PMLR, 2337--2346. https:\/\/proceedings.mlr.press\/v97\/gong19a.html"},{"key":"e_1_2_1_29_1","volume-title":"large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677","author":"Goyal Priya","year":"2017","unstructured":"Priya Goyal , Piotr Doll\u00e1r , Ross Girshick , Pieter Noordhuis , Lukasz Wesolowski , Aapo Kyrola , Andrew Tulloch , Yangqing Jia , and Kaiming He. 2017. Accurate , large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677 ( 2017 ). Priya Goyal, Piotr Doll\u00e1r, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677 (2017)."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.406"},{"key":"e_1_2_1_31_1","volume-title":"Nectar: Automatic Management of Data and Computation in Datacenters. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI 10)","author":"Gunda Pradeep Kumar","year":"2010","unstructured":"Pradeep Kumar Gunda , Lenin Ravindranath , Chandramohan A. Thekkath , Yuan Yu , and Li Zhuang . 2010 . Nectar: Automatic Management of Data and Computation in Datacenters. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI 10) . USENIX Association, Vancouver, BC. https:\/\/www.usenix.org\/conference\/osdi10\/nectar-automatic-management-data-and-computation-datacenters Pradeep Kumar Gunda, Lenin Ravindranath, Chandramohan A. Thekkath, Yuan Yu, and Li Zhuang. 2010. Nectar: Automatic Management of Data and Computation in Datacenters. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI 10). USENIX Association, Vancouver, BC. https:\/\/www.usenix.org\/conference\/osdi10\/nectar-automatic-management-data-and-computation-datacenters"},{"key":"e_1_2_1_32_1","volume-title":"Ng","author":"Hannun Awni","year":"2014","unstructured":"Awni Hannun , Carl Case , Jared Casper , Bryan Catanzaro , Greg Diamos , Erich Elsen , Ryan Prenger , Sanjeev Satheesh , Shubho Sengupta , Adam Coates , and Andrew Y . Ng . 2014 . Deep Speech : Scaling up end-to-end speech recognition. Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, and Andrew Y. Ng. 2014. Deep Speech: Scaling up end-to-end speech recognition."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.imu.2021.100709"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_35_1","first-page":"2","article-title":"Neural networks for machine learning lecture 6a overview of mini-batch gradient descent","volume":"14","author":"Hinton Geoffrey","year":"2012","unstructured":"Geoffrey Hinton , Nitish Srivastava , and Kevin Swersky . 2012 . Neural networks for machine learning lecture 6a overview of mini-batch gradient descent . Cited on 14 , 8 (2012), 2 . http:\/\/www.cs.toronto.edu\/~tijmen\/csc321\/slides\/lecture_slides_lec6.pdf. Geoffrey Hinton, Nitish Srivastava, and Kevin Swersky. 2012. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on 14, 8 (2012), 2. http:\/\/www.cs.toronto.edu\/~tijmen\/csc321\/slides\/lecture_slides_lec6.pdf.","journal-title":"Cited on"},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.)","volume":"97","author":"Ho Daniel","year":"2019","unstructured":"Daniel Ho , Eric Liang , Xi Chen , Ion Stoica , and Pieter Abbeel . 2019 . Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules . In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.) , Vol. 97 . PMLR, 2731--2741. https:\/\/proceedings.mlr.press\/v97\/ho19b.html Daniel Ho, Eric Liang, Xi Chen, Ion Stoica, and Pieter Abbeel. 2019. Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.), Vol. 97. PMLR, 2731--2741. https:\/\/proceedings.mlr.press\/v97\/ho19b.html"},{"key":"e_1_2_1_37_1","unstructured":"Elad Hoffer Berry Weinstein Itay Hubara Tal Ben-Nun Torsten Hoefler and Daniel Soudry. 2019. Mix & Match: training convnets with mixed image sizes for improved accuracy speed and scale resiliency. arXiv:1908.08986 [cs.CV]  Elad Hoffer Berry Weinstein Itay Hubara Tal Ben-Nun Torsten Hoefler and Daniel Soudry. 2019. Mix & Match: training convnets with mixed image sizes for improved accuracy speed and scale resiliency. arXiv:1908.08986 [cs.CV]"},{"key":"e_1_2_1_38_1","volume-title":"Training Imagenet in 3 hours for USD 25","author":"Howard Jeremy","year":"2018","unstructured":"Jeremy Howard . 2018. Training Imagenet in 3 hours for USD 25 ; and CIFAR10 for USD 0.26. https:\/\/www.fast.ai\/ 2018 \/04\/30\/dawnbench-fastai\/ Jeremy Howard. 2018. Training Imagenet in 3 hours for USD 25; and CIFAR10 for USD 0.26. https:\/\/www.fast.ai\/2018\/04\/30\/dawnbench-fastai\/"},{"key":"e_1_2_1_39_1","unstructured":"IAFOSS. 2018. Similarity DenseNet121 [0.805LB] kernel time limit. https:\/\/www.kaggle.com\/iafoss\/similarity-densenet121-0-805lb-kernel-time-limit\/notebook  IAFOSS. 2018. Similarity DenseNet121 [0.805LB] kernel time limit. https:\/\/www.kaggle.com\/iafoss\/similarity-densenet121-0-805lb-kernel-time-limit\/notebook"},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research), Francis Bach and David Blei (Eds.)","volume":"37","author":"Ioffe Sergey","year":"2015","unstructured":"Sergey Ioffe and Christian Szegedy . 2015 . Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift . In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research), Francis Bach and David Blei (Eds.) , Vol. 37 . PMLR, Lille, France, 448--456. https:\/\/proceedings.mlr.press\/v37\/ioffe15.html Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research), Francis Bach and David Blei (Eds.), Vol. 37. PMLR, Lille, France, 448--456. https:\/\/proceedings.mlr.press\/v37\/ioffe15.html"},{"key":"e_1_2_1_41_1","unstructured":"Max Jaderberg Valentin Dalibard Simon Osindero Wojciech M. Czarnecki Jeff Donahue Ali Razavi Oriol Vinyals Tim Green Iain Dunning Karen Simonyan Chrisantha Fernando and Koray Kavukcuoglu. 2017. Population Based Training of Neural Networks. arXiv:1711.09846 [cs.LG]  Max Jaderberg Valentin Dalibard Simon Osindero Wojciech M. Czarnecki Jeff Donahue Ali Razavi Oriol Vinyals Tim Green Iain Dunning Karen Simonyan Chrisantha Fernando and Koray Kavukcuoglu. 2017. Population Based Training of Neural Networks. arXiv:1711.09846 [cs.LG]"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research), Arthur Gretton and Christian C. Robert (Eds.)","volume":"51","author":"Jamieson Kevin","year":"2016","unstructured":"Kevin Jamieson and Ameet Talwalkar . 2016 . Non-stochastic Best Arm Identification and Hyperparameter Optimization . In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research), Arthur Gretton and Christian C. Robert (Eds.) , Vol. 51 . PMLR, Cadiz, Spain, 240--248. https:\/\/proceedings.mlr.press\/v51\/jamieson16.html Kevin Jamieson and Ameet Talwalkar. 2016. Non-stochastic Best Arm Identification and Hyperparameter Optimization. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research), Arthur Gretton and Christian C. Robert (Eds.), Vol. 51. PMLR, Cadiz, Spain, 240--248. https:\/\/proceedings.mlr.press\/v51\/jamieson16.html"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2647868.2654889"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3190656"},{"key":"e_1_2_1_45_1","unstructured":"Kiran U Kamath. 2020. fastai MultiLabel Classification using Kfold CV. https:\/\/www.kaggle.com\/kirankamat\/fastai-multilabel-classification-using-kfold-cv\/notebook  Kiran U Kamath. 2020. fastai MultiLabel Classification using Kfold CV. https:\/\/www.kaggle.com\/kirankamat\/fastai-multilabel-classification-using-kfold-cv\/notebook"},{"key":"e_1_2_1_46_1","volume-title":"6th International Conference on Learning Representations, ICLR","author":"Karras Tero","year":"2018","unstructured":"Tero Karras , Timo Aila , Samuli Laine , and Jaakko Lehtinen . 2018. Progressive Growing of GANs for Improved Quality, Stability, and Variation . In 6th International Conference on Learning Representations, ICLR 2018 , Vancouver, BC , Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview .net. https:\/\/openreview.net\/forum?id=Hk99zCeAb Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive Growing of GANs for Improved Quality, Stability, and Variation. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. https:\/\/openreview.net\/forum?id=Hk99zCeAb"},{"key":"e_1_2_1_47_1","volume-title":"Automated Learning Rate Scheduler for Large-batch Training. In 8th ICML Workshop on Automated Machine Learning (AutoML).","author":"Kim Chiheon","year":"2021","unstructured":"Chiheon Kim , Saehoon Kim , Jongmin Kim , Donghoon Lee , and Sungwoong Kim . 2021 . Automated Learning Rate Scheduler for Large-batch Training. In 8th ICML Workshop on Automated Machine Learning (AutoML). Chiheon Kim, Saehoon Kim, Jongmin Kim, Donghoon Lee, and Sungwoong Kim. 2021. Automated Learning Rate Scheduler for Large-batch Training. In 8th ICML Workshop on Automated Machine Learning (AutoML)."},{"key":"e_1_2_1_48_1","volume-title":"CHOPT: Automated Hyperparameter Optimization Framework for Cloud-Based Machine Learning Platforms. arXiv:1810.03527 [cs.LG]","author":"Kim Jinwoong","year":"2018","unstructured":"Jinwoong Kim , Minkyu Kim , Heungseok Park , Ernar Kusdavletov , Dongjun Lee , Adrian Kim , Ji-Hoon Kim , Jung-Woo Ha , and Nako Sung . 2018 . CHOPT: Automated Hyperparameter Optimization Framework for Cloud-Based Machine Learning Platforms. arXiv:1810.03527 [cs.LG] Jinwoong Kim, Minkyu Kim, Heungseok Park, Ernar Kusdavletov, Dongjun Lee, Adrian Kim, Ji-Hoon Kim, Jung-Woo Ha, and Nako Sung. 2018. CHOPT: Automated Hyperparameter Optimization Framework for Cloud-Based Machine Learning Platforms. arXiv:1810.03527 [cs.LG]"},{"key":"e_1_2_1_49_1","volume-title":"Kingma and Jimmy Ba","author":"Diederik","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba . 2015 . Adam : A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7--9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds .). http:\/\/arxiv.org\/abs\/1412.6980 Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7--9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http:\/\/arxiv.org\/abs\/1412.6980"},{"key":"e_1_2_1_50_1","volume-title":"Proceedings, Part V 16","author":"Kolesnikov Alexander","year":"2020","unstructured":"Alexander Kolesnikov , Lucas Beyer , Xiaohua Zhai , Joan Puigcerver , Jessica Yung , Sylvain Gelly , and Neil Houlsby . 2020 . Big transfer (bit): General visual representation learning. In Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020 , Proceedings, Part V 16 . Springer, 491--507. Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, and Neil Houlsby. 2020. Big transfer (bit): General visual representation learning. In Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part V 16. Springer, 491--507."},{"key":"e_1_2_1_51_1","volume-title":"Learning Multiple Layers of Features from Tiny Images. Tech report","author":"Krizhevsky Alex","year":"2009","unstructured":"Alex Krizhevsky . 2009. Learning Multiple Layers of Features from Tiny Images. Tech report ( 2009 ). Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images. Tech report (2009)."},{"key":"e_1_2_1_52_1","unstructured":"Lak Lakshmanan and Wenzhe Li. 2018. Hyperparameter tuning on Google Cloud Platform is now faster and smarter. https:\/\/cloud.google.com\/blog\/products\/gcp\/hyperparameter-tuning-on-google-cloud-platform-is-now-faster-and-smarter.  Lak Lakshmanan and Wenzhe Li. 2018. Hyperparameter tuning on Google Cloud Platform is now faster and smarter. https:\/\/cloud.google.com\/blog\/products\/gcp\/hyperparameter-tuning-on-google-cloud-platform-is-now-faster-and-smarter."},{"key":"e_1_2_1_53_1","volume-title":"PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Lee Yunseong","year":"2018","unstructured":"Yunseong Lee , Alberto Scolari , Byung-Gon Chun , Marco Domenico Santam-brogio, Markus Weimer , and Matteo Interlandi . 2018 . PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . USENIX Association, Carlsbad, CA, 611--626. https:\/\/www.usenix.org\/conference\/osdi18\/presentation\/lee Yunseong Lee, Alberto Scolari, Byung-Gon Chun, Marco Domenico Santam-brogio, Markus Weimer, and Matteo Interlandi. 2018. PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 611--626. https:\/\/www.usenix.org\/conference\/osdi18\/presentation\/lee"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/2670979.2670985"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.5555\/3122009.3242042"},{"key":"e_1_2_1_56_1","volume-title":"Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.)","volume":"2","author":"Li Liam","year":"2020","unstructured":"Liam Li , Kevin Jamieson , Afshin Rostamizadeh , Ekaterina Gonina , Jonathan Ben-tzur, Moritz Hardt , Benjamin Recht , and Ameet Talwalkar . 2020 . A System for Massively Parallel Hyperparameter Tuning . In Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.) , Vol. 2 . 230--246. https:\/\/proceedings.mlsys.org\/paper\/2020\/file\/f4b9ec30ad9f68f89b29639786cb62ef-Paper.pdf Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Jonathan Ben-tzur, Moritz Hardt, Benjamin Recht, and Ameet Talwalkar. 2020. A System for Massively Parallel Hyperparameter Tuning. In Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.), Vol. 2. 230--246. https:\/\/proceedings.mlsys.org\/paper\/2020\/file\/f4b9ec30ad9f68f89b29639786cb62ef-Paper.pdf"},{"key":"e_1_2_1_57_1","volume-title":"Exploiting Reuse in Pipeline-Aware Hyperparameter Tuning. In Systems for ML Workshop at NeurIPS.","author":"Li Liam","year":"2018","unstructured":"Liam Li , Evan Sparks , Kevin Jamieson , and Ameet Talwalkar . 2018 . Exploiting Reuse in Pipeline-Aware Hyperparameter Tuning. In Systems for ML Workshop at NeurIPS. Liam Li, Evan Sparks, Kevin Jamieson, and Ameet Talwalkar. 2018. Exploiting Reuse in Pipeline-Aware Hyperparameter Tuning. In Systems for ML Workshop at NeurIPS."},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357223.3362719"},{"key":"e_1_2_1_59_1","volume-title":"Tune: A Research Platform for Distributed Model Selection and Training. In ICML AutoML Workshop.","author":"Liaw Richard","year":"2018","unstructured":"Richard Liaw , Eric Liang , Robert Nishihara , Philipp Moritz , Joseph E. Gonzalez , and Ion Stoica . 2018 . Tune: A Research Platform for Distributed Model Selection and Training. In ICML AutoML Workshop. Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E. Gonzalez, and Ion Stoica. 2018. Tune: A Research Platform for Distributed Model Selection and Training. In ICML AutoML Workshop."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/3462462.3468880"},{"key":"e_1_2_1_61_1","volume-title":"Themis: Fair and Efficient GPU Cluster Scheduling. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20)","author":"Mahajan Kshiteej","year":"2020","unstructured":"Kshiteej Mahajan , Arjun Balasubramanian , Arjun Singhvi , Shivaram Venkataraman , Aditya Akella , Amar Phanishayee , and Shuchi Chawla . 2020 . Themis: Fair and Efficient GPU Cluster Scheduling. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20) . USENIX Association, Santa Clara, CA, 289--304. https:\/\/www.usenix.org\/conference\/nsdi20\/presentation\/mahajan Kshiteej Mahajan, Arjun Balasubramanian, Arjun Singhvi, Shivaram Venkataraman, Aditya Akella, Amar Phanishayee, and Shuchi Chawla. 2020. Themis: Fair and Efficient GPU Cluster Scheduling. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). USENIX Association, Santa Clara, CA, 289--304. https:\/\/www.usenix.org\/conference\/nsdi20\/presentation\/mahajan"},{"key":"e_1_2_1_62_1","unstructured":"Sam McCandlish Jared Kaplan Dario Amodei and OpenAI Dota Team. 2018. An Empirical Model of Large-Batch Training. arXiv:1812.06162 [cs.LG]  Sam McCandlish Jared Kaplan Dario Amodei and OpenAI Dota Team. 2018. An Empirical Model of Large-Batch Training. arXiv:1812.06162 [cs.LG]"},{"key":"e_1_2_1_63_1","first-page":"2600241","article-title":"Docker","volume":"2600239","author":"Merkel Dirk","year":"2014","unstructured":"Dirk Merkel . 2014 . Docker : Lightweight Linux Containers for Consistent Development and Deployment. Linux J. 2014, 239, Article 2 (March 2014). http:\/\/dl.acm.org\/citation.cfm?id= 2600239 . 2600241 Dirk Merkel. 2014. Docker: Lightweight Linux Containers for Consistent Development and Deployment. Linux J. 2014, 239, Article 2 (March 2014). http:\/\/dl.acm.org\/citation.cfm?id=2600239.2600241","journal-title":"Lightweight Linux Containers for Consistent Development and Deployment. Linux"},{"key":"e_1_2_1_64_1","unstructured":"Microsoft. 2017. Neural Network Intelligence (NNI). https:\/\/github.com\/Microsoft\/nni  Microsoft. 2017. Neural Network Intelligence (NNI). https:\/\/github.com\/Microsoft\/nni"},{"key":"e_1_2_1_65_1","volume-title":"Ray: A Distributed Framework for Emerging AI Applications. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Moritz Philipp","year":"2018","unstructured":"Philipp Moritz , Robert Nishihara , Stephanie Wang , Alexey Tumanov , Richard Liaw , Eric Liang , Melih Elibol , Zongheng Yang , William Paul , Michael I. Jordan , and Ion Stoica . 2018 . Ray: A Distributed Framework for Emerging AI Applications. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . USENIX Association, Carlsbad, CA, 561--577. https:\/\/www.usenix.org\/conference\/osdi18\/presentation\/moritz Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I. Jordan, and Ion Stoica. 2018. Ray: A Distributed Framework for Emerging AI Applications. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 561--577. https:\/\/www.usenix.org\/conference\/osdi18\/presentation\/moritz"},{"key":"e_1_2_1_66_1","unstructured":"Anna Novikova. 2019. fast.ai starter with ResNet 50. https:\/\/www.kaggle.com\/demonplus\/fast-ai-starter-with-resnet-50\/notebook  Anna Novikova. 2019. fast.ai starter with ResNet 50. https:\/\/www.kaggle.com\/demonplus\/fast-ai-starter-with-resnet-50\/notebook"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920906"},{"key":"e_1_2_1_68_1","volume-title":"High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke , Sam Gross , Francisco Massa , Adam Lerer , James Bradbury , Gregory Chanan , Trevor Killeen , Zeming Lin , Natalia Gimelshein , Luca Antiga , Alban Desmaison , Andreas K\u00f6pf , Edward Z. Yang , Zachary DeVito , Martin Raison , Alykhan Tejani , Sasank Chilamkurthy , Benoit Steiner , Lu Fang , Junjie Bai , and Soumith Chintala . 2019 . PyTorch: An Imperative Style , High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 , NeurIPS 2019, December 8--14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alch\u00e9-Buc, Emily B. Fox, and Roman Garnett (Eds.). 8024--8035. https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/bdbca288fee7f92f2bfa9f7012727740-Abstract.html Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas K\u00f6pf, Edward Z. Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8--14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alch\u00e9-Buc, Emily B. Fox, and Roman Garnett (Eds.). 8024--8035. https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/bdbca288fee7f92f2bfa9f7012727740-Abstract.html"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447548.3467098"},{"key":"e_1_2_1_70_1","unstructured":"Miguel Pinto. 2019. pneumothorax fastai U-Net. https:\/\/www.kaggle.com\/mnpinto\/pneumothorax-fastai-u-net\/notebook  Miguel Pinto. 2019. pneumothorax fastai U-Net. https:\/\/www.kaggle.com\/mnpinto\/pneumothorax-fastai-u-net\/notebook"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/3135974.3135994"},{"key":"e_1_2_1_72_1","unstructured":"Inc Red Hat. 2020. GlusterFS. https:\/\/www.gluster.org\/  Inc Red Hat. 2020. GlusterFS. https:\/\/www.gluster.org\/"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.91"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2945397"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2017.58"},{"key":"e_1_2_1_76_1","unstructured":"Leslie N. Smith. 2018. A disciplined approach to neural network hyperparameters: Part 1 - learning rate batch size momentum and weight decay. arXiv:1803.09820 [cs.LG]  Leslie N. Smith. 2018. A disciplined approach to neural network hyperparameters: Part 1 - learning rate batch size momentum and weight decay. arXiv:1803.09820 [cs.LG]"},{"key":"e_1_2_1_77_1","volume-title":"Smith and Nicholay Topin","author":"Leslie","year":"2018","unstructured":"Leslie N. Smith and Nicholay Topin . 2018 . Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates . arXiv:1708.07120 [cs.LG] Leslie N. Smith and Nicholay Topin. 2018. Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates. arXiv:1708.07120 [cs.LG]"},{"key":"e_1_2_1_78_1","volume-title":"Le","author":"Smith Samuel L.","year":"2018","unstructured":"Samuel L. Smith , Pieter-Jan Kindermans , Chris Ying , and Quoc V . Le . 2018 . Don't Decay the Learning Rate, Increase the Batch Size. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview .net. https:\/\/openreview.net\/forum?id=B1Yy1BxCZ Samuel L. Smith, Pieter-Jan Kindermans, Chris Ying, and Quoc V. Le. 2018. Don't Decay the Learning Rate, Increase the Batch Size. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. https:\/\/openreview.net\/forum?id=B1Yy1BxCZ"},{"key":"e_1_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2017.109"},{"key":"e_1_2_1_80_1","unstructured":"Danny Stoll J\u00f6rg K. H. Franke Diane Wagner Simon Selg and Frank Hutter. 2020. Hyperparameter Transfer Across Developer Adjustments. arXiv:2010.13117 [cs.LG]  Danny Stoll J\u00f6rg K. H. Franke Diane Wagner Simon Selg and Frank Hutter. 2020. Hyperparameter Transfer Across Developer Adjustments. arXiv:2010.13117 [cs.LG]"},{"key":"e_1_2_1_81_1","volume-title":"NIPS Workshop on Machine Learning Systems (LearningSys).","author":"Sung Nako","year":"2017","unstructured":"Nako Sung , Minkyu Kim , Hyunwoo Jo , Youngil Yang , Jinwoong Kim , Leonard Lausen , Youngkwan Kim , Gayoung Lee , Donghyun Kwak , Jung-Woo Ha , and Sunghun Kim . 2017 . NSML: A Machine Learning Platform That Enables You to Focus on Your Models . In NIPS Workshop on Machine Learning Systems (LearningSys). Nako Sung, Minkyu Kim, Hyunwoo Jo, Youngil Yang, Jinwoong Kim, Leonard Lausen, Youngkwan Kim, Gayoung Lee, Donghyun Kwak, Jung-Woo Ha, and Sunghun Kim. 2017. NSML: A Machine Learning Platform That Enables You to Focus on Your Models. In NIPS Workshop on Machine Learning Systems (LearningSys)."},{"key":"e_1_2_1_82_1","volume-title":"Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research), Marina Meila and Tong Zhang (Eds.)","volume":"139","author":"Tan Mingxing","year":"2021","unstructured":"Mingxing Tan and Quoc Le . 2021 . EfficientNetV2: Smaller Models and Faster Training . In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research), Marina Meila and Tong Zhang (Eds.) , Vol. 139 . PMLR, 10096--10106. https:\/\/proceedings.mlr.press\/v139\/tan21a.html Mingxing Tan and Quoc Le. 2021. EfficientNetV2: Smaller Models and Faster Training. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research), Marina Meila and Tong Zhang (Eds.), Vol. 139. PMLR, 10096--10106. https:\/\/proceedings.mlr.press\/v139\/tan21a.html"},{"key":"e_1_2_1_83_1","unstructured":"Eclipse Deeplearning4j Development Team. 2016. ND4J: Fast Scientific and Numerical Computing for the JVM. https:\/\/github.com\/eclipse\/deeplearning4j  Eclipse Deeplearning4j Development Team. 2016. ND4J: Fast Scientific and Numerical Computing for the JVM. https:\/\/github.com\/eclipse\/deeplearning4j"},{"key":"e_1_2_1_84_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330756"},{"key":"e_1_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1145\/3290605.3300911"},{"key":"e_1_2_1_86_1","volume-title":"Network Morphism. In Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research), Maria Florina Balcan and Kilian Q. Weinberger (Eds.)","volume":"48","author":"Wei Tao","year":"2016","unstructured":"Tao Wei , Changhu Wang , Yong Rui , and Chang Wen Chen . 2016 . Network Morphism. In Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research), Maria Florina Balcan and Kilian Q. Weinberger (Eds.) , Vol. 48 . PMLR, New York, New York, USA, 564--572. https:\/\/proceedings.mlr.press\/v48\/wei16.html Tao Wei, Changhu Wang, Yong Rui, and Chang Wen Chen. 2016. Network Morphism. In Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research), Maria Florina Balcan and Kilian Q. Weinberger (Eds.), Vol. 48. PMLR, New York, New York, USA, 564--572. https:\/\/proceedings.mlr.press\/v48\/wei16.html"},{"key":"e_1_2_1_87_1","unstructured":"Yonghui Wu Mike Schuster Zhifeng Chen Quoc V Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey etal 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016).  Yonghui Wu Mike Schuster Zhifeng Chen Quoc V Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey et al. 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)."},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.14778\/3297753.3297763"},{"key":"e_1_2_1_89_1","volume-title":"Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701","author":"Zeiler Matthew D","year":"2012","unstructured":"Matthew D Zeiler . 2012. Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 ( 2012 ). Matthew D Zeiler. 2012. Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)."},{"key":"e_1_2_1_90_1","volume-title":"Proceedings of Machine Learning and Systems, A. Talwalkar, V. Smith, and M. Zaharia (Eds.)","volume":"1","author":"Zhang Jian","year":"2019","unstructured":"Jian Zhang and Ioannis Mitliagkas . 2019 . YellowFin and the Art of Momentum Tuning . In Proceedings of Machine Learning and Systems, A. Talwalkar, V. Smith, and M. Zaharia (Eds.) , Vol. 1 . 289--308. https:\/\/proceedings.mlsys.org\/paper\/2019\/file\/b3e3e393c77e35a4a3f3cbd1e429b5dc-Paper.pdf Jian Zhang and Ioannis Mitliagkas. 2019. YellowFin and the Art of Momentum Tuning. In Proceedings of Machine Learning and Systems, A. Talwalkar, V. Smith, and M. Zaharia (Eds.), Vol. 1. 289--308. https:\/\/proceedings.mlsys.org\/paper\/2019\/file\/b3e3e393c77e35a4a3f3cbd1e429b5dc-Paper.pdf"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3510397.3510402","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:22:56Z","timestamp":1672219376000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3510397.3510402"}},"subtitle":["sharing computations in hyper-parameter optimization"],"short-title":[],"issued":{"date-parts":[[2022,1]]},"references-count":90,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,1]]}},"alternative-id":["10.14778\/3510397.3510402"],"URL":"https:\/\/doi.org\/10.14778\/3510397.3510402","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2022,1]]}}}