{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T03:15:09Z","timestamp":1767928509231,"version":"3.49.0"},"publisher-location":"New York, NY, USA","reference-count":75,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,11,17]],"date-time":"2019-11-17T00:00:00Z","timestamp":1573948800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Department of Energy","award":["DE-AC02-05CH11231"],"award-info":[{"award-number":["DE-AC02-05CH11231"]}]},{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/N019474\/1"],"award-info":[{"award-number":["EP\/N019474\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000185","name":"Defense Advanced Research Projects Agency","doi-asserted-by":"publisher","award":["D3M Cooperative Agreement FA8750-17-2-0093"],"award-info":[{"award-number":["D3M Cooperative Agreement FA8750-17-2-0093"]}],"id":[{"id":"10.13039\/100000185","id-type":"DOI","asserted-by":"publisher"}]},{"name":"NERSC Big Data Center"},{"DOI":"10.13039\/501100002790","name":"Natural Sciences and Engineering Research Council of Canada","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002790","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["ACI-1450310"],"award-info":[{"award-number":["ACI-1450310"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,11,17]]},"DOI":"10.1145\/3295500.3356180","type":"proceedings-article","created":{"date-parts":[[2019,11,7]],"date-time":"2019-11-07T19:43:22Z","timestamp":1573155802000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":22,"title":["Etalumis"],"prefix":"10.1145","author":[{"given":"Atilim G\u00fcne\u015f","family":"Baydin","sequence":"first","affiliation":[{"name":"University of Oxford"}]},{"given":"Lei","family":"Shao","sequence":"additional","affiliation":[{"name":"Intel Corporation"}]},{"given":"Wahid","family":"Bhimji","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory"}]},{"given":"Lukas","family":"Heinrich","sequence":"additional","affiliation":[{"name":"CERN"}]},{"given":"Lawrence","family":"Meadows","sequence":"additional","affiliation":[{"name":"Intel Corporation"}]},{"given":"Jialin","family":"Liu","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory"}]},{"given":"Andreas","family":"Munk","sequence":"additional","affiliation":[{"name":"University of British Columbia"}]},{"given":"Saeid","family":"Naderiparizi","sequence":"additional","affiliation":[{"name":"University of British Columbia"}]},{"given":"Bradley","family":"Gram-Hansen","sequence":"additional","affiliation":[{"name":"University of Oxford"}]},{"given":"Gilles","family":"Louppe","sequence":"additional","affiliation":[{"name":"University of Li\u00e8ge"}]},{"given":"Mingfei","family":"Ma","sequence":"additional","affiliation":[{"name":"Intel Corporation"}]},{"given":"Xiaohui","family":"Zhao","sequence":"additional","affiliation":[{"name":"Intel Corporation"}]},{"given":"Philip","family":"Torr","sequence":"additional","affiliation":[{"name":"University of Oxford"}]},{"given":"Victor","family":"Lee","sequence":"additional","affiliation":[{"name":"Intel Corporation"}]},{"given":"Kyle","family":"Cranmer","sequence":"additional","affiliation":[{"name":"New York University"}]},{"family":"Prabhat","sequence":"additional","affiliation":[{"name":"Lawrence Berkeley National Laboratory"}]},{"given":"Frank","family":"Wood","sequence":"additional","affiliation":[{"name":"University of British Columbia"}]}],"member":"320","published-online":{"date-parts":[[2019,11,17]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1140\/epjc\/s10052-017-4852-3"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1088\/1748-0221\/3\/08\/S08003"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1140\/epjc\/s10052-015-3543-1"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1140\/epjc\/s10052-016-4110-0"},{"key":"e_1_3_2_1_5_1","volume-title":"12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , 2016 . Tensorflow: A system for large-scale machine learning . In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) . 265--283. Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 265--283."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0168-9002(03)01368-8"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1088\/1475-7516\/2015\/08\/043"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/78.978374"},{"key":"e_1_3_2_1_9_1","first-page":"1","article-title":"Automatic differentiation in machine learning: a survey","volume":"18","author":"Baydin Atilim G\u00fcne\u015f","year":"2018","unstructured":"Atilim G\u00fcne\u015f Baydin , Barak A. Pearlmutter , Alexey Andreyevich Radul , and Jeffrey Mark Siskind . 2018 . Automatic differentiation in machine learning: a survey . Journal of Machine Learning Research (JMLR) 18 , 153 (2018), 1 -- 43 . http:\/\/jmlr.org\/papers\/v18\/17-468.html Atilim G\u00fcne\u015f Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2018. Automatic differentiation in machine learning: a survey. Journal of Machine Learning Research (JMLR) 18, 153 (2018), 1--43. http:\/\/jmlr.org\/papers\/v18\/17-468.html","journal-title":"Journal of Machine Learning Research (JMLR)"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553380"},{"key":"e_1_3_2_1_11_1","volume-title":"Pyro: Deep universal probabilistic programming. Journal of Machine Learning Research","author":"Bingham Eli","year":"2018","unstructured":"Eli Bingham , Jonathan P Chen , Martin Jankowiak , Fritz Obermeyer , Neeraj Pradhan , Theofanis Karaletsos , Rohit Singh , Paul Szerlip , Paul Horsfall , and Noah D Goodman . 2018 . Pyro: Deep universal probabilistic programming. Journal of Machine Learning Research (2018). Eli Bingham, Jonathan P Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D Goodman. 2018. Pyro: Deep universal probabilistic programming. Journal of Machine Learning Research (2018)."},{"key":"e_1_3_2_1_13_1","volume-title":"Pattern Recognition and Machine Learning","author":"Bishop Christopher M","unstructured":"Christopher M Bishop . 2006. Pattern Recognition and Machine Learning . Springer . Christopher M Bishop. 2006. Pattern Recognition and Machine Learning. Springer."},{"key":"e_1_3_2_1_14_1","volume-title":"Mining gold from implicit models to improve likelihood-free inference. arXiv preprint arXiv:1805.12244","author":"Brehmer Johann","year":"2018","unstructured":"Johann Brehmer , Gilles Louppe , Juan Pavez , and Kyle Cranmer . 2018. Mining gold from implicit models to improve likelihood-free inference. arXiv preprint arXiv:1805.12244 ( 2018 ). Johann Brehmer, Gilles Louppe, Juan Pavez, and Kyle Cranmer. 2018. Mining gold from implicit models to improve likelihood-free inference. arXiv preprint arXiv:1805.12244 (2018)."},{"key":"e_1_3_2_1_15_1","volume-title":"Revisiting distributed synchronous SGD. arXiv preprint arXiv:1604.00981","author":"Chen Jianmin","year":"2016","unstructured":"Jianmin Chen , Xinghao Pan , Rajat Monga , Samy Bengio , and Rafal Jozefowicz . 2016. Revisiting distributed synchronous SGD. arXiv preprint arXiv:1604.00981 ( 2016 ). Jianmin Chen, Xinghao Pan, Rajat Monga, Samy Bengio, and Rafal Jozefowicz. 2016. Revisiting distributed synchronous SGD. arXiv preprint arXiv:1604.00981 (2016)."},{"key":"e_1_3_2_1_16_1","unstructured":"D. Das S. Avancha D. Mudigere K. Vaidynathan S. Sridharan D. Kalamkar B. Kaul and P. Dubey. 2016. Distributed Deep Learning Using Synchronous Stochastic Gradient Descent. ArXiv e-prints (Feb. 2016). arXiv:cs.DC\/1602.06709 D. Das S. Avancha D. Mudigere K. Vaidynathan S. Sridharan D. Kalamkar B. Kaul and P. Dubey. 2016. Distributed Deep Learning Using Synchronous Stochastic Gradient Descent. ArXiv e-prints (Feb. 2016). arXiv:cs.DC\/1602.06709"},{"key":"e_1_3_2_1_17_1","volume-title":"Proceedings of the 25th International Conference on Neural Information Processing Systems -","volume":"1","author":"Dean Jeffrey","unstructured":"Jeffrey Dean , Greg S. Corrado , Rajat Monga , Kai Chen , Matthieu Devin , Quoc V. Le , Mark Z. Mao , Marc'Aurelio Ranzato , Andrew Senior , Paul Tucker , Ke Yang , and Andrew Y. Ng . 2012. Large Scale Distributed Deep Networks . In Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1 (NIPS'12). Curran Associates Inc., USA, 1223--1231. http:\/\/dl.acm.org\/citation.cfm?id=2999134.2999271 Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc'Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, and Andrew Y. Ng. 2012. Large Scale Distributed Deep Networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1 (NIPS'12). Curran Associates Inc., USA, 1223--1231. http:\/\/dl.acm.org\/citation.cfm?id=2999134.2999271"},{"key":"e_1_3_2_1_18_1","unstructured":"Adji Bousso Dieng Dustin Tran Rajesh Ranganath John Paisley and David Blei. 2017. Variational Inference via \\ chi Upper Bound Minimization. In Advances in Neural Information Processing Systems. 2732--2741. Adji Bousso Dieng Dustin Tran Rajesh Ranganath John Paisley and David Blei. 2017. Variational Inference via \\ chi Upper Bound Minimization. In Advances in Neural Information Processing Systems. 2732--2741."},{"key":"e_1_3_2_1_19_1","volume-title":"TensorFlow distributions. arXiv preprint arXiv:1711.10604","author":"Dillon Joshua V","year":"2017","unstructured":"Joshua V Dillon , Ian Langmore , Dustin Tran , Eugene Brevdo , Srinivas Vasudevan , Dave Moore , Brian Patton , Alex Alemi , Matt Hoffman , and Rif A Saurous . 2017. TensorFlow distributions. arXiv preprint arXiv:1711.10604 ( 2017 ). Joshua V Dillon, Ian Langmore, Dustin Tran, Eugene Brevdo, Srinivas Vasudevan, Dave Moore, Brian Patton, Alex Alemi, Matt Hoffman, and Rif A Saurous. 2017. TensorFlow distributions. arXiv preprint arXiv:1711.10604 (2017)."},{"key":"e_1_3_2_1_20_1","volume-title":"A comprehensive study of batch construction strategies for recurrent neural networks in mxnet. arXiv preprint arXiv:1705.02414","author":"Doetsch Patrick","year":"2017","unstructured":"Patrick Doetsch , Pavel Golik , and Hermann Ney . 2017. A comprehensive study of batch construction strategies for recurrent neural networks in mxnet. arXiv preprint arXiv:1705.02414 ( 2017 ). Patrick Doetsch, Pavel Golik, and Hermann Ney. 2017. A comprehensive study of batch construction strategies for recurrent neural networks in mxnet. arXiv preprint arXiv:1705.02414 (2017)."},{"key":"e_1_3_2_1_21_1","volume-title":"Proceedings of the 1st Annual Conference on Robot Learning. 1--16","author":"Dosovitskiy Alexey","year":"2017","unstructured":"Alexey Dosovitskiy , German Ros , Felipe Codevilla , Antonio Lopez , and Vladlen Koltun . 2017 . CARLA: An Open Urban Driving Simulator . In Proceedings of the 1st Annual Conference on Robot Learning. 1--16 . Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. 2017. CARLA: An Open Urban Driving Simulator. In Proceedings of the 1st Annual Conference on Robot Learning. 1--16."},{"key":"e_1_3_2_1_22_1","unstructured":"Arnaud Doucet and Adam M Johansen. 2009. A tutorial on particle filtering and smoothing: Fifteen years later. (2009). Arnaud Doucet and Adam M Johansen. 2009. A tutorial on particle filtering and smoothing: Fifteen years later. (2009)."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1088\/1748-0221\/13\/07\/P07027"},{"key":"e_1_3_2_1_24_1","volume-title":"Bayesian data analysis","author":"Gelman Andrew","unstructured":"Andrew Gelman , Hal S Stern , John B Carlin , David B Dunson , Aki Vehtari , and Donald B Rubin . 2013. Bayesian data analysis . Chapman and Hall\/CRC. Andrew Gelman, Hal S Stern, John B Carlin, David B Dunson, Aki Vehtari, and Donald B Rubin. 2013. Bayesian data analysis. Chapman and Hall\/CRC."},{"key":"e_1_3_2_1_25_1","volume-title":"Proceedings of the Annual Meeting of the Cognitive Science Society","volume":"36","author":"Gershman Samuel","year":"2014","unstructured":"Samuel Gershman and Noah Goodman . 2014 . Amortized inference in probabilistic reasoning . In Proceedings of the Annual Meeting of the Cognitive Science Society , Vol. 36 . Samuel Gershman and Noah Goodman. 2014. Amortized inference in probabilistic reasoning. In Proceedings of the Annual Meeting of the Cognitive Science Society, Vol. 36."},{"key":"e_1_3_2_1_26_1","volume-title":"Probabilistic machine learning and artificial intelligence. Nature 521, 7553","author":"Ghahramani Zoubin","year":"2015","unstructured":"Zoubin Ghahramani . 2015. Probabilistic machine learning and artificial intelligence. Nature 521, 7553 ( 2015 ), 452. Zoubin Ghahramani. 2015. Probabilistic machine learning and artificial intelligence. Nature 521, 7553 (2015), 452."},{"key":"e_1_3_2_1_27_1","unstructured":"B. Ginsburg I. Gitman and O. Kuchaiev. 2018. Layer-Wise Adaptive Rate Control for Training of Deep Networks. in preparation (2018). B. Ginsburg I. Gitman and O. Kuchaiev. 2018. Layer-Wise Adaptive Rate Control for Training of Deep Networks. in preparation (2018)."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/0029-5582(61)90469-2"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1088\/1126-6708\/2009\/02\/007"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1038\/ng.3710"},{"key":"e_1_3_2_1_31_1","volume-title":"Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He.","author":"Goyal Priya","year":"2017","unstructured":"Priya Goyal , Piotr Dollar , Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017 . Accurate, Large Minibatch SGD : Training ImageNet in 1 Hour . arXiv preprint arXiv:1706.02677v1 (2017). Priya Goyal, Piotr Dollar, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour. arXiv preprint arXiv:1706.02677v1 (2017)."},{"key":"e_1_3_2_1_32_1","volume-title":"Introduction to elementary particles","author":"Griffiths David","unstructured":"David Griffiths . 2008. Introduction to elementary particles . John Wiley & Sons . David Griffiths. 2008. Introduction to elementary particles. John Wiley & Sons."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Ralf Herbrich Tom Minka and Thore Graepel. 2007. TrueSkill\u2122: a Bayesian skill rating system. In Advances in Neural Information Processing Systems. 569--576. Ralf Herbrich Tom Minka and Thore Graepel. 2007. TrueSkill\u2122: a Bayesian skill rating system. In Advances in Neural Information Processing Systems. 569--576.","DOI":"10.7551\/mitpress\/7503.003.0076"},{"key":"e_1_3_2_1_34_1","volume-title":"ZeroMQ: messaging for many applications","author":"Hintjens Pieter","unstructured":"Pieter Hintjens . 2013. ZeroMQ: messaging for many applications . O'Reilly Media, Inc. Pieter Hintjens. 2013. ZeroMQ: messaging for many applications. O'Reilly Media, Inc."},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.5555\/2567709.2502622"},{"key":"e_1_3_2_1_37_1","first-page":"1593","article-title":"The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo","volume":"15","author":"Hoffman Matthew D","year":"2014","unstructured":"Matthew D Hoffman and Andrew Gelman . 2014 . The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo . Journal of Machine Learning Research 15 , 1 (2014), 1593 -- 1623 . Matthew D Hoffman and Andrew Gelman. 2014. The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research 15, 1 (2014), 1593--1623.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"crossref","unstructured":"F. N. Iandola K. Ashraf M. W. Moskewicz and K. Keutzer. 2015. FireCaffe: near-linear acceleration of deep neural network training on compute clusters. ArXiv e-prints (Oct. 2015). arXiv:cs.CV\/1511.00175 F. N. Iandola K. Ashraf M. W. Moskewicz and K. Keutzer. 2015. FireCaffe: near-linear acceleration of deep neural network training on compute clusters. ArXiv e-prints (Oct. 2015). arXiv:cs.CV\/1511.00175","DOI":"10.1109\/CVPR.2016.284"},{"key":"e_1_3_2_1_39_1","volume-title":"On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836","author":"Keskar Nitish Shirish","year":"2016","unstructured":"Nitish Shirish Keskar , Dheevatsa Mudigere , Jorge Nocedal , Mikhail Smelyanskiy , and Ping Tak Peter Tang . 2016. On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836 ( 2016 ). Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, and Ping Tak Peter Tang. 2016. On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836 (2016)."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSMP.2016.7583516"},{"key":"e_1_3_2_1_41_1","volume-title":"Adam: A Method for Stochastic Optimization. ArXiv e-prints (Dec.","author":"Kingma D. P.","year":"2014","unstructured":"D. P. Kingma and J. Ba . 2014 . Adam: A Method for Stochastic Optimization. ArXiv e-prints (Dec. 2014). arXiv:cs.LG\/1412.6980 D. P. Kingma and J. Ba. 2014. Adam: A Method for Stochastic Optimization. ArXiv e-prints (Dec. 2014). arXiv:cs.LG\/1412.6980"},{"key":"e_1_3_2_1_42_1","volume-title":"Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114","author":"Kingma Diederik P","year":"2013","unstructured":"Diederik P Kingma and Max Welling . 2013. Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114 ( 2013 ). Diederik P Kingma and Max Welling. 2013. Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114 (2013)."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1143\/JPSJ.57.4126"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2018.00054"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3126908.3126916"},{"key":"e_1_3_2_1_46_1","volume-title":"Inference for higher order probabilistic programs. Masters thesis","author":"Tuan Anh Le.","unstructured":"Tuan Anh Le. 2015. Inference for higher order probabilistic programs. Masters thesis , University of Oxford (2015) . Tuan Anh Le. 2015. Inference for higher order probabilistic programs. Masters thesis, University of Oxford (2015)."},{"key":"e_1_3_2_1_47_1","volume-title":"Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) (Proceedings of Machine Learning Research)","volume":"54","author":"Le Tuan Anh","year":"2017","unstructured":"Tuan Anh Le , Atilim G\u00fcne\u015f Baydin , and Frank Wood . 2017 . Inference Compilation and Universal Probabilistic Programming . In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) (Proceedings of Machine Learning Research) , Vol. 54 . PMLR, Fort Lauderdale, FL, USA, 1338--1348. Tuan Anh Le, Atilim G\u00fcne\u015f Baydin, and Frank Wood. 2017. Inference Compilation and Universal Probabilistic Programming. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) (Proceedings of Machine Learning Research), Vol. 54. PMLR, Fort Lauderdale, FL, USA, 1338--1348."},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_3_2_1_49_1","volume-title":"Neural Information Processing Systems (NIPS) 2017 workshop on Deep Learning for Physical Sciences (DLPS)","author":"Casado Mario Lezcano","year":"2017","unstructured":"Mario Lezcano Casado , Atilim G\u00fcne\u015f Baydin , David Martinez Rubio , Tuan Anh Le , Frank Wood , Lukas Heinrich , Gilles Louppe , Kyle Cranmer , Wahid Bhimji , Karen Ng , and Prabhat. 2017 . Improvements to Inference Compilation for Probabilistic Programming in Large-Scale Scientific Simulators . In Neural Information Processing Systems (NIPS) 2017 workshop on Deep Learning for Physical Sciences (DLPS) , Long Beach, CA, US , December 8, 2017. Mario Lezcano Casado, Atilim G\u00fcne\u015f Baydin, David Martinez Rubio, Tuan Anh Le, Frank Wood, Lukas Heinrich, Gilles Louppe, Kyle Cranmer, Wahid Bhimji, Karen Ng, and Prabhat. 2017. Improvements to Inference Compilation for Probabilistic Programming in Large-Scale Scientific Simulators. In Neural Information Processing Systems (NIPS) 2017 workshop on Deep Learning for Physical Sciences (DLPS), Long Beach, CA, US, December 8, 2017."},{"key":"e_1_3_2_1_50_1","unstructured":"Linux man-pages project. 2019. Linux Programmer's Manual. http:\/\/man7.org\/linux\/man-pages\/index.html Linux man-pages project. 2019. Linux Programmer's Manual. http:\/\/man7.org\/linux\/man-pages\/index.html"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2018.00068"},{"key":"e_1_3_2_1_52_1","volume-title":"Neural Information Processing Systems (NIPS) 2017 workshop on Deep Learning At Supercomputer Scale","author":"Mathuriya Amrita","year":"2017","unstructured":"Amrita Mathuriya , Thorsten Kurth , Vivek Rane , Mustafa Mustafa , Lei Shao , Debbie Bard , Victor W Lee , 2017 . Scaling GRPC Tensorflow on 512 nodes of Cori Supercomputer . In Neural Information Processing Systems (NIPS) 2017 workshop on Deep Learning At Supercomputer Scale , Long Beach, CA, US , December 8, 2017. Amrita Mathuriya, Thorsten Kurth, Vivek Rane, Mustafa Mustafa, Lei Shao, Debbie Bard, Victor W Lee, et al. 2017. Scaling GRPC Tensorflow on 512 nodes of Cori Supercomputer. In Neural Information Processing Systems (NIPS) 2017 workshop on Deep Learning At Supercomputer Scale, Long Beach, CA, US, December 8, 2017."},{"key":"e_1_3_2_1_53_1","volume-title":"An Empirical Model of Large-Batch Training. arXiv preprint arXiv:1812.06162v1","author":"McCandlish Sam","year":"2018","unstructured":"Sam McCandlish , Jared Kaplan , and et.al Amodei, Dario. 2018. An Empirical Model of Large-Batch Training. arXiv preprint arXiv:1812.06162v1 ( 2018 ). Sam McCandlish, Jared Kaplan, and et.al Amodei, Dario. 2018. An Empirical Model of Large-Batch Training. arXiv preprint arXiv:1812.06162v1 (2018)."},{"key":"e_1_3_2_1_54_1","volume-title":"CoRR abs\/1811.05233","author":"Mikami Hiroaki","year":"2018","unstructured":"Hiroaki Mikami , Hisahiro Suganuma , Pongsakorn U.- Chupala , Yoshiki Tanaka , and Yuichi Kageyama . 2018. ImageNet\/ResNet-50 Training in 224 Seconds . CoRR abs\/1811.05233 ( 2018 ). arXiv:1811.05233 http:\/\/arxiv.org\/abs\/1811.05233 Hiroaki Mikami, Hisahiro Suganuma, Pongsakorn U.-Chupala, Yoshiki Tanaka, and Yuichi Kageyama. 2018. ImageNet\/ResNet-50 Training in 224 Seconds. CoRR abs\/1811.05233 (2018). arXiv:1811.05233 http:\/\/arxiv.org\/abs\/1811.05233"},{"key":"e_1_3_2_1_55_1","unstructured":"T. Minka J.M. Winn J.P. Guiver Y. Zaykov D. Fabian and J. Bronskill. 2018. \/Infer.NET 0.3. Microsoft Research Cambridge. http:\/\/dotnet.github.io\/infer. T. Minka J.M. Winn J.P. Guiver Y. Zaykov D. Fabian and J. Bronskill. 2018. \/Infer.NET 0.3. Microsoft Research Cambridge. http:\/\/dotnet.github.io\/infer."},{"key":"e_1_3_2_1_57_1","volume-title":"Handbook of Markov Chain Monte Carlo, Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng (Eds.).","author":"Neal Radford M.","unstructured":"Radford M. Neal . 2011. MCMC using Hamiltonian dynamics . In Handbook of Markov Chain Monte Carlo, Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng (Eds.). Vol. 2 . 2. Radford M. Neal. 2011. MCMC using Hamiltonian dynamics. In Handbook of Markov Chain Monte Carlo, Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng (Eds.). Vol. 2. 2."},{"key":"e_1_3_2_1_58_1","volume-title":"Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS'11)","author":"Niu Feng","unstructured":"Feng Niu , Benjamin Recht , Christopher Re , and Stephen J. Wright . 2011. HOG-WILD!: A Lock-free Approach to Parallelizing Stochastic Gradient Descent . In Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS'11) . Curran Associates Inc., USA, 693--701. http:\/\/dl.acm.org\/citation.cfm?id=2986459.2986537 Feng Niu, Benjamin Recht, Christopher Re, and Stephen J. Wright. 2011. HOG-WILD!: A Lock-free Approach to Parallelizing Stochastic Gradient Descent. In Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS'11). Curran Associates Inc., USA, 693--701. http:\/\/dl.acm.org\/citation.cfm?id=2986459.2986537"},{"key":"e_1_3_2_1_59_1","unstructured":"X. Pan J. Chen R. Monga S. Bengio and R. Jozefowicz. 2017. Revisiting Distributed Synchronous SGD. ArXiv e-prints (Feb. 2017). arXiv:cs.DC\/1702.05800 X. Pan J. Chen R. Monga S. Bengio and R. Jozefowicz. 2017. Revisiting Distributed Synchronous SGD. ArXiv e-prints (Feb. 2017). arXiv:cs.DC\/1702.05800"},{"key":"e_1_3_2_1_60_1","volume-title":"Advances in Neural Information Processing Systems 29","author":"Papamakarios George","unstructured":"George Papamakarios and Iain Murray . 2016. Fast e-free Inference of Simulation Models with Bayesian Conditional Density Estimation . In Advances in Neural Information Processing Systems 29 , D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc. , 1028--1036. George Papamakarios and Iain Murray. 2016. Fast e-free Inference of Simulation Models with Bayesian Conditional Density Estimation. In Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc., 1028--1036."},{"key":"e_1_3_2_1_61_1","volume-title":"NIPS 2017 Autodiff Workshop: The Future of Gradient-based Machine Learning Software and Techniques","author":"Paszke Adam","year":"2017","unstructured":"Adam Paszke , Sam Gross , Soumith Chintala , Gregory Chanan , Edward Yang , Zachary DeVito , Zeming Lin , Alban Desmaison , Luca Antiga , and Adam Lerer . 2017 . Automatic differentiation in PyTorch . In NIPS 2017 Autodiff Workshop: The Future of Gradient-based Machine Learning Software and Techniques , Long Beach, CA, US , December 9, 2017. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In NIPS 2017 Autodiff Workshop: The Future of Gradient-based Machine Learning Software and Techniques, Long Beach, CA, US, December 9, 2017."},{"key":"e_1_3_2_1_62_1","volume-title":"An introduction to quantum field theory","author":"Peskin Michael E","unstructured":"Michael E Peskin . 2018. An introduction to quantum field theory . CRC Press . Michael E Peskin. 2018. An introduction to quantum field theory. CRC Press."},{"key":"e_1_3_2_1_63_1","volume-title":"Proceedings of the Eighth Nobel Symposium on Elementary Particle Theory, Relativistic Groups, and Analyticity","author":"Salam A","year":"1968","unstructured":"A Salam . 1968 . Proceedings of the Eighth Nobel Symposium on Elementary Particle Theory, Relativistic Groups, and Analyticity , Stockholm, Sweden , 1968. (1968). A Salam. 1968. Proceedings of the Eighth Nobel Symposium on Elementary Particle Theory, Relativistic Groups, and Analyticity, Stockholm, Sweden, 1968. (1968)."},{"key":"e_1_3_2_1_64_1","volume-title":"Dahl","author":"Shallue Christopher J.","year":"2018","unstructured":"Christopher J. Shallue , Jaehoom Lee , Joseph Antognini , Jascha Sohl-Dickstein , Roy Frostig , and George E . Dahl . 2018 . Measuring the Effects of Data Parallelism on Neural Network training. arXiv preprint arXiv:1811.03600v2 (2018). Christopher J. Shallue, Jaehoom Lee, Joseph Antognini, Jascha Sohl-Dickstein, Roy Frostig, and George E. Dahl. 2018. Measuring the Effects of Data Parallelism on Neural Network training. arXiv preprint arXiv:1811.03600v2 (2018)."},{"key":"e_1_3_2_1_65_1","volume-title":"Le","author":"Smith Samuel L","year":"2017","unstructured":"Samuel L Smith , Pieter-Jan Kindermans , Chris Ying , and Quoc V . Le . 2017 . Don't Decay the Learning Rate, Increase the Batch Size . arXiv preprint arXiv:1711.00489 (2017). Samuel L Smith, Pieter-Jan Kindermans, Chris Ying, and Quoc V. Le. 2017. Don't Decay the Learning Rate, Increase the Batch Size. arXiv preprint arXiv:1711.00489 (2017)."},{"key":"e_1_3_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1017\/S0031182008000371"},{"key":"e_1_3_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/2933575.2935313"},{"key":"e_1_3_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1002\/sdr.1474"},{"key":"e_1_3_2_1_69_1","unstructured":"Michael Teng and Frank Wood. 2018. Bayesian Distributed Stochastic Gradient Descent. In Advances in Neural Information Processing Systems. 6380--6390. Michael Teng and Frank Wood. 2018. Bayesian Distributed Stochastic Gradient Descent. In Advances in Neural Information Processing Systems. 6380--6390."},{"key":"e_1_3_2_1_70_1","volume-title":"Edward: A library for probabilistic modeling, inference, and criticism. arXiv preprint arXiv:1610.09787","author":"Tran Dustin","year":"2016","unstructured":"Dustin Tran , Alp Kucukelbir , Adji B Dieng , Maja Rudolph , Dawen Liang , and David M Blei . 2016 . Edward: A library for probabilistic modeling, inference, and criticism. arXiv preprint arXiv:1610.09787 (2016). Dustin Tran, Alp Kucukelbir, Adji B Dieng, Maja Rudolph, Dawen Liang, and David M Blei. 2016. Edward: A library for probabilistic modeling, inference, and criticism. arXiv preprint arXiv:1610.09787 (2016)."},{"key":"e_1_3_2_1_71_1","volume-title":"Article arXiv:1809.10756 (Sep","author":"van de Meent Jan-Willem","year":"2018","unstructured":"Jan-Willem van de Meent , Brooks Paige , Hongseok Yang , and Frank Wood . 2018. An Introduction to Probabilistic Programming. arXiv e-prints , Article arXiv:1809.10756 (Sep 2018 ). arXiv:stat.ML\/1809.10756 Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. 2018. An Introduction to Probabilistic Programming. arXiv e-prints, Article arXiv:1809.10756 (Sep 2018). arXiv:stat.ML\/1809.10756"},{"key":"e_1_3_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1016\/0550-3213(72)90279-9"},{"key":"e_1_3_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.19.1264"},{"key":"e_1_3_2_1_74_1","volume-title":"Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. 770--778","author":"Wingate David","year":"2011","unstructured":"David Wingate , Andreas Stuhlm\u00fcller , and Noah Goodman . 2011 . Lightweight implementations of probabilistic programming languages via transformational compilation . In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. 770--778 . David Wingate, Andreas Stuhlm\u00fcller, and Noah Goodman. 2011. Lightweight implementations of probabilistic programming languages via transformational compilation. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. 770--778."},{"key":"e_1_3_2_1_75_1","unstructured":"Yonghui Wu Mike Schuster Zhifeng Chen Quoc V Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey etal 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016). Yonghui Wu Mike Schuster Zhifeng Chen Quoc V Le Mohammad Norouzi Wolfgang Macherey Maxim Krikun Yuan Cao Qin Gao Klaus Macherey et al. 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)."},{"key":"e_1_3_2_1_76_1","volume-title":"Large batch training of convolutional networks. arXiv preprint arXiv:1708.03888","author":"You Yang","year":"2017","unstructured":"Yang You , Igor Gitman , and Boris Ginsburg . 2017. Large batch training of convolutional networks. arXiv preprint arXiv:1708.03888 ( 2017 ). Yang You, Igor Gitman, and Boris Ginsburg. 2017. Large batch training of convolutional networks. arXiv preprint arXiv:1708.03888 (2017)."},{"key":"e_1_3_2_1_77_1","volume-title":"Asynchronous Stochastic Gradient Descent with Delay Compensation. ArXiv e-prints (Sept","author":"Zheng S.","year":"2016","unstructured":"S. Zheng , Q. Meng , T. Wang , W. Chen , N. Yu , Z.-M. Ma , and T.-Y. Liu . 2016. Asynchronous Stochastic Gradient Descent with Delay Compensation. ArXiv e-prints (Sept . 2016 ). arXiv:cs.LG\/1609.08326 S. Zheng, Q. Meng, T. Wang, W. Chen, N. Yu, Z.-M. Ma, and T.-Y. Liu. 2016. Asynchronous Stochastic Gradient Descent with Delay Compensation. ArXiv e-prints (Sept. 2016). arXiv:cs.LG\/1609.08326"}],"event":{"name":"SC '19: The International Conference for High Performance Computing, Networking, Storage, and Analysis","location":"Denver Colorado","acronym":"SC '19","sponsor":["SIGHPC ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing","IEEE CS"]},"container-title":["Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3295500.3356180","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3295500.3356180","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3295500.3356180","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:02:13Z","timestamp":1750208533000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3295500.3356180"}},"subtitle":["bringing probabilistic programming to scientific simulators at scale"],"short-title":[],"issued":{"date-parts":[[2019,11,17]]},"references-count":75,"alternative-id":["10.1145\/3295500.3356180","10.1145\/3295500"],"URL":"https:\/\/doi.org\/10.1145\/3295500.3356180","relation":{},"subject":[],"published":{"date-parts":[[2019,11,17]]},"assertion":[{"value":"2019-11-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}