{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,24]],"date-time":"2025-10-24T16:46:41Z","timestamp":1761324401714,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":36,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,17]],"date-time":"2021-10-17T00:00:00Z","timestamp":1634428800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Swiss National Science Foundation","award":["200021_165749, 200020B_188696"],"award-info":[{"award-number":["200021_165749, 200020B_188696"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,18]]},"DOI":"10.1145\/3466752.3480057","type":"proceedings-article","created":{"date-parts":[[2021,10,17]],"date-time":"2021-10-17T19:16:55Z","timestamp":1634498215000},"page":"421-433","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Equinox: Training (for Free) on a Custom Inference Accelerator"],"prefix":"10.1145","author":[{"given":"Mario","family":"Drumond","sequence":"first","affiliation":[{"name":"CodeDepot, Switzerland"}]},{"given":"Louis","family":"Coulon","sequence":"additional","affiliation":[{"name":"EcoCloud, EPFL, Switzerland"}]},{"given":"Arash","family":"Pourhabibi","sequence":"additional","affiliation":[{"name":"EcoCloud, EPFL, Switzerland"}]},{"given":"Ahmet Caner","family":"Y\u00fcz\u00fcg\u00fcler","sequence":"additional","affiliation":[{"name":"EcoCloud, EPFL, Switzerland"}]},{"given":"Babak","family":"Falsafi","sequence":"additional","affiliation":[{"name":"EcoCloud, EPFL, Switzerland"}]},{"given":"Martin","family":"Jaggi","sequence":"additional","affiliation":[{"name":"EcoCloud, EPFL, Switzerland"}]}],"member":"320","published-online":{"date-parts":[[2021,10,17]]},"reference":[{"unstructured":"2010. NVIDIA T4 Tensor Core GPU.https:\/\/www.nvidia.com\/content\/dam\/en-zz\/Solutions\/Data-Center\/t4-tensor-core-datasheet.pdf. Accessed: 2019-01-07.  2010. NVIDIA T4 Tensor Core GPU.https:\/\/www.nvidia.com\/content\/dam\/en-zz\/Solutions\/Data-Center\/t4-tensor-core-datasheet.pdf. Accessed: 2019-01-07.","key":"e_1_3_2_1_1_1"},{"unstructured":"2017. Cloud TPU.https:\/\/cloud.google.com\/tpu. Accessed: 2018-01-31.  2017. Cloud TPU.https:\/\/cloud.google.com\/tpu. Accessed: 2018-01-31.","key":"e_1_3_2_1_2_1"},{"unstructured":"2017. Introduction to the IPU architecture.https:\/\/www.graphcore.ai\/nips2017_presentations. Accessed: 2019-08-06.  2017. Introduction to the IPU architecture.https:\/\/www.graphcore.ai\/nips2017_presentations. Accessed: 2019-08-06.","key":"e_1_3_2_1_3_1"},{"unstructured":"2018. NVIDIA Volta V100 GPU Accelerator.https:\/\/images.nvidia.com\/content\/technologies\/volta\/pdf\/tesla-volta-v100-datasheet-letter-fnl-web.pdf. Accessed: 2018-01-31.  2018. NVIDIA Volta V100 GPU Accelerator.https:\/\/images.nvidia.com\/content\/technologies\/volta\/pdf\/tesla-volta-v100-datasheet-letter-fnl-web.pdf. Accessed: 2018-01-31.","key":"e_1_3_2_1_4_1"},{"unstructured":"2018. Tearing Apart Google\u2019s TPU 3.0 AI coprocessor.https:\/\/www.nextplatform.com\/2018\/05\/10\/tearing-apart-googles-tpu-3-0-ai-coprocessor. Accessed: 2018-05-15.  2018. Tearing Apart Google\u2019s TPU 3.0 AI coprocessor.https:\/\/www.nextplatform.com\/2018\/05\/10\/tearing-apart-googles-tpu-3-0-ai-coprocessor. Accessed: 2018-05-15.","key":"e_1_3_2_1_5_1"},{"key":"e_1_3_2_1_6_1","volume-title":"Proceedings of the 12th Symposium on Operating System Design and Implementation (OSDI). 265\u2013283","author":"Abadi Martn","year":"2016","unstructured":"Martn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , Manjunath Kudlur , Josh Levenberg , Rajat Monga , Sherry Moore , Derek\u00a0Gordon Murray , Benoit Steiner , Paul\u00a0 A. Tucker , Vijay Vasudevan , Pete Warden , Martin Wicke , Yuan Yu , and Xiaoqiang Zheng . 2016 . TensorFlow: A System for Large-Scale Machine Learning . In Proceedings of the 12th Symposium on Operating System Design and Implementation (OSDI). 265\u2013283 . Martn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek\u00a0Gordon Murray, Benoit Steiner, Paul\u00a0A. Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of the 12th Symposium on Operating System Design and Implementation (OSDI). 265\u2013283."},{"volume-title":"The Datacenter as a Computer: Designing Warehouse-Scale Machines","author":"Barroso Luiz\u00a0Andr","unstructured":"Luiz\u00a0Andr Barroso , Urs Hlzle , and Parthasarathy Ranganathan . 2018. The Datacenter as a Computer: Designing Warehouse-Scale Machines , Third Edition. Morgan & Claypool Publishers . Luiz\u00a0Andr Barroso, Urs Hlzle, and Parthasarathy Ranganathan. 2018. The Datacenter as a Computer: Designing Warehouse-Scale Machines, Third Edition. Morgan & Claypool Publishers.","key":"e_1_3_2_1_7_1"},{"key":"e_1_3_2_1_8_1","volume-title":"Proceedings of the 14th Symposium on Networked Systems Design and Implementation (NSDI). 613\u2013627","author":"Crankshaw Daniel","year":"2017","unstructured":"Daniel Crankshaw , Xin\u00a0Wang 0066, Giulio Zhou , Michael\u00a0 J. Franklin , Joseph\u00a0 E. Gonzalez , and Ion Stoica . 2017 . Clipper: A Low-Latency Online Prediction Serving System . In Proceedings of the 14th Symposium on Networked Systems Design and Implementation (NSDI). 613\u2013627 . Daniel Crankshaw, Xin\u00a0Wang 0066, Giulio Zhou, Michael\u00a0J. Franklin, Joseph\u00a0E. Gonzalez, and Ion Stoica. 2017. Clipper: A Low-Latency Online Prediction Serving System. In Proceedings of the 14th Symposium on Networked Systems Design and Implementation (NSDI). 613\u2013627."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_9_1","DOI":"10.1145\/3297858.3304070"},{"unstructured":"William Dally. 2015. High Performance Hardware for Machine Learning.https:\/\/media.nips.cc\/Conferences\/2015\/tutorialslides\/Dally-NIPS-Tutorial-2015.pdf. Accessed: 2018-01-31.  William Dally. 2015. High Performance Hardware for Machine Learning.https:\/\/media.nips.cc\/Conferences\/2015\/tutorialslides\/Dally-NIPS-Tutorial-2015.pdf. Accessed: 2018-01-31.","key":"e_1_3_2_1_10_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_11_1","DOI":"10.1145\/2541940.2541941"},{"key":"e_1_3_2_1_12_1","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding . In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019. 4171\u20134186. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019. 4171\u20134186."},{"key":"e_1_3_2_1_14_1","volume-title":"Proceedings of the Thirty-second Conference on Neural Information Processing Systems (NeurIPS). 451\u2013461","author":"Drumond Mario","year":"2018","unstructured":"Mario Drumond , Tao Lin , Martin Jaggi , and Babak Falsafi . 2018 . Training DNNs with Hybrid Block Floating Point . In Proceedings of the Thirty-second Conference on Neural Information Processing Systems (NeurIPS). 451\u2013461 . Mario Drumond, Tao Lin, Martin Jaggi, and Babak Falsafi. 2018. Training DNNs with Hybrid Block Floating Point. In Proceedings of the Thirty-second Conference on Neural Information Processing Systems (NeurIPS). 451\u2013461."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_15_1","DOI":"10.1145\/2000064.2000108"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_16_1","DOI":"10.1109\/ISCA.2018.00012"},{"key":"e_1_3_2_1_17_1","volume-title":"Proceedings of the Thirty-second International Conference on Machine Learning (ICML). 1737\u20131746","author":"Gupta Suyog","year":"2015","unstructured":"Suyog Gupta , Ankur Agrawal , Kailash Gopalakrishnan , and Pritish Narayanan . 2015 . Deep Learning with Limited Numerical Precision . In Proceedings of the Thirty-second International Conference on Machine Learning (ICML). 1737\u20131746 . Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep Learning with Limited Numerical Precision. In Proceedings of the Thirty-second International Conference on Machine Learning (ICML). 1737\u20131746."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_18_1","DOI":"10.1109\/HPCA.2018.00059"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_19_1","DOI":"10.1109\/CVPR.2016.90"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_20_1","DOI":"10.1109\/PACT.2019.00034"},{"key":"e_1_3_2_1_21_1","volume-title":"A Domain-Specific Supercomputer for Training Deep Neural Networks. Commun. ACM","author":"Jouppi P.","year":"2020","unstructured":"Norman\u00a0 P. Jouppi , Doe\u00a0Hyun Yoon , George Kurian , Sheng Li , Nishant Patil , James Laudon , Cliff Young , and David Patterson . 2020. A Domain-Specific Supercomputer for Training Deep Neural Networks. Commun. ACM ( 2020 ), 67\u201378. Norman\u00a0P. Jouppi, Doe\u00a0Hyun Yoon, George Kurian, Sheng Li, Nishant Patil, James Laudon, Cliff Young, and David Patterson. 2020. A Domain-Specific Supercomputer for Training Deep Neural Networks. Commun. ACM (2020), 67\u201378."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_22_1","DOI":"10.1145\/3079856.3080246"},{"volume-title":"Learning Multiple Layers of Features from Tiny Images. Technical report","author":"Krizhevsky Alex","unstructured":"Alex Krizhevsky . 2009. Learning Multiple Layers of Features from Tiny Images. Technical report , University of Toronto(2009) . Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images. Technical report, University of Toronto(2009).","key":"e_1_3_2_1_23_1"},{"key":"e_1_3_2_1_24_1","volume-title":"Proceedings of the Thirty-first Conference on Neural Information Processing Systems (NIPS). 1742\u20131752","author":"Kster Urs","year":"2017","unstructured":"Urs Kster , Tristan Webb , Xin Wang , Marcel Nassar , Arjun\u00a0 K. Bansal , William Constable , Oguz Elibol , Stewart Hall , Luke Hornof , Amir Khosrowshahi , Carey Kloss , Ruby\u00a0 J. Pai , and Naveen Rao . 2017 . Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks . In Proceedings of the Thirty-first Conference on Neural Information Processing Systems (NIPS). 1742\u20131752 . Urs Kster, Tristan Webb, Xin Wang, Marcel Nassar, Arjun\u00a0K. Bansal, William Constable, Oguz Elibol, Stewart Hall, Luke Hornof, Amir Khosrowshahi, Carey Kloss, Ruby\u00a0J. Pai, and Naveen Rao. 2017. Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks. In Proceedings of the Thirty-first Conference on Neural Information Processing Systems (NIPS). 1742\u20131752."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_25_1","DOI":"10.1145\/2749469.2749475"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_26_1","DOI":"10.1109\/MICRO.2007.30"},{"doi-asserted-by":"crossref","unstructured":"Sharan Narang and Greg Diamos. 2017. Baidu DeepBench. https:\/\/doi.org\/10.1145\/1105734.1105748. 10.1145\/1105734.1105748","key":"#cr-split#-e_1_3_2_1_27_1.1","DOI":"10.1145\/1105734.1105748"},{"doi-asserted-by":"crossref","unstructured":"Sharan Narang and Greg Diamos. 2017. Baidu DeepBench. https:\/\/doi.org\/10.1145\/1105734.1105748.","key":"#cr-split#-e_1_3_2_1_27_1.2","DOI":"10.1145\/1105734.1105748"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_28_1","DOI":"10.5555\/2971808.2971811"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_29_1","DOI":"10.1145\/3132747.3132780"},{"volume-title":"Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point. NeurIPS 2020","author":"Rouhani Bita","unstructured":"Bita Rouhani , Daniel Lo , Ritchie Zhao , Ming Liu , Jeremy Fowers , Kalin Ovtcharov , Anna Vinogradsky , Sarah Massengill , Lita Yang , Ray Bittner , Alessandro Forin , Haishan Zhu , Taesik Na , Prerak Patel , Shuai Che , Lok\u00a0Chand Koppaka , Xia Song , Subhojit Som , Kaustav Das , Saurabh Tiwary , Steve Reinhardt , Sitaram Lanka , Eric Chung , and Doug Burger . 2020. Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point. NeurIPS 2020 vol. 33 , no. 4, pp. 100-107. (2020). Bita Rouhani, Daniel Lo, Ritchie Zhao, Ming Liu, Jeremy Fowers, Kalin Ovtcharov, Anna Vinogradsky, Sarah Massengill, Lita Yang, Ray Bittner, Alessandro Forin, Haishan Zhu, Taesik Na, Prerak Patel, Shuai Che, Lok\u00a0Chand Koppaka, Xia Song, Subhojit Som, Kaustav Das, Saurabh Tiwary, Steve Reinhardt, Sitaram Lanka, Eric Chung, and Doug Burger. 2020. Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point. NeurIPS 2020 vol. 33, no. 4, pp. 100-107. (2020).","key":"e_1_3_2_1_30_1"},{"unstructured":"Olga Russakovsky Jia Deng Hao Su Jonathan Krause Sanjeev Satheesh Sean Ma Zhiheng Huang Andrej Karpathy Aditya Khosla Michael\u00a0S. Bernstein Alexander\u00a0C. Berg and Fei-Fei Li. 2014. ImageNet Large Scale Visual Recognition Challenge. CoRR abs\/1409.0575(2014).  Olga Russakovsky Jia Deng Hao Su Jonathan Krause Sanjeev Satheesh Sean Ma Zhiheng Huang Andrej Karpathy Aditya Khosla Michael\u00a0S. Bernstein Alexander\u00a0C. Berg and Fei-Fei Li. 2014. ImageNet Large Scale Visual Recognition Challenge. CoRR abs\/1409.0575(2014).","key":"e_1_3_2_1_31_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_32_1","DOI":"10.1109\/JPROC.2017.2761740"},{"unstructured":"Kevin Tran. 2016. Start Your HBM\/2.5 D Design Today.  Kevin Tran. 2016. Start Your HBM\/2.5 D Design Today.","key":"e_1_3_2_1_33_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_34_1","DOI":"10.1145\/1105734.1105748"},{"key":"e_1_3_2_1_35_1","volume-title":"Proceedings of the Thirty-second Conference on Neural Information Processing Systems (NeurIPS). 7686\u20137695","author":"Wang Naigang","year":"2018","unstructured":"Naigang Wang , Jungwook Choi , Daniel Brand , Chia-Yu Chen , and Kailash Gopalakrishnan . 2018 . Training Deep Neural Networks with 8-bit Floating Point Numbers . In Proceedings of the Thirty-second Conference on Neural Information Processing Systems (NeurIPS). 7686\u20137695 . Naigang Wang, Jungwook Choi, Daniel Brand, Chia-Yu Chen, and Kailash Gopalakrishnan. 2018. Training Deep Neural Networks with 8-bit Floating Point Numbers. In Proceedings of the Thirty-second Conference on Neural Information Processing Systems (NeurIPS). 7686\u20137695."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_36_1","DOI":"10.1109\/IISWC.2018.8573476"}],"event":{"sponsor":["SIGMICRO ACM Special Interest Group on Microarchitectural Research and Processing"],"acronym":"MICRO '21","name":"MICRO '21: 54th Annual IEEE\/ACM International Symposium on Microarchitecture","location":"Virtual Event Greece"},"container-title":["MICRO-54: 54th Annual IEEE\/ACM International Symposium on Microarchitecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3466752.3480057","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3466752.3480057","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:24:53Z","timestamp":1750195493000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3466752.3480057"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,17]]},"references-count":36,"alternative-id":["10.1145\/3466752.3480057","10.1145\/3466752"],"URL":"https:\/\/doi.org\/10.1145\/3466752.3480057","relation":{},"subject":[],"published":{"date-parts":[[2021,10,17]]},"assertion":[{"value":"2021-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}