{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:26:24Z","timestamp":1750220784788,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":32,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,6,15]],"date-time":"2020-06-15T00:00:00Z","timestamp":1592179200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,6,15]]},"DOI":"10.1145\/3394450.3397468","type":"proceedings-article","created":{"date-parts":[[2020,6,1]],"date-time":"2020-06-01T22:03:54Z","timestamp":1591049034000},"page":"20-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["On the challenges in programming mixed-precision deep neural networks"],"prefix":"10.1145","author":[{"given":"Ruizhe","family":"Zhao","sequence":"first","affiliation":[{"name":"Imperial College London, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wayne","family":"Luk","sequence":"additional","affiliation":[{"name":"Imperial College London, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chao","family":"Xiong","sequence":"additional","affiliation":[{"name":"Corerain Technologies, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xinyu","family":"Niu","sequence":"additional","affiliation":[{"name":"Corerain Technologies, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kuen Hung","family":"Tsoi","sequence":"additional","affiliation":[{"name":"Corerain Technologies, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,6,15]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3088525.3088527"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/857076.857077"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"crossref","unstructured":"Liang-Chieh Chen George Papandreou Iasonas Kokkinos Kevin Murphy and Alan L Yuille. 2017. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets Atrous Convolution and Fully Connected CRFs. IEEE transactions on pattern analysis and machine intelligence 40 4 (2017) 834\u2013848.  Liang-Chieh Chen George Papandreou Iasonas Kokkinos Kevin Murphy and Alan L Yuille. 2017. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets Atrous Convolution and Fully Connected CRFs. IEEE transactions on pattern analysis and machine intelligence 40 4 (2017) 834\u2013848.","DOI":"10.1109\/TPAMI.2017.2699184"},{"volume-title":"TVM: End-to-End Optimization Stack for Deep Learning.","year":"2018","author":"Chen Tianqi","key":"e_1_3_2_1_4_1"},{"key":"e_1_3_2_1_5_1","unstructured":"Sharan Chetlur Cliff Woolley Philippe Vandermersch Jonathan Cohen John Tran Bryan Catanzaro and Evan Shelhamer. 2014. cuDNN: Efficient primitives for deep learning. arXiv preprint arXiv:1410.0759 (2014).  Sharan Chetlur Cliff Woolley Philippe Vandermersch Jonathan Cohen John Tran Bryan Catanzaro and Evan Shelhamer. 2014. cuDNN: Efficient primitives for deep learning. arXiv preprint arXiv:1410.0759 (2014)."},{"volume-title":"HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision. In ICCV. 293\u2013302. arXiv","year":"2019","author":"Dong Zhen","key":"e_1_3_2_1_6_1"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/996841.996853"},{"volume-title":"JAX: Autograd and XLA. https:\/\/github.com\/google\/jax.","year":"2020","key":"e_1_3_2_1_8_1"},{"key":"e_1_3_2_1_9_1","unstructured":"Ga\u00ebl Guennebaud Beno\u00eet Jacob etal 2010. Eigen v3. http:\/\/eigen.tuxfamily.org.  Ga\u00ebl Guennebaud Beno\u00eet Jacob et al. 2010. Eigen v3. http:\/\/eigen.tuxfamily.org."},{"key":"e_1_3_2_1_10_1","unstructured":"Song Han Huizi Mao and William J. Dally. 2016. Deep Compression - Compressing Deep Neural Networks with Pruning Trained Quantization and Huffman Coding. In ICLR. arXiv: 1510.00149  Song Han Huizi Mao and William J. Dally. 2016. Deep Compression - Compressing Deep Neural Networks with Pruning Trained Quantization and Huffman Coding. In ICLR. arXiv: 1510.00149"},{"key":"e_1_3_2_1_11_1","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR. arXiv: 1512.03385  Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR. arXiv: 1512.03385"},{"volume-title":"June 15, 2020","year":"2016","author":"He Kaiming","key":"e_1_3_2_1_12_1"},{"key":"e_1_3_2_1_13_1","unstructured":"Dhiraj Kalamkar Dheevatsa Mudigere Naveen Mellempudi Dipankar Das Kunal Banerjee Sasikanth Avancha Dharma Teja Vooturi Nataraj Jammalamadaka Jianyu Huang Hector Yuen Jiyan Yang Jongsoo Park Alexander Heinecke Evangelos Georganas Sudarshan Srinivasan Abhisek Kundu Misha Smelyanskiy Bharat Kaul and Pradeep Dubey. 2019. A Study of BFLOAT16 for Deep Learning Training. (2019).  Dhiraj Kalamkar Dheevatsa Mudigere Naveen Mellempudi Dipankar Das Kunal Banerjee Sasikanth Avancha Dharma Teja Vooturi Nataraj Jammalamadaka Jianyu Huang Hector Yuen Jiyan Yang Jongsoo Park Alexander Heinecke Evangelos Georganas Sudarshan Srinivasan Abhisek Kundu Misha Smelyanskiy Bharat Kaul and Pradeep Dubey. 2019. A Study of BFLOAT16 for Deep Learning Training. (2019)."},{"key":"e_1_3_2_1_14_1","unstructured":"arXiv: 1905.12322  arXiv: 1905.12322"},{"volume-title":"Cheetah: Mixed Low-Precision Hardware &amp","year":"2019","author":"Langroudi Hamed F.","key":"e_1_3_2_1_15_1"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/358438.349320"},{"key":"e_1_3_2_1_17_1","volume-title":"The BSD conference","volume":"5","author":"Lattner Chris","year":"2008"},{"key":"e_1_3_2_1_18_1","unstructured":"Tsung Yi Lin Piotr Doll\u00e1r Ross Girshick Kaiming He Bharath Hariharan and Serge Belongie. 2017. Feature pyramid networks for object detection. In CVPR. arXiv: arXiv:1612.03144v2  Tsung Yi Lin Piotr Doll\u00e1r Ross Girshick Kaiming He Bharath Hariharan and Serge Belongie. 2017. Feature pyramid networks for object detection. In CVPR. arXiv: arXiv:1612.03144v2"},{"volume-title":"SSD: Single Shot MultiBox Detector. In ECCV. https:\/\/arxiv.org\/pdf\/1512.02325.pdf","year":"2016","author":"Liu Wei","key":"e_1_3_2_1_19_1"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2011.68"},{"key":"e_1_3_2_1_21_1","unstructured":"Naveen Mellempudi Sudarshan Srinivasan Dipankar Das and Bharat Kaul. 2019. Mixed Precision Training With 8-bit Floating Point. (2019).  Naveen Mellempudi Sudarshan Srinivasan Dipankar Das and Bharat Kaul. 2019. Mixed Precision Training With 8-bit Floating Point. (2019)."},{"key":"e_1_3_2_1_22_1","unstructured":"arXiv: 1905.12334 http:\/\/arxiv.org\/abs\/1905.12334  arXiv: 1905.12334 http:\/\/arxiv.org\/abs\/1905.12334"},{"key":"e_1_3_2_1_23_1","unstructured":"Numba. 2020. A Just-In-Time Compiler for Numerical Functions in Python. https:\/\/github.com\/numba\/numba.  Numba. 2020. A Just-In-Time Compiler for Numerical Functions in Python. https:\/\/github.com\/numba\/numba."},{"key":"e_1_3_2_1_24_1","unstructured":"Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein Luca Antiga Alban Desmaison Andreas Kopf Edward Yang Zachary DeVito Martin Raison Alykhan Tejani Sasank Chilamkurthy Benoit Steiner Lu Fang Junjie Bai and Soumith Chintala. 2019. PyTorch: An Imperative Style High-Performance Deep Learning Library. In NeurIPS.  Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein Luca Antiga Alban Desmaison Andreas Kopf Edward Yang Zachary DeVito Martin Raison Alykhan Tejani Sasank Chilamkurthy Benoit Steiner Lu Fang Junjie Bai and Soumith Chintala. 2019. PyTorch: An Imperative Style High-Performance Deep Learning Library. In NeurIPS."},{"key":"e_1_3_2_1_25_1","unstructured":"PyTorch. 2020. TorchScript. https:\/\/pytorch.org\/docs\/stable\/jit.html.  PyTorch. 2020. TorchScript. https:\/\/pytorch.org\/docs\/stable\/jit.html."},{"key":"e_1_3_2_1_26_1","unstructured":"Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. (2018). arXiv: 1804.02767  Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. (2018). arXiv: 1804.02767"},{"key":"e_1_3_2_1_27_1","unstructured":"Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In NeurIPS. 91\u201399.  Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In NeurIPS. 91\u201399."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-42999-1_15"},{"volume-title":"Glow: Graph Lowering Compiler Techniques for Neural Networks.","year":"2018","author":"Rotem Nadav","key":"e_1_3_2_1_29_1"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"crossref","unstructured":"Christian Szegedy Sergey Ioffe and Vincent Vanhoucke. 2017. Inception-v4 Inception-ResNet and the Impact of Residual Connections on Learning. In AAAI. arXiv: 1602.07261  Christian Szegedy Sergey Ioffe and Vincent Vanhoucke. 2017. Inception-v4 Inception-ResNet and the Impact of Residual Connections on Learning. In AAAI. arXiv: 1602.07261","DOI":"10.1609\/aaai.v31i1.11231"},{"key":"e_1_3_2_1_31_1","unstructured":"TensorFlow. 2020. TensorFlow graph optimization with Grappler. https:\/\/www.tensorflow.org\/guide\/graph_optimization.  TensorFlow. 2020. TensorFlow graph optimization with Grappler. https:\/\/www.tensorflow.org\/guide\/graph_optimization."},{"volume-title":"DLVM: A modern compiler infrastructure for deep learning systems.","year":"2017","author":"Wei Richard","key":"e_1_3_2_1_32_1"}],"event":{"name":"PLDI '20: 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation","sponsor":["SIGPLAN ACM Special Interest Group on Programming Languages"],"location":"London UK","acronym":"PLDI '20"},"container-title":["Proceedings of the 4th ACM SIGPLAN International Workshop on Machine Learning and Programming Languages"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394450.3397468","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3394450.3397468","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:41:37Z","timestamp":1750200097000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394450.3397468"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,15]]},"references-count":32,"alternative-id":["10.1145\/3394450.3397468","10.1145\/3394450"],"URL":"https:\/\/doi.org\/10.1145\/3394450.3397468","relation":{},"subject":[],"published":{"date-parts":[[2020,6,15]]},"assertion":[{"value":"2020-06-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}