{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T00:20:52Z","timestamp":1777422052852,"version":"3.51.4"},"publisher-location":"New York, NY, USA","reference-count":51,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,11,8]],"date-time":"2020-11-08T00:00:00Z","timestamp":1604793600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,11,8]]},"DOI":"10.1145\/3368089.3417050","type":"proceedings-article","created":{"date-parts":[[2020,12,11]],"date-time":"2020-12-11T00:45:01Z","timestamp":1607647501000},"page":"1342-1352","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":100,"title":["Estimating GPU memory consumption of deep learning models"],"prefix":"10.1145","author":[{"given":"Yanjie","family":"Gao","sequence":"first","affiliation":[{"name":"Microsoft Research, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yu","family":"Liu","sequence":"additional","affiliation":[{"name":"Microsoft Research, China \/ National University of Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hongyu","family":"Zhang","sequence":"additional","affiliation":[{"name":"University of Newcastle, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhengxian","family":"Li","sequence":"additional","affiliation":[{"name":"Microsoft Research, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yonghao","family":"Zhu","sequence":"additional","affiliation":[{"name":"Microsoft Research, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haoxiang","family":"Lin","sequence":"additional","affiliation":[{"name":"Microsoft Research, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mao","family":"Yang","sequence":"additional","affiliation":[{"name":"Microsoft Research, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,11,8]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (Savannah, GA, USA) ( OSDI'16). USENIX Association, USA, 265-283","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , and Andy et. al Davis . 2016 . TensorFlow: A System for Large-Scale Machine Learning . In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (Savannah, GA, USA) ( OSDI'16). USENIX Association, USA, 265-283 . Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, and Andy et.al Davis. 2016. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (Savannah, GA, USA) ( OSDI'16). USENIX Association, USA, 265-283."},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1837855.1806671"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/0925-2312(93)90006-O"},{"key":"e_1_3_2_2_4_1","unstructured":"Amazon. 2019. Amazon SageMaker. https:\/\/aws.amazon.com\/sagemaker.  Amazon. 2019. Amazon SageMaker. https:\/\/aws.amazon.com\/sagemaker."},{"key":"e_1_3_2_2_5_1","unstructured":"Microsoft Azure. 2019. Microsoft Azure Machine Learning. https:\/\/azure. microsoft.com\/en-us\/ services\/machine-learning-service.  Microsoft Azure. 2019. Microsoft Azure Machine Learning. https:\/\/azure. microsoft.com\/en-us\/ services\/machine-learning-service."},{"key":"e_1_3_2_2_6_1","volume-title":"Understanding the Memory Consumption of the MiBench Embedded Benchmark","author":"Blin Antoine","unstructured":"Antoine Blin , C\u00e9dric Courtaud , Julien Sopena , Julia Lawall , and Gilles Muller . 2016. Understanding the Memory Consumption of the MiBench Embedded Benchmark . In Networked Systems, Parosh Aziz Abdulla and Carole Delporte-Gallet (Eds.). Springer International Publishing , Cham , 71-86. Antoine Blin, C\u00e9dric Courtaud, Julien Sopena, Julia Lawall, and Gilles Muller. 2016. Understanding the Memory Consumption of the MiBench Embedded Benchmark. In Networked Systems, Parosh Aziz Abdulla and Carole Delporte-Gallet (Eds.). Springer International Publishing, Cham, 71-86."},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1375634.1375655"},{"key":"e_1_3_2_2_8_1","unstructured":"Alfredo Canziani Adam Paszke and Eugenio Culurciello. 2017. An Analysis of Deep Neural Network Models for Practical Applications. ArXiv abs\/1605.07678 ( 2017 ).  Alfredo Canziani Adam Paszke and Eugenio Culurciello. 2017. An Analysis of Deep Neural Network Models for Practical Applications. ArXiv abs\/1605.07678 ( 2017 )."},{"key":"e_1_3_2_2_9_1","unstructured":"Tianqi Chen Mu Li Yutian Li Min Lin Naiyan Wang Minjie Wang Tianjun Xiao Bing Xu Chiyuan Zhang and Zheng Zhang. 2015. MXNet: A Flexible and Eficient Machine Learning Library for Heterogeneous Distributed Systems. CoRR abs\/1512.01274 ( 2015 ). arXiv: 1512.01274 http:\/\/arxiv.org\/abs\/1512.01274  Tianqi Chen Mu Li Yutian Li Min Lin Naiyan Wang Minjie Wang Tianjun Xiao Bing Xu Chiyuan Zhang and Zheng Zhang. 2015. MXNet: A Flexible and Eficient Machine Learning Library for Heterogeneous Distributed Systems. CoRR abs\/1512.01274 ( 2015 ). arXiv: 1512.01274 http:\/\/arxiv.org\/abs\/1512.01274"},{"key":"e_1_3_2_2_10_1","volume-title":"Proceedings of the 13th USENIX Conference on Operating Systems Design and Implementation (Carlsbad, CA, USA) ( OSDI'18). USENIX Association, USA, 579-594","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen , Thierry Moreau , Ziheng Jiang , Lianmin Zheng , Eddie Yan , Meghan Cowan , Haichen Shen , Leyuan Wang , Yuwei Hu , Luis Ceze , Carlos Guestrin , and Arvind Krishnamurthy . 2018 . TVM: An Automated End-to-End Optimizing Compiler for Deep Learning . In Proceedings of the 13th USENIX Conference on Operating Systems Design and Implementation (Carlsbad, CA, USA) ( OSDI'18). USENIX Association, USA, 579-594 . Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In Proceedings of the 13th USENIX Conference on Operating Systems Design and Implementation (Carlsbad, CA, USA) ( OSDI'18). USENIX Association, USA, 579-594."},{"key":"e_1_3_2_2_11_1","unstructured":"Tianqi Chen Bing Xu Chiyuan Zhang and Carlos Guestrin. 2016. Training Deep Nets with Sublinear Memory Cost. CoRR abs\/1604.06174 ( 2016 ). arXiv: 1604. 06174  Tianqi Chen Bing Xu Chiyuan Zhang and Carlos Guestrin. 2016. Training Deep Nets with Sublinear Memory Cost. CoRR abs\/1604.06174 ( 2016 ). arXiv: 1604. 06174"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2907950.2907955"},{"key":"e_1_3_2_2_13_1","volume-title":"ImageNet: A Large-Scale Hierarchical Image Database. In CVPR","author":"Deng J.","year":"2009","unstructured":"J. Deng , W. Dong , R. Socher , L.-J. Li , K. Li , and L. Fei-Fei . 2009 . ImageNet: A Large-Scale Hierarchical Image Database. In CVPR 2009 . J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR 2009."},{"key":"e_1_3_2_2_14_1","volume-title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT.","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT."},{"key":"e_1_3_2_2_15_1","unstructured":"Peng Gu. 2018. Memory management for tensorflow. https:\/\/github.com\/ miglopst\/cs263_spring2018\/wiki\/Memory-management-for-tensorflow  Peng Gu. 2018. Memory management for tensorflow. https:\/\/github.com\/ miglopst\/cs263_spring2018\/wiki\/Memory-management-for-tensorflow"},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2019.00113"},{"key":"e_1_3_2_2_17_1","unstructured":"Mark Harris. 2019. TensorFlow Graph Optimizations. ( 2019 ).  Mark Harris. 2019. TensorFlow Graph Optimizations. ( 2019 )."},{"key":"e_1_3_2_2_18_1","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2015. Deep Residual Learning for Image Recognition. CoRR abs\/1512.03385 ( 2015 ). arXiv: 1512. 03385  Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2015. Deep Residual Learning for Image Recognition. CoRR abs\/1512.03385 ( 2015 ). arXiv: 1512. 03385"},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2019.00027"},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"crossref","unstructured":"Sepp Hochreiter and Jurgen Schmidhuber. 1997. Long Short-term Memory. Neural computation 9 (12 1997 ) 1735-80.  Sepp Hochreiter and Jurgen Schmidhuber. 1997. Long Short-term Memory. Neural computation 9 (12 1997 ) 1735-80.","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_2_21_1","volume-title":"Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Tallinn, Estonia) (ESEC\/FSE 2019 )","author":"Islam Md Johirul","unstructured":"Md Johirul Islam , Giang Nguyen , Rangeet Pan , and Hridesh Rajan . 2019. A Comprehensive Study on Deep Learning Bug Characteristics . In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Tallinn, Estonia) (ESEC\/FSE 2019 ) . Association for Computing Machinery , NY , USA, 510-520. Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A Comprehensive Study on Deep Learning Bug Characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Tallinn, Estonia) (ESEC\/FSE 2019 ). Association for Computing Machinery, NY, USA, 510-520."},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3338906.3338936"},{"key":"e_1_3_2_2_23_1","volume-title":"Kingma and Jimmy Ba","author":"Diederik","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba . 2015 . Adam : A Method for Stochastic Optimization. In ICLR (Poster) . http:\/\/arxiv.org\/abs\/1412.6980 Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR (Poster). http:\/\/arxiv.org\/abs\/1412.6980"},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2017.2780222"},{"key":"e_1_3_2_2_25_1","unstructured":"Malmaud. 2020. TensorFlow Shape Infer. https:\/\/malmaud.github.io\/tfdocs\/ shape_inference.  Malmaud. 2020. TensorFlow Shape Infer. https:\/\/malmaud.github.io\/tfdocs\/ shape_inference."},{"key":"e_1_3_2_2_26_1","unstructured":"Chen Meng Minmin Sun Jun Yang Minghui Qiu and Yang Gu. 2017. Training Deeper Models by GPU Memory Optimization on TensorFlow.  Chen Meng Minmin Sun Jun Yang Minghui Qiu and Yang Gu. 2017. Training Deeper Models by GPU Memory Optimization on TensorFlow."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"crossref","unstructured":"Tim Menzies Zach Milton Burak Turhan Bojan Cukic Yue Jiang and Ay\u015fe Bener. 2010. Defect prediction from static code features: Current results limitations new approaches. Automated Software Engineering 17 4 ( 1 12 2010 ) 375-407.  Tim Menzies Zach Milton Burak Turhan Bojan Cukic Yue Jiang and Ay\u015fe Bener. 2010. Defect prediction from static code features: Current results limitations new approaches. Automated Software Engineering 17 4 ( 1 12 2010 ) 375-407.","DOI":"10.1007\/s10515-010-0069-5"},{"key":"e_1_3_2_2_28_1","volume-title":"2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings. 223-230","author":"Molokken K.","unstructured":"K. Molokken and M. Jorgensen . 2003. A review of software surveys on software efort estimation . In 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings. 223-230 . K. Molokken and M. Jorgensen. 2003. A review of software surveys on software efort estimation. In 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings. 223-230."},{"key":"e_1_3_2_2_29_1","unstructured":"MXNet. 2020. MXNet Memory Monger. https:\/\/github.com\/dmlc\/mxnetmemonger.  MXNet. 2020. MXNet Memory Monger. https:\/\/github.com\/dmlc\/mxnetmemonger."},{"key":"e_1_3_2_2_30_1","unstructured":"MXNet. 2020. MXNet symbol simple bind. https:\/\/beta.mxnet.io\/api\/symbol\/_autogen\/mxnet.symbol.Symbol.\\simple_bind.html.  MXNet. 2020. MXNet symbol simple bind. https:\/\/beta.mxnet.io\/api\/symbol\/_autogen\/mxnet.symbol.Symbol.\\simple_bind.html."},{"key":"e_1_3_2_2_31_1","unstructured":"Apache MXNet. 2019. The topological sorting algorithm for computation graphs in Apache MXNet. https:\/\/github.com\/apache\/incubator-mxnet\/blob\/1.6.0\/src\/ executor\/simple_partition_pass.h  Apache MXNet. 2019. The topological sorting algorithm for computation graphs in Apache MXNet. https:\/\/github.com\/apache\/incubator-mxnet\/blob\/1.6.0\/src\/ executor\/simple_partition_pass.h"},{"key":"e_1_3_2_2_32_1","unstructured":"NVIDIA. 2019. NVML API Reference Guide. https:\/\/docs.nvidia.com\/deploy\/ nvml-api\/index.html. ( 2019 ).  NVIDIA. 2019. NVML API Reference Guide. https:\/\/docs.nvidia.com\/deploy\/ nvml-api\/index.html. ( 2019 )."},{"key":"e_1_3_2_2_33_1","unstructured":"Nvidia. 2020. cudnnConvolutionFwdAlgo. https:\/\/docs.nvidia.com\/deeplearning\/ sdk\/cudnn-api\/index.html#cudnnConvolutionFwdAlgo_t.  Nvidia. 2020. cudnnConvolutionFwdAlgo. https:\/\/docs.nvidia.com\/deeplearning\/ sdk\/cudnn-api\/index.html#cudnnConvolutionFwdAlgo_t."},{"key":"e_1_3_2_2_34_1","unstructured":"ONNX. 2020. ONNX Shape Inference. https:\/\/github.com\/onnx\/onnx\/blob\/v1.7. 0\/docs\/ShapeInference.md.  ONNX. 2020. ONNX Shape Inference. https:\/\/github.com\/onnx\/onnx\/blob\/v1.7. 0\/docs\/ShapeInference.md."},{"key":"e_1_3_2_2_35_1","volume-title":"Bradbury","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke , Sam Gross , Francisco Massa , Adam Lerer , and James et al. Bradbury . 2019 . PyTorch: An Imperative Style, High-Performance Deep Learning Library . In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 8024-8035. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, and James et al. Bradbury. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 8024-8035."},{"key":"e_1_3_2_2_36_1","unstructured":"PyTorch. 2019. PyTorch: Control Flow + Weight Sharing. https:\/\/pytorch.org\/ tutorials\/beginner\/examples_nn\/dynamic_net.html.  PyTorch. 2019. PyTorch: Control Flow + Weight Sharing. https:\/\/pytorch.org\/ tutorials\/beginner\/examples_nn\/dynamic_net.html."},{"key":"e_1_3_2_2_37_1","volume-title":"The topological sorting algorithm for computation graphs in PyTorch. https:\/\/github.com\/pytorch\/pytorch\/blob\/v1.2.0\/cafe2\/core\/nomnigraph\/ include\/nomnigraph\/Graph\/TopoSort.h#L26","unstructured":"PyTorch. 2019. The topological sorting algorithm for computation graphs in PyTorch. https:\/\/github.com\/pytorch\/pytorch\/blob\/v1.2.0\/cafe2\/core\/nomnigraph\/ include\/nomnigraph\/Graph\/TopoSort.h#L26 PyTorch. 2019. The topological sorting algorithm for computation graphs in PyTorch. https:\/\/github.com\/pytorch\/pytorch\/blob\/v1.2.0\/cafe2\/core\/nomnigraph\/ include\/nomnigraph\/Graph\/TopoSort.h#L26"},{"key":"e_1_3_2_2_38_1","volume-title":"Keckler","author":"Rhu Minsoo","year":"2016","unstructured":"Minsoo Rhu , Natalia Gimelshein , Jason Clemons , Arslan Zulfiqar , and Stephen W . Keckler . 2016 . VDNN : Virtualized Deep Neural Networks for Scalable, MemoryEficient Neural Network Design. In The 49th Annual IEEE\/ACM International Symposium on Microarchitecture (Taipei, Taiwan) (MICRO-49). IEEE Press , Article 18, 13 pages. Minsoo Rhu, Natalia Gimelshein, Jason Clemons, Arslan Zulfiqar, and Stephen W. Keckler. 2016. VDNN: Virtualized Deep Neural Networks for Scalable, MemoryEficient Neural Network Design. In The 49th Annual IEEE\/ACM International Symposium on Microarchitecture (Taipei, Taiwan) (MICRO-49). IEEE Press, Article 18, 13 pages."},{"key":"e_1_3_2_2_39_1","unstructured":"Nikolay Sakharnykh. 2018. Everything you need to know about unified memory. NVIDIA GTC ( 2018 ).  Nikolay Sakharnykh. 2018. Everything you need to know about unified memory. NVIDIA GTC ( 2018 )."},{"key":"e_1_3_2_2_40_1","volume-title":"Horovod: fast and easy distributed deep learning in TensorFlow. CoRR abs\/","author":"Sergeev Alexander","year":"1802","unstructured":"Alexander Sergeev and Mike Del Balso . 2018. Horovod: fast and easy distributed deep learning in TensorFlow. CoRR abs\/ 1802 .05799 ( 2018 ). arXiv: 1802.05799 Alexander Sergeev and Mike Del Balso. 2018. Horovod: fast and easy distributed deep learning in TensorFlow. CoRR abs\/ 1802.05799 ( 2018 ). arXiv: 1802.05799"},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2786805.2786845"},{"key":"e_1_3_2_2_42_1","volume-title":"3rd International Conference on Learning Representations, ICLR","author":"Simonyan Karen","year":"2015","unstructured":"Karen Simonyan and Andrew Zisserman . 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition . In 3rd International Conference on Learning Representations, ICLR 2015 , San Diego, CA , USA, May 7-9, 2015, Conference Track Proceedings . http:\/\/arxiv.org\/abs\/1409.1556 Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings. http:\/\/arxiv.org\/abs\/1409.1556"},{"key":"e_1_3_2_2_43_1","unstructured":"Christian Szegedy Vincent Vanhoucke Sergey Iofe Jonathon Shlens and Zbigniew Wojna. 2015. Rethinking the Inception Architecture for Computer Vision. CoRR abs\/1512.00567 ( 2015 ). arXiv: 1512.00567 http:\/\/arxiv.org\/abs\/1512.00567  Christian Szegedy Vincent Vanhoucke Sergey Iofe Jonathon Shlens and Zbigniew Wojna. 2015. Rethinking the Inception Architecture for Computer Vision. CoRR abs\/1512.00567 ( 2015 ). arXiv: 1512.00567 http:\/\/arxiv.org\/abs\/1512.00567"},{"key":"e_1_3_2_2_44_1","article-title":"Conceptual Data Model-Based Software Size Estimation for Information Systems","volume":"19","author":"Kuan Tan Hee Beng","year":"2009","unstructured":"Hee Beng Kuan Tan , Yuan Zhao , and Hongyu Zhang . 2009 . Conceptual Data Model-Based Software Size Estimation for Information Systems . ACM Trans. Softw. Eng. Methodol. 19 , 2, Article 4 (Oct. 2009 ), 37 pages. Hee Beng Kuan Tan, Yuan Zhao, and Hongyu Zhang. 2009. Conceptual Data Model-Based Software Size Estimation for Information Systems. ACM Trans. Softw. Eng. Methodol. 19, 2, Article 4 (Oct. 2009 ), 37 pages.","journal-title":"ACM Trans. Softw. Eng. Methodol."},{"key":"e_1_3_2_2_45_1","first-page":"26","article-title":"rmsprop: Divide the Gradient by a Running Average of Its Recent Magnitude","volume":"4","author":"Tieleman T.","year":"2012","unstructured":"T. Tieleman and G. Hinton . 2012 . rmsprop: Divide the Gradient by a Running Average of Its Recent Magnitude . COURSERA: Neural Networks for Machine Learning , 4 , 26 - 31 . ( 2012 ). T. Tieleman and G. Hinton. 2012. rmsprop: Divide the Gradient by a Running Average of Its Recent Magnitude. COURSERA: Neural Networks for Machine Learning, 4, 26-31. ( 2012 ).","journal-title":"COURSERA: Neural Networks for Machine Learning"},{"key":"e_1_3_2_2_46_1","volume-title":"Liu","author":"Unnikrishnan Leena","year":"2000","unstructured":"Leena Unnikrishnan , Scott D. Stoller , and Yanhong A . Liu . 2000 . Automatic Accurate Stack Space and Heap Space Analysis for High-Level Languages. Technical Report. Indiana University . Leena Unnikrishnan, Scott D. Stoller, and Yanhong A. Liu. 2000. Automatic Accurate Stack Space and Heap Space Analysis for High-Level Languages. Technical Report. Indiana University."},{"key":"e_1_3_2_2_47_1","first-page":"143","volume-title":"Proceedings of the 31st Annual Design Automation Conference (San Diego, California, USA) ( DAC '94). Association for Computing Machinery","author":"Verbauwhede Ingrid M.","unstructured":"Ingrid M. Verbauwhede , Chris J. Scheers , and Jan M. Rabaey . 1994. Memory Estimation for High Level Synthesis . In Proceedings of the 31st Annual Design Automation Conference (San Diego, California, USA) ( DAC '94). Association for Computing Machinery , New York, NY, USA , 143 - 148 . Ingrid M. Verbauwhede, Chris J. Scheers, and Jan M. Rabaey. 1994. Memory Estimation for High Level Synthesis. In Proceedings of the 31st Annual Design Automation Conference (San Diego, California, USA) ( DAC '94). Association for Computing Machinery, New York, NY, USA, 143-148."},{"key":"e_1_3_2_2_48_1","volume-title":"Bowman","author":"Wang Alex","year":"2019","unstructured":"Alex Wang , Amanpreet Singh , Julian Michael , Felix Hill , Omer Levy , and Samuel R . Bowman . 2019 . GLUE : A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In the Proceedings of ICLR. Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2019. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In the Proceedings of ICLR."},{"key":"e_1_3_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178487.3178491"},{"key":"e_1_3_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.58337"},{"key":"e_1_3_2_2_51_1","volume-title":"Proceedings of the 42nd International Conference on Software Engineering (Seoul, Republic of Korea) ( ICSE '20). Association for Computing Machinery, NY, USA, 1159-1170","author":"Zhang Ru","year":"2020","unstructured":"Ru Zhang , Wencong Xiao , Hongyu Zhang , Yu Liu , Haoxiang Lin , and Mao Yang . 2020 . An Empirical Study on Program Failures of Deep Learning Jobs . In Proceedings of the 42nd International Conference on Software Engineering (Seoul, Republic of Korea) ( ICSE '20). Association for Computing Machinery, NY, USA, 1159-1170 . Ru Zhang, Wencong Xiao, Hongyu Zhang, Yu Liu, Haoxiang Lin, and Mao Yang. 2020. An Empirical Study on Program Failures of Deep Learning Jobs. In Proceedings of the 42nd International Conference on Software Engineering (Seoul, Republic of Korea) ( ICSE '20). Association for Computing Machinery, NY, USA, 1159-1170."}],"event":{"name":"ESEC\/FSE '20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering","location":"Virtual Event USA","acronym":"ESEC\/FSE '20","sponsor":["SIGSOFT ACM Special Interest Group on Software Engineering"]},"container-title":["Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3368089.3417050","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3368089.3417050","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:01:58Z","timestamp":1750197718000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3368089.3417050"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,8]]},"references-count":51,"alternative-id":["10.1145\/3368089.3417050","10.1145\/3368089"],"URL":"https:\/\/doi.org\/10.1145\/3368089.3417050","relation":{},"subject":[],"published":{"date-parts":[[2020,11,8]]},"assertion":[{"value":"2020-11-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}