{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T07:50:52Z","timestamp":1761897052021,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":52,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,5,13]],"date-time":"2019-05-13T00:00:00Z","timestamp":1557705600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,5,13]]},"DOI":"10.1145\/3317550.3321443","type":"proceedings-article","created":{"date-parts":[[2019,5,10]],"date-time":"2019-05-10T19:01:58Z","timestamp":1557514918000},"page":"184-191","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":26,"title":["A Case for Managed and Model-less Inference Serving"],"prefix":"10.1145","author":[{"given":"Neeraja J.","family":"Yadwadkar","sequence":"first","affiliation":[{"name":"Stanford University"}]},{"given":"Francisco","family":"Romero","sequence":"additional","affiliation":[{"name":"Stanford University"}]},{"given":"Qian","family":"Li","sequence":"additional","affiliation":[{"name":"Stanford University"}]},{"given":"Christos","family":"Kozyrakis","sequence":"additional","affiliation":[{"name":"Stanford University, Google"}]}],"member":"320","published-online":{"date-parts":[[2019,5,13]]},"reference":[{"volume-title":"2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). 1815--1824","author":"AbdelBaky M.","key":"e_1_3_2_1_1_1","unstructured":"M. AbdelBaky , M. Zou , A. R. Zamani , E. Renart , J. Diaz-Montes , and M. Parashar . 2017. Computing in the Continuum: Combining Pervasive Devices and Services to Support Data-Driven Applications . In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). 1815--1824 . M. AbdelBaky, M. Zou, A. R. Zamani, E. Renart, J. Diaz-Montes, and M. Parashar. 2017. Computing in the Continuum: Combining Pervasive Devices and Services to Support Data-Driven Applications. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). 1815--1824."},{"key":"e_1_3_2_1_2_1","volume-title":"SAND: Towards High-Performance Serverless Computing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18)","author":"Akkus Istemi Ekin","year":"2018","unstructured":"Istemi Ekin Akkus , Ruichuan Chen , Ivica Rimac , Manuel Stein , Klaus Satzke , Andre Beck , Paarijaat Aditya , and Volker Hilt . 2018 . SAND: Towards High-Performance Serverless Computing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18) . USENIX Association, Boston, MA, 923--935. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/akkus Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Klaus Satzke, Andre Beck, Paarijaat Aditya, and Volker Hilt. 2018. SAND: Towards High-Performance Serverless Computing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 923--935. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/akkus"},{"key":"e_1_3_2_1_3_1","unstructured":"Amazon 2018. Amazon Elastic Inference. https:\/\/aws.amazon.com\/machine-learning\/elastic-inference\/.  Amazon 2018. Amazon Elastic Inference. https:\/\/aws.amazon.com\/machine-learning\/elastic-inference\/."},{"key":"e_1_3_2_1_4_1","unstructured":"Amazon 2018. Amazon SageMaker. https:\/\/aws.amazon.com\/sagemaker\/.  Amazon 2018. Amazon SageMaker. https:\/\/aws.amazon.com\/sagemaker\/."},{"key":"e_1_3_2_1_5_1","unstructured":"Amazon 2018. Amazon SageMaker Neo. https:\/\/aws.amazon.com\/sagemaker\/neo\/.  Amazon 2018. Amazon SageMaker Neo. https:\/\/aws.amazon.com\/sagemaker\/neo\/."},{"key":"e_1_3_2_1_6_1","volume-title":"Performance Analysis of Cloud Applications. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18)","author":"Ardelean Dan","year":"2018","unstructured":"Dan Ardelean , Amer Diwan , and Chandra Erdman . 2018 . Performance Analysis of Cloud Applications. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18) . USENIX Association, Renton, WA, 405--417. https:\/\/www.usenix.org\/conference\/nsdi18\/presentation\/ardelean Dan Ardelean, Amer Diwan, and Chandra Erdman. 2018. Performance Analysis of Cloud Applications. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 405--417. https:\/\/www.usenix.org\/conference\/nsdi18\/presentation\/ardelean"},{"key":"e_1_3_2_1_7_1","unstructured":"Mohammed Attia Younes Samih Ali Elkahky and Laura Kallmeyer. 2018. Multilingual Multi-class Sentiment Classification Using Convolutional Neural Networks. Miyazaki Japan 635--640. http:\/\/www.lrec-conf.org\/proceedings\/lrec2018\/pdf\/149.pdf  Mohammed Attia Younes Samih Ali Elkahky and Laura Kallmeyer. 2018. Multilingual Multi-class Sentiment Classification Using Convolutional Neural Networks. Miyazaki Japan 635--640. http:\/\/www.lrec-conf.org\/proceedings\/lrec2018\/pdf\/149.pdf"},{"key":"e_1_3_2_1_8_1","unstructured":"AWS 2018. AWS Lambda. https:\/\/aws.amazon.com\/lambda\/.  AWS 2018. AWS Lambda. https:\/\/aws.amazon.com\/lambda\/."},{"key":"e_1_3_2_1_9_1","volume-title":"An Analysis of Deep Neural Network Models for Practical Applications. CoRR abs\/1605.07678","author":"Canziani Alfredo","year":"2016","unstructured":"Alfredo Canziani , Adam Paszke , and Eugenio Culurciello . 2016. An Analysis of Deep Neural Network Models for Practical Applications. CoRR abs\/1605.07678 ( 2016 ). arXiv:1605.07678 http:\/\/arxiv.org\/abs\/1605.07678 Alfredo Canziani, Adam Paszke, and Eugenio Culurciello. 2016. An Analysis of Deep Neural Network Models for Practical Applications. CoRR abs\/1605.07678 (2016). arXiv:1605.07678 http:\/\/arxiv.org\/abs\/1605.07678"},{"key":"e_1_3_2_1_10_1","volume-title":"TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen , Thierry Moreau , Ziheng Jiang , Lianmin Zheng , Eddie Yan , Haichen Shen , Meghan Cowan , Leyuan Wang , Yuwei Hu , Luis Ceze , Carlos Guestrin , and Arvind Krishnamurthy . 2018 . TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . USENIX Association, Carlsbad, CA, 578--594. https:\/\/www.usenix.org\/conference\/osdi18\/presentation\/chen Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 578--594. https:\/\/www.usenix.org\/conference\/osdi18\/presentation\/chen"},{"key":"e_1_3_2_1_11_1","volume-title":"Clipper: A Low-Latency Online Prediction Serving System. In 14th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2017","author":"Crankshaw Daniel","year":"2017","unstructured":"Daniel Crankshaw , Xin Wang , Giulio Zhou , Michael J. Franklin , Joseph E. Gonzalez , and Ion Stoica . 2017 . Clipper: A Low-Latency Online Prediction Serving System. In 14th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2017 , Boston, MA, USA , March 27-29, 2017. 613--627. https:\/\/www.usenix.org\/conference\/nsdi17\/technical-sessions\/presentation\/crankshaw Daniel Crankshaw, Xin Wang, Giulio Zhou, Michael J. Franklin, Joseph E. Gonzalez, and Ion Stoica. 2017. Clipper: A Low-Latency Online Prediction Serving System. In 14th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2017, Boston, MA, USA, March 27-29, 2017. 613--627. https:\/\/www.usenix.org\/conference\/nsdi17\/technical-sessions\/presentation\/crankshaw"},{"key":"e_1_3_2_1_12_1","volume-title":"Jinjun Xiong, and Wen-Mei W. Hwu.","author":"Dakkak Abdul","year":"2018","unstructured":"Abdul Dakkak , Cheng Li , Simon Garcia De Gonzalo , Jinjun Xiong, and Wen-Mei W. Hwu. 2018 . TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep LearningInference in Function as a Service Environments. CoRR abs\/1811.09732 (2018). arXiv:1811.09732 http:\/\/arxiv.org\/abs\/1811.09732 Abdul Dakkak, Cheng Li, Simon Garcia De Gonzalo, Jinjun Xiong, and Wen-Mei W. Hwu. 2018. TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep LearningInference in Function as a Service Environments. CoRR abs\/1811.09732 (2018). arXiv:1811.09732 http:\/\/arxiv.org\/abs\/1811.09732"},{"key":"e_1_3_2_1_13_1","unstructured":"Facebook 2017. PyTorch. https:\/\/pytorch.org\/.  Facebook 2017. PyTorch. https:\/\/pytorch.org\/."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2018.00012"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2382553.2382556"},{"key":"e_1_3_2_1_16_1","unstructured":"Google 2018. Google Cloud Machine Learning Engine. https:\/\/cloud.google.com\/ml-engine\/.  Google 2018. Google Cloud Machine Learning Engine. https:\/\/cloud.google.com\/ml-engine\/."},{"key":"e_1_3_2_1_17_1","unstructured":"Google 2018. Google Compute Engine Pricing. https:\/\/cloud.google.com\/compute\/pricing.  Google 2018. Google Compute Engine Pricing. https:\/\/cloud.google.com\/compute\/pricing."},{"key":"e_1_3_2_1_18_1","unstructured":"Google 2018. TensorFlow - An open source machine learning framework for everyone. https:\/\/www.tensorflow.org.  Google 2018. TensorFlow - An open source machine learning framework for everyone. https:\/\/www.tensorflow.org."},{"key":"e_1_3_2_1_19_1","unstructured":"Google 2018. TensorFlow Serving for model deployment in production. https:\/\/www.tensorflow.org\/serving\/.  Google 2018. TensorFlow Serving for model deployment in production. https:\/\/www.tensorflow.org\/serving\/."},{"key":"e_1_3_2_1_20_1","volume-title":"Tiresias: A GPU Cluster Manager for Distributed Deep Learning. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19)","author":"Gu Juncheng","year":"2019","unstructured":"Juncheng Gu , Mosharaf Chowdhury , Kang G. Shin , Yibo Zhu , Myeongjae Jeon , Junjie Qian , Hongqiang Liu , and Chuanxiong Guo . 2019 . Tiresias: A GPU Cluster Manager for Distributed Deep Learning. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19) . USENIX Association, Boston, MA, 485--500. https:\/\/www.usenix.org\/conference\/nsdi19\/presentation\/gu Juncheng Gu, Mosharaf Chowdhury, Kang G. Shin, Yibo Zhu, Myeongjae Jeon, Junjie Qian, Hongqiang Liu, and Chuanxiong Guo. 2019. Tiresias: A GPU Cluster Manager for Distributed Deep Learning. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19). USENIX Association, Boston, MA, 485--500. https:\/\/www.usenix.org\/conference\/nsdi19\/presentation\/gu"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3135974.3135993"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3135974.3135993"},{"key":"e_1_3_2_1_23_1","unstructured":"Song Han Jeff Pool John Tran and William Dally. 2015. Learning both weights and connections for efficient neural network. In Advances in neural information processing systems. 1135--1143.   Song Han Jeff Pool John Tran and William Dally. 2015. Learning both weights and connections for efficient neural network. In Advances in neural information processing systems. 1135--1143."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2018.00059"},{"volume-title":"Multi-Scale Dense Networks for Resource Efficient Image Classification. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=Hk2aImxAb","author":"Huang Gao","key":"e_1_3_2_1_25_1","unstructured":"Gao Huang , Danlu Chen , Tianhong Li , Felix Wu , Laurens van der Maaten, and Kilian Weinberger. 2018 . Multi-Scale Dense Networks for Resource Efficient Image Classification. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=Hk2aImxAb Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, and Kilian Weinberger. 2018. Multi-Scale Dense Networks for Resource Efficient Image Classification. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=Hk2aImxAb"},{"key":"e_1_3_2_1_26_1","volume-title":"Squeezenet: Alexnet-level accuracy with 50x fewer parameters and&lt","author":"Iandola Forrest N","year":"2016","unstructured":"Forrest N Iandola , Song Han , Matthew W Moskewicz , Khalid Ashraf , William J Dally , and Kurt Keutzer . 2016 . Squeezenet: Alexnet-level accuracy with 50x fewer parameters and&lt ; 0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016). Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. 2016. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and&lt; 0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016)."},{"key":"e_1_3_2_1_27_1","unstructured":"Intel 2018. Intel Nervana Neural Network Processor. https:\/\/ai.intel.com\/nervana-nnp\/.  Intel 2018. Intel Nervana Neural Network Processor. https:\/\/ai.intel.com\/nervana-nnp\/."},{"key":"e_1_3_2_1_28_1","volume-title":"Dynamic Space-Time Scheduling for GPU Inference. In LearningSys Workshop at Neural Information Processing Systems","author":"Jain Paras","year":"2018","unstructured":"Paras Jain , Xiangxi Mo , Ajay Jain , Harikaran Subbaraj , Rehan Durrani , Alexey Tumanov , Joseph Gonzalez , and Ion Stoica . 2018 . Dynamic Space-Time Scheduling for GPU Inference. In LearningSys Workshop at Neural Information Processing Systems 2018. Paras Jain, Xiangxi Mo, Ajay Jain, Harikaran Subbaraj, Rehan Durrani, Alexey Tumanov, Joseph Gonzalez, and Ion Stoica. 2018. Dynamic Space-Time Scheduling for GPU Inference. In LearningSys Workshop at Neural Information Processing Systems 2018."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3230543.3230574"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3127479.3128601"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3037697.3037698"},{"key":"e_1_3_2_1_33_1","volume-title":"LIT: Block-wise Intermediate Representation Training for Model Compression. CoRR abs\/1810.01937","author":"Koratana Animesh","year":"2018","unstructured":"Animesh Koratana , Daniel Kang , Peter Bailis , and Matei Zaharia . 2018 . LIT: Block-wise Intermediate Representation Training for Model Compression. CoRR abs\/1810.01937 (2018). arXiv:1810.01937 http:\/\/arxiv.org\/abs\/1810.01937 Animesh Koratana, Daniel Kang, Peter Bailis, and Matei Zaharia. 2018. LIT: Block-wise Intermediate Representation Training for Model Compression. CoRR abs\/1810.01937 (2018). arXiv:1810.01937 http:\/\/arxiv.org\/abs\/1810.01937"},{"key":"e_1_3_2_1_34_1","volume-title":"Flex-point: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks. CoRR abs\/1711.02213","author":"K\u00f6ster Urs","year":"2017","unstructured":"Urs K\u00f6ster , Tristan Webb , Xin Wang , Marcel Nassar , Arjun K. Bansal , William Constable , Oguz Elibol , Stewart Hall , Luke Hornof , Amir Khosrowshahi , Carey Kloss , Ruby J. Pai , and Naveen Rao . 2017 . Flex-point: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks. CoRR abs\/1711.02213 (2017). arXiv:1711.02213 http:\/\/arxiv.org\/abs\/1711.02213 Urs K\u00f6ster, Tristan Webb, Xin Wang, Marcel Nassar, Arjun K. Bansal, William Constable, Oguz Elibol, Stewart Hall, Luke Hornof, Amir Khosrowshahi, Carey Kloss, Ruby J. Pai, and Naveen Rao. 2017. Flex-point: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks. CoRR abs\/1711.02213 (2017). arXiv:1711.02213 http:\/\/arxiv.org\/abs\/1711.02213"},{"key":"e_1_3_2_1_35_1","unstructured":"Microsoft 2018. Azure Machine Learning. https:\/\/docs.microsoft.com\/en-us\/azure\/machine-learning\/.  Microsoft 2018. Azure Machine Learning. https:\/\/docs.microsoft.com\/en-us\/azure\/machine-learning\/."},{"key":"e_1_3_2_1_36_1","volume-title":"Jordan","author":"Moritz Philipp","year":"2015","unstructured":"Philipp Moritz , Robert Nishihara , Ion Stoica , and Michael I . Jordan . 2015 . SparkNet: Training Deep Networks in Spark. CoRR abs\/1511.06051 (2015). arXiv:1511.06051 http:\/\/arxiv.org\/abs\/1511.06051 Philipp Moritz, Robert Nishihara, Ion Stoica, and Michael I. Jordan. 2015. SparkNet: Training Deep Networks in Spark. CoRR abs\/1511.06051 (2015). arXiv:1511.06051 http:\/\/arxiv.org\/abs\/1511.06051"},{"key":"e_1_3_2_1_37_1","unstructured":"MXNet 2017. Apache MXNet (Incubating) - A flexible and efficient library for deep learning. https:\/\/mxnet.apache.org\/.  MXNet 2017. Apache MXNet (Incubating) - A flexible and efficient library for deep learning. https:\/\/mxnet.apache.org\/."},{"volume-title":"Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS'11)","author":"Niu Feng","key":"e_1_3_2_1_38_1","unstructured":"Feng Niu , Benjamin Recht , Christopher Re , and Stephen J. Wright . 2011. HOGWILD!: A Lock-free Approach to Parallelizing Stochastic Gradient Descent . In Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS'11) . Curran Associates Inc., USA, 693--701. http:\/\/dl.acm.org\/citation.cfm?id=2986459.2986537 Feng Niu, Benjamin Recht, Christopher Re, and Stephen J. Wright. 2011. HOGWILD!: A Lock-free Approach to Parallelizing Stochastic Gradient Descent. In Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS'11). Curran Associates Inc., USA, 693--701. http:\/\/dl.acm.org\/citation.cfm?id=2986459.2986537"},{"key":"e_1_3_2_1_39_1","unstructured":"NVIDIA 2017. NVIDIA Tesla V100 Tensor Core GPU. https:\/\/www.nvidia.com\/en-us\/data-center\/tesla-v100\/.  NVIDIA 2017. NVIDIA Tesla V100 Tensor Core GPU. https:\/\/www.nvidia.com\/en-us\/data-center\/tesla-v100\/."},{"key":"e_1_3_2_1_40_1","unstructured":"NVIDIA 2018. NVIDIA TensorRT Inference Server. https:\/\/github.com\/NVIDIA\/tensorrt-inference-server.  NVIDIA 2018. NVIDIA TensorRT Inference Server. https:\/\/github.com\/NVIDIA\/tensorrt-inference-server."},{"key":"e_1_3_2_1_41_1","unstructured":"NVIDIA 2018. NVIDIA TensorRT: Programmable Inference Accelerator. https:\/\/developer.nvidia.com\/tensorrt.  NVIDIA 2018. NVIDIA TensorRT: Programmable Inference Accelerator. https:\/\/developer.nvidia.com\/tensorrt."},{"key":"e_1_3_2_1_42_1","volume-title":"SOCK: Rapid Task Provisioning with Serverless-Optimized Containers. In 2018 USENIX Annual Technical Conference (USENIX ATC 18)","author":"Oakes Edward","year":"2018","unstructured":"Edward Oakes , Leon Yang , Dennis Zhou , Kevin Houck , Tyler Harter , Andrea Arpaci-Dusseau , and Remzi Arpaci-Dusseau . 2018 . SOCK: Rapid Task Provisioning with Serverless-Optimized Containers. In 2018 USENIX Annual Technical Conference (USENIX ATC 18) . USENIX Association, Boston, MA, 57--70. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/oakes Edward Oakes, Leon Yang, Dennis Zhou, Kevin Houck, Tyler Harter, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. 2018. SOCK: Rapid Task Provisioning with Serverless-Optimized Containers. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 57--70. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/oakes"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3243176.3243180"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2017.9"},{"key":"e_1_3_2_1_45_1","article-title":"CoCoA: A General Framework for Communication-Efficient Distributed Optimization","volume":"18","author":"Smith Virginia","year":"2017","unstructured":"Virginia Smith , Simone Forte , Chenxin Ma , Martin Tak\u00e1c , Michael I. Jordan , and Martin Jaggi . 2017 . CoCoA: A General Framework for Communication-Efficient Distributed Optimization . Journal of Machine Learning Research 18 (2017), 230:1--230:49. http:\/\/jmlr.org\/papers\/v18\/papers\/v18\/16-512.html Virginia Smith, Simone Forte, Chenxin Ma, Martin Tak\u00e1c, Michael I. Jordan, and Martin Jaggi. 2017. CoCoA: A General Framework for Communication-Efficient Distributed Optimization. Journal of Machine Learning Research 18 (2017), 230:1--230:49. http:\/\/jmlr.org\/papers\/v18\/papers\/v18\/16-512.html","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2908080.2908105"},{"key":"e_1_3_2_1_47_1","volume-title":"19th Annual Conference of the International Speech Communication Association","author":"Velikovich Leonid","year":"2018","unstructured":"Leonid Velikovich , Ian Williams , Justin Scheiner , Petar S. Aleksic , Pedro J. Moreno , and Michael Riley . 2018 . Semantic Lattice Processing in Contextual Automatic Speech Recognition for Google Assistant. In Interspeech 2018 , 19th Annual Conference of the International Speech Communication Association , Hyderabad, India , 2-6 September 2018. 2222--2226. Leonid Velikovich, Ian Williams, Justin Scheiner, Petar S. Aleksic, Pedro J. Moreno, and Michael Riley. 2018. Semantic Lattice Processing in Contextual Automatic Speech Recognition for Google Assistant. In Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018. 2222--2226."},{"key":"e_1_3_2_1_48_1","volume-title":"Peeking Behind the Curtains of Serverless Platforms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18)","author":"Wang Liang","year":"2018","unstructured":"Liang Wang , Mengyuan Li , Yinqian Zhang , Thomas Ristenpart , and Michael Swift . 2018 . Peeking Behind the Curtains of Serverless Platforms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18) . USENIX Association, Boston, MA, 133--146. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/wang-liang Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. 2018. Peeking Behind the Curtains of Serverless Platforms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 133--146. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/wang-liang"},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.14778\/3282495.3282499"},{"key":"e_1_3_2_1_50_1","volume-title":"Gandiva: Introspective Cluster Scheduling for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Xiao Wencong","year":"2018","unstructured":"Wencong Xiao , Romil Bhardwaj , Ramachandran Ramjee , Muthian Sivathanu , Nipun Kwatra , Zhenhua Han , Pratyush Patel , Xuan Peng , Hanyu Zhao , Quanlu Zhang , Fan Yang , and Lidong Zhou . 2018 . Gandiva: Introspective Cluster Scheduling for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . USENIX Association, Carlsbad, CA, 595--610. https:\/\/www. usenix.org\/conference\/osdi18\/presentation\/xiao Wencong Xiao, Romil Bhardwaj, Ramachandran Ramjee, Muthian Sivathanu, Nipun Kwatra, Zhenhua Han, Pratyush Patel, Xuan Peng, Hanyu Zhao, Quanlu Zhang, Fan Yang, and Lidong Zhou. 2018. Gandiva: Introspective Cluster Scheduling for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 595--610. https:\/\/www. usenix.org\/conference\/osdi18\/presentation\/xiao"},{"key":"e_1_3_2_1_51_1","unstructured":"Xilinx 2018. Accelerating DNNs with Xilinx Alveo Accelerator Cards. https:\/\/www.xilinx.com\/support\/documentation\/white_papers\/wp504-accel-dnns.pdf.  Xilinx 2018. Accelerating DNNs with Xilinx Alveo Accelerator Cards. https:\/\/www.xilinx.com\/support\/documentation\/white_papers\/wp504-accel-dnns.pdf."},{"key":"e_1_3_2_1_52_1","volume-title":"Cache Telepathy: Leveraging Shared Resource Attacks to Learn DNN Architectures. CoRR abs\/1808.04761","author":"Yan Mengjia","year":"2018","unstructured":"Mengjia Yan , Christopher W. Fletcher , and Josep Torrellas . 2018 . Cache Telepathy: Leveraging Shared Resource Attacks to Learn DNN Architectures. CoRR abs\/1808.04761 (2018). arXiv:1808.04761 http:\/\/arxiv.org\/abs\/1808.04761 Mengjia Yan, Christopher W. Fletcher, and Josep Torrellas. 2018. Cache Telepathy: Leveraging Shared Resource Attacks to Learn DNN Architectures. CoRR abs\/1808.04761 (2018). arXiv:1808.04761 http:\/\/arxiv.org\/abs\/1808.04761"}],"event":{"name":"HotOS '19: Workshop on Hot Topics in Operating Systems","sponsor":["SIGOPS ACM Special Interest Group on Operating Systems"],"location":"Bertinoro Italy","acronym":"HotOS '19"},"container-title":["Proceedings of the Workshop on Hot Topics in Operating Systems"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3317550.3321443","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3317550.3321443","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:02:27Z","timestamp":1750208547000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3317550.3321443"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,5,13]]},"references-count":52,"alternative-id":["10.1145\/3317550.3321443","10.1145\/3317550"],"URL":"https:\/\/doi.org\/10.1145\/3317550.3321443","relation":{},"subject":[],"published":{"date-parts":[[2019,5,13]]},"assertion":[{"value":"2019-05-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}