{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,24]],"date-time":"2025-08-24T01:32:49Z","timestamp":1755999169157,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":30,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,11,7]],"date-time":"2022-11-07T00:00:00Z","timestamp":1667779200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"IITP","award":["RS-2022-00144309"],"award-info":[{"award-number":["RS-2022-00144309"]}]},{"name":"Korean Government (MSIP)","award":["NRF-2022R1A5A7000765, NRF-2020R1A2C1102544"],"award-info":[{"award-number":["NRF-2022R1A5A7000765, NRF-2020R1A2C1102544"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,11,7]]},"DOI":"10.1145\/3565382.3565878","type":"proceedings-article","created":{"date-parts":[[2022,11,22]],"date-time":"2022-11-22T23:51:27Z","timestamp":1669161087000},"page":"1-6","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["All-you-can-inference"],"prefix":"10.1145","author":[{"given":"Subin","family":"Park","sequence":"first","affiliation":[{"name":"Computer Science. Kookmin Univ., Seoul, South Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jaeghang","family":"Choi","sequence":"additional","affiliation":[{"name":"Computer Science. Kookmin Univ., Seoul, South Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kyungyong","family":"Lee","sequence":"additional","affiliation":[{"name":"Computer Science. Kookmin Univ., Seoul, South Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,11,22]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"12th USENIX symposium on operating systems design and implementation (OSDI 16)","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , 2016 . TensorFlow: a system for large-scale machine learning . In 12th USENIX symposium on operating systems design and implementation (OSDI 16) . 265--283. Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: a system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16). 265--283."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC41405.2020.00073"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133850.3133855"},{"key":"e_1_3_2_1_4_1","unstructured":"AWS Blog. [n. d.]. AWS Lambda - Container Image Support. https:\/\/aws.amazon.com\/blogs\/aws\/new-for-aws-lambda-container-image-support\/  AWS Blog. [n. d.]. AWS Lambda - Container Image Support. https:\/\/aws.amazon.com\/blogs\/aws\/new-for-aws-lambda-container-image-support\/"},{"key":"e_1_3_2_1_5_1","volume-title":"last accessed","author":"Blog AWS","year":"2020","unstructured":"AWS Blog . last accessed April . 2020 . AWS Lambda announces Provisioned Concurrency . https:\/\/aws.amazon.com\/about-aws\/whats-new\/2019\/12\/aws-lambda-announces-provisioned-concurrency\/ AWS Blog. last accessed April. 2020. AWS Lambda announces Provisioned Concurrency. https:\/\/aws.amazon.com\/about-aws\/whats-new\/2019\/12\/aws-lambda-announces-provisioned-concurrency\/"},{"key":"e_1_3_2_1_6_1","volume-title":"Asian Conference on Machine Learning","author":"Cai Ermao","year":"2017","unstructured":"Ermao Cai , Da-Cheng Juan , Dimitrios Stamoulis , and Diana Marculescu . 2017 . Neuralpower: Predict and deploy energy-efficient convolutional neural networks . Asian Conference on Machine Learning (2017). Ermao Cai, Da-Cheng Juan, Dimitrios Stamoulis, and Diana Marculescu. 2017. Neuralpower: Predict and deploy energy-efficient convolutional neural networks. Asian Conference on Machine Learning (2017)."},{"key":"e_1_3_2_1_7_1","volume-title":"MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. CoRR abs\/1512.01274","author":"Chen Tianqi","year":"2015","unstructured":"Tianqi Chen , Mu Li , Yutian Li , Min Lin , Naiyan Wang , Minjie Wang , Tianjun Xiao , Bing Xu , Chiyuan Zhang , and Zheng Zhang . 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. CoRR abs\/1512.01274 ( 2015 ). arXiv:1512.01274 http:\/\/arxiv.org\/abs\/1512.01274 Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. CoRR abs\/1512.01274 (2015). arXiv:1512.01274 http:\/\/arxiv.org\/abs\/1512.01274"},{"key":"e_1_3_2_1_8_1","volume-title":"TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen , Thierry Moreau , Ziheng Jiang , Lianmin Zheng , Eddie Yan , Haichen Shen , Meghan Cowan , Leyuan Wang , Yuwei Hu , Luis Ceze , Carlos Guestrin , and Arvind Krishnamurthy . 2018 . TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . USENIX Association, Carlsbad, CA, 578--594. https:\/\/www.usenix.org\/conference\/osdi18\/presentation\/chen Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 578--594. https:\/\/www.usenix.org\/conference\/osdi18\/presentation\/chen"},{"key":"e_1_3_2_1_9_1","unstructured":"Google Cloud. [n. d.]. Cloud Functions Image Build. https:\/\/cloud.google.com\/functions\/docs\/building  Google Cloud. [n. d.]. Cloud Functions Image Build. https:\/\/cloud.google.com\/functions\/docs\/building"},{"key":"e_1_3_2_1_10_1","unstructured":"ONNX Runtime developers. 2021. ONNX Runtime. https:\/\/onnxruntime.ai\/.  ONNX Runtime developers. 2021. ONNX Runtime. https:\/\/onnxruntime.ai\/."},{"key":"e_1_3_2_1_11_1","unstructured":"BentoML Documents. [n. d.]. What is BentoML? https:\/\/docs.bentoml.org\/en\/latest\/  BentoML Documents. [n. d.]. What is BentoML? https:\/\/docs.bentoml.org\/en\/latest\/"},{"key":"e_1_3_2_1_12_1","unstructured":"Azure Functions. [n. d.]. Create a function on Linux using a custom container. https:\/\/docs.microsoft.com\/en-us\/azure\/azure-functions\/functions-create-function-linux-custom-image  Azure Functions. [n. d.]. Create a function on Linux using a custom container. https:\/\/docs.microsoft.com\/en-us\/azure\/azure-functions\/functions-create-function-linux-custom-image"},{"key":"e_1_3_2_1_13_1","volume-title":"Serverless Computing: One Step Forward, Two Steps Back. In 9th Biennial Conference on Innovative Data Systems Research, CIDR","author":"Hellerstein Joseph M.","year":"2019","unstructured":"Joseph M. Hellerstein , Jose M. Faleiro , Joseph Gonzalez , Johann Schleier-Smith , Vikram Sreekanti , Alexey Tumanov , and Chenggang Wu . 2019 . Serverless Computing: One Step Forward, Two Steps Back. In 9th Biennial Conference on Innovative Data Systems Research, CIDR 2019, Asilomar, CA , USA, January 13-16, 2019, Online Proceedings . www.cidrdb.org. http:\/\/cidrdb.org\/cidr2019\/papers\/p119-hellerstein-cidr19.pdf Joseph M. Hellerstein, Jose M. Faleiro, Joseph Gonzalez, Johann Schleier-Smith, Vikram Sreekanti, Alexey Tumanov, and Chenggang Wu. 2019. Serverless Computing: One Step Forward, Two Steps Back. In 9th Biennial Conference on Innovative Data Systems Research, CIDR 2019, Asilomar, CA, USA, January 13-16, 2019, Online Proceedings. www.cidrdb.org. http:\/\/cidrdb.org\/cidr2019\/papers\/p119-hellerstein-cidr19.pdf"},{"key":"e_1_3_2_1_14_1","unstructured":"AWS What is new. [n. d.]. AWS Lambda support for Amazon Elastic File System now generally available. https:\/\/aws.amazon.com\/about-aws\/whats-new\/2020\/06\/aws-lambda-support-for-amazon-elastic-file-system-now-generally-\/  AWS What is new. [n. d.]. AWS Lambda support for Amazon Elastic File System now generally available. https:\/\/aws.amazon.com\/about-aws\/whats-new\/2020\/06\/aws-lambda-support-for-amazon-elastic-file-system-now-generally-\/"},{"key":"e_1_3_2_1_15_1","volume-title":"Serving deep learning models in a serverless platform. CoRR abs\/1710.08460","author":"Ishakian Vatche","year":"2017","unstructured":"Vatche Ishakian , Vinod Muthusamy , and Aleksander Slominski . 2017. Serving deep learning models in a serverless platform. CoRR abs\/1710.08460 ( 2017 ). arXiv:1710.08460 http:\/\/arxiv.org\/abs\/1710.08460 Vatche Ishakian, Vinod Muthusamy, and Aleksander Slominski. 2017. Serving deep learning models in a serverless platform. CoRR abs\/1710.08460 (2017). arXiv:1710.08460 http:\/\/arxiv.org\/abs\/1710.08460"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLOUD.2019.00091"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357223.3365439"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10586-020-03103-4"},{"key":"e_1_3_2_1_19_1","volume-title":"Network Resource Isolation in Serverless Cloud Function Service. In 2019 IEEE 4th International Workshops on Foundations and Applications of Self* Systems (FAS*W).","author":"Kim Jeongchul","year":"2019","unstructured":"Jeongchul Kim , Jungae Park , and Kyungyong Lee . 2019 . Network Resource Isolation in Serverless Cloud Function Service. In 2019 IEEE 4th International Workshops on Foundations and Applications of Self* Systems (FAS*W). Jeongchul Kim, Jungae Park, and Kyungyong Lee. 2019. Network Resource Isolation in Serverless Cloud Function Service. In 2019 IEEE 4th International Workshops on Foundations and Applications of Self* Systems (FAS*W)."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3491204.3543506"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLOUD.2018.00062"},{"volume-title":"Evaluating Concurrent Executions of Multiple Function-as-a-Service Runtimes with MicroVM. In 2020 IEEE 13th International Conference on Cloud Computing (CLOUD).","author":"Park J.","key":"e_1_3_2_1_22_1","unstructured":"J. Park ,, H. Kim , and K. Lee . 2020 . Evaluating Concurrent Executions of Multiple Function-as-a-Service Runtimes with MicroVM. In 2020 IEEE 13th International Conference on Cloud Computing (CLOUD). J. Park,, H. Kim, and K. Lee. 2020. Evaluating Concurrent Executions of Multiple Function-as-a-Service Runtimes with MicroVM. In 2020 IEEE 13th International Conference on Cloud Computing (CLOUD)."},{"key":"e_1_3_2_1_23_1","volume-title":"Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke , Sam Gross , Francisco Massa , Adam Lerer , James Bradbury , Gregory Chanan , Trevor Killeen , Zeming Lin , Natalia Gimelshein , Luca Antiga , 2019 . Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019). Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019)."},{"key":"e_1_3_2_1_24_1","volume-title":"Proceedings of the International Conference on Learning Representations.","author":"Qi Hang","year":"2017","unstructured":"Hang Qi , Evan R. Sparks , and Ameet Talwalkar . 2017 . Paleo: A Performance Model for Deep Neural Networks . In Proceedings of the International Conference on Learning Representations. Hang Qi, Evan R. Sparks, and Ameet Talwalkar. 2017. Paleo: A Performance Model for Deep Neural Networks. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3406011"},{"key":"e_1_3_2_1_26_1","volume-title":"2020 USENIX Annual Technical Conference (USENIX ATC 20)","author":"Shahrad Mohammad","year":"2020","unstructured":"Mohammad Shahrad , Rodrigo Fonseca , Inigo Goiri , Gohar Chaudhry , Paul Batum , Jason Cooke , Eduardo Laureano , Colby Tresness , Mark Russinovich , and Ricardo Bianchini . 2020 . Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider . In 2020 USENIX Annual Technical Conference (USENIX ATC 20) . USENIX Association, 205--218. https:\/\/www.usenix.org\/conference\/atc20\/presentation\/shahrad Mohammad Shahrad, Rodrigo Fonseca, Inigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 205--218. https:\/\/www.usenix.org\/conference\/atc20\/presentation\/shahrad"},{"key":"e_1_3_2_1_27_1","volume-title":"Peeking Behind the Curtains of Serverless Platforms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18)","author":"Wang Liang","year":"2018","unstructured":"Liang Wang , Mengyuan Li , Yinqian Zhang , Thomas Ristenpart , and Michael Swift . 2018 . Peeking Behind the Curtains of Serverless Platforms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18) . USENIX Association, Boston, MA, 133--146. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/wang-liang Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. 2018. Peeking Behind the Curtains of Serverless Platforms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 133--146. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/wang-liang"},{"key":"e_1_3_2_1_28_1","volume-title":"Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21)","author":"Yu Geoffrey X.","year":"2021","unstructured":"Geoffrey X. Yu , Yubo Gao , Pavel Golikov , and Gennady Pekhimenko . 2021 . Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21) . USENIX Association, 503--521. https:\/\/www.usenix.org\/conference\/atc21\/presentation\/yu Geoffrey X. Yu, Yubo Gao, Pavel Golikov, and Gennady Pekhimenko. 2021. Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21). USENIX Association, 503--521. https:\/\/www.usenix.org\/conference\/atc21\/presentation\/yu"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3419111.3421280"},{"key":"e_1_3_2_1_30_1","volume-title":"SLO-Aware Machine Learning Inference Serving. In 2019 USENIX Annual Technical Conference (USENIX ATC 19)","author":"Zhang Chengliang","year":"2019","unstructured":"Chengliang Zhang , Minchen Yu , Wei Wang , and Feng Yan . 2019 . MArk: Exploiting Cloud Services for Cost-Effective , SLO-Aware Machine Learning Inference Serving. In 2019 USENIX Annual Technical Conference (USENIX ATC 19) . USENIX Association, Renton, WA, 1049--1062. https:\/\/www.usenix.org\/conference\/atc19\/presentation\/zhang-chengliang Chengliang Zhang, Minchen Yu, Wei Wang, and Feng Yan. 2019. MArk: Exploiting Cloud Services for Cost-Effective, SLO-Aware Machine Learning Inference Serving. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). USENIX Association, Renton, WA, 1049--1062. https:\/\/www.usenix.org\/conference\/atc19\/presentation\/zhang-chengliang"}],"event":{"name":"Middleware '22: 23rd International Middleware Conference","sponsor":["ACM Association for Computing Machinery","USENIX Assoc USENIX Assoc","IFIP"],"location":"Quebec Quebec City Canada","acronym":"Middleware '22"},"container-title":["Proceedings of the Eighth International Workshop on Serverless Computing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3565382.3565878","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3565382.3565878","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:37:12Z","timestamp":1750178232000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3565382.3565878"}},"subtitle":["serverless DNN model inference suite"],"short-title":[],"issued":{"date-parts":[[2022,11,7]]},"references-count":30,"alternative-id":["10.1145\/3565382.3565878","10.1145\/3565382"],"URL":"https:\/\/doi.org\/10.1145\/3565382.3565878","relation":{},"subject":[],"published":{"date-parts":[[2022,11,7]]},"assertion":[{"value":"2022-11-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}