{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T13:04:34Z","timestamp":1775912674767,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":66,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,5,18]],"date-time":"2021-05-18T00:00:00Z","timestamp":1621296000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"JUMP, a Semiconductor Research Corporation (SRC) program sponsored by DARPA"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,5,18]]},"DOI":"10.1145\/3450268.3453521","type":"proceedings-article","created":{"date-parts":[[2021,5,18]],"date-time":"2021-05-18T19:13:35Z","timestamp":1621365215000},"page":"80-92","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":19,"title":["Rim"],"prefix":"10.1145","author":[{"given":"Yitao","family":"Hu","sequence":"first","affiliation":[{"name":"University of Southern California"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weiwu","family":"Pang","sequence":"additional","affiliation":[{"name":"University of Southern California"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaochen","family":"Liu","sequence":"additional","affiliation":[{"name":"University of Southern California"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rajrup","family":"Ghosh","sequence":"additional","affiliation":[{"name":"University of Southern California"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bongjun","family":"Ko","sequence":"additional","affiliation":[{"name":"IBM Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wei-Han","family":"Lee","sequence":"additional","affiliation":[{"name":"IBM Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ramesh","family":"Govindan","sequence":"additional","affiliation":[{"name":"University of Southern California"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,5,18]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"Apple: Apple's HTTP Live Streaming. https:\/\/developer.apple.com\/streaming\/.  Apple: Apple's HTTP Live Streaming. https:\/\/developer.apple.com\/streaming\/."},{"key":"e_1_3_2_1_2_1","unstructured":"AWS re:Invent 2018 Keynote. https:\/\/www.youtube.com\/watch?v=ZOIkOnW640A&ab_channel=AmazonWebServices.  AWS re:Invent 2018 Keynote. https:\/\/www.youtube.com\/watch?v=ZOIkOnW640A&ab_channel=AmazonWebServices."},{"key":"e_1_3_2_1_3_1","unstructured":"Edge Computing Market Size Share and Trends Analysis. https:\/\/www.grandviewresearch.com\/industry-analysis\/edge-computing-market.  Edge Computing Market Size Share and Trends Analysis. https:\/\/www.grandviewresearch.com\/industry-analysis\/edge-computing-market."},{"key":"e_1_3_2_1_4_1","unstructured":"Edge Computing Market Worth $43.4 Billion By 2027. https:\/\/www.grandviewresearch.com\/press-release\/global-edge-computing-market.  Edge Computing Market Worth $43.4 Billion By 2027. https:\/\/www.grandviewresearch.com\/press-release\/global-edge-computing-market."},{"key":"e_1_3_2_1_5_1","unstructured":"Multi-person Real-time Action Recognition Based-on Human Skeleton. https:\/\/github.com\/felixchenfy\/Realtime-Action-Recognition.  Multi-person Real-time Action Recognition Based-on Human Skeleton. https:\/\/github.com\/felixchenfy\/Realtime-Action-Recognition."},{"key":"e_1_3_2_1_6_1","unstructured":"NVIDIA System Management Interface. https:\/\/developer.nvidia.com\/nvidia-system-management-interface.  NVIDIA System Management Interface. https:\/\/developer.nvidia.com\/nvidia-system-management-interface."},{"key":"e_1_3_2_1_7_1","unstructured":"The Low Latency Live Streaming Landscape in 2019. https:\/\/mux.com\/blog\/the-low-latency-live-streaming-landscape-in-2019\/.  The Low Latency Live Streaming Landscape in 2019. https:\/\/mux.com\/blog\/the-low-latency-live-streaming-landscape-in-2019\/."},{"key":"e_1_3_2_1_8_1","unstructured":"The NVIDIA EGX Platform for Edge Computing. https:\/\/www.nvidia.com\/en-us\/datacenter\/products\/egx-edge-computing\/.  The NVIDIA EGX Platform for Edge Computing. https:\/\/www.nvidia.com\/en-us\/datacenter\/products\/egx-edge-computing\/."},{"key":"e_1_3_2_1_9_1","unstructured":"Video Conferencing Network Requirements. https:\/\/www.videonations.co.uk\/resources\/video-conferencing-news\/video-conferencing-network-requirements\/.  Video Conferencing Network Requirements. https:\/\/www.videonations.co.uk\/resources\/video-conferencing-news\/video-conferencing-network-requirements\/."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2189750.2150984"},{"key":"e_1_3_2_1_11_1","volume-title":"Proceedings of the Conference of the ACM Special Interest Group on Data Communication, SIGCOMM '18","author":"Akhtar Zahaib","year":"2018","unstructured":"Zahaib Akhtar , Yun S. Nam , Ramesh Govindan , Sanjay Rao , Jessica Chen , Ethan Katz-Bassett , Bruno M. Ribeiro , Jibin Zhan , and Hui Zhang . Oboe : Auto-Tuning Video ABR Algorithms to Network Conditions . In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, SIGCOMM '18 , 2018 . Zahaib Akhtar, Yun S. Nam, Ramesh Govindan, Sanjay Rao, Jessica Chen, Ethan Katz-Bassett, Bruno M. Ribeiro, Jibin Zhan, and Hui Zhang. Oboe: Auto-Tuning Video ABR Algorithms to Network Conditions. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, SIGCOMM '18, 2018."},{"key":"e_1_3_2_1_12_1","unstructured":"https:\/\/aws.amazon.com\/ 2020.  https:\/\/aws.amazon.com\/ 2020."},{"key":"e_1_3_2_1_13_1","first-page":"173","volume-title":"International conference on machine learning","author":"Amodei Dario","year":"2016","unstructured":"Dario Amodei , Sundaram Ananthanarayanan , Rishita Anubhai , Jingliang Bai , Eric Battenberg , Carl Case , Jared Casper , Bryan Catanzaro , Qiang Cheng , Guoliang Chen , Deep speech 2: End-to-end speech recognition in english and mandarin . In International conference on machine learning , pages 173 -- 182 , 2016 . Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, et al. Deep speech 2: End-to-end speech recognition in english and mandarin. In International conference on machine learning, pages 173--182, 2016."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2019.2929257"},{"key":"e_1_3_2_1_15_1","volume-title":"A survey of model compression and acceleration for deep neural networks. CoRR, abs\/1710.09282","author":"Cheng Yu","year":"2017","unstructured":"Yu Cheng , Duo Wang , Pan Zhou , and Tao Zhang . A survey of model compression and acceleration for deep neural networks. CoRR, abs\/1710.09282 , 2017 . Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang. A survey of model compression and acceleration for deep neural networks. CoRR, abs\/1710.09282, 2017."},{"key":"e_1_3_2_1_16_1","volume-title":"Wav2letter: an end-to-end convnet-based speech recognition system. arXiv preprint arXiv:1609.03193","author":"Collobert Ronan","year":"2016","unstructured":"Ronan Collobert , Christian Puhrsch , and Gabriel Synnaeve . Wav2letter: an end-to-end convnet-based speech recognition system. arXiv preprint arXiv:1609.03193 , 2016 . Ronan Collobert, Christian Puhrsch, and Gabriel Synnaeve. Wav2letter: an end-to-end convnet-based speech recognition system. arXiv preprint arXiv:1609.03193, 2016."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/IoTDI49375.2020.00029"},{"key":"e_1_3_2_1_18_1","volume-title":"Inferline: Ml inference pipeline composition framework. arXiv preprint arXiv:1812.01776","author":"Crankshaw Daniel","year":"2018","unstructured":"Daniel Crankshaw , Gur-Eyal Sela , Corey Zumar , Xiangxi Mo , Joseph E Gonzalez , Ion Stoica , and Alexey Tumanov . Inferline: Ml inference pipeline composition framework. arXiv preprint arXiv:1812.01776 , 2018 . Daniel Crankshaw, Gur-Eyal Sela, Corey Zumar, Xiangxi Mo, Joseph E Gonzalez, Ion Stoica, and Alexey Tumanov. Inferline: Ml inference pipeline composition framework. arXiv preprint arXiv:1812.01776, 2018."},{"key":"e_1_3_2_1_19_1","first-page":"613","volume-title":"14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17)","author":"Crankshaw Daniel","year":"2017","unstructured":"Daniel Crankshaw , Xin Wang , Guilio Zhou , Michael J. Franklin , Joseph E. Gonzalez , and Ion Stoica . Clipper : A low-latency online prediction serving system . In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17) , pages 613 -- 627 , Boston, MA , March 2017 . USENIX Association. Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J. Franklin, Joseph E. Gonzalez, and Ion Stoica. Clipper: A low-latency online prediction serving system. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pages 613--627, Boston, MA, March 2017. USENIX Association."},{"key":"e_1_3_2_1_20_1","first-page":"4","volume-title":"Proceedings of the Eleventh EuroSys Conference","author":"Cui Henggang","unstructured":"Henggang Cui , Hao Zhang , Gregory R Ganger , Phillip B Gibbons , and Eric P Xing . Geeps : Scalable deep learning on distributed gpus with a gpu-specialized parameter server . In Proceedings of the Eleventh EuroSys Conference , page 4 . ACM, 2016. Henggang Cui, Hao Zhang, Gregory R Ganger, Phillip B Gibbons, and Eric P Xing. Geeps: Scalable deep learning on distributed gpus with a gpu-specialized parameter server. In Proceedings of the Eleventh EuroSys Conference, page 4. ACM, 2016."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327492"},{"key":"e_1_3_2_1_22_1","unstructured":"TensorFlow Documentation. Savedmodel warmup. https:\/\/www.tensorflow.org\/tfx\/serving\/saved_model_warmup.  TensorFlow Documentation. Savedmodel warmup. https:\/\/www.tensorflow.org\/tfx\/serving\/saved_model_warmup."},{"key":"e_1_3_2_1_23_1","volume-title":"Joint 3d face reconstruction and dense alignment with position map regression network. CoRR, abs\/1803.07835","author":"Feng Yao","year":"2018","unstructured":"Yao Feng , Fan Wu , Xiaohu Shao , Yanfeng Wang , and Xi Zhou . Joint 3d face reconstruction and dense alignment with position map regression network. CoRR, abs\/1803.07835 , 2018 . Yao Feng, Fan Wu, Xiaohu Shao, Yanfeng Wang, and Xi Zhou. Joint 3d face reconstruction and dense alignment with position map regression network. CoRR, abs\/1803.07835, 2018."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/945445.945450"},{"key":"e_1_3_2_1_25_1","unstructured":"https:\/\/cloud.google.com\/ 2020.  https:\/\/cloud.google.com\/ 2020."},{"key":"e_1_3_2_1_26_1","volume-title":"CUDA Pro Tip: Understand Fat Binaries and JIT Caching. https:\/\/devblogs.nvidia.com\/cuda-pro-tip-understand-fat-binaries-jit-caching\/","author":"Harris Mark","year":"2013","unstructured":"Mark Harris . CUDA Pro Tip: Understand Fat Binaries and JIT Caching. https:\/\/devblogs.nvidia.com\/cuda-pro-tip-understand-fat-binaries-jit-caching\/ , 2013 . Mark Harris. CUDA Pro Tip: Understand Fat Binaries and JIT Caching. https:\/\/devblogs.nvidia.com\/cuda-pro-tip-understand-fat-binaries-jit-caching\/, 2013."},{"key":"e_1_3_2_1_27_1","first-page":"41","volume-title":"Proceedings of the Fourteenth EuroSys Conference","author":"Holmes Connor","unstructured":"Connor Holmes , Daniel Mawhirter , Yuxiong He , Feng Yan , and Bo Wu. Grnn : Low-latency and scalable rnn inference on gpus . In Proceedings of the Fourteenth EuroSys Conference , page 41 . ACM, 2019. Connor Holmes, Daniel Mawhirter, Yuxiong He, Feng Yan, and Bo Wu. Grnn: Low-latency and scalable rnn inference on gpus. In Proceedings of the Fourteenth EuroSys Conference, page 41. ACM, 2019."},{"key":"e_1_3_2_1_28_1","volume-title":"Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs\/1704.04861","author":"Howard Andrew G.","year":"2017","unstructured":"Andrew G. Howard , Menglong Zhu , Bo Chen , Dmitry Kalenichenko , Weijun Wang , Tobias Weyand , Marco Andreetto , and Hartwig Adam . Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs\/1704.04861 , 2017 . Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs\/1704.04861, 2017."},{"key":"e_1_3_2_1_29_1","first-page":"269","volume-title":"13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Hsieh Kevin","year":"2018","unstructured":"Kevin Hsieh , Ganesh Ananthanarayanan , Peter Bodik , Shivaram Venkataraman , Paramvir Bahl , Matthai Philipose , Phillip B. Gibbons , and Onur Mutlu . Focus : Querying large video datasets with low latency and low cost . In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) , pages 269 -- 286 , Carlsbad, CA , October 2018 . USENIX Association. Kevin Hsieh, Ganesh Ananthanarayanan, Peter Bodik, Shivaram Venkataraman, Paramvir Bahl, Matthai Philipose, Phillip B. Gibbons, and Onur Mutlu. Focus: Querying large video datasets with low latency and low cost. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 269--286, Carlsbad, CA, October 2018. USENIX Association."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/IoTDI49375.2020.00023"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3072959.3092817"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3274808.3274813"},{"key":"e_1_3_2_1_33_1","volume-title":"Squeezenet: Alexnet-level accuracy with 50x fewer parameters and &lt;1mb model size. CoRR, abs\/1602.07360","author":"Iandola Forrest N.","year":"2016","unstructured":"Forrest N. Iandola , Matthew W. Moskewicz , Khalid Ashraf , Song Han , William J. Dally , and Kurt Keutzer . Squeezenet: Alexnet-level accuracy with 50x fewer parameters and &lt;1mb model size. CoRR, abs\/1602.07360 , 2016 . Forrest N. Iandola, Matthew W. Moskewicz, Khalid Ashraf, Song Han, William J. Dally, and Kurt Keutzer. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and &lt;1mb model size. CoRR, abs\/1602.07360, 2016."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1629575.1629601"},{"key":"e_1_3_2_1_35_1","volume-title":"The ooo vliw jit compiler for gpu inference. arXiv preprint arXiv:1901.10008","author":"Jain Paras","year":"2019","unstructured":"Paras Jain , Xiangxi Mo , Ajay Jain , Alexey Tumanov , Joseph E Gonzalez , and Ion Stoica . The ooo vliw jit compiler for gpu inference. arXiv preprint arXiv:1901.10008 , 2019 . Paras Jain, Xiangxi Mo, Ajay Jain, Alexey Tumanov, Joseph E Gonzalez, and Ion Stoica. The ooo vliw jit compiler for gpu inference. arXiv preprint arXiv:1901.10008, 2019."},{"key":"e_1_3_2_1_36_1","unstructured":"https:\/\/nvidia.github.io\/OpenSeq2Seq\/html\/speech-recognition\/jasper.html 2019.  https:\/\/nvidia.github.io\/OpenSeq2Seq\/html\/speech-recognition\/jasper.html 2019."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/354871.354874"},{"key":"e_1_3_2_1_39_1","first-page":"1097","volume-title":"Advances in neural information processing systems","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky , Ilya Sutskever , and Geoffrey E Hinton . Imagenet classification with deep convolutional neural networks . In Advances in neural information processing systems , pages 1097 -- 1105 , 2012 . Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097--1105, 2012."},{"key":"e_1_3_2_1_40_1","first-page":"611","volume-title":"13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Lee Yunseong","year":"2018","unstructured":"Yunseong Lee , Alberto Scolari , Byung-Gon Chun , Marco Domenico Santambrogio , Markus Weimer , and Matteo Interlandi . PRETZEL : Opening the black box of machine learning prediction serving systems . In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) , pages 611 -- 626 , Carlsbad, CA , October 2018 . USENIX Association. Yunseong Lee, Alberto Scolari, Byung-Gon Chun, Marco Domenico Santambrogio, Markus Weimer, and Matteo Interlandi. PRETZEL: Opening the black box of machine learning prediction serving systems. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 611--626, Carlsbad, CA, October 2018. USENIX Association."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"e_1_3_2_1_42_1","volume-title":"Proc. ACM Sensys","author":"Liu X.","year":"2019","unstructured":"X. Liu , P. Ghosh , O. Ulutan , K. Chan , B. S. Manjunath , and R. Govindan . Caesar: Cross-Camera Complex Activity Detection . In Proc. ACM Sensys , 2019 . X. Liu, P. Ghosh, O. Ulutan, K. Chan, B. S.Manjunath, and R. Govindan. Caesar: Cross-Camera Complex Activity Detection. In Proc. ACM Sensys, 2019."},{"key":"e_1_3_2_1_43_1","volume-title":"Mohammad Alizadeh. Neural Adaptive Video Streaming with Pensieve. In Proceedings of the ACM Conference on Special Interest Group on Data Communication, SIGCOMM","author":"Mao Hongzi","year":"2017","unstructured":"Hongzi Mao , Ravi Netravali , and Mohammad Alizadeh. Neural Adaptive Video Streaming with Pensieve. In Proceedings of the ACM Conference on Special Interest Group on Data Communication, SIGCOMM , 2017 . Hongzi Mao, Ravi Netravali, and Mohammad Alizadeh. Neural Adaptive Video Streaming with Pensieve. In Proceedings of the ACM Conference on Special Interest Group on Data Communication, SIGCOMM, 2017."},{"key":"e_1_3_2_1_44_1","unstructured":"https:\/\/azure.microsoft.com\/en-us\/ 2020.  https:\/\/azure.microsoft.com\/en-us\/ 2020."},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3272127.3275075"},{"key":"e_1_3_2_1_46_1","unstructured":"https:\/\/developer.nvidia.com\/embedded\/jetson-tx2 2019.  https:\/\/developer.nvidia.com\/embedded\/jetson-tx2 2019."},{"key":"e_1_3_2_1_47_1","unstructured":"https:\/\/www.nvidia.com\/en-us\/autonomous-machines\/embedded-systems\/jetson-agx-xavier\/ 2020.  https:\/\/www.nvidia.com\/en-us\/autonomous-machines\/embedded-systems\/jetson-agx-xavier\/ 2020."},{"key":"e_1_3_2_1_48_1","unstructured":"https:\/\/nvidia.github.io\/OpenSeq2Seq\/html\/speech-recognition.html#models 2019.  https:\/\/nvidia.github.io\/OpenSeq2Seq\/html\/speech-recognition.html#models 2019."},{"key":"e_1_3_2_1_49_1","unstructured":"https:\/\/developer.nvidia.com\/tensorrt 2019.  https:\/\/developer.nvidia.com\/tensorrt 2019."},{"key":"e_1_3_2_1_50_1","first-page":"3","volume-title":"Proceedings of the Thirteenth EuroSys Conference","author":"Peng Yanghua","unstructured":"Yanghua Peng , Yixin Bao , Yangrui Chen , Chuan Wu , and Chuanxiong Guo . Optimus : an efficient dynamic resource scheduler for deep learning clusters . In Proceedings of the Thirteenth EuroSys Conference , page 3 . ACM, 2018. Yanghua Peng, Yixin Bao, Yangrui Chen, Chuan Wu, and Chuanxiong Guo. Optimus: an efficient dynamic resource scheduler for deep learning clusters. In Proceedings of the Thirteenth EuroSys Conference, page 3. ACM, 2018."},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/1999995.2000000"},{"key":"e_1_3_2_1_52_1","volume-title":"Yolov3: an incremental improvement. arXiv","author":"Redmon Joseph","year":"2018","unstructured":"Joseph Redmon and Ali Farhadi . Yolov3: an incremental improvement. arXiv , 2018 . Joseph Redmon and Ali Farhadi. Yolov3: an incremental improvement. arXiv, 2018."},{"key":"e_1_3_2_1_53_1","volume-title":"Managed & model-less inference serving. arXiv preprint arXiv:1905.13348","author":"Romero Francisco","year":"2019","unstructured":"Francisco Romero , Qian Li , Neeraja J Yadwadkar , and Christos Kozyrakis . Infaas : Managed & model-less inference serving. arXiv preprint arXiv:1905.13348 , 2019 . Francisco Romero, Qian Li, Neeraja J Yadwadkar, and Christos Kozyrakis. Infaas: Managed & model-less inference serving. arXiv preprint arXiv:1905.13348, 2019."},{"key":"e_1_3_2_1_54_1","volume-title":"Numpywren: Serverless linear algebra. CoRR, abs\/1810.09679","author":"Shankar Vaishaal","year":"2018","unstructured":"Vaishaal Shankar , Karl Krauth , Qifan Pu , Eric Jonas , Shivaram Venkataraman , Ion Stoica , Benjamin Recht , and Jonathan Ragan-Kelley . Numpywren: Serverless linear algebra. CoRR, abs\/1810.09679 , 2018 . Vaishaal Shankar, Karl Krauth, Qifan Pu, Eric Jonas, Shivaram Venkataraman, Ion Stoica, Benjamin Recht, and Jonathan Ragan-Kelley. Numpywren: Serverless linear algebra. CoRR, abs\/1810.09679, 2018."},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341301.3359658"},{"key":"e_1_3_2_1_56_1","volume-title":"Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman . Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 , 2014 . Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014."},{"key":"e_1_3_2_1_57_1","unstructured":"https:\/\/www.stackpath.com\/products\/edge-computing\/ 2020.  https:\/\/www.stackpath.com\/products\/edge-computing\/ 2020."},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.5555\/3298023.3298188"},{"key":"e_1_3_2_1_59_1","unstructured":"https:\/\/github.com\/tensorflow\/serving 2019.  https:\/\/github.com\/tensorflow\/serving 2019."},{"key":"e_1_3_2_1_60_1","volume-title":"Actor conditioned attention maps for video action detection. arXiv preprint arXiv:1812.11631","author":"Ulutan Oytun","year":"2018","unstructured":"Oytun Ulutan , Swati Rallapalli , Carlos Torres , Mudhakar Srivatsa , and BS Manjunath . Actor conditioned attention maps for video action detection. arXiv preprint arXiv:1812.11631 , 2018 . Oytun Ulutan, Swati Rallapalli, Carlos Torres, Mudhakar Srivatsa, and BS Manjunath. Actor conditioned attention maps for video action detection. arXiv preprint arXiv:1812.11631, 2018."},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.515"},{"key":"e_1_3_2_1_62_1","unstructured":"https:\/\/enterprise.verizon.com\/business\/learn\/edge-computing\/5G-and-edge-computing\/ 2020.  https:\/\/enterprise.verizon.com\/business\/learn\/edge-computing\/5G-and-edge-computing\/ 2020."},{"key":"e_1_3_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1145\/2987443.2987453"},{"key":"e_1_3_2_1_64_1","volume-title":"Deep Cosine Metric Learning for Person Reidentification. CoRR, abs\/1812.00442","author":"Wojke Nicolai","year":"2018","unstructured":"Nicolai Wojke and Alex Bewley . Deep Cosine Metric Learning for Person Reidentification. CoRR, abs\/1812.00442 , 2018 . Nicolai Wojke and Alex Bewley. Deep Cosine Metric Learning for Person Reidentification. CoRR, abs\/1812.00442, 2018."},{"key":"e_1_3_2_1_65_1","first-page":"595","volume-title":"13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Xiao Wencong","year":"2018","unstructured":"Wencong Xiao , Romil Bhardwaj , Ramachandran Ramjee , Muthian Sivathanu , Nipun Kwatra , Zhenhua Han , Pratyush Patel , Xuan Peng , Hanyu Zhao , Quanlu Zhang , Fan Yang , and Lidong Zhou . Gandiva : Introspective cluster scheduling for deep learning . In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) , pages 595 -- 610 , Carlsbad, CA , October 2018 . USENIX Association. Wencong Xiao, Romil Bhardwaj, Ramachandran Ramjee, Muthian Sivathanu, Nipun Kwatra, Zhenhua Han, Pratyush Patel, Xuan Peng, Hanyu Zhao, Quanlu Zhang, Fan Yang, and Lidong Zhou. Gandiva: Introspective cluster scheduling for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 595--610, Carlsbad, CA, October 2018. USENIX Association."},{"key":"e_1_3_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/3302505.3310067"}],"event":{"name":"IoTDI '21: International Conference on Internet-of-Things Design and Implementation","location":"Charlottesvle VA USA","acronym":"IoTDI '21","sponsor":["SIGBED ACM Special Interest Group on Embedded Systems","IEEE CS"]},"container-title":["Proceedings of the International Conference on Internet-of-Things Design and Implementation"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3450268.3453521","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3450268.3453521","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:46:59Z","timestamp":1750193219000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3450268.3453521"}},"subtitle":["Offloading Inference to the Edge"],"short-title":[],"issued":{"date-parts":[[2021,5,18]]},"references-count":66,"alternative-id":["10.1145\/3450268.3453521","10.1145\/3450268"],"URL":"https:\/\/doi.org\/10.1145\/3450268.3453521","relation":{},"subject":[],"published":{"date-parts":[[2021,5,18]]},"assertion":[{"value":"2021-05-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}