{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T01:53:19Z","timestamp":1773193999204,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":82,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,11,7]],"date-time":"2022-11-07T00:00:00Z","timestamp":1667779200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"United States Air Force Research Laboratory and the United States Air Force Artificial Intelligence Accelerator","award":["FA8750-19-2-1000"],"award-info":[{"award-number":["FA8750-19-2-1000"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,11,7]]},"DOI":"10.1145\/3542929.3563510","type":"proceedings-article","created":{"date-parts":[[2022,11,7]],"date-time":"2022-11-07T20:19:18Z","timestamp":1667852358000},"page":"173-189","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":65,"title":["MISO"],"prefix":"10.1145","author":[{"given":"Baolin","family":"Li","sequence":"first","affiliation":[{"name":"Northeastern University"}]},{"given":"Tirthak","family":"Patel","sequence":"additional","affiliation":[{"name":"Northeastern University"}]},{"given":"Siddharth","family":"Samsi","sequence":"additional","affiliation":[{"name":"MIT Lincoln Laboratory"}]},{"given":"Vijay","family":"Gadepally","sequence":"additional","affiliation":[{"name":"MIT Lincoln Laboratory"}]},{"given":"Devesh","family":"Tiwari","sequence":"additional","affiliation":[{"name":"Northeastern University"}]}],"member":"320","published-online":{"date-parts":[[2022,11,7]]},"reference":[{"issue":"18","key":"e_1_3_2_1_1_1","first-page":"59","article-title":"Candle\/supervisor: A workflow framework for machine learning applied to cancer research","volume":"19","author":"Wozniak Justin M","year":"2018","unstructured":"Justin M Wozniak , Rajeev Jain , Prasanna Balaprakasli , Jonathan Ozik , Nicholson T Collier , John Bauer , Fangfang Xia , Thomas Brettin , Rick Stevens , Jamaludin Mohd-Yusof , Candle\/supervisor: A workflow framework for machine learning applied to cancer research . BMC bioinformatics , 19 ( 18 ): 59 -- 69 , 2018 . Justin M Wozniak, Rajeev Jain, Prasanna Balaprakasli, Jonathan Ozik, Nicholson T Collier, John Bauer, Fangfang Xia, Thomas Brettin, Rick Stevens, Jamaludin Mohd-Yusof, et al. Candle\/supervisor: A workflow framework for machine learning applied to cancer research. BMC bioinformatics, 19(18):59--69, 2018.","journal-title":"BMC bioinformatics"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcp.2018.10.045"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/abbf9a"},{"key":"e_1_3_2_1_4_1","volume-title":"Benchmarking graph neural networks for materials chemistry. npj Computational Materials, 7(1):1--8","author":"Fung Victor","year":"2021","unstructured":"Victor Fung , Jiaxin Zhang , Eric Juarez , and Bobby G Sumpter . Benchmarking graph neural networks for materials chemistry. npj Computational Materials, 7(1):1--8 , 2021 . Victor Fung, Jiaxin Zhang, Eric Juarez, and Bobby G Sumpter. Benchmarking graph neural networks for materials chemistry. npj Computational Materials, 7(1):1--8, 2021."},{"key":"e_1_3_2_1_5_1","volume-title":"NVIDIA A100 Tensor Core GPU Datasheet","year":"2021","unstructured":"A100. NVIDIA A100 Tensor Core GPU Datasheet , 2021 . URL https:\/\/www.nvidia.com\/content\/dam\/en-zz\/Solutions\/Data-Center\/a100\/pdf\/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf. A100. NVIDIA A100 Tensor Core GPU Datasheet, 2021. URL https:\/\/www.nvidia.com\/content\/dam\/en-zz\/Solutions\/Data-Center\/a100\/pdf\/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/2388996.2389090"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3458817.3476209"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3458817.3476181"},{"key":"e_1_3_2_1_9_1","volume-title":"Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky , Ilya Sutskever , and Geoffrey E Hinton . Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25 , 2012 . Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3458817.3476223"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA47549.2020.00047"},{"key":"e_1_3_2_1_12_1","first-page":"947","volume-title":"2019 USENIX Annual Technical Conference (USENIX ATC 19)","author":"Jeon Myeongjae","year":"2019","unstructured":"Myeongjae Jeon , Shivaram Venkataraman , Amar Phanishayee , Junjie Qian , Wencong Xiao , and Fan Yang . Analysis of {Large-Scale} {Multi-Tenant} {GPU} clusters for {DNN} training workloads. In 2019 USENIX Annual Technical Conference (USENIX ATC 19) , pages 947 -- 960 , 2019 . Myeongjae Jeon, Shivaram Venkataraman, Amar Phanishayee, Junjie Qian, Wencong Xiao, and Fan Yang. Analysis of {Large-Scale} {Multi-Tenant} {GPU} clusters for {DNN} training workloads. In 2019 USENIX Annual Technical Conference (USENIX ATC 19), pages 947--960, 2019."},{"key":"e_1_3_2_1_13_1","first-page":"945","volume-title":"19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22)","author":"Weng Qizhen","year":"2022","unstructured":"Qizhen Weng , Wencong Xiao , Yinghao Yu , Wei Wang , Cheng Wang , Jian He , Yong Li , Liping Zhang , Wei Lin , and Yu Ding . MLaaS in the wild: Workload analysis and scheduling in Large-Scale heterogeneous GPU clusters . In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22) , pages 945 -- 960 , Renton, WA , April 2022 . USENIX Association. ISBN 978-1-939133-27-4. URL https:\/\/www.usenix.org\/conference\/nsdi22\/presentation\/weng. Qizhen Weng, Wencong Xiao, Yinghao Yu, Wei Wang, Cheng Wang, Jian He, Yong Li, Liping Zhang, Wei Lin, and Yu Ding. MLaaS in the wild: Workload analysis and scheduling in Large-Scale heterogeneous GPU clusters. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22), pages 945--960, Renton, WA, April 2022. USENIX Association. ISBN 978-1-939133-27-4. URL https:\/\/www.usenix.org\/conference\/nsdi22\/presentation\/weng."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA53966.2022.00093"},{"key":"e_1_3_2_1_15_1","volume-title":"NVIDIA Multi-Process Service","author":"MPS.","year":"2021","unstructured":"MPS. NVIDIA Multi-Process Service , 2021 . URL https:\/\/docs.nvidia.com\/deploy\/mps\/. MPS. NVIDIA Multi-Process Service, 2021. URL https:\/\/docs.nvidia.com\/deploy\/mps\/."},{"key":"e_1_3_2_1_16_1","volume-title":"NVIDIA Multi-Instance GPU User Guide","author":"MIG.","year":"2021","unstructured":"MIG. NVIDIA Multi-Instance GPU User Guide , 2021 . URL https:\/\/docs.nvidia.com\/datacenter\/tesla\/mig-user-guide\/. MIG. NVIDIA Multi-Instance GPU User Guide, 2021. URL https:\/\/docs.nvidia.com\/datacenter\/tesla\/mig-user-guide\/."},{"key":"e_1_3_2_1_17_1","volume-title":"AWS to offer NVIDIA A100 Tensor Core GPU-based Amazon EC2 instances","author":"AWS.","year":"2022","unstructured":"AWS. AWS to offer NVIDIA A100 Tensor Core GPU-based Amazon EC2 instances , 2022 . URL https:\/\/www.nvidia.com\/en-us\/data-center\/a100\/. AWS. AWS to offer NVIDIA A100 Tensor Core GPU-based Amazon EC2 instances, 2022. URL https:\/\/www.nvidia.com\/en-us\/data-center\/a100\/."},{"key":"e_1_3_2_1_18_1","volume-title":"A2 VMs now GA---the largest GPU cloud instances with NVIDIA A100 GPUs","year":"2022","unstructured":"Google. A2 VMs now GA---the largest GPU cloud instances with NVIDIA A100 GPUs , 2022 . URL https:\/\/cloud.google.com\/blog\/products\/compute\/a2-vms-with-nvidia-a100-gpus-are-ga. Google. A2 VMs now GA---the largest GPU cloud instances with NVIDIA A100 GPUs, 2022. URL https:\/\/cloud.google.com\/blog\/products\/compute\/a2-vms-with-nvidia-a100-gpus-are-ga."},{"key":"e_1_3_2_1_19_1","first-page":"2021","volume":"500","year":"2021","unstructured":"Top500 . Top 500 list November 2021 , 2021 . URL https:\/\/www.top500.org\/lists\/top500\/2021\/11\/. Top500. Top 500 list November 2021, 2021. URL https:\/\/www.top500.org\/lists\/top500\/2021\/11\/.","journal-title":"Top"},{"key":"e_1_3_2_1_20_1","volume-title":"Microsoft expands its AI-supercomputer lineup with general availability of the latest 80GB NVIDIA A100 GPUs in Azure","year":"2021","unstructured":"Microsoft. Microsoft expands its AI-supercomputer lineup with general availability of the latest 80GB NVIDIA A100 GPUs in Azure , 2021 . URL https:\/\/azure.microsoft.com\/en-us\/blog\/microsoft-expands-its-aisupercomputer-lineup-with-general-availability-of-the-latest-80gb-nvidia-a100-gpus-in-azure-claims\/. Microsoft. Microsoft expands its AI-supercomputer lineup with general availability of the latest 80GB NVIDIA A100 GPUs in Azure, 2021. URL https:\/\/azure.microsoft.com\/en-us\/blog\/microsoft-expands-its-aisupercomputer-lineup-with-general-availability-of-the-latest-80gb-nvidia-a100-gpus-in-azure-claims\/."},{"key":"e_1_3_2_1_21_1","first-page":"1263","volume-title":"International conference on machine learning","author":"Gilmer Justin","year":"2017","unstructured":"Justin Gilmer , Samuel S Schoenholz , Patrick F Riley , Oriol Vinyals , and George E Dahl . Neural message passing for quantum chemistry . In International conference on machine learning , pages 1263 -- 1272 . PMLR, 2017 . Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. In International conference on machine learning, pages 1263--1272. PMLR, 2017."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPSW55747.2022.00124"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2008.44"},{"key":"e_1_3_2_1_24_1","volume-title":"Multiprogram throughput metrics: A systematic approach. ACM Transactions on Architecture and Code Optimization (TACO), 11(3):1--26","author":"Eyerman Stijn","year":"2014","unstructured":"Stijn Eyerman , Pierre Michaud , and Wouter Rogiest . Multiprogram throughput metrics: A systematic approach. ACM Transactions on Architecture and Code Optimization (TACO), 11(3):1--26 , 2014 . Stijn Eyerman, Pierre Michaud, and Wouter Rogiest. Multiprogram throughput metrics: A systematic approach. ACM Transactions on Architecture and Code Optimization (TACO), 11(3):1--26, 2014."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3190508.3190517"},{"key":"e_1_3_2_1_26_1","first-page":"485","volume-title":"16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19)","author":"Gu Juncheng","year":"2019","unstructured":"Juncheng Gu , Mosharaf Chowdhury , Kang G Shin , Yibo Zhu , Myeongjae Jeon , Junjie Qian , Hongqiang Liu , and Chuanxiong Guo . Tiresias : A {GPU} cluster manager for distributed deep learning . In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19) , pages 485 -- 500 , 2019 . Juncheng Gu, Mosharaf Chowdhury, Kang G Shin, Yibo Zhu, Myeongjae Jeon, Junjie Qian, Hongqiang Liu, and Chuanxiong Guo. Tiresias: A {GPU} cluster manager for distributed deep learning. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19), pages 485--500, 2019."},{"key":"e_1_3_2_1_27_1","first-page":"481","volume-title":"14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20)","author":"Narayanan Deepak","year":"2020","unstructured":"Deepak Narayanan , Keshav Santhanam , Fiodar Kazhamiaka , Amar Phanishayee , and Matei Zaharia . {Heterogeneity-Aware} cluster scheduling policies for deep learning workloads . In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20) , pages 481 -- 498 , 2020 . Deepak Narayanan, Keshav Santhanam, Fiodar Kazhamiaka, Amar Phanishayee, and Matei Zaharia. {Heterogeneity-Aware} cluster scheduling policies for deep learning workloads. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pages 481--498, 2020."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541941"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"e_1_3_2_1_30_1","first-page":"22009","article-title":"A storm event imagery dataset for deep learning applications in radar and satellite meteorology","volume":"33","author":"Veillette Mark","year":"2020","unstructured":"Mark Veillette , Siddharth Samsi , and Chris Mattioli . Sevir : A storm event imagery dataset for deep learning applications in radar and satellite meteorology . Advances in Neural Information Processing Systems , 33 : 22009 -- 22019 , 2020 . Mark Veillette, Siddharth Samsi, and Chris Mattioli. Sevir: A storm event imagery dataset for deep learning applications in radar and satellite meteorology. Advances in Neural Information Processing Systems, 33:22009--22019, 2020.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_31_1","volume-title":"Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980","author":"Kingma Diederik P","year":"2014","unstructured":"Diederik P Kingma and Jimmy Ba . Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 , 2014 . Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014."},{"key":"e_1_3_2_1_32_1","first-page":"230","article-title":"A system for massively parallel hyperparameter tuning","volume":"2","author":"Li Liam","year":"2020","unstructured":"Liam Li , Kevin Jamieson , Afshin Rostamizadeh , Ekaterina Gonina , Jonathan Ben-Tzur , Moritz Hardt , Benjamin Recht , and Ameet Talwalkar . A system for massively parallel hyperparameter tuning . Proceedings of Machine Learning and Systems , 2 : 230 -- 246 , 2020 . Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Jonathan Ben-Tzur, Moritz Hardt, Benjamin Recht, and Ameet Talwalkar. A system for massively parallel hyperparameter tuning. Proceedings of Machine Learning and Systems, 2:230--246, 2020.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_2_1_33_1","volume-title":"Tune: A research platform for distributed model selection and training. arXiv preprint arXiv:1807.05118","author":"Liaw Richard","year":"2018","unstructured":"Richard Liaw , Eric Liang , Robert Nishihara , Philipp Moritz , Joseph E Gonzalez , and Ion Stoica . Tune: A research platform for distributed model selection and training. arXiv preprint arXiv:1807.05118 , 2018 . Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E Gonzalez, and Ion Stoica. Tune: A research platform for distributed model selection and training. arXiv preprint arXiv:1807.05118, 2018."},{"key":"e_1_3_2_1_34_1","volume-title":"NVIDIA A100 TENSOR CORE GPU","year":"2022","unstructured":"NVIDIA-A100. NVIDIA A100 TENSOR CORE GPU , 2022 . URL https:\/\/www.nvidia.com\/en-us\/data-center\/a100\/. NVIDIA-A100. NVIDIA A100 TENSOR CORE GPU, 2022. URL https:\/\/www.nvidia.com\/en-us\/data-center\/a100\/."},{"key":"e_1_3_2_1_35_1","first-page":"438","volume-title":"14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17)","author":"Ousterhout Amy","year":"2017","unstructured":"Amy Ousterhout , Jonathan Perry , Hari Balakrishnan , and Petr Lapukhov. Flexplane : An experimentation platform for resource management in datacenters . In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17) , pages 438 -- 451 , 2017 . Amy Ousterhout, Jonathan Perry, Hari Balakrishnan, and Petr Lapukhov. Flexplane: An experimentation platform for resource management in datacenters. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pages 438--451, 2017."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_1_37_1","volume-title":"Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861","author":"Howard Andrew G","year":"2017","unstructured":"Andrew G Howard , Menglong Zhu , Bo Chen , Dmitry Kalenichenko , Weijun Wang , Tobias Weyand , Marco Andreetto , and Hartwig Adam . Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 , 2017 . Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017."},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1423"},{"key":"e_1_3_2_1_39_1","volume-title":"Attention is all you need. Advances in neural information processing systems, 30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N Gomez , \u0141ukasz Kaiser , and Illia Polosukhin . Attention is all you need. Advances in neural information processing systems, 30 , 2017 . Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017."},{"key":"e_1_3_2_1_40_1","first-page":"173","volume-title":"International conference on machine learning","author":"Amodei Dario","year":"2016","unstructured":"Dario Amodei , Sundaram Ananthanarayanan , Rishita Anubhai , Jingliang Bai , Eric Battenberg , Carl Case , Jared Casper , Bryan Catanzaro , Qiang Cheng , Guoliang Chen , Deep speech 2: End-to-end speech recognition in english and mandarin . In International conference on machine learning , pages 173 -- 182 . PMLR, 2016 . Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, et al. Deep speech 2: End-to-end speech recognition in english and mandarin. In International conference on machine learning, pages 173--182. PMLR, 2016."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.244"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3458817.3476188"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3337821.3337905"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPEC49654.2021.9622850"},{"key":"e_1_3_2_1_46_1","volume-title":"Hugging Face: The AI community building the future","year":"2022","unstructured":"Hugging-Face. Hugging Face: The AI community building the future ., 2022 . URL https:\/\/huggingface.co\/. Hugging-Face. Hugging Face: The AI community building the future., 2022. URL https:\/\/huggingface.co\/."},{"key":"e_1_3_2_1_47_1","volume-title":"keras-io","year":"2022","unstructured":"Keras. keras-io ., 2022 . URL https:\/\/github.com\/keras-team\/keras-io. Keras. keras-io., 2022. URL https:\/\/github.com\/keras-team\/keras-io."},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3458817.3476143"},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2499368.2451125"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.53"},{"key":"e_1_3_2_1_51_1","first-page":"1","volume-title":"Fabien Hermenier. Multi-Objective Job Placement in Clusters. In SC'15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis","author":"Blagodurov Sergey","year":"2015","unstructured":"Sergey Blagodurov , Alexandra Fedorova , Evgeny Vinnik , Tyler Dwyer , and Fabien Hermenier. Multi-Objective Job Placement in Clusters. In SC'15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , pages 1 -- 12 . IEEE, 2015 . Sergey Blagodurov, Alexandra Fedorova, Evgeny Vinnik, Tyler Dwyer, and Fabien Hermenier. Multi-Objective Job Placement in Clusters. In SC'15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1--12. IEEE, 2015."},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/2980024.2872394"},{"key":"e_1_3_2_1_53_1","first-page":"598","volume-title":"Daniel Sanchez. Rubik: Fast Analytical Power Management for Latency-Critical Systems. In 2015 48th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO)","author":"Kasture Harshad","year":"2015","unstructured":"Harshad Kasture , Davide B Bartolini , Nathan Beckmann , and Daniel Sanchez. Rubik: Fast Analytical Power Management for Latency-Critical Systems. In 2015 48th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO) , pages 598 -- 610 . IEEE, 2015 . Harshad Kasture, Davide B Bartolini, Nathan Beckmann, and Daniel Sanchez. Rubik: Fast Analytical Power Management for Latency-Critical Systems. In 2015 48th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO), pages 598--610. IEEE, 2015."},{"key":"e_1_3_2_1_54_1","first-page":"121","volume-title":"Jos\u00e9 F Mart\u00ednez. SWAP: Effective Fine-Grain Management of Shared Last-Level Caches with Minimum Hardware Support. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)","author":"Wang Xiaodong","year":"2017","unstructured":"Xiaodong Wang , Shuang Chen , Jeff Setter , and Jos\u00e9 F Mart\u00ednez. SWAP: Effective Fine-Grain Management of Shared Last-Level Caches with Minimum Hardware Support. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA) , pages 121 -- 132 . IEEE, 2017 . Xiaodong Wang, Shuang Chen, Jeff Setter, and Jos\u00e9 F Mart\u00ednez. SWAP: Effective Fine-Grain Management of Shared Last-Level Caches with Minimum Hardware Support. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 121--132. IEEE, 2017."},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541944"},{"key":"e_1_3_2_1_56_1","first-page":"104","volume-title":"Daniel Sanchez. KPart: A Hybrid Cache Partitioning-Sharing Technique for Commodity Multicores. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)","author":"El-Sayed Nosayba","year":"2018","unstructured":"Nosayba El-Sayed , Anurag Mukkara , Po-An Tsai , Harshad Kasture , Xiaosong Ma , and Daniel Sanchez. KPart: A Hybrid Cache Partitioning-Sharing Technique for Commodity Multicores. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA) , pages 104 -- 117 . IEEE, 2018 . Nosayba El-Sayed, Anurag Mukkara, Po-An Tsai, Harshad Kasture, Xiaosong Ma, and Daniel Sanchez. KPart: A Hybrid Cache Partitioning-Sharing Technique for Commodity Multicores. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 104--117. IEEE, 2018."},{"key":"e_1_3_2_1_57_1","first-page":"13","volume-title":"Zhenlin Wang. DCAPS: Dynamic Cache Allocation with Partial Sharing. In Proceedings of the Thirteenth EuroSys Conference","author":"Xiang Yaocheng","unstructured":"Yaocheng Xiang , Xiaolin Wang , Zihui Huang , Zeyu Wang , Yingwei Luo , and Zhenlin Wang. DCAPS: Dynamic Cache Allocation with Partial Sharing. In Proceedings of the Thirteenth EuroSys Conference , page 13 . ACM, 2018. Yaocheng Xiang, Xiaolin Wang, Zihui Huang, Zeyu Wang, Yingwei Luo, and Zhenlin Wang. DCAPS: Dynamic Cache Allocation with Partial Sharing. In Proceedings of the Thirteenth EuroSys Conference, page 13. ACM, 2018."},{"key":"e_1_3_2_1_58_1","first-page":"14","volume-title":"Performance-Sensitive Infrastructure-as-a-Service. In Proceedings of the Thirteenth EuroSys Conference","author":"Xu Cong","unstructured":"Cong Xu , Karthick Rajamani , Alexandre Ferreira , Wesley Felter , Juan Rubio , and Yang Li. d Cat : Dynamic Cache Management for Efficient , Performance-Sensitive Infrastructure-as-a-Service. In Proceedings of the Thirteenth EuroSys Conference , page 14 . ACM, 2018. Cong Xu, Karthick Rajamani, Alexandre Ferreira, Wesley Felter, Juan Rubio, and Yang Li. dCat: Dynamic Cache Management for Efficient, Performance-Sensitive Infrastructure-as-a-Service. In Proceedings of the Thirteenth EuroSys Conference, page 14. ACM, 2018."},{"key":"e_1_3_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3243176.3243211"},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/3302424.3303963"},{"key":"e_1_3_2_1_61_1","first-page":"15","volume-title":"Boris Grot. Stretch: Balancing QoS and Throughput for Colocated Server Workloads on SMT Cores. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)","author":"Margaritov Artemiy","year":"2019","unstructured":"Artemiy Margaritov , Siddharth Gupta , Rekai Gonzalez-Alberquilla , and Boris Grot. Stretch: Balancing QoS and Throughput for Colocated Server Workloads on SMT Cores. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA) , pages 15 -- 27 . IEEE, 2019 . Artemiy Margaritov, Siddharth Gupta, Rekai Gonzalez-Alberquilla, and Boris Grot. Stretch: Balancing QoS and Throughput for Colocated Server Workloads on SMT Cores. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 15--27. IEEE, 2019."},{"key":"e_1_3_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA47549.2020.00025"},{"key":"e_1_3_2_1_63_1","volume-title":"20th USENIX Conference on File and Storage Technologies (FAST 22)","author":"Yi Jifei","year":"2022","unstructured":"Jifei Yi , Benchao Dong , Mingkai Dong , Ruizhe Tong , and Haibo Chen . Mt2 : Memory bandwidth regulation on hybrid nvm\/dram platforms . In 20th USENIX Conference on File and Storage Technologies (FAST 22) , Santa Clara, CA , 2022 . Jifei Yi, Benchao Dong, Mingkai Dong, Ruizhe Tong, and Haibo Chen. Mt2: Memory bandwidth regulation on hybrid nvm\/dram platforms. In 20th USENIX Conference on File and Storage Technologies (FAST 22), Santa Clara, CA, 2022."},{"key":"e_1_3_2_1_64_1","first-page":"443","volume-title":"14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20)","author":"Gujarati Arpan","year":"2020","unstructured":"Arpan Gujarati , Reza Karimi , Safya Alzayat , Wei Hao , Antoine Kaufmann , Ymir Vigfusson , and Jonathan Mace . Serving {DNNs} like clockwork : Performance predictability from the bottom up . In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20) , pages 443 -- 462 , 2020 . Arpan Gujarati, Reza Karimi, Safya Alzayat, Wei Hao, Antoine Kaufmann, Ymir Vigfusson, and Jonathan Mace. Serving {DNNs} like clockwork: Performance predictability from the bottom up. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pages 443--462, 2020."},{"key":"e_1_3_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/RTAS.2018.00028"},{"key":"e_1_3_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/RTSS46320.2019.00042"},{"key":"e_1_3_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1109\/RTSS52674.2021.00048"},{"key":"e_1_3_2_1_68_1","first-page":"595","volume-title":"13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Xiao Wencong","year":"2018","unstructured":"Wencong Xiao , Romil Bhardwaj , Ramachandran Ramjee , Muthian Sivathanu , Nipun Kwatra , Zhenhua Han , Pratyush Patel , Xuan Peng , Hanyu Zhao , Quanlu Zhang , : Introspective cluster scheduling for deep learning . In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) , pages 595 -- 610 , 2018 . Wencong Xiao, Romil Bhardwaj, Ramachandran Ramjee, Muthian Sivathanu, Nipun Kwatra, Zhenhua Han, Pratyush Patel, Xuan Peng, Hanyu Zhao, Quanlu Zhang, et al. Gandiva: Introspective cluster scheduling for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 595--610, 2018."},{"key":"e_1_3_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/3342195.3387555"},{"key":"e_1_3_2_1_70_1","first-page":"533","volume-title":"14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20)","author":"Xiao Wencong","year":"2020","unstructured":"Wencong Xiao , Shiru Ren , Yong Li , Yang Zhang , Pengyang Hou , Zhi Li , Yihui Feng , Wei Lin , and Yangqing Jia . {AntMan} : Dynamic scaling on {GPU} clusters for deep learning . In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20) , pages 533 -- 548 , 2020 . Wencong Xiao, Shiru Ren, Yong Li, Yang Zhang, Pengyang Hou, Zhi Li, Yihui Feng, Wei Lin, and Yangqing Jia. {AntMan}: Dynamic scaling on {GPU} clusters for deep learning. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pages 533--548, 2020."},{"key":"e_1_3_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/3419111.3421284"},{"key":"e_1_3_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2021.12.016"},{"key":"e_1_3_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1145\/3458817.3480853"},{"key":"e_1_3_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1145\/3503222.3507721"},{"key":"e_1_3_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1145\/3508036"},{"key":"e_1_3_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1145\/3503221.3508423"},{"key":"e_1_3_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1145\/3453417.3453439"},{"key":"e_1_3_2_1_78_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732967.2732976"},{"key":"e_1_3_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.1145\/2637364.2592002"},{"key":"e_1_3_2_1_80_1","volume-title":"NVIDIA Hopper GPU Architecture","author":"NVIDIA.","year":"2022","unstructured":"NVIDIA. NVIDIA Hopper GPU Architecture , 2022 . URL https:\/\/www.nvidia.com\/en-us\/technologies\/hopper-architecture\/. NVIDIA. NVIDIA Hopper GPU Architecture, 2022. URL https:\/\/www.nvidia.com\/en-us\/technologies\/hopper-architecture\/."},{"key":"e_1_3_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1145\/3453417.3453432"},{"key":"e_1_3_2_1_82_1","volume-title":"OneAPI GPU Optimization Guide","year":"2022","unstructured":"Intel. OneAPI GPU Optimization Guide , 2022 . URL https:\/\/www.intel.com\/content\/dam\/develop\/external\/us\/en\/documents\/oneapi-gpu-optimization-guide.pdf. Intel. OneAPI GPU Optimization Guide, 2022. URL https:\/\/www.intel.com\/content\/dam\/develop\/external\/us\/en\/documents\/oneapi-gpu-optimization-guide.pdf."}],"event":{"name":"SoCC '22: ACM Symposium on Cloud Computing","location":"San Francisco California","acronym":"SoCC '22","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGOPS ACM Special Interest Group on Operating Systems"]},"container-title":["Proceedings of the 13th Symposium on Cloud Computing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3542929.3563510","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3542929.3563510","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:49:31Z","timestamp":1750182571000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3542929.3563510"}},"subtitle":["exploiting multi-instance GPU capability on multi-tenant GPU clusters"],"short-title":[],"issued":{"date-parts":[[2022,11,7]]},"references-count":82,"alternative-id":["10.1145\/3542929.3563510","10.1145\/3542929"],"URL":"https:\/\/doi.org\/10.1145\/3542929.3563510","relation":{},"subject":[],"published":{"date-parts":[[2022,11,7]]},"assertion":[{"value":"2022-11-07","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}