{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,28]],"date-time":"2025-08-28T02:40:01Z","timestamp":1756348801519,"version":"3.44.0"},"publisher-location":"New York, NY, USA","reference-count":65,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,9,8]]},"DOI":"10.1145\/3718958.3750496","type":"proceedings-article","created":{"date-parts":[[2025,8,27]],"date-time":"2025-08-27T16:54:11Z","timestamp":1756313651000},"page":"114-128","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["ByteDance Jakiro: Enabling RDMA and TCP over Virtual Private Cloud"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-5794-9682","authenticated-orcid":false,"given":"Yirui","family":"Liu","sequence":"first","affiliation":[{"name":"ByteDance, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-5793-2401","authenticated-orcid":false,"given":"Lidong","family":"Jiang","sequence":"additional","affiliation":[{"name":"ByteDance, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-7383-3451","authenticated-orcid":false,"given":"Deguo","family":"Li","sequence":"additional","affiliation":[{"name":"ByteDance, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2782-8508","authenticated-orcid":false,"given":"Daxiang","family":"Kang","sequence":"additional","affiliation":[{"name":"ByteDance, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-3911-5676","authenticated-orcid":false,"given":"Zhaoyang","family":"Wei","sequence":"additional","affiliation":[{"name":"ByteDance, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-9394-6089","authenticated-orcid":false,"given":"Yuqi","family":"Chai","sequence":"additional","affiliation":[{"name":"ByteDance, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-9367-2087","authenticated-orcid":false,"given":"Bin","family":"Niu","sequence":"additional","affiliation":[{"name":"ByteDance, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-2509-214X","authenticated-orcid":false,"given":"Ke","family":"Lin","sequence":"additional","affiliation":[{"name":"ByteDance, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5688-7611","authenticated-orcid":false,"given":"Xiaoning","family":"Ding","sequence":"additional","affiliation":[{"name":"ByteDance, San Jose, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-8338-1390","authenticated-orcid":false,"given":"Jianwen","family":"Pi","sequence":"additional","affiliation":[{"name":"ByteDance, San Jose, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-8462-7295","authenticated-orcid":false,"given":"Hao","family":"Luo","sequence":"additional","affiliation":[{"name":"ByteDance, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,8,27]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"2014. Dynamically connected transport. (2014). https:\/\/www.openfabrics.org\/images\/eventpresos\/workshops2014\/DevWorkshop\/presos\/Monday\/pdf\/05_DC_Verbs.pdf"},{"key":"e_1_3_2_1_2_1","unstructured":"2017. Linux IOMMU Support. (2017). https:\/\/www.kernel.org\/doc\/Documentation\/Intel-IOMMU.txt"},{"key":"e_1_3_2_1_3_1","unstructured":"2023. Perftest package. (2023). https:\/\/enterprise-support.nvidia.com\/s\/article\/perftest-package"},{"key":"e_1_3_2_1_4_1","volume-title":"Reverie: Low Pass Filter-Based Switch Buffer Sharing for Datacenters with RDMA and TCP Traffic. In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24)","author":"Addanki Vamsi","year":"2024","unstructured":"Vamsi Addanki, Wei Bai, Stefan Schmid, and Maria Apostolaki. 2024. Reverie: Low Pass Filter-Based Switch Buffer Sharing for Datacenters with RDMA and TCP Traffic. In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24). USENIX Association, Santa Clara, CA, 651\u2013668. https:\/\/www.usenix.org\/conference\/nsdi24\/presentation\/addanki-reverie"},{"key":"e_1_3_2_1_5_1","unstructured":"Open Fabrics Alliance. 2019. Open Fabrics Interfaces. (2019). https:\/\/github.com\/ofiwg\/libfabric"},{"key":"e_1_3_2_1_6_1","unstructured":"AMD. 2024. AMD Pensando\u2122 Networking. (2024). https:\/\/www.amd.com\/en\/products\/accelerators\/pensando.html"},{"key":"e_1_3_2_1_7_1","unstructured":"InfiniBand Trade Association. 2010. InfiniBand Architecture Specification Release 1.2.1 Annex A16: RoCE. (2010)."},{"key":"e_1_3_2_1_8_1","unstructured":"InfiniBand Trade Association. 2014. InfiniBand Architecture Specification Release 1.2.1. (2014)."},{"key":"e_1_3_2_1_9_1","unstructured":"InfiniBand Trade Association. 2014. InfiniBand Architecture Specification Release 1.2.1 Annex A17: RoCEv2. (2014)."},{"key":"e_1_3_2_1_10_1","volume-title":"Empowering Azure Storage with RDMA. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23)","author":"Bai Wei","year":"2023","unstructured":"Wei Bai, Shanim Sainul Abdeen, Ankit Agrawal, Krishan Kumar Attre, Paramvir Bahl, Ameya Bhagat, Gowri Bhaskara, Tanya Brokhman, Lei Cao, Ahmad Cheema, Rebecca Chow, Jeff Cohen, Mahmoud Elhaddad, Vivek Ette, Igal Figlin, Daniel Firestone, Mathew George, Ilya German, Lakhmeet Ghai, Eric Green, Albert Greenberg, Manish Gupta, Randy Haagens, Matthew Hendel, Ridwan Howlader, Neetha John, Julia Johnstone, Tom Jolly, Greg Kramer, David Kruse, Ankit Kumar, Erica Lan, Ivan Lee, Avi Levy, Marina Lipshteyn, Xin Liu, Chen Liu, Guohan Lu, Yuemin Lu, Xiakun Lu, Vadim Makhervaks, Ulad Malashanka, David A. Maltz, Ilias Marinos, Rohan Mehta, Sharda Murthi, Anup Namdhari, Aaron Ogus, Jitendra Padhye, Madhav Pandya, Douglas Phillips, Adrian Power, Suraj Puri, Shachar Raindel, Jordan Rhee, Anthony Russo, Maneesh Sah, Ali Sheriff, Chris Sparacino, Ashutosh Srivastava, Weixiang Sun, Nick Swanson, Fuhou Tian, Lukasz Tomczyk, Vamsi Vadlamuri, Alec Wolman, Ying Xie, Joyce Yom, Lihua Yuan, Yanzhao Zhang, and Brian Zill. 2023. Empowering Azure Storage with RDMA. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). USENIX Association, Boston, MA, 49\u201367. https:\/\/www.usenix.org\/conference\/nsdi23\/presentation\/bai"},{"key":"e_1_3_2_1_11_1","volume-title":"https:\/\/team.doubao.com\/en\/","author":"DOUBAO","year":"2024","unstructured":"ByteDance. 2024. DOUBAO TEAM. (2024). https:\/\/team.doubao.com\/en\/"},{"key":"e_1_3_2_1_12_1","unstructured":"ByteDance. 2024. volcengine MaaS platform. (2024). https:\/\/www.volcengine.com\/product\/ark"},{"key":"e_1_3_2_1_13_1","unstructured":"Amazon Elastic Compute Cloud. 2019. Elastic Fabric Adapter. (2019). https:\/\/docs.aws.amazon.com\/AWSEC2\/latest\/UserGuide\/efa.html"},{"key":"e_1_3_2_1_14_1","unstructured":"Amazon Virtual Private Cloud. 2024. What is Amazon VPC?. (2024). https:\/\/docs.aws.amazon.com\/vpc\/latest\/userguide\/what-is-amazon-vpc.html"},{"key":"e_1_3_2_1_15_1","unstructured":"Google Cloud. 2024. Virtual Private Cloud (VPC). (2024). https:\/\/cloud.google.com\/vpc?hl=en"},{"key":"e_1_3_2_1_16_1","unstructured":"Ultra Ethernet Consortium. 2023. The New Era Needs a New Network. (2023). https:\/\/ultraethernet.org\/"},{"key":"e_1_3_2_1_17_1","unstructured":"Ultra Ethernet Consortium. 2023. Overview of and Motivation for the Forthcoming Ultra Ethernet Consortium Specification. (2023). https:\/\/ultraethernet.org\/wp-content\/uploads\/sites\/20\/2023\/10\/23.07.12-UEC-1.0-Overview-FINAL-WITH-LOGO.pdf"},{"key":"e_1_3_2_1_18_1","volume-title":"Proceedings of the 15th USENIX Conference on Networked Systems Design and Implementation (NSDI'18)","author":"Dalton Michael","year":"2018","unstructured":"Michael Dalton, David Schultz, Jacob Adriaens, Ahsan Arefin, Anshuman Gupta, Brian Fahs, Dima Rubinstein, Enrique Cauich Zermeno, Erik Rubow, James Alexander Docauer, Jesse Alpert, Jing Ai, Jon Olson, Kevin DeCabooter, Marc De Kruijf, Nan Hua, Nathan Lewis, Nikhil Kasinadhuni, Riccardo Crepaldi, Srinivas Krishnan, Subbaiah Venkata, Yossi Richter, Uday Naik, and Amin Vahdat. 2018. Andromeda: performance, isolation, and velocity at scale in cloud network virtualization. In Proceedings of the 15th USENIX Conference on Networked Systems Design and Implementation (NSDI'18). USENIX Association, USA, 373\u2013387."},{"key":"e_1_3_2_1_19_1","volume-title":"https:\/\/docs.vmware.com\/en\/VMware-vSphere\/7.0\/com.vmware.vsphere.networking.doc\/GUID-CC021803-30EA-444D-BCBE-618E0D836B9F.html","author":"Docs Mware","year":"2019","unstructured":"VMware Docs. 2019. Single Root I\/O Virtualization (SR-IOV). (2019). https:\/\/docs.vmware.com\/en\/VMware-vSphere\/7.0\/com.vmware.vsphere.networking.doc\/GUID-CC021803-30EA-444D-BCBE-618E0D836B9F.html"},{"key":"e_1_3_2_1_20_1","volume-title":"VFP: A Virtual Switch Platform for Host SDN in the Public Cloud. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17)","author":"Firestone Daniel","year":"2017","unstructured":"Daniel Firestone. 2017. VFP: A Virtual Switch Platform for Host SDN in the Public Cloud. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 315\u2013328. https:\/\/www.usenix.org\/conference\/nsdi17\/technical-sessions\/presentation\/firestone"},{"key":"e_1_3_2_1_21_1","volume-title":"Azure Accelerated Networking: SmartNICs in the Public Cloud. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18)","author":"Firestone Daniel","year":"2018","unstructured":"Daniel Firestone, Andrew Putnam, Sambhrama Mundkur, Derek Chiou, Alireza Dabagh, Mike Andrewartha, Hari Angepat, Vivek Bhanu, Adrian Caulfield, Eric Chung, Harish Kumar Chandrappa, Somesh Chaturmohta, Matt Humphrey, Jack Lavier, Norman Lam, Fengfen Liu, Kalin Ovtcharov, Jitu Padhye, Gautham Popuri, Shachar Raindel, Tejas Sapre, Mark Shaw, Gabriel Silva, Madhan Sivakumar, Nisheeth Srivastava, Anshuman Verma, Qasim Zuhair, Deepak Bansal, Doug Burger, Kushagra Vaid, David A. Maltz, and Albert Greenberg. 2018. Azure Accelerated Networking: SmartNICs in the Public Cloud. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 51\u201366. https:\/\/www.usenix.org\/conference\/nsdi18\/presentation\/firestone"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/90.413212"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2934872.2934908"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3387514.3405849"},{"key":"e_1_3_2_1_25_1","unstructured":"Cunchen Hu Heyang Huang Junhao Hu Jiang Xu Xusheng Chen Tao Xie Chenxi Wang Sa Wang Yungang Bao Ninghui Sun and Yizhou Shan. 2024. MemServe: Context Caching for Disaggregated LLM Serving with Elastic Memory Pool. (2024). arXiv:cs.DC\/2406.17565 https:\/\/arxiv.org\/abs\/2406.17565"},{"key":"e_1_3_2_1_26_1","unstructured":"Edward J. Hu Yelong Shen Phillip Wallis Zeyuan Allen-Zhu Yuanzhi Li Shean Wang Lu Wang and Weizhu Chen. 2021. LoRA: Low-Rank Adaptation of Large Language Models. (2021). arXiv:cs.CL\/2106.09685"},{"key":"e_1_3_2_1_27_1","unstructured":"Intel. 2023. Open vSwitch with DPDK. (2023). https:\/\/docs.openvswitch.org\/en\/latest\/intro\/install\/dpdk\/"},{"key":"e_1_3_2_1_28_1","unstructured":"Intel. 2024. Introduction to Intel IPUs. (2024). https:\/\/www.intel.com\/content\/www\/us\/en\/products\/details\/network-io\/ipu.html"},{"key":"e_1_3_2_1_29_1","volume-title":"21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24)","author":"Jiang Ziheng","year":"2024","unstructured":"Ziheng Jiang, Haibin Lin, Yinmin Zhong, Qi Huang, Yangrui Chen, Zhi Zhang, Yanghua Peng, Xiang Li, Cong Xie, Shibiao Nong, Yulu Jia, Sun He, Hongmin Chen, Zhihao Bai, Qi Hou, Shipeng Yan, Ding Zhou, Yiyao Sheng, Zhuo Jiang, Haohan Xu, Haoran Wei, Zhang Zhang, Pengfei Nie, Leqi Zou, Sida Zhao, Liang Xiang, Zherui Liu, Zhe Li, Xiaoying Jia, Jianxi Ye, Xin Jin, and Xin Liu. 2024. MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs. In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24). USENIX Association, Santa Clara, CA, 745\u2013760. https:\/\/www.usenix.org\/conference\/nsdi24\/presentation\/jiang-ziheng"},{"key":"e_1_3_2_1_30_1","volume-title":"Seth Elliott.","author":"Bruce","year":"2021","unstructured":"Bruce A. Mah Jeff Poskanzer Kaustubh Prabhu etc. Jon Dugan, Seth Elliott. 2021. iPerf - The ultimate speed test tool for TCP, UDP and SCTP. (2021). https:\/\/iperf.fr\/"},{"volume-title":"Dah-Ming Chiu, and Raj Jain.","year":"1987","key":"e_1_3_2_1_31_1","unstructured":"k. k. Ramakrishnan, Dah-Ming Chiu, and Raj Jain. 1987. Congestion Avoidance in Computer Networks with a Connectionless Network Layer. Part IV-A Selective Binary Feedback Scheme for General Topologies. Technical Report DEC-TR-509, Digital Equipment Corporation (1987)."},{"volume-title":"Design Guidelines for High Performance RDMA Systems. In 2016 USENIX Annual Technical Conference (USENIX ATC 16)","author":"Kalia Anuj","key":"e_1_3_2_1_32_1","unstructured":"Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. Design Guidelines for High Performance RDMA Systems. In 2016 USENIX Annual Technical Conference (USENIX ATC 16). USENIX Association, Denver, CO, 437\u2013450. https:\/\/www.usenix.org\/conference\/atc16\/technical-sessions\/presentation\/kalia"},{"key":"e_1_3_2_1_33_1","volume-title":"FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19)","author":"Kim Daehyeok","year":"2019","unstructured":"Daehyeok Kim, Tianlong Yu, Hongqiang Harry Liu, Yibo Zhu, Jitu Padhye, Shachar Raindel, Chuanxiong Guo, Vyas Sekar, and Srinivasan Seshan. 2019. FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19). USENIX Association, Boston, MA, 113\u2013126. https:\/\/www.usenix.org\/conference\/nsdi19\/presentation\/kim"},{"key":"e_1_3_2_1_34_1","volume-title":"Understanding RDMA Microarchitecture Resources for Performance Isolation. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23)","author":"Kong Xinhao","year":"2023","unstructured":"Xinhao Kong, Jingrong Chen, Wei Bai, Yechen Xu, Mahmoud Elhaddad, Shachar Raindel, Jitendra Padhye, Alvin R. Lebeck, and Danyang Zhuo. 2023. Understanding RDMA Microarchitecture Resources for Performance Isolation. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). USENIX Association, Boston, MA, 31\u201348. https:\/\/www.usenix.org\/conference\/nsdi23\/presentation\/kong"},{"key":"e_1_3_2_1_35_1","volume-title":"Collie: Finding Performance Anomalies in RDMA Subsystems. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22)","author":"Kong Xinhao","year":"2022","unstructured":"Xinhao Kong, Yibo Zhu, Huaping Zhou, Zhuo Jiang, Jianxi Ye, Chuanxiong Guo, and Danyang Zhuo. 2022. Collie: Finding Performance Anomalies in RDMA Subsystems. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). USENIX Association, Renton, WA, 287\u2013305. https:\/\/www.usenix.org\/conference\/nsdi22\/presentation\/kong"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3651890.3672224"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3603269.3610849"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDCS57875.2023.00076"},{"key":"e_1_3_2_1_39_1","volume-title":"Harmonic: Hardware-assisted RDMA Performance Isolation for Public Clouds. In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24)","author":"Lou Jiaqi","year":"2024","unstructured":"Jiaqi Lou, Xinhao Kong, Jinghan Huang, Wei Bai, Nam Sung Kim, and Danyang Zhuo. 2024. Harmonic: Hardware-assisted RDMA Performance Isolation for Public Clouds. In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24). USENIX Association, Santa Clara, CA, 1479\u20131496. https:\/\/www.usenix.org\/conference\/nsdi24\/presentation\/lou"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.17487\/RFC7348"},{"key":"e_1_3_2_1_41_1","volume-title":"https:\/\/network.nvidia.com\/files\/doc-2020\/sb-asap2.pdf","author":"Mellanox","year":"2020","unstructured":"Mellanox. 2020. Mellanox ASAP2. (2020). https:\/\/network.nvidia.com\/files\/doc-2020\/sb-asap2.pdf"},{"key":"e_1_3_2_1_42_1","unstructured":"Mellanox. 2020. Mellanox ASAP2 Accelerated Switching and Packet Processing. (2020). https:\/\/network.nvidia.com\/files\/doc-2020\/sb-asap2.pdf"},{"key":"e_1_3_2_1_43_1","unstructured":"NVIDIA. 2020. NVIDIA Collective Communications Library. (2020). https:\/\/developer.nvidia.com\/nccl"},{"key":"e_1_3_2_1_44_1","unstructured":"NVIDIA. 2021. NVIDIA BLUEFIELD-3 DPU. (2021). https:\/\/www.nvidia.com\/content\/dam\/en-zz\/Solutions\/Data-Center\/documents\/datasheet-nvidia-bluefield-3-dpu.pdf"},{"key":"e_1_3_2_1_45_1","volume-title":"https:\/\/www.nvidia.com\/content\/dam\/en-zz\/Solutions\/networking\/ethernet-adapters\/connectx-7-datasheet-Final.pdf","author":"NVIDIA.","year":"2021","unstructured":"NVIDIA. 2021. NVIDIA CONNECTX-7. (2021). https:\/\/www.nvidia.com\/content\/dam\/en-zz\/Solutions\/networking\/ethernet-adapters\/connectx-7-datasheet-Final.pdf"},{"key":"e_1_3_2_1_46_1","volume-title":"LuoShen: A Hyper-Converged Programmable Gateway for Multi-Tenant Multi-Service Edge Clouds. In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24)","author":"Pan Tian","year":"2024","unstructured":"Tian Pan, Kun Liu, Xionglie Wei, Yisong Qiao, Jun Hu, Zhiguo Li, Jun Liang, Tiesheng Cheng, Wenqiang Su, Jie Lu, Yuke Hong, Zhengzhong Wang, Zhi Xu, Chongjing Dai, Peiqiao Wang, Xuetao Jia, Jianyuan Lu, Enge Song, Jun Zeng, Biao Lyu, Ennan Zhai, Jiao Zhang, Tao Huang, Dennis Cai, and Shunmin Zhu. 2024. LuoShen: A Hyper-Converged Programmable Gateway for Multi-Tenant Multi-Service Edge Clouds. In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24). USENIX Association, Santa Clara, CA, 877\u2013892. https:\/\/www.usenix.org\/conference\/nsdi24\/presentation\/pan"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3452296.3472889"},{"key":"e_1_3_2_1_48_1","volume-title":"Splitwise: Efficient generative LLM inference using phase splitting. In ISCA. https:\/\/www.microsoft.com\/en-us\/research\/publication\/splitwise-efficient-generative-llm-inference-using-phase-splitting\/","author":"Patel Pratyush","year":"2024","unstructured":"Pratyush Patel, Esha Choukse, Chaojie Zhang, Aashaka Shah, \u00cd\u00f1igo Goiri, Saeed Maleki, and Ricardo Bianchini. 2024. Splitwise: Efficient generative LLM inference using phase splitting. In ISCA. https:\/\/www.microsoft.com\/en-us\/research\/publication\/splitwise-efficient-generative-llm-inference-using-phase-splitting\/"},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2817817.2731200"},{"key":"e_1_3_2_1_50_1","volume-title":"Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving.","author":"Qin Ruoyu","year":"2024","unstructured":"Ruoyu Qin, Zheming Li, Weiran He, Mingxing Zhang, Yongwei Wu, Weimin Zheng, and Xinran Xu. 2024. Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving. (2024). arXiv:cs.DC\/2407.00079 https:\/\/arxiv.org\/abs\/2407.00079"},{"key":"e_1_3_2_1_51_1","volume-title":"Toward a Paravirtual vRDMA Device for VMware ESXi Guests. (12","author":"Ranadive Adit","year":"2012","unstructured":"Adit Ranadive and Bhavesh Davda. 2012. Toward a Paravirtual vRDMA Device for VMware ESXi Guests. (12 2012)."},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.91"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/3098822.3098852"},{"key":"e_1_3_2_1_54_1","unstructured":"Amazon Web Service. 2021. Live media workflows on AWS. (2021). https:\/\/aws.amazon.com\/cn\/blogs\/media\/metfc-live-media-workflows-on-aws-to-compress-or-to-not-compress\/"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2020.3016891"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDCS57875.2023.00019"},{"key":"e_1_3_2_1_57_1","volume-title":"https:\/\/plm.sw.siemens.com\/en-US\/simcenter\/fluids-thermal-simulation\/star-ccm\/","author":"Simcenter","year":"2024","unstructured":"Siemens. 2024. Simcenter STAR-CCM+. (2024). https:\/\/plm.sw.siemens.com\/en-US\/simcenter\/fluids-thermal-simulation\/star-ccm\/"},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3600006.3613145"},{"key":"e_1_3_2_1_59_1","volume-title":"SRNIC: A Scalable Architecture for RDMA NICs. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23)","author":"Wang Zilong","year":"2023","unstructured":"Zilong Wang, Layong Luo, Qingsong Ning, Chaoliang Zeng, Wenxue Li, Xinchen Wan, Peng Xie, Tao Feng, Ke Cheng, Xiongfei Geng, Tianhao Wang, Weicheng Ling, Kejia Huo, Pingbo An, Kui Ji, Shideng Zhang, Bin Xu, Ruiqing Feng, Tao Ding, Kai Chen, and Chuanxiong Guo. 2023. SRNIC: A Scalable Architecture for RDMA NICs. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). USENIX Association, Boston, MA, 1\u201314. https:\/\/www.usenix.org\/conference\/nsdi23\/presentation\/wang-zilong"},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/3603269.3604859"},{"key":"e_1_3_2_1_61_1","volume-title":"KRCORE: A Microsecond-scale RDMA Control Plane for Elastic Computing. In 2022 USENIX Annual Technical Conference (USENIX ATC 22)","author":"Wei Xingda","year":"2022","unstructured":"Xingda Wei, Fangming Lu, Rong Chen, and Haibo Chen. 2022. KRCORE: A Microsecond-scale RDMA Control Plane for Elastic Computing. In 2022 USENIX Annual Technical Conference (USENIX ATC 22). USENIX Association, Carlsbad, CA, 121\u2013136. https:\/\/www.usenix.org\/conference\/atc22\/presentation\/wei"},{"key":"e_1_3_2_1_62_1","unstructured":"AWS Whitepaper. 2024. The components of the Nitro System. (2024). https:\/\/docs.aws.amazon.com\/whitepapers\/latest\/security-design-of-aws-nitro-system\/the-components-of-the-nitro-system.html"},{"volume-title":"The Case for Enterprise-Ready Virtual Private Clouds. In Workshop on Hot Topics in Cloud Computing (HotCloud 09)","author":"Wood Timothy","key":"e_1_3_2_1_63_1","unstructured":"Timothy Wood, Prashant Shenoy, Alexandre Gerber, Jacobus Van der Merwe, and K.K. Ramakrishnan. 2009. The Case for Enterprise-Ready Virtual Private Clouds. In Workshop on Hot Topics in Cloud Computing (HotCloud 09). USENIX Association, San Diego, CA. https:\/\/www.usenix.org\/conference\/hotcloud-09\/case-enterprise-ready-virtual-private-clouds"},{"key":"e_1_3_2_1_64_1","volume-title":"Justitia: Software Multi-Tenancy in Hardware Kernel-Bypass Networks. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22)","author":"Zhang Yiwen","year":"2022","unstructured":"Yiwen Zhang, Yue Tan, Brent Stephens, and Mosharaf Chowdhury. 2022. Justitia: Software Multi-Tenancy in Hardware Kernel-Bypass Networks. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). USENIX Association, Renton, WA, 1307\u20131326. https:\/\/www.usenix.org\/conference\/nsdi22\/presentation\/zhang-yiwen"},{"key":"e_1_3_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/2829988.2787484"}],"event":{"name":"SIGCOMM '25: ACM SIGCOMM 2025 Conference","sponsor":["SIGCOMM ACM Special Interest Group on Data Communication"],"location":"S\u00e3o Francisco Convent Coimbra Portugal","acronym":"SIGCOMM '25"},"container-title":["Proceedings of the ACM SIGCOMM 2025 Conference"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3718958.3750496","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,27]],"date-time":"2025-08-27T17:02:11Z","timestamp":1756314131000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3718958.3750496"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,27]]},"references-count":65,"alternative-id":["10.1145\/3718958.3750496","10.1145\/3718958"],"URL":"https:\/\/doi.org\/10.1145\/3718958.3750496","relation":{},"subject":[],"published":{"date-parts":[[2025,8,27]]},"assertion":[{"value":"2025-08-27","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}