{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,24]],"date-time":"2025-11-24T21:12:33Z","timestamp":1764018753623,"version":"3.45.0"},"publisher-location":"New York, NY, USA","reference-count":52,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,10,28]]},"DOI":"10.1145\/3730567.3764494","type":"proceedings-article","created":{"date-parts":[[2025,11,21]],"date-time":"2025-11-21T15:22:38Z","timestamp":1763738558000},"page":"944-951","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Congestion Patterns in a Large-scale RDMA Datacenter"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1331-1372","authenticated-orcid":false,"given":"Soudeh","family":"Ghorbani","sequence":"first","affiliation":[{"name":"Meta, New York, New York, USA and Johns Hopkins University, Baltimore, Maryland, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-1791-6325","authenticated-orcid":false,"given":"Yimeng","family":"Zhao","sequence":"additional","affiliation":[{"name":"Meta, New York, New York, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-6230-9396","authenticated-orcid":false,"given":"Srikanth","family":"Sundaresan","sequence":"additional","affiliation":[{"name":"Meta, Boston, Massachusett, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2736-5694","authenticated-orcid":false,"given":"Ying","family":"Zhang","sequence":"additional","affiliation":[{"name":"Meta, Menlo Park, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5244-5034","authenticated-orcid":false,"given":"Yijing","family":"Zeng","sequence":"additional","affiliation":[{"name":"Meta, New York, New York, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-0368-9829","authenticated-orcid":false,"given":"Abhigyan","family":"Sharma","sequence":"additional","affiliation":[{"name":"Meta, New York, New York, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-4794-9971","authenticated-orcid":false,"given":"Prashanth","family":"Kannan","sequence":"additional","affiliation":[{"name":"Meta, Menlo Park, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-8528-602X","authenticated-orcid":false,"given":"Cristian","family":"Lumezanu","sequence":"additional","affiliation":[{"name":"Meta, New York, New York, USA"}]}],"member":"320","published-online":{"date-parts":[[2025,11,21]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Sepehr Abdous Erfan Sharafzadeh and Soudeh Ghorbani. 2021. Bursttext-tolerant datacenter networks with Vertigo. In CoNEXT.","DOI":"10.1145\/3485983.3494873"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"crossref","unstructured":"Sepehr Abdous Erfan Sharafzadeh and Soudeh Ghorbani. 2023. Practical Packet Deflection in Datacenters. In CoNEXT.","DOI":"10.1145\/3629147"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3544216.3544252"},{"key":"e_1_3_2_1_4_1","volume-title":"Harmony: A congestion-free datacenter architecture. In NSDI.","author":"Agarwal Saksham","year":"2024","unstructured":"Saksham Agarwal, Qizhe Cai, Rachit Agarwal, David Shmoys, and Amin Vahdat. 2024. Harmony: A congestion-free datacenter architecture. In NSDI."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Saksham Agarwal Arvind Krishnamurthy and Rachit Agarwal. 2023. Host congestion control. In SIGCOMM.","DOI":"10.1145\/3603269.3604878"},{"key":"e_1_3_2_1_6_1","unstructured":"A Aggarwal S Savage and T Anderson. 2000. Understanding the performance of TCP pacing. In INFOCOM."},{"key":"e_1_3_2_1_7_1","volume-title":"Francis Matus, Rong Pan, Navindra Yadav, and George Varghese.","author":"Alizadeh Mohammad","year":"2014","unstructured":"Mohammad Alizadeh, Tom Edsall, Sarang Dharmapurikar, Ramanan Vaidyanathan, Kevin Chu, Andy Fingerhut, Vinh The Lam, Francis Matus, Rong Pan, Navindra Yadav, and George Varghese. 2014. CONGA: Distributed congestiontext-aware load balancing for datacenters. In SIGCOMM."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Mohammad Alizadeh Albert Greenberg David A Maltz Jitendra Padhye Parveen Patel Balaji Prabhakar Sudipta Sengupta and Murari Sridharan. 2010. Data Center TCP (DCTCP). In SIGCOMM.","DOI":"10.1145\/1851182.1851192"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Mohammad Alizadeh Shuang Yang Milad Sharif Sachin Katti Nick McKeown Balaji Prabhakar and Scott Shenker. 2013. pFabric: Minimal neartext-optimal datacenter transport. In SIGCOMM.","DOI":"10.1145\/2486001.2486031"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"crossref","unstructured":"Theophilus Benson Aditya Akella and David A Maltz. 2010a. Network traffic characteristics of data centers in the wild. In IMC.","DOI":"10.1145\/1879141.1879175"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1672308.1672325"},{"key":"e_1_3_2_1_12_1","volume-title":"Soheil Hassas Yeganeh, and Van Jacobson","author":"Cardwell Neal","year":"2016","unstructured":"Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil Hassas Yeganeh, and Van Jacobson. 2016. BBR: Congestiontext-based congestion control. ACM Queue (2016)."},{"key":"e_1_3_2_1_13_1","volume-title":"Yaron Koral, Jennifer Rexford, Ori Rottenstreich, Steven A Monetti, and Tzuu-Yi Wang.","author":"Chen Xiaoqi","year":"2019","unstructured":"Xiaoqi Chen, Shir Landau Feibish, Yaron Koral, Jennifer Rexford, Ori Rottenstreich, Steven A Monetti, and Tzuu-Yi Wang. 2019. Finetext-grained queue measurement in the data plane. In CoNEXT."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"crossref","unstructured":"Inho Cho Keon Jang and Dongsu Han. 2017. Credittext-scheduled delaytext-bounded congestion control for datacenters. In SIGCOMM.","DOI":"10.1145\/3098822.3098840"},{"key":"e_1_3_2_1_15_1","unstructured":"Abhishek Dhamija Balasubramanian Madhavan Hechao Li Jie Meng Shrikrishna Khare Madhavi Rao Lawrence Brakmo Neil Spring Prashanth Kannan Srikanth Sundaresan et al. 2024. A large-scale deployment of DCTCP. In NSDI."},{"key":"e_1_3_2_1_16_1","volume-title":"Vikram Subramanya, Yeshaiahu Fainman, George Papen, and Amin Vahdat.","author":"Farrington Nathan","year":"2010","unstructured":"Nathan Farrington, George Porter, Sivasankar Radhakrishnan, Hamid Hajabdolali Bazzaz, Vikram Subramanya, Yeshaiahu Fainman, George Papen, and Amin Vahdat. 2010. Helios: A hybrid electrical\/optical switch architecture for modular datacenters. In SIGCOMM."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"crossref","unstructured":"S. Floyd and V. Jacobson. 1994. The synchronization of periodic routing messages. IEEE\/ACM Transactions on Networking (1994).","DOI":"10.1109\/90.298431"},{"key":"e_1_3_2_1_18_1","volume-title":"Guilherme Goes, Hany Morsy, Rohit Puri, Mohammad Riftadi, Ashmitha Jeevaraj Shetty, Jingyi Yang, et al.","author":"Gangidi Adithya","year":"2024","unstructured":"Adithya Gangidi, Rui Miao, Shengbao Zheng, Sai Jayesh Bondu, Guilherme Goes, Hany Morsy, Rohit Puri, Mohammad Riftadi, Ashmitha Jeevaraj Shetty, Jingyi Yang, et al., 2024. RDMA over Ethernet for distributed training at Meta scale. In SIGCOMM."},{"key":"e_1_3_2_1_19_1","unstructured":"Peter X Gao Akshay Narayan Gautam Kumar Rachit Agarwal Sylvia Ratnasamy and Scott Shenker. 2015. pHost: Distributed neartext-optimal datacenter transport over commodity network fabric. In CoNEXT."},{"key":"e_1_3_2_1_20_1","unstructured":"Yixiao Gao Qiang Li Lingbo Tang Yongqing Xi Pengcheng Zhang Wenwen Peng Bo Li Yaohui Wu Shaozong Liu Lei Yan et al. 2021. When cloud storage meets RDMA. In NSDI."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"crossref","unstructured":"Ehab Ghabashneh Yimeng Zhao Cristian Lumezanu Neil Spring Srikanth Sundaresan and Sanjay Rao. 2022. A microscopic view of bursts buffer contention and loss in data centers. In IMC.","DOI":"10.1145\/3517745.3561430"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3098822.3098839"},{"key":"e_1_3_2_1_23_1","volume-title":"Zhehua Wu, Sunghwan Yoo, et al.","author":"Gibson Dan","year":"2022","unstructured":"Dan Gibson, Hema Hariharan, Eric Lance, Moray McLaren, Behnam Montazeri, Arjun Singh, Stephen Wang, Hassan MG Wassel, Zhehua Wu, Sunghwan Yoo, et al., 2022. Aquila: A unified, low-latency fabric for datacenter networks. In NSDI."},{"key":"e_1_3_2_1_24_1","unstructured":"Chuanxiong Guo Haitao Wu Zhong Deng Gaurav Soni Jianxi Ye Jitu Padhye and Marina Lipshteyn. 2016. RDMA over commodity Ethernet at scale. In SIGCOMM."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"crossref","unstructured":"Mark Handley Costin Raiciu Alexandru Agache Andrei Voinescu Andrew W Moore Gianni Antichi and Marcin W\u00f3jcik. 2017. Retext-architecting datacenter networks and stacks for low latency and high performance. In SIGCOMM.","DOI":"10.1145\/3098822.3098825"},{"key":"e_1_3_2_1_26_1","volume-title":"Presto: Edgetext-based load balancing for fast datacenter networks. In SIGCOMM.","author":"He Keqiang","year":"2015","unstructured":"Keqiang He, Eric Rozner, Kanak Agarwal, Wes Felter, John Carter, and Aditya Akella. 2015. Presto: Edgetext-based load balancing for fast datacenter networks. In SIGCOMM."},{"key":"e_1_3_2_1_27_1","volume-title":"Aeolus: A building block for proactive transport in datacenters. In SIGCOMM.","author":"Hu Shuihai","year":"2020","unstructured":"Shuihai Hu, Wei Bai, Gaoxiong Zeng, Zilong Wang, Baochen Qiao, Kai Chen, Kun Tan, and Yi Wang. 2020. Aeolus: A building block for proactive transport in datacenters. In SIGCOMM."},{"key":"e_1_3_2_1_28_1","unstructured":"Ziheng Jiang Haibin Lin Yinmin Zhong Qi Huang Yangrui Chen Zhi Zhang Yanghua Peng Xiang Li Cong Xie Shibiao Nong et al. 2024. MegaScale: Scaling large language model training to more than 10 000 GPUs. In NSDI."},{"key":"e_1_3_2_1_29_1","volume-title":"Ben Leong, and Boon Thau Loo.","author":"Joshi Raj","year":"2018","unstructured":"Raj Joshi, Ting Qu, Mun Choon Chan, Ben Leong, and Boon Thau Loo. 2018. BurstRadar: Practical realtext-time microburst monitoring for datacenter networks. In APSys."},{"key":"e_1_3_2_1_30_1","unstructured":"Srikanth Kandula Jitu Padhye and Victor Bahl. 2009a. Flyways to de-congest datacenter networks. In HotNets."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"crossref","unstructured":"Srikanth Kandula Sudipta Sengupta Albert Greenberg Parveen Patel and Ronnie Chaiken. 2009b. The nature of data center traffic: Measurements & analysis. In IMC.","DOI":"10.1145\/1644893.1644918"},{"key":"e_1_3_2_1_32_1","volume-title":"Xian Wu, Behnam Montazeri, Yaogong Wang, Kevin Springborn, Christopher Alfeld, Michael Ryan, David Wetherall, and Amin Vahdat.","author":"Kumar Gautam","year":"2020","unstructured":"Gautam Kumar, Nandita Dukkipati, Keon Jang, Hassan M G Wassel, Xian Wu, Behnam Montazeri, Yaogong Wang, Kevin Springborn, Christopher Alfeld, Michael Ryan, David Wetherall, and Amin Vahdat. 2020. Swift: Delay is simple and effective for congestion control in the datacenter. In SIGCOMM."},{"key":"e_1_3_2_1_33_1","first-page":"859","article-title":"Prevention of deadlocks and livelocks in lossless, backpressured packet networks","volume":"6","author":"Lee David","year":"2005","unstructured":"David Lee, S Jamaloddin Golestani, and Mark John Karol. 2005. Prevention of deadlocks and livelocks in lossless, backpressured packet networks. US Patent 6,859,435.","journal-title":"US Patent"},{"key":"e_1_3_2_1_34_1","unstructured":"Hwijoon Lim Wei Bai Yibo Zhu Youngmok Jung and Dongsu Han. 2021. Towards timeouttext-less transport in commodity datacenter networks. In EuroSys."},{"key":"e_1_3_2_1_35_1","unstructured":"Teng Ma Tao Ma Zhuo Song Jingxuan Li Huaixin Chang Kang Chen Hai Jiang and Yongwei Wu. 2019. Xtext-RDMA: Effective RDMA middleware in large-scale production environments. In CLUSTER."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3230543.3230564"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"crossref","unstructured":"S Narayana A Sivaraman V Nathan P Goyal et al. 2017. Languagetext-directed hardware design for network performance monitoring. In SIGCOMM.","DOI":"10.1145\/3098822.3098829"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2619239.2626309"},{"key":"e_1_3_2_1_39_1","unstructured":"Pawan Prakash Advait Dixit Y Charlie Hu and Ramana Kompella. 2012. The TCP outcast problem: Exposing unfairness in data center networks. In NSDI."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"crossref","unstructured":"Kun Qian Yongqing Xi Jiamin Cao Jiaqi Gao Yichi Xu Yu Guan Binzhang Fu Xuemei Shi Fangbo Zhu Rui Miao et al. 2024. Alibaba HPN: A datacenter network for large language model training. In SIGCOMM.","DOI":"10.1145\/3651890.3672265"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"crossref","unstructured":"Arjun Roy Hongyi Zeng Jasmeet Bagga George Porter and Alex C Snoeren. 2015. Inside the social network's (datacenter) network. In SIGCOMM.","DOI":"10.1145\/2785956.2787472"},{"key":"e_1_3_2_1_42_1","unstructured":"Erfan Sharafzadeh Sepehr Abdous and Soudeh Ghorbani. 2023. Understanding the impact of host networking elements on traffic bursts. In NSDI."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2785956.2787508"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"crossref","unstructured":"Brent Stephens Alan L Cox Ankit Singla John Carter Colin Dixon and Wesley Felter. 2014. Practical DCB for improved data center networks. In INFOCOM.","DOI":"10.1109\/INFOCOM.2014.6848121"},{"key":"e_1_3_2_1_45_1","unstructured":"Erico Vanini Rong Pan Mohammad Alizadeh Parvin Taheri and Tom Edsall. 2017. Let it flow: Resilient asymmetric load balancing with flowlet switching. In NSDI."},{"key":"e_1_3_2_1_46_1","unstructured":"Weiyang Wang Moein Khazraee Zhizhen Zhong Manya Ghobadi Zhihao Jia Dheevatsa Mudigere Ying Zhang and Anthony Kewitsch. 2023. TopoOpt: Cotext-optimizing network topology and parallelization strategy for distributed training jobs. In NSDI."},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3387514.3405870"},{"key":"e_1_3_2_1_48_1","unstructured":"Zhuolong Yu Chuheng Hu Jingfeng Wu Xiao Sun Vladimir Braverman Mosharaf Chowdhury Zhenhua Liu and Xin Jin. 2021. Programmable packet scheduling with a single queue. In SIGCOMM."},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2592798.2592806"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"crossref","unstructured":"Qiao Zhang Vincent Liu Hongyi Zeng and Arvind Krishnamurthy. 2017. High resolution measurement of data center microbursts. In IMC.","DOI":"10.1145\/3131365.3131375"},{"key":"e_1_3_2_1_51_1","volume-title":"Mohamad Haj Yahia, and Ming Zhang","author":"Zhu Yibo","year":"2015","unstructured":"Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. 2015. Congestion control for largetext-scale RDMA deployments. In SIGCOMM."},{"key":"e_1_3_2_1_52_1","unstructured":"Yazhou Zu Alireza Ghaffarkhah Hoang-Vu Dang Brian Towles Steven Hand Safeen Huda Adekunle Bello Alexander Kolbasov Arash Rezaei Dayou Du et al. 2024. Resiliency at scale: Managing Google's TPUv4 machine learning supercomputer. In NSDI."}],"event":{"name":"IMC '25:ACM Internet Measurement Conference","sponsor":["SIGMETRICS ACM Special Interest Group on Measurement and Evaluation","SIGCOMM ACM Special Interest Group on Data Communication"],"location":"Madison WI USA"},"container-title":["Proceedings of the 2025 ACM Internet Measurement Conference"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3730567.3764494","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,24]],"date-time":"2025-11-24T21:06:31Z","timestamp":1764018391000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3730567.3764494"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,28]]},"references-count":52,"alternative-id":["10.1145\/3730567.3764494","10.1145\/3730567"],"URL":"https:\/\/doi.org\/10.1145\/3730567.3764494","relation":{},"subject":[],"published":{"date-parts":[[2025,10,28]]},"assertion":[{"value":"2025-11-21","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}