{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,14]],"date-time":"2026-02-14T10:23:48Z","timestamp":1771064628859,"version":"3.50.1"},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"CoNEXT4","license":[{"start":{"date-parts":[[2024,11,25]],"date-time":"2024-11-25T00:00:00Z","timestamp":1732492800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Netw."],"published-print":{"date-parts":[[2024,12]]},"abstract":"<jats:p>Together with the construction of RDMA networks for data center applications, the RDMA-coupled DCQCN dominates the RDMA Congestion Control (CC). However, DCQCN suffers severe performance problems in high-speed RDMA networks with modern high-performance distributed applications such as machine learning training. This paper presents RECC, inspired by both the latest emerging programmability of RDMA NICs (RNICs) and limitations in existing RDMA congestion control mechanisms. RECC comprehensively leverages RTT and ECN events from RNICs to handle congestion timely and precisely, along with a History-aware Burst Smooth mechanism to avoid wrong rate decisions under various traffic patterns. We implement RECC completely based on commercial RNICs without any modifications to switches, RDMA protocol stack, and applications. The results of microbenchmark testbed experiments and real Machine Learning (ML) workload experiments with hundreds of 200G RNICs show that RECC can significantly reduce network tail latency and pause duration by up to 64.4% and 95%, respectively, compared with DCQCN. In addition, large-scale simulations with realistic workloads demonstrate that RECC achieves comparable performance with HPCC.<\/jats:p>","DOI":"10.1145\/3696402","type":"journal-article","created":{"date-parts":[[2024,11,25]],"date-time":"2024-11-25T11:15:47Z","timestamp":1732533347000},"page":"1-18","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["RECC: Joint Congestion Control Based on RTT and ECN for High-speed RDMA Networks"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0009-0000-1673-8104","authenticated-orcid":false,"given":"Zirui","family":"Wan","sequence":"first","affiliation":[{"name":"Beijing University of Posts and Telecommunications, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5614-3420","authenticated-orcid":false,"given":"Jiao","family":"Zhang","sequence":"additional","affiliation":[{"name":"Beijing University of Posts and Telecommunications, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-4093-3187","authenticated-orcid":false,"given":"Haoran","family":"Wei","sequence":"additional","affiliation":[{"name":"ByteDance China, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6144-7899","authenticated-orcid":false,"given":"Zhuo","family":"Jiang","sequence":"additional","affiliation":[{"name":"ByteDance China, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4985-1645","authenticated-orcid":false,"given":"Xiaolong","family":"Zhong","sequence":"additional","affiliation":[{"name":"ByteDance China, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1357-3137","authenticated-orcid":false,"given":"Wenfei","family":"Wu","sequence":"additional","affiliation":[{"name":"Peking University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-1620-4110","authenticated-orcid":false,"given":"Huaping","family":"Zhou","sequence":"additional","affiliation":[{"name":"ByteDance China, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7718-0669","authenticated-orcid":false,"given":"Tian","family":"Pan","sequence":"additional","affiliation":[{"name":"Beijing University of Posts and Telecommunications, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3545-1122","authenticated-orcid":false,"given":"Tao","family":"Huang","sequence":"additional","affiliation":[{"name":"Beijing University of Posts and Telecommunications, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2024,11,25]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"December","year":"2021","unstructured":"2021. Versa Technology Inc. 400G Ethernet: It's Here, and It's Huge, December 2021. www.versatek.com\/400gethernet-its-here-and-its-huge\/"},{"key":"e_1_2_1_2_1","volume-title":"PowerTCP: Pushing the Performance Limits of Datacenter Networks. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22)","author":"Addanki Vamsi","year":"2022","unstructured":"Vamsi Addanki, Oliver Michel, and Stefan Schmid. 2022. PowerTCP: Pushing the Performance Limits of Datacenter Networks. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). 51--70."},{"key":"e_1_2_1_3_1","unstructured":"Wenxue Cheng Kun Qian Wanchun Jiang Tong Zhang and Fengyuan Ren. 2020. Re-architecting Congestion Management in Lossless Ethernet. In NSDI. 19--36."},{"key":"e_1_2_1_4_1","volume-title":"EFLOPS: Algorithm and System Co-design for a High Performance Distributed Training Platform. In HPCA. 610--622.","author":"Dong Jianbo","year":"2020","unstructured":"Jianbo Dong, Zheng Cao, Tao Zhang, Jianxi Ye, Shaochuang Wang, Fei Feng, Li Zhao, Xiaoyong Liu, Liuyihan Song, Liwei Peng, et al. 2020. EFLOPS: Algorithm and System Co-design for a High Performance Distributed Training Platform. In HPCA. 610--622."},{"key":"e_1_2_1_5_1","first-page":"249","article-title":"Network Requirements for Resource Disaggregation","volume":"16","author":"Gao Peter Xiang","year":"2016","unstructured":"Peter Xiang Gao, Akshay Narayan, Sagar Karandikar, Joao Carreira, Sangjin Han, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. 2016. Network Requirements for Resource Disaggregation. In OSDI, Vol. 16. 249--264.","journal-title":"OSDI"},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Peter X Gao Akshay Narayan Gautam Kumar Rachit Agarwal Sylvia Ratnasamy and Scott Shenker. 2015. phost: Distributed Near-optimal Datacenter Transport over Commodity Network Fabric. In CoNEXT. 1--12.","DOI":"10.1145\/2716281.2836086"},{"key":"e_1_2_1_7_1","unstructured":"Yixiao Gao Qiang Li Lingbo Tang Yongqing Xi Pengcheng Zhang Wenwen Peng Bo Li Yaohui Wu Shaozong Liu Lei Yan et al. 2021. When Cloud Storage Meets RDMA. In NSDI. 519--533."},{"key":"e_1_2_1_8_1","unstructured":"IETF Group. 2009. IEEE 802.1 Qbb - Priority-based Flow Control. https:\/\/1.ieee802.org\/dcb\/802--1qbb\/."},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Chuanxiong Guo Haitao Wu Zhong Deng Gaurav Soni Jianxi Ye Jitu Padhye and Marina Lipshteyn. 2016. RDMA over Commodity Ethernet at Scale. In SIGCOMM. 202--215.","DOI":"10.1145\/2934872.2934908"},{"key":"e_1_2_1_10_1","doi-asserted-by":"crossref","unstructured":"Mark Handley Costin Raiciu Alexandru Agache Andrei Voinescu Andrew W Moore Gianni Antichi and Marcin W\u00f3jcik. 2017. Re-architecting Datacenter Networks and Stacks for Low Latency and High Performance. In SIGCOMM. 29--42.","DOI":"10.1145\/3098822.3098825"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_12_1","volume-title":"Zhi Li Zhang, and Kai Zheng","author":"He Zhiqiang","year":"2020","unstructured":"Zhiqiang He, Dongyang Wang, Binzhang Fu, Kun Tan, Bei Hua, Zhi Li Zhang, and Kai Zheng. 2020. MasQ: RDMA for Virtual Private Cloud. In SIGCOMM. 1--14."},{"key":"e_1_2_1_13_1","volume-title":"Zixuan Wang, Yi Xu, Subramanya R Dulloor, Jishen Zhao, and Steven Swanson.","author":"Izraelevitz Joseph","year":"2019","unstructured":"Joseph Izraelevitz, Jian Yang, Lu Zhang, Juno Kim, Xiao Liu, Amirsaman Memaripour, Yun Joon Soh, Zixuan Wang, Yi Xu, Subramanya R Dulloor, Jishen Zhao, and Steven Swanson. 2019. Basic Performance Measurements of the Intel Optane DC Persistent Memory Module. arXiv preprint arXiv:1903.05714 (2019)."},{"key":"e_1_2_1_14_1","volume-title":"2019 USENIX Annual Technical Conference (USENIX ATC 19)","author":"Jeon Myeongjae","year":"2019","unstructured":"Myeongjae Jeon, Shivaram Venkataraman, Amar Phanishayee, Junjie Qian,Wencong Xiao, and Fan Yang. 2019. Analysis of {Large-Scale} {Multi-Tenant} {GPU} clusters for {DNN} training workloads. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). 947--960."},{"key":"e_1_2_1_15_1","unstructured":"Xianyan Jia Shutao Song Wei He Yangzihao Wang Haidong Rong Feihu Zhou Liqiang Xie Zhenyu Guo Yuanzhou Yang Liwei Yu et al. 2018. Highly Scalable Deep Learning Training System with Mixed-Precision: Training Imagenet in Four Minutes. arXiv preprint arXiv:1807.11205 (2018)."},{"key":"e_1_2_1_16_1","unstructured":"Yimin Jiang Yibo Zhu Chang Lan Bairen Yi Yong Cui and Chuanxiong Guo. 2020. A Unified Architecture for Accelerating Distributed DNN Training in Heterogeneous GPU\/CPU Clusters. In OSDI. 463--479."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2901318.2901337"},{"key":"e_1_2_1_18_1","volume-title":"Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012)."},{"key":"e_1_2_1_19_1","volume-title":"Xian Wu, Behnam Montazeri, Yaogong Wang, Kevin Springborn, Christopher Alfeld, Michael Ryan, et al.","author":"Kumar Gautam","year":"2020","unstructured":"Gautam Kumar, Nandita Dukkipati, Keon Jang, Hassan MG Wassel, Xian Wu, Behnam Montazeri, Yaogong Wang, Kevin Springborn, Christopher Alfeld, Michael Ryan, et al. 2020. Swift: Delay is Simple and Effective for Congestion Control in the Datacenter. In SIGCOMM. 514--528."},{"key":"e_1_2_1_20_1","volume-title":"Yan Zhuang, Fei Feng, Lingbo Tang, Zheng Cao, Ming Zhang, Frank Kelly, Mohammad Alizadeh, et al.","author":"Li Yuliang","year":"2019","unstructured":"Yuliang Li, Rui Miao, Hongqiang Harry Liu, Yan Zhuang, Fei Feng, Lingbo Tang, Zheng Cao, Ming Zhang, Frank Kelly, Mohammad Alizadeh, et al. 2019. HPCC: High Precision Congestion Control. In SIGCOMM. 44--58."},{"key":"e_1_2_1_21_1","volume-title":"Themis: Fair and Efficient GPU Cluster Scheduling. In 17th USENIX Symposium on Networked Systems Design and Implementation.","author":"Mahajan Kshiteej","year":"2020","unstructured":"Kshiteej Mahajan, Arjun Balasubramanian, Arjun Singhvi, Shivaram Venkataraman, Aditya Akella, Amar Phanishayee, and Shuchi Chawla. 2020. Themis: Fair and Efficient GPU Cluster Scheduling. In 17th USENIX Symposium on Networked Systems Design and Implementation."},{"key":"e_1_2_1_22_1","unstructured":"Mellanox. 2020. Mellanox Adapters - Comparison Table. https:\/\/support.mellanox.com\/s\/article\/mellanox-adapters-- comparison-table."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2829988.2787510"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3230543.3230564"},{"key":"e_1_2_1_25_1","unstructured":"NVIDIA. 2020. ConnectX-6 DX. https:\/\/www.nvidia.com\/enus\/networking\/ethernet\/connectx-6-dx\/."},{"key":"e_1_2_1_26_1","unstructured":"NVIDIA. 2021. BlueField-2. https:\/\/resources.nvidia.com\/en-us-accelerated-networking-resource-library\/bluefield-2-dpu-datasheet?lx=LbHvpR&topic=networking-cloud."},{"key":"e_1_2_1_27_1","unstructured":"NVIDIA. 2022. ConnectX-7 400G Adapters. https:\/\/nvdam.widen.net\/s\/csf8rmnqwl\/infiniband-ethernet-datasheetconnectx-7-ds-nv-us-2544471."},{"key":"e_1_2_1_28_1","unstructured":"NVIDIA. 2022. nccl-tests. https:\/\/github.com\/nvidia\/nccl-tests."},{"key":"e_1_2_1_29_1","unstructured":"NVIDIA. 2023. DGX Platform. https:\/\/www.nvidia.com\/en-us\/data-center\/dgx-platform\/."},{"key":"e_1_2_1_30_1","doi-asserted-by":"crossref","unstructured":"Sreeram Potluri Khaled Hamidouche Akshay Venkatesh Devendar Bureddy and Dhabaleswar K Panda. 2013. Efficient Inter-node MPI Communication using GPUDirect RDMA for InfiniBand Clusters with NVIDIA GPUs. In ICPP. 80--89.","DOI":"10.1109\/ICPP.2013.17"},{"key":"e_1_2_1_31_1","volume-title":"CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters. In NSDI 24.","author":"Rajasekaran Sudarsanan","year":"2024","unstructured":"Sudarsanan Rajasekaran, Manya Ghobadi, and Aditya Akella. 2024. CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters. In NSDI 24."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3563766.3564115"},{"key":"e_1_2_1_33_1","volume-title":"20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23)","author":"Shah Aashaka","year":"2023","unstructured":"Aashaka Shah, Vijay Chidambaram, Meghan Cowan, Saeed Maleki, Madan Musuvathi, Todd Mytkowicz, Jacob Nelson, Olli Saarikivi, and Rachee Singh. 2023. {TACCL}: Guiding Collective Algorithm Synthesis using Communication Sketches. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). 593--612."},{"key":"e_1_2_1_34_1","unstructured":"Jiaxin Shi Youyang Yao Rong Chen Haibo Chen and Feifei Li. 2016. Fast and Concurrent RDF Queries with RDMA-Based Distributed Graph Exploration. In OSDI. 317--332."},{"key":"e_1_2_1_35_1","first-page":"428","article-title":"Programmable Congestion Control Communication Scheme","volume":"16","author":"Shpigelman Yuval","year":"2021","unstructured":"Yuval Shpigelman, Idan Burstein, Noam Bloch, Reut Zuck, and Roee Moyal. 2021. Programmable Congestion Control Communication Scheme. US Patent App. 16\/986,428.","journal-title":"US Patent App."},{"key":"e_1_2_1_36_1","volume-title":"Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_2_1_38_1","volume-title":"Tuning Target Delay for RTT-based Congestion Control. In 2022 IEEE 30th International Conference on Network Protocols (ICNP). IEEE, 1--11","author":"Tang Jian","year":"2022","unstructured":"Jian Tang, Tingting Xu, Camtu Nguyen, Xiaoliang Wang, Sanglu Lu, and Baoliu Ye. 2022. Tuning Target Delay for RTT-based Congestion Control. In 2022 IEEE 30th International Conference on Network Protocols (ICNP). IEEE, 1--11."},{"key":"e_1_2_1_39_1","volume-title":"10th {USENIX} Workshop on Hot Topics in Cloud Computing (HotCloud 18).","author":"Thomas Shelby","unstructured":"Shelby Thomas, Geoffrey M Voelker, and George Porter. 2018. CacheCloud: Towards Speed-of-light Datacenter Communication. In 10th {USENIX} Workshop on Hot Topics in Cloud Computing (HotCloud 18)."},{"key":"e_1_2_1_40_1","volume-title":"Exhaustive Study of Hierarchical AllReduce Patterns for Large Messages Between GPUs. In 2019 19th IEEE\/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). IEEE, 430--439","author":"Ueno Yuichiro","year":"2019","unstructured":"Yuichiro Ueno and Rio Yokota. 2019. Exhaustive Study of Hierarchical AllReduce Patterns for Large Messages Between GPUs. In 2019 19th IEEE\/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). IEEE, 430--439."},{"key":"e_1_2_1_41_1","doi-asserted-by":"crossref","unstructured":"Jilong Xue Youshan Miao Cheng Chen Ming Wu Lintao Zhang and Lidong Zhou. 2019. Fast Distributed Deep Learning over RDMA. In EuroSys. 1--14.","DOI":"10.1145\/3302424.3303975"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3106989.3107002"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2022.3161580"},{"key":"e_1_2_1_44_1","volume-title":"HierCC: Hierarchical RDMA Congestion Control. In 5th Asia-Pacific Workshop on Networking (APNet","author":"Zhang Jiao","year":"2021","unstructured":"Jiao Zhang, Yali Zhang, Zixuan Guan, Zirui Wan, Yinben Xia, Tian Pan, Tao Huang, Dezhi Tang, and Yun Lin. 2021. HierCC: Hierarchical RDMA Congestion Control. In 5th Asia-Pacific Workshop on Networking (APNet 2021). 29--36."},{"key":"e_1_2_1_45_1","volume-title":"PACC: Proactive and Accurate Congestion Feedback for RDMA Congestion Control. In IEEE INFOCOM 2022-IEEE Conference on Computer Communications. IEEE, 2228--2237","author":"Zhong Xiaolong","year":"2022","unstructured":"Xiaolong Zhong, Jiao Zhang, Yali Zhang, Zixuan Guan, and ZiruiWan. 2022. PACC: Proactive and Accurate Congestion Feedback for RDMA Congestion Control. In IEEE INFOCOM 2022-IEEE Conference on Computer Communications. IEEE, 2228--2237."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2829988.2787484"}],"container-title":["Proceedings of the ACM on Networking"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3696402","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3696402","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,23]],"date-time":"2025-08-23T01:24:41Z","timestamp":1755912281000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3696402"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,25]]},"references-count":46,"journal-issue":{"issue":"CoNEXT4","published-print":{"date-parts":[[2024,12]]}},"alternative-id":["10.1145\/3696402"],"URL":"https:\/\/doi.org\/10.1145\/3696402","relation":{},"ISSN":["2834-5509"],"issn-type":[{"value":"2834-5509","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,25]]},"assertion":[{"value":"2024-11-25","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}