{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T14:39:44Z","timestamp":1775745584545,"version":"3.50.1"},"reference-count":63,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2024,6,14]],"date-time":"2024-06-14T00:00:00Z","timestamp":1718323200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Model. Comput. Simul."],"published-print":{"date-parts":[[2024,7,31]]},"abstract":"<jats:p>\n            Reinforcement Learning (RL) has gained significant momentum in the development of network protocols. However, RL-based protocols are still in their infancy, and substantial research is required to build deployable solutions. Developing a protocol based on RL is a complex and challenging process that involves several model design decisions and requires significant training and evaluation in real and simulated network topologies. Network simulators offer an efficient training environment for RL-based protocols because they are deterministic and can run in parallel. In this article, we introduce\n            <jats:italic>RayNet<\/jats:italic>\n            , a scalable and adaptable simulation platform for the development of RL-based network protocols. RayNet integrates OMNeT++, a fully programmable network simulator, with Ray\/RLlib, a scalable training platform for distributed RL. RayNet facilitates the methodical development of RL-based network protocols so that researchers can focus on the problem at hand and not on implementation details of the learning aspect of their research. We developed a simple RL-based congestion control approach as a proof of concept showcasing that RayNet can be a valuable platform for RL-based research in computer networks, enabling scalable training and evaluation. We compared RayNet with\n            <jats:italic>ns3-gym<\/jats:italic>\n            , a platform with similar objectives to RayNet, and showed that RayNet performs better in terms of how fast agents can collect experience in RL environments.\n          <\/jats:p>","DOI":"10.1145\/3653975","type":"journal-article","created":{"date-parts":[[2024,3,30]],"date-time":"2024-03-30T09:24:44Z","timestamp":1711790684000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["RayNet: A Simulation Platform for Developing Reinforcement Learning-Driven Network Protocols"],"prefix":"10.1145","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1535-7558","authenticated-orcid":false,"given":"Luca","family":"Giacomoni","sequence":"first","affiliation":[{"name":"University of Sussex, Brighton, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4168-6961","authenticated-orcid":false,"given":"Basil","family":"Benny","sequence":"additional","affiliation":[{"name":"University of Sussex, Brighton UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1298-7143","authenticated-orcid":false,"given":"George","family":"Parisis","sequence":"additional","affiliation":[{"name":"University of Sussex, Brighton UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,6,14]]},"reference":[{"key":"e_1_3_2_2_2","first-page":"632","volume-title":"Proceedings of ACM SIGCOMM","author":"Abbasloo Soheil","year":"2020","unstructured":"Soheil Abbasloo, Chen-Yu Yen, and H. Jonathan Chao. 2020. Classic meets modern: A pragmatic learning-based congestion control for the Internet. In Proceedings of ACM SIGCOMM. 632\u2013647."},{"issue":"3","key":"e_1_3_2_3_2","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1109\/90.929853","article-title":"TCP-Peach: A new congestion control scheme for satellite IP networks","volume":"9","author":"Akyildiz Ian F.","year":"2001","unstructured":"Ian F. Akyildiz, Giacomo Morabito, and Sergio Palazzo. 2001. TCP-Peach: A new congestion control scheme for satellite IP networks. IEEE\/ACM Transactions on Networking 9, 3 (2001), 307\u2013321.","journal-title":"IEEE\/ACM Transactions on Networking"},{"key":"e_1_3_2_4_2","first-page":"63","volume-title":"Proceedings of ACM SIGCOMM","author":"Alizadeh M.","year":"2010","unstructured":"M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. 2010. Data center TCP (DCTCP). In Proceedings of ACM SIGCOMM. 63\u201374."},{"issue":"8","key":"e_1_3_2_5_2","doi-asserted-by":"crossref","first-page":"8638","DOI":"10.1109\/TITS.2023.3250320","article-title":"VeSoNet: Traffic-aware content caching for vehicular social networks using deep reinforcement learning","volume":"24","author":"Aung N.","year":"2023","unstructured":"N. Aung, S. Dhelim, L. Chen, A. Lakas, W. Zhang, H. Ning, S. Chaib, and M. T. Kechadi. 2023. VeSoNet: Traffic-aware content caching for vehicular social networks using deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems 24, 8 (2023), 8638\u20138649.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"issue":"5","key":"e_1_3_2_6_2","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TSMC.1983.6313077","article-title":"Neuronlike adaptive elements that can solve difficult learning control problems","author":"Barto Andrew G.","year":"1983","unstructured":"Andrew G. Barto, Richard S. Sutton, and Charles W. Anderson. 1983. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics5 (1983), 834\u2013846.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics"},{"key":"e_1_3_2_7_2","first-page":"24","volume-title":"Proceedings of ACM SIGCOMM","author":"Brakmo Lawrence S.","year":"1994","unstructured":"Lawrence S. Brakmo, Sean W. O\u2019Malley, and Larry L. Peterson. 1994. TCP Vegas: New techniques for congestion detection and avoidance. In Proceedings of ACM SIGCOMM. 24\u201335."},{"key":"e_1_3_2_8_2","unstructured":"G. Brockman V. Cheung L. Pettersson J. Schneider J. Schulman J. Tang and W. Zaremba. 2016. OpenAI Gym. arXiv:arXiv:1606.01540"},{"key":"e_1_3_2_9_2","first-page":"395","volume-title":"Proceedings of USENIX NSDI","author":"Dong Mo","year":"2015","unstructured":"Mo Dong, Qingxi Li, Doron Zarchy, P. Brighten Godfrey, and Michael Schapira. 2015. PCC: Re-architecting congestion control for consistent high performance. In Proceedings of USENIX NSDI. 395\u2013408."},{"key":"e_1_3_2_10_2","first-page":"343","volume-title":"Proceedings of USENIX NSDI","author":"Dong Mo","year":"2018","unstructured":"Mo Dong, Tong Meng, Doron Zarchy, Engin Arslan, Yossi Gilad, Brighten Godfrey, and Michael Schapira. 2018. PCC Vivace: Online-learning congestion control. In Proceedings of USENIX NSDI. 343\u2013356."},{"issue":"3","key":"e_1_3_2_11_2","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1109\/TSMCA.2005.846390","article-title":"Using feedback in collaborative reinforcement learning to adaptively optimize MANET routing","volume":"35","author":"Dowling J.","year":"2005","unstructured":"J. Dowling, E. Curran, R. Cunningham, and V. Cahill. 2005. Using feedback in collaborative reinforcement learning to adaptively optimize MANET routing. IEEE Transactions on Systems, Man, and Cybernetics \u2014 Part A: Systems and Humans 35, 3 (2005), 360\u2013372.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics \u2014 Part A: Systems and Humans"},{"key":"e_1_3_2_12_2","article-title":"Implementing reinforcement learning datacenter congestion control in NVIDIA NICs","author":"Fuhrer Benjamin","year":"2022","unstructured":"Benjamin Fuhrer, Yuval Shpigelman, Chen Tessler, Shie Mannor, Gal Chechik, Eitan Zahavi, and Gal Dalal. 2022. Implementing reinforcement learning datacenter congestion control in NVIDIA NICs. arXiv preprint arXiv:2207.02295 (2022).","journal-title":"arXiv preprint arXiv:2207.02295"},{"key":"e_1_3_2_13_2","volume-title":"Proceedings of ACM MSWIM","author":"Gaw\u0142owicz Piotr","year":"2019","unstructured":"Piotr Gaw\u0142owicz and Anatolij Zubow. 2019. ns-3 meets OpenAI Gym: The playground for machine learning in networking research. In Proceedings of ACM MSWIM."},{"issue":"5","key":"e_1_3_2_14_2","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1145\/1400097.1400105","article-title":"CUBIC: A new TCP-friendly high-speed TCP variant","volume":"42","author":"Ha Sangtae","year":"2008","unstructured":"Sangtae Ha, Injong Rhee, and Lisong Xu. 2008. CUBIC: A new TCP-friendly high-speed TCP variant. ACM SIGOPS Operating Systems Review 42, 5 (2008), 64\u201374.","journal-title":"ACM SIGOPS Operating Systems Review"},{"key":"e_1_3_2_15_2","first-page":"1861","volume-title":"Proceedings of ICML","author":"Haarnoja Tuomas","year":"2018","unstructured":"Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of ICML. 1861\u20131870."},{"issue":"1","key":"e_1_3_2_16_2","first-page":"44","article-title":"Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach","volume":"67","author":"He Ying","year":"2017","unstructured":"Ying He, Nan Zhao, and Hongxi Yin. 2017. Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach. IEEE Transactions on Vehicular Technology 67, 1 (2017), 44\u201355.","journal-title":"IEEE Transactions on Vehicular Technology"},{"key":"e_1_3_2_17_2","article-title":"Distributed prioritized experience replay","author":"Horgan D.","year":"2018","unstructured":"D. Horgan, J. Quan, D. Budden, G. Barth-Maron, M. Hessel, H. Van Hasselt, and D. Silver. 2018. Distributed prioritized experience replay. arXiv preprint arXiv:1803.00933 (2018).","journal-title":"arXiv preprint arXiv:1803.00933"},{"key":"e_1_3_2_18_2","first-page":"1208","volume-title":"Proceedings of ACM Multimedia","author":"Huang Tianchi","year":"2018","unstructured":"Tianchi Huang, Rui-Xiao Zhang, Chao Zhou, and Lifeng Sun. 2018. QARC: Video quality aware rate control for real-time video streaming based on deep reinforcement learning. In Proceedings of ACM Multimedia. 1208\u20131216."},{"key":"e_1_3_2_19_2","first-page":"3050","volume-title":"Proceedings of ICML","author":"Jay Nathan","year":"2019","unstructured":"Nathan Jay, Noga Rotman, Brighten Godfrey, Michael Schapira, and Aviv Tamar. 2019. A deep reinforcement learning perspective on Internet congestion control. In Proceedings of ICML. 3050\u20133059."},{"issue":"3","key":"e_1_3_2_20_2","doi-asserted-by":"crossref","first-page":"1610","DOI":"10.1109\/TWC.2019.2894403","article-title":"Multi-agent reinforcement learning for efficient content caching in mobile D2D networks","volume":"18","author":"Jiang Wei","year":"2019","unstructured":"Wei Jiang, Gang Feng, Shuang Qin, Tak Shing Peter Yum, and Guohong Cao. 2019. Multi-agent reinforcement learning for efficient content caching in mobile D2D networks. IEEE Transactions on Wireless Communications 18, 3 (2019), 1610\u20131622.","journal-title":"IEEE Transactions on Wireless Communications"},{"key":"e_1_3_2_21_2","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1016\/j.adhoc.2018.05.015","article-title":"Internet congestion control using the power metric: Keep the pipe just full, but no fuller","volume":"80","author":"Kleinrock Leonard","year":"2018","unstructured":"Leonard Kleinrock. 2018. Internet congestion control using the power metric: Keep the pipe just full, but no fuller. Ad Hoc Networks 80 (2018), 142\u2013157.","journal-title":"Ad Hoc Networks"},{"key":"e_1_3_2_22_2","first-page":"710","volume-title":"Proceedings of IEEE ICC","author":"Kliazovich Dzmitry","year":"2006","unstructured":"Dzmitry Kliazovich, Fabrizio Granelli, and Daniele Miorandi. 2006. TCP Westwood+ enhancement in high-speed long-distance networks. In Proceedings of IEEE ICC. 710\u2013715."},{"issue":"4","key":"e_1_3_2_23_2","doi-asserted-by":"crossref","first-page":"326","DOI":"10.1109\/JCN.2020.000022","article-title":"Self-adaptive power control with deep reinforcement learning for millimeter-wave Internet-of-Vehicles video caching","volume":"22","author":"Kwon Dohyun","year":"2020","unstructured":"Dohyun Kwon, Joongheon Kim, David A. Mohaisen, and Wonjun Lee. 2020. Self-adaptive power control with deep reinforcement learning for millimeter-wave Internet-of-Vehicles video caching. Journal of Communications and Networks 22, 4 (2020), 326\u2013337.","journal-title":"Journal of Communications and Networks"},{"key":"e_1_3_2_24_2","first-page":"1","volume-title":"Proceedings of IEEE ICC","author":"Lan Dehao","year":"2019","unstructured":"Dehao Lan, Xiaobin Tan, Jinyang Lv, Yang Jin, and Jian Yang. 2019. A deep reinforcement learning based congestion Control Mechanism for NDN. In Proceedings of IEEE ICC. 1\u20137."},{"key":"e_1_3_2_25_2","first-page":"3053","volume-title":"Proceedings of ICML","author":"Liang E.","year":"2018","unstructured":"E. Liang, R. Liaw, R. Nishihara, P. Moritz, R. Fox, K. Goldberg, J. Gonzalez, M. Jordan, and I. Stoica. 2018. RLlib: Abstractions for distributed reinforcement learning. In Proceedings of ICML. 3053\u20133062."},{"key":"e_1_3_2_26_2","article-title":"Continuous control with deep reinforcement learning","author":"Lillicrap T. P.","year":"2015","unstructured":"T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).","journal-title":"arXiv preprint arXiv:1509.02971"},{"key":"e_1_3_2_27_2","doi-asserted-by":"crossref","first-page":"55916","DOI":"10.1109\/ACCESS.2019.2913776","article-title":"Reinforcement learning based routing in networks: Review and classification of approaches","volume":"7","author":"Mammeri Zoubir","year":"2019","unstructured":"Zoubir Mammeri. 2019. Reinforcement learning based routing in networks: Review and classification of approaches. IEEE Access 7 (2019), 55916\u201355950.","journal-title":"IEEE Access"},{"key":"e_1_3_2_28_2","first-page":"197","volume-title":"Proceedings of ACM SIGCOMM","author":"Mao Hongzi","year":"2017","unstructured":"Hongzi Mao, Ravi Netravali, and Mohammad Alizadeh. 2017. Neural adaptive video streaming with Pensieve. In Proceedings of ACM SIGCOMM. 197\u2013210."},{"key":"e_1_3_2_29_2","first-page":"287","volume-title":"Proceedings of ACM MobiCom","author":"Mascolo Saverio","year":"2001","unstructured":"Saverio Mascolo, Claudio Casetti, Mario Gerla, Medy Y. Sanadidi, and Ren Wang. 2001. TCP Westwood: Bandwidth estimation for enhanced transport over wireless links. In Proceedings of ACM MobiCom. 287\u2013297."},{"issue":"12","key":"e_1_3_2_30_2","doi-asserted-by":"crossref","first-page":"6262","DOI":"10.1109\/TSP.2011.2165211","article-title":"Fast Reinforcement Learning for Energy-Efficient Wireless Communication","volume":"59","author":"Mastronarde Nicholas","year":"2011","unstructured":"Nicholas Mastronarde and Mihaela van der Schaar. 2011. Fast Reinforcement Learning for Energy-Efficient Wireless Communication. IEEE Transactions on Signal Processing 59, 12 (2011), 6262\u20136266.","journal-title":"IEEE Transactions on Signal Processing"},{"issue":"4","key":"e_1_3_2_31_2","doi-asserted-by":"crossref","first-page":"537","DOI":"10.1145\/2829988.2787510","article-title":"TIMELY: RTT-based congestion control for the datacenter","volume":"45","author":"Mittal R.","year":"2015","unstructured":"R. Mittal, V. T. Lam, N. Dukkipati, E. Blem, H. Wassel, M. Ghobadi, A. Vahdat, Y. Wang, D. Wetherall, and D. Zats. 2015. TIMELY: RTT-based congestion control for the datacenter. ACM SIGCOMM Computer Communication Review 45, 4 (2015), 537\u2013550.","journal-title":"ACM SIGCOMM Computer Communication Review"},{"key":"e_1_3_2_32_2","article-title":"Playing Atari with deep reinforcement learning","author":"Mnih V.","year":"2013","unstructured":"V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. 2013. Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).","journal-title":"arXiv preprint arXiv:1312.5602"},{"key":"e_1_3_2_33_2","first-page":"561","volume-title":"Proceedings of USENIX OSDI","author":"Moritz Philipp","year":"2018","unstructured":"Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Lian, Melih Elibol, Zongheng Yang, William Paul, Michael I. Jordan, and Ion Stoica. 2018. Ray: A distributed framework for emerging AI applications. In Proceedings of USENIX OSDI. 561\u2013577."},{"issue":"1","key":"e_1_3_2_34_2","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1109\/TWC.2018.2879433","article-title":"Deep multi-user reinforcement learning for distributed dynamic spectrum access","volume":"18","author":"Naparstek Oshri","year":"2018","unstructured":"Oshri Naparstek and Kobi Cohen. 2018. Deep multi-user reinforcement learning for distributed dynamic spectrum access. IEEE Transactions on Wireless Communications 18, 1 (2018), 310\u2013323.","journal-title":"IEEE Transactions on Wireless Communications"},{"key":"e_1_3_2_35_2","first-page":"969","volume-title":"Proceedings of IEEE\/CIC ICCC","author":"Nasehzadeh Ali","year":"2020","unstructured":"Ali Nasehzadeh and Ping Wang. 2020. A deep reinforcement learning-based caching strategy for Internet of Things. In Proceedings of IEEE\/CIC ICCC. 969\u2013974."},{"key":"e_1_3_2_36_2","first-page":"417","volume-title":"Proceedings of USENIX ATC","author":"Netravali R.","year":"2015","unstructured":"R. Netravali, A. Sivaraman, S. Das, A. Goyal, K. Winstein, J. Mickens, and H. Balakrishnan. 2015. Mahimahi: Accurate Record-and-Replay for HTTP. In Proceedings of USENIX ATC. 417\u2013429."},{"key":"e_1_3_2_37_2","article-title":"Deep reinforcement learning for cyber security","author":"Nguyen Thanh Thi","year":"2019","unstructured":"Thanh Thi Nguyen and Vijay Janapa Reddi. 2019. Deep reinforcement learning for cyber security. IEEE Transactions on Neural Networks and Learning Systems (2019).","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"issue":"4","key":"e_1_3_2_38_2","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1145\/2578901","article-title":"Multipath TCP","volume":"57","author":"Paasch Christoph","year":"2014","unstructured":"Christoph Paasch and Olivier Bonaventure. 2014. Multipath TCP. Commun. ACM 57, 4 (2014), 51\u201357.","journal-title":"Commun. ACM"},{"issue":"1","key":"e_1_3_2_39_2","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1109\/JIOT.2019.2945640","article-title":"Deep reinforcement learning for cooperative content caching in vehicular edge computing and networks","volume":"7","author":"Qiao Guanhua","year":"2020","unstructured":"Guanhua Qiao, Supeng Leng, Sabita Maharjan, Yan Zhang, and Nirwan Ansari. 2020. Deep reinforcement learning for cooperative content caching in vehicular edge computing and networks. IEEE Internet of Things Journal 7, 1 (2020), 247\u2013257.","journal-title":"IEEE Internet of Things Journal"},{"key":"e_1_3_2_40_2","first-page":"1","volume-title":"Proceedings of IEEE INFOCOM","author":"Sacco Alessio","year":"2021","unstructured":"Alessio Sacco, Matteo Flocco, Flavio Esposito, and Guido Marchetto. 2021. Owl: Congestion control with partially invisible networks via reinforcement learning. In Proceedings of IEEE INFOCOM. 1\u201310."},{"key":"e_1_3_2_41_2","article-title":"Proximal policy optimization algorithms","author":"Schulman John","year":"2017","unstructured":"John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).","journal-title":"arXiv preprint arXiv:1707.06347"},{"key":"e_1_3_2_42_2","doi-asserted-by":"crossref","first-page":"3548","DOI":"10.1109\/ICC.2005.1495079","volume-title":"IEEE International Conference on Communications, 2005 (ICC 2005). 2005","volume":"5","author":"Shimonishi Hideyuki","year":"2005","unstructured":"Hideyuki Shimonishi, MY Sanadidi, and Mario Gerla. 2005. Improving efficiency-friendliness tradeoffs of TCP in wired-wireless combined networks. In IEEE International Conference on Communications, 2005 (ICC 2005). 2005, Vol. 5. IEEE, 3548\u20133552."},{"issue":"4","key":"e_1_3_2_43_2","doi-asserted-by":"crossref","first-page":"479","DOI":"10.1145\/2740070.2626324","article-title":"An experimental study of the learnability of congestion control","volume":"44","author":"Sivaraman Anirudh","year":"2014","unstructured":"Anirudh Sivaraman, Keith Winstein, Pratiksha Thaker, and Hari Balakrishnan. 2014. An experimental study of the learnability of congestion control. ACM SIGCOMM Computer Communication Review 44, 4 (2014), 479\u2013490.","journal-title":"ACM SIGCOMM Computer Communication Review"},{"key":"e_1_3_2_44_2","article-title":"Compound TCP: A scalable and TCP-friendly congestion control for high-speed networks","author":"Song Kun Tan Jingmin","year":"2006","unstructured":"Kun Tan Jingmin Song, Qian Zhang, and Murari Sridharan. 2006. Compound TCP: A scalable and TCP-friendly congestion control for high-speed networks. Proceedings of PFLDnet 2006 (2006).","journal-title":"Proceedings of PFLDnet 2006"},{"key":"e_1_3_2_45_2","article-title":"A deep-reinforcement learning approach for software-defined networking routing optimization","author":"Stampa Giorgio","year":"2017","unstructured":"Giorgio Stampa, Marta Arias, David S\u00e1nchez-Charles, Victor Munt\u00e9s-Mulero, and Albert Cabellos. 2017. A deep-reinforcement learning approach for software-defined networking routing optimization. arXiv preprint arXiv:1709.07080 (2017).","journal-title":"arXiv preprint arXiv:1709.07080"},{"key":"e_1_3_2_46_2","first-page":"88","volume-title":"Proceedings of ACM SIGCOMM (Posters and Demos)","author":"Sun Penghao","year":"2019","unstructured":"Penghao Sun, Junfei Li, Zehua Guo, Yang Xu, Julong Lan, and Yuxiang Hu. 2019. SINET: Enabling scalable network routing with deep reinforcement learning on partial nodes. In Proceedings of ACM SIGCOMM (Posters and Demos). 88\u201389."},{"key":"e_1_3_2_47_2","volume-title":"Reinforcement Learning: An Introduction","author":"Sutton Richard S.","year":"2018","unstructured":"Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. MIT Press."},{"issue":"5","key":"e_1_3_2_48_2","doi-asserted-by":"crossref","first-page":"1031","DOI":"10.1109\/TNET.2006.883130","article-title":"REFWA: An efficient and fair congestion control scheme for LEO satellite networks","volume":"14","author":"Taleb Tarik","year":"2006","unstructured":"Tarik Taleb, Nei Kato, and Yoshiaki Nemoto. 2006. REFWA: An efficient and fair congestion control scheme for LEO satellite networks. IEEE\/ACM Transactions on Networking 14, 5 (2006), 1031\u20131044.","journal-title":"IEEE\/ACM Transactions on Networking"},{"key":"e_1_3_2_49_2","first-page":"12615","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"36","author":"Tessler Chen","year":"2022","unstructured":"Chen Tessler, Yuval Shpigelman, Gal Dalal, Amit Mandelbaum, Doron Haritan Kazakov, Benjamin Fuhrer, Gal Chechik, and Shie Mannor. 2022. Reinforcement learning for datacenter congestion control. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 12615\u201312621."},{"issue":"11","key":"e_1_3_2_50_2","doi-asserted-by":"crossref","first-page":"8693","DOI":"10.1109\/JIOT.2020.3040957","article-title":"Reinforcement learning for IoT security: A comprehensive survey","volume":"8","author":"Uprety Aashma","year":"2020","unstructured":"Aashma Uprety and Danda B. Rawat. 2020. Reinforcement learning for IoT security: A comprehensive survey. IEEE Internet of Things Journal 8, 11 (2020), 8693\u20138706.","journal-title":"IEEE Internet of Things Journal"},{"key":"e_1_3_2_51_2","first-page":"1","volume-title":"Proceedings of the 1st International Conference on Simulation Tools and Techniques for Communications, Networks and Systems & Workshops","author":"Varga Andr\u00e1s","year":"2008","unstructured":"Andr\u00e1s Varga and Rudolf Hornig. 2008. An overview of the OMNeT++ simulation environment. In Proceedings of the 1st International Conference on Simulation Tools and Techniques for Communications, Networks and Systems & Workshops. 1\u201310."},{"key":"e_1_3_2_52_2","first-page":"1","volume-title":"Proceedings of IEEE INFOCOM","author":"Wang Qi","year":"2021","unstructured":"Qi Wang, Jianmin Liu, Katia Jaffr\u00e8s-Runser, Yongqing Wang, Chentao He, Cunzhuang Liu, and Yongjun Xu. 2021. INCdeep: Intelligent network coding with deep reinforcement learning. In Proceedings of IEEE INFOCOM. 1\u201310."},{"issue":"2","key":"e_1_3_2_53_2","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1109\/TCCN.2018.2809722","article-title":"Deep reinforcement learning for dynamic multichannel access in wireless networks","volume":"4","author":"Wang Shangxing","year":"2018","unstructured":"Shangxing Wang, Hanpeng Liu, Pedro Henrique Gomes, and Bhaskar Krishnamachari. 2018. Deep reinforcement learning for dynamic multichannel access in wireless networks. IEEE Transactions on Cognitive Communications and Networking 4, 2 (2018), 257\u2013265.","journal-title":"IEEE Transactions on Cognitive Communications and Networking"},{"issue":"4","key":"e_1_3_2_54_2","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1145\/2534169.2486020","article-title":"TCP ex machina: Computer-generated congestion control","volume":"43","author":"Winstein Keith","year":"2013","unstructured":"Keith Winstein and Hari Balakrishnan. 2013. TCP ex machina: Computer-generated congestion control. ACM SIGCOMM Computer Communication Review 43, 4 (2013), 123\u2013134.","journal-title":"ACM SIGCOMM Computer Communication Review"},{"issue":"2","key":"e_1_3_2_55_2","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1109\/TNET.2012.2197411","article-title":"ICTCP: Incast Congestion Control for TCP in Data-Center Networks","volume":"21","author":"Wu Haitao","year":"2013","unstructured":"Haitao Wu, Zhenqian Feng, Chuanxiong Guo, and Yongguang Zhang. 2013. ICTCP: Incast Congestion Control for TCP in Data-Center Networks. IEEE\/ACM Transactions on Networking 21, 2 (2013), 345\u2013358.","journal-title":"IEEE\/ACM Transactions on Networking"},{"key":"e_1_3_2_56_2","volume-title":"Proceedings of CSAE","author":"Xu Chunlei","year":"2020","unstructured":"Chunlei Xu, Weijin Zhuang, and Hong Zhang. 2020. A deep-reinforcement learning approach for SDN routing optimization. In Proceedings of CSAE."},{"issue":"6","key":"e_1_3_2_57_2","doi-asserted-by":"crossref","first-page":"1325","DOI":"10.1109\/JSAC.2019.2904358","article-title":"Experience-driven congestion control: When multi-path TCP meets deep reinforcement learning","volume":"37","author":"Xu Zhiyuan","year":"2019","unstructured":"Zhiyuan Xu, Jian Tang, Chengxiang Yin, Yanzhi Wang, and Guoliang Xue. 2019. Experience-driven congestion control: When multi-path TCP meets deep reinforcement learning. IEEE Journal on Selected Areas in Communications 37, 6 (2019), 1325\u20131336.","journal-title":"IEEE Journal on Selected Areas in Communications"},{"issue":"1","key":"e_1_3_2_58_2","first-page":"360","article-title":"An actor-critic-based transfer learning framework for experience-driven networking","volume":"29","author":"Xu Zhiyuan","year":"2021","unstructured":"Zhiyuan Xu, Dejun Yang, Jian Tang, Yinan Tang, Tongtong Yuan, Yanzhi Wang, and Guoliang Xue. 2021. An actor-critic-based transfer learning framework for experience-driven networking. IEEE\/ACM Transactions on Networking 29, 1 (2021), 360\u2013371.","journal-title":"IEEE\/ACM Transactions on Networking"},{"key":"e_1_3_2_59_2","doi-asserted-by":"crossref","first-page":"64533","DOI":"10.1109\/ACCESS.2018.2877686","article-title":"DROM: Optimizing the routing in software-defined networks with deep reinforcement learning","volume":"6","author":"Yu Changhe","year":"2018","unstructured":"Changhe Yu, Julong Lan, Zehua Guo, and Yuxiang Hu. 2018. DROM: Optimizing the routing in software-defined networks with deep reinforcement learning. IEEE Access 6 (2018), 64533\u201364539.","journal-title":"IEEE Access"},{"issue":"11","key":"e_1_3_2_60_2","doi-asserted-by":"crossref","first-page":"7208","DOI":"10.1109\/TITS.2020.3003163","article-title":"A hybrid of deep reinforcement learning and local search for the vehicle routing problems","volume":"22","author":"Zhao Jiuxia","year":"2021","unstructured":"Jiuxia Zhao, Minjia Mao, Xi Zhao, and Jianhua Zou. 2021. A hybrid of deep reinforcement learning and local search for the vehicle routing problems. IEEE Transactions on Intelligent Transportation Systems 22, 11 (2021), 7208\u20137218.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"issue":"4","key":"e_1_3_2_61_2","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1109\/MCOM.2019.1800603","article-title":"Routing for crowd management in smart cities: A deep reinforcement learning perspective","volume":"57","author":"Zhao Lei","year":"2019","unstructured":"Lei Zhao, Jiadai Wang, Jiajia Liu, and Nei Kato. 2019. Routing for crowd management in smart cities: A deep reinforcement learning perspective. IEEE Communications Magazine 57, 4 (2019), 88\u201393.","journal-title":"IEEE Communications Magazine"},{"issue":"1","key":"e_1_3_2_62_2","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1109\/TCCN.2020.2968326","article-title":"Deep reinforcement learning-based edge caching in wireless networks","volume":"6","author":"Zhong Chen","year":"2020","unstructured":"Chen Zhong, M. Cenk Gursoy, and Senem Velipasalar. 2020. Deep reinforcement learning-based edge caching in wireless networks. IEEE Transactions on Cognitive Communications and Networking 6, 1 (2020), 48\u201361.","journal-title":"IEEE Transactions on Cognitive Communications and Networking"},{"issue":"6","key":"e_1_3_2_63_2","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1109\/MNET.2018.1800109","article-title":"Deep reinforcement learning for mobile edge caching: Review, new features, and open issues","volume":"32","author":"Zhu Hao","year":"2018","unstructured":"Hao Zhu, Yang Cao, Wei Wang, Tao Jiang, and Shi Jin. 2018. Deep reinforcement learning for mobile edge caching: Review, new features, and open issues. IEEE Network 32, 6 (2018), 50\u201357.","journal-title":"IEEE Network"},{"issue":"2","key":"e_1_3_2_64_2","doi-asserted-by":"crossref","first-page":"2074","DOI":"10.1109\/JIOT.2018.2882583","article-title":"Caching transient data for Internet of Things: A deep reinforcement learning approach","volume":"6","author":"Zhu Hao","year":"2019","unstructured":"Hao Zhu, Yang Cao, Xiao Wei, Wei Wang, Tao Jiang, and Shi Jin. 2019. Caching transient data for Internet of Things: A deep reinforcement learning approach. IEEE Internet of Things Journal 6, 2 (2019), 2074\u20132083.","journal-title":"IEEE Internet of Things Journal"}],"container-title":["ACM Transactions on Modeling and Computer Simulation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3653975","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3653975","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:03:36Z","timestamp":1750291416000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3653975"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,14]]},"references-count":63,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,7,31]]}},"alternative-id":["10.1145\/3653975"],"URL":"https:\/\/doi.org\/10.1145\/3653975","relation":{},"ISSN":["1049-3301","1558-1195"],"issn-type":[{"value":"1049-3301","type":"print"},{"value":"1558-1195","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,14]]},"assertion":[{"value":"2022-12-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-01-16","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-06-14","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}