{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,29]],"date-time":"2025-09-29T08:20:25Z","timestamp":1759134025806,"version":"3.41.0"},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2018,8,30]],"date-time":"2018-08-30T00:00:00Z","timestamp":1535587200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research","award":["DE-AC02-06CH11357"],"award-info":[{"award-number":["DE-AC02-06CH11357"]}]},{"DOI":"10.13039\/100006602","name":"Air Force Research Laboratory","doi-asserted-by":"crossref","award":["FA8750-15-2-0078"],"award-info":[{"award-number":["FA8750-15-2-0078"]}],"id":[{"id":"10.13039\/100006602","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Model. Comput. Simul."],"published-print":{"date-parts":[[2018,10,31]]},"abstract":"<jats:p>As supercomputers approach exascale performance, the increased number of processors translates to an increased demand on the underlying network interconnect. The slim fly network topology, a new low-diameter, low-latency, and low-cost interconnection network, is gaining interest as one possible solution for next-generation supercomputing interconnect systems. In this article, we present a high-fidelity slim fly packet-level model leveraging the Rensselaer Optimistic Simulation System (ROSS) and Co-Design of Exascale Storage (CODES) frameworks. We validate the model with published work before scaling the network size up to an unprecedented 1 million compute nodes and confirming that the slim fly observes peak network throughput at extreme scale. In addition to synthetic workloads, we evaluate large-scale slim fly models with real communication workloads from applications in the Design Forward program with over 110,000 MPI processes. We show strong scaling of the slim fly model on an Intel cluster achieving a peak network packet transfer rate of 2.3 million packets per second and processing over 7 billion discrete events using 128 MPI tasks. Enabled by the strong performance capabilities of the model, we perform a detailed application trace and routing protocol performance study. Through analysis of metrics such as packet latency, hop count, and congestion, we find that the slim fly network is able to leverage simple minimal routing and achieve the same performance as more complex adaptive routing for tested DOE benchmark applications.<\/jats:p>","DOI":"10.1145\/3203406","type":"journal-article","created":{"date-parts":[[2018,9,4]],"date-time":"2018-09-04T12:37:30Z","timestamp":1536064650000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Modeling Large-Scale Slim Fly Networks Using Parallel Discrete-Event Simulation"],"prefix":"10.1145","volume":"28","author":[{"given":"Noah","family":"Wolfe","sequence":"first","affiliation":[{"name":"Rensselaer Polytechnic Institute, Troy, NY"}]},{"given":"Misbah","family":"Mubarak","sequence":"additional","affiliation":[{"name":"Argonne National Laboratory, Lemont, IL"}]},{"given":"Christopher D.","family":"Carothers","sequence":"additional","affiliation":[{"name":"Rensselaer Polytechnic Institute, Troy, NY"}]},{"given":"Robert B.","family":"Ross","sequence":"additional","affiliation":[{"name":"Argonne National Laboratory, Lemont, IL"}]},{"given":"Philip H.","family":"Carns","sequence":"additional","affiliation":[{"name":"Argonne National Laboratory, Lemont, IL"}]}],"member":"320","published-online":{"date-parts":[[2018,8,30]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Parallel Processing Workshops (Euro-Par\u201915)","volume":"9523","author":"Acun Bilge","unstructured":"Bilge Acun , Nikhil Jain , Abhinav Bhatele , Misbah Mubarak , Christopher D. Carothers , and Laxmikant V. Kale . 2015. Preliminary evaluation of a parallel trace replay tool for HPC network simulations . In Parallel Processing Workshops (Euro-Par\u201915) , Sascha Hunold, Alexandru Costan, Domingo Gim\u00e9nez, Alexandru Iosup, Laura Ricci, Mar\u00eda Engracia G\u00f3mez Requena, Vittorio Scarano, Ana Lucia Varbanescu, Stephen L. Scott, Stefan Lankes, Josef Weidendorfer, and Michael Alexander (Eds.). Lecture Notes in Computer Science , Vol. 9523 . Springer International Publishing, 417--429. Bilge Acun, Nikhil Jain, Abhinav Bhatele, Misbah Mubarak, Christopher D. Carothers, and Laxmikant V. Kale. 2015. Preliminary evaluation of a parallel trace replay tool for HPC network simulations. In Parallel Processing Workshops (Euro-Par\u201915), Sascha Hunold, Alexandru Costan, Domingo Gim\u00e9nez, Alexandru Iosup, Laura Ricci, Mar\u00eda Engracia G\u00f3mez Requena, Vittorio Scarano, Ana Lucia Varbanescu, Stephen L. Scott, Stefan Lankes, Josef Weidendorfer, and Michael Alexander (Eds.). Lecture Notes in Computer Science, Vol. 9523. Springer International Publishing, 417--429."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/215399.215427"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2486092.2486134"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2014.34"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.5555\/336146.336157"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0743-7315(02)00004-7"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/347823.347828"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.127260"},{"key":"e_1_2_1_9_1","volume-title":"Accessed","author":"Department of Energy.","year":"2012","unstructured":"Department of Energy. 2012 . Design Forward - Exascale Initiative. Retrieved from http:\/\/www.exascaleinitiative.org\/design-forward. Accessed Dec. 31, 2015. Department of Energy. 2012. Design Forward - Exascale Initiative. Retrieved from http:\/\/www.exascaleinitiative.org\/design-forward. Accessed Dec. 31, 2015."},{"key":"e_1_2_1_10_1","unstructured":"Department of Energy. 2015. AMR Box Lib. Retrieved from https:\/\/ccse.lbl.gov\/BoxLib\/.  Department of Energy. 2015. AMR Box Lib. Retrieved from https:\/\/ccse.lbl.gov\/BoxLib\/."},{"volume-title":"2016 IEEE International Conference on Cluster Computing (CLUSTER\u201916)","author":"Groves Taylor","key":"e_1_2_1_11_1","unstructured":"Taylor Groves , Ryan E. Grant , Scott Hemmer , Simon Hammond , Michael Levenhagen , and Dorian C. Arnold . 2016. (SAI) stalled, active and idle: Characterizing power and performance of large-scale dragonfly networks . In 2016 IEEE International Conference on Cluster Computing (CLUSTER\u201916) . 50--59. Taylor Groves, Ryan E. Grant, Scott Hemmer, Simon Hammond, Michael Levenhagen, and Dorian C. Arnold. 2016. (SAI) stalled, active and idle: Characterizing power and performance of large-scale dragonfly networks. In 2016 IEEE International Conference on Cluster Computing (CLUSTER\u201916). 50--59."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jctb.2003.07.002"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2807591.2807652"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1394608.1382129"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2769458.2769474"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1006\/jctb.1998.1828"},{"key":"e_1_2_1_18_1","volume-title":"Moore graphs and beyond: A survey of the degree\/diameter problem. Electronic Journal of Combinatorics {Electronic Only} DS14","author":"Miller Mirka","year":"2005","unstructured":"Mirka Miller and Jozef Siran . 2005. Moore graphs and beyond: A survey of the degree\/diameter problem. Electronic Journal of Combinatorics {Electronic Only} DS14 ( 2005 ), Retrieved from http:\/\/eudml.org\/doc\/125462. Mirka Miller and Jozef Siran. 2005. Moore graphs and beyond: A survey of the degree\/diameter problem. Electronic Journal of Combinatorics {Electronic Only} DS14 (2005), Retrieved from http:\/\/eudml.org\/doc\/125462."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.4108\/ICST.SIMUTOOLS2009.5521"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.Companion.2012.56"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601381.2601383"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2016.2543725"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/151261.151266"},{"key":"e_1_2_1_25_1","unstructured":"Michael Papka Paul Messina Richard Coffey and Cristina Drugan. 2015. Argonne Leadership Computing Facility 2014 Annual Report. Retrieved from https:\/\/www.alcf.anl.gov\/files\/alcfannreport2014.pdf.  Michael Papka Paul Messina Richard Coffey and Cristina Drugan. 2015. Argonne Leadership Computing Facility 2014 Annual Report. Retrieved from https:\/\/www.alcf.anl.gov\/files\/alcfannreport2014.pdf."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1964218.1964225"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1810085.1810120"},{"key":"e_1_2_1_28_1","series-title":"Lecture Notes in Computer Science","volume-title":"A case for epidemic fault detection and group membership in HPC storage systems","author":"Snyder Shane","unstructured":"Shane Snyder , Philip Carns , Jonathan Jenkins , Kevin Harms , Robert Ross , Misbah Mubarak , and Christopher Carothers . 2015a. A case for epidemic fault detection and group membership in HPC storage systems . In High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, Stephen A. Jarvis, Steven A. Wright, and Simon D. Hammond (Eds.). Lecture Notes in Computer Science , Vol. 8966 . Springer International Publishing , 237--248. Shane Snyder, Philip Carns, Jonathan Jenkins, Kevin Harms, Robert Ross, Misbah Mubarak, and Christopher Carothers. 2015a. A case for epidemic fault detection and group membership in HPC storage systems. In High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, Stephen A. Jarvis, Steven A. Wright, and Simon D. Hammond (Eds.). Lecture Notes in Computer Science, Vol. 8966. Springer International Publishing, 237--248."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2832087.2832091"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1137\/0211027"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.5555\/1416222.1416290"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.1996.0099"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2901378.2901389"},{"volume-title":"International Symposium on Cluster, Cloud and Grid Computing (CCGrid\u201917)","author":"Wolfe Noah","key":"e_1_2_1_34_1","unstructured":"Noah Wolfe , Misbah Mubarak , Nikhil Jain , Jens Domke , Abhinav Bhatele , Christopher D. Carothers , and Robert B. Ross . 2017. Methods for effective utilization of multi-rail fat-tree interconnects . In International Symposium on Cluster, Cloud and Grid Computing (CCGrid\u201917) . Noah Wolfe, Misbah Mubarak, Nikhil Jain, Jens Domke, Abhinav Bhatele, Christopher D. Carothers, and Robert B. Ross. 2017. Methods for effective utilization of multi-rail fat-tree interconnects. In International Symposium on Cluster, Cloud and Grid Computing (CCGrid\u201917)."}],"container-title":["ACM Transactions on Modeling and Computer Simulation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3203406","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3203406","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:39:23Z","timestamp":1750210763000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3203406"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,8,30]]},"references-count":32,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2018,10,31]]}},"alternative-id":["10.1145\/3203406"],"URL":"https:\/\/doi.org\/10.1145\/3203406","relation":{},"ISSN":["1049-3301","1558-1195"],"issn-type":[{"type":"print","value":"1049-3301"},{"type":"electronic","value":"1558-1195"}],"subject":[],"published":{"date-parts":[[2018,8,30]]},"assertion":[{"value":"2017-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-03-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-08-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}