{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,25]],"date-time":"2026-01-25T04:35:28Z","timestamp":1769315728829,"version":"3.49.0"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"14","license":[{"start":{"date-parts":[[2024,6,7]],"date-time":"2024-06-07T00:00:00Z","timestamp":1717718400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,6,7]],"date-time":"2024-06-07T00:00:00Z","timestamp":1717718400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"BBVA foundation and Becas Leonardo","award":["IN[20]_TIC_TIC_0042"],"award-info":[{"award-number":["IN[20]_TIC_TIC_0042"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Supercomput"],"published-print":{"date-parts":[[2024,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The InfiniBand (IB) interconnection technology is widely used in the networks of modern supercomputers and data centers. Among other advantages, the IB-based network devices allow for building multiple network topologies, and the IB control software (subnet manager) supports several routing engines suitable for the most common topologies. However, the implementation of some novel topologies in IB-based networks may be difficult if suitable routing algorithms are not supported, or if the IB switch or NIC architectures are not directly applicable for that topology. This work describes the implementation of the network topology known as KNS in a real HPC cluster using an IB network. As far as we know, this is the first implementation of this topology in an IB-based system. In more detail, we have implemented the KNS routing algorithm in the OpenSM software distribution of the subnet manager, and we have adapted the available IB-based switches to the particular structure of this topology. We have evaluated the correctness of our implementation through experiments in the real cluster, using well-known benchmarks. The obtained results, which match the expected performance for the KNS topology, show that this topology can be implemented in IB-based clusters as an alternative to other interconnection patterns.\n<\/jats:p>","DOI":"10.1007\/s11227-024-06214-6","type":"journal-article","created":{"date-parts":[[2024,6,7]],"date-time":"2024-06-07T11:03:02Z","timestamp":1717758182000},"page":"21306-21338","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Implementation and testing of a KNS topology in an InfiniBand cluster"],"prefix":"10.1007","volume":"80","author":[{"given":"Gabriel","family":"Gomez-Lopez","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jesus","family":"Escudero-Sahuquillo","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pedro J.","family":"Garcia","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Francisco J.","family":"Quiles","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,6,7]]},"reference":[{"key":"6214_CR1","doi-asserted-by":"publisher","first-page":"514","DOI":"10.1109\/ACCESS.2014.2325029","volume":"2","author":"X Chen","year":"2014","unstructured":"Chen X, Lin X (2014) Big data deep learning: challenges and perspectives. IEEE Access 2:514\u2013525. https:\/\/doi.org\/10.1109\/ACCESS.2014.2325029","journal-title":"IEEE Access"},{"key":"6214_CR2","unstructured":"Top500.org: Top 500 list. https:\/\/www.top500.org. Accessed 4 June 2024 (2024)"},{"key":"6214_CR3","doi-asserted-by":"publisher","unstructured":"Rocher-Gonzalez J, Escudero-Sahuquillo J, Garc\u00eda PJ, Quiles FJ (2017) On the Impact of Routing Algorithms in the Effectiveness of Queuing Schemes in High-Performance Interconnection Networks. In: 25th IEEE Annual Symposium on High-Performance Interconnects, HOTI 2017, Santa Clara, CA, USA, August 28-30, 2017, pp. 65\u201372. IEEE Computer Society, USA. https:\/\/doi.org\/10.1109\/HOTI.2017.16","DOI":"10.1109\/HOTI.2017.16"},{"key":"6214_CR4","doi-asserted-by":"publisher","first-page":"46","DOI":"10.1016\/J.JPDC.2020.07.009","volume":"147","author":"J Rocher-Gonzalez","year":"2021","unstructured":"Rocher-Gonzalez J, Escudero-Sahuquillo J, Garc\u00eda PJ, Quiles FJ, Mora G (2021) towards an efficient combination of adaptive routing and queuing schemes in fat-tree topologies. J Parallel Distrib Comput 147:46\u201363. https:\/\/doi.org\/10.1016\/J.JPDC.2020.07.009","journal-title":"J Parallel Distrib Comput"},{"key":"6214_CR5","doi-asserted-by":"publisher","unstructured":"Rocher-Gonz\u00e1lez J, Gran EG, Reinemo S, Skeie T, Escudero-Sahuquillo J, Garc\u00eda PJ, Flor FJQ (2022) Adaptive routing in InfiniBand Hardware. In: 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2022, Taormina, Italy, May 16\u201319, 463\u2013472. IEEE, USA (2022). https:\/\/doi.org\/10.1109\/CCGRID54584.2022.00056","DOI":"10.1109\/CCGRID54584.2022.00056"},{"key":"6214_CR6","volume-title":"Interconnection networks","author":"J Duato","year":"2003","unstructured":"Duato J, Yalamanchili S, Ni L (2003) Interconnection networks. Elsevier Science, San Francisco"},{"key":"6214_CR7","volume-title":"Principles and practices of interconnection networks","author":"WJ Dally","year":"2004","unstructured":"Dally WJ, Towles BP (2004) Principles and practices of interconnection networks. Elsevier, San Francisco"},{"issue":"10","key":"6214_CR8","doi-asserted-by":"publisher","first-page":"892","DOI":"10.1109\/TC.1985.6312192","volume":"34","author":"CE Leiserson","year":"1985","unstructured":"Leiserson CE (1985) Fat-trees: universal networks for hardware-efficient supercomputing. IEEE Trans Comput 34(10):892\u2013901. https:\/\/doi.org\/10.1109\/TC.1985.6312192","journal-title":"IEEE Trans Comput"},{"key":"6214_CR9","doi-asserted-by":"publisher","unstructured":"Singh A, Ong J, Agarwal A, Anderson G, Armistead A, Bannon R, Boving S, Desai G, Felderman B, Germano P, Kanagala A, Provost J, Simmons J, Tanda E, Wanderer J, H\u00f6lzle U, Stuart S, Vahdat A (2015) Jupiter rising: a decade of clos topologies and centralized control in Google\u2019s datacenter network. In: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication. SIGCOMM \u201915, pp. 183\u2013197. Association for Computing Machinery, New York, NY, USA . https:\/\/doi.org\/10.1145\/2785956.2787508","DOI":"10.1145\/2785956.2787508"},{"issue":"1","key":"6214_CR10","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1109\/MM.2011.98","volume":"32","author":"Y Ajima","year":"2012","unstructured":"Ajima Y, Inoue T, Hiramoto S, Takagi Y, Shimizu T (2012) The Tofu Interconnect. IEEE Micro 32(1):21\u201331. https:\/\/doi.org\/10.1109\/MM.2011.98","journal-title":"IEEE Micro"},{"key":"6214_CR11","doi-asserted-by":"publisher","unstructured":"Rodriguez G, Minkenberg C, Beivide R, Luijten RP, Labarta J, Valero M (2009) Oblivious routing schemes in extended generalized fat tree networks. In: 2009 IEEE International Conference on Cluster Computing and Workshops, pp 1\u20138 . https:\/\/doi.org\/10.1109\/CLUSTR.2009.5289145","DOI":"10.1109\/CLUSTR.2009.5289145"},{"key":"6214_CR12","doi-asserted-by":"publisher","unstructured":"Desai N, Balaji P, Sadayappan P, Islam M (2008) Are nonblocking networks really needed for high-end-computing workloads? In: 2008 IEEE International Conference on Cluster Computing, pp 152\u2013159 . https:\/\/doi.org\/10.1109\/CLUSTR.2008.4663766","DOI":"10.1109\/CLUSTR.2008.4663766"},{"key":"6214_CR13","doi-asserted-by":"publisher","unstructured":"Kim J, Dally WJ, Scott S, Abts D (2008) Technology-driven, highly-scalable dragonfly topology. In: Proceedings of the 35th Annual International Symposium on Computer Architecture. ISCA \u201908, pp 77\u201388. IEEE Computer Society, USA . https:\/\/doi.org\/10.1109\/ISCA.2008.19","DOI":"10.1109\/ISCA.2008.19"},{"key":"6214_CR14","doi-asserted-by":"publisher","unstructured":"Flajslik M, Borch E, Parker MA (2018) MegaFly: a topology for Exascale systems. In: High performance computing: 33rd international conference, ISC High Performance 2018, Frankfurt, Germany, June 24\u201328, 2018, Proceedings 33, pp 289\u2013310. Springer, Cham . https:\/\/doi.org\/10.1007\/978-3-319-92040-5_15","DOI":"10.1007\/978-3-319-92040-5_15"},{"key":"6214_CR15","doi-asserted-by":"publisher","unstructured":"Pe\u00f1aranda R, G\u00f3mez C, G\u00f3mez ME, L\u00f3pez P, Duato J (2012) A new family of hybrid topologies for large-scale interconnection networks. In: 2012 IEEE 11th International Symposium on Network Computing and Applications, pp 220\u2013227 .https:\/\/doi.org\/10.1109\/NCA.2012.22","DOI":"10.1109\/NCA.2012.22"},{"key":"6214_CR16","doi-asserted-by":"publisher","unstructured":"Yebenes\u00a0Segura P, Escudero-Sahuquillo J, Gomez C, Garcia PJ, Quiles FJ, Duato J (2013) BBQ: a straightforward queuing scheme to reduce hol-blocking in high-performance hybrid networks. In: Euro-Par 2013 Parallel Processing: 19th International Conference, Aachen, Germany, August 26-30, 2013. Proceedings 19, pp 699\u2013712. Springer, Berlin. https:\/\/doi.org\/10.1007\/978-3-642-40047-6_70","DOI":"10.1007\/978-3-642-40047-6_70"},{"issue":"3","key":"6214_CR17","doi-asserted-by":"publisher","first-page":"1035","DOI":"10.1007\/s11227-016-1640-z","volume":"72","author":"R Pe\u00f1aranda","year":"2016","unstructured":"Pe\u00f1aranda R, G\u00f3mez C, G\u00f3mez ME, L\u00f3pez P, Duato J (2016) The k-ary n-direct s-indirect family of topologies for large-scale interconnection networks. J Supercomput 72(3):1035\u20131062. https:\/\/doi.org\/10.1007\/s11227-016-1640-z","journal-title":"J Supercomput"},{"key":"6214_CR18","volume-title":"InfiniBand network architecture","author":"T Shanley","year":"2003","unstructured":"Shanley T (2003) InfiniBand network architecture. Addison-Wesley, Boston"},{"key":"6214_CR19","unstructured":"Mellanox Technologies: Mellanox OFED for Linux User Manual. Mellanox OFED for Linux User Manual, Rev 2.0-3.0.0 ed., Sunnyvale, CA, USA (2013)"},{"issue":"2","key":"6214_CR20","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1002\/cpe.1527","volume":"22","author":"E Zahavi","year":"2010","unstructured":"Zahavi E, Johnson G, Kerbyson DJ, Lang M (2010) Optimized InfiniBand$$^{TM}$$ fat-tree routing for shift all-to-all communication patterns. Concurr Comput Pract Exp 22(2):217\u2013231. https:\/\/doi.org\/10.1002\/cpe.1527","journal-title":"Concurr Comput Pract Exp"},{"key":"6214_CR21","doi-asserted-by":"publisher","unstructured":"Sullivan H, Bashkow TR (1977) A large scale, homogenous, fully distributed Parallel machine, I. In: Proceedings of the 4th Annual Symposium on Computer Architecture. ISCA \u201977, pp 105\u2013117. Association for Computing Machinery, New York, NY, USA . https:\/\/doi.org\/10.1145\/800255.810659","DOI":"10.1145\/800255.810659"},{"issue":"5","key":"6214_CR22","doi-asserted-by":"publisher","first-page":"547","DOI":"10.1109\/TC.1987.1676939","volume":"C\u201336","author":"WJ Dally","year":"1987","unstructured":"Dally WJ, Seitz CL (1987) Deadlock-free message routing in multiprocessor interconnection networks. IEEE Trans Comput C\u201336(5):547\u2013553. https:\/\/doi.org\/10.1109\/TC.1987.1676939","journal-title":"IEEE Trans Comput"},{"key":"6214_CR23","doi-asserted-by":"publisher","unstructured":"Hoefler T, Schneider T, Lumsdaine A (2009) Optimized routing for large-scale InfiniBand networks. In: 2009 17th IEEE Symposium on High Performance Interconnects, pp 103\u2013111 . https:\/\/doi.org\/10.1109\/HOTI.2009.9","DOI":"10.1109\/HOTI.2009.9"},{"key":"6214_CR24","doi-asserted-by":"publisher","unstructured":"Domke J, Hoefler T, Nagel WE (2011) Deadlock-free oblivious routing for arbitrary topologies. In: 2011 IEEE International Parallel & Distributed Processing Symposium, pp 616\u2013627 . https:\/\/doi.org\/10.1109\/IPDPS.2011.65","DOI":"10.1109\/IPDPS.2011.65"},{"key":"6214_CR25","doi-asserted-by":"crossref","unstructured":"Luszczek PR, Bailey DH, Dongarra JJ, Kepner J, Lucas RF, Rabenseifner R, Takahashi D (2006) The HPC challenge (HPCC) benchmark suite. In: Proceedings of the 2006 ACM\/IEEE conference on supercomputing, 213, 1","DOI":"10.1145\/1188455.1188677"},{"key":"6214_CR26","doi-asserted-by":"publisher","unstructured":"Dongarra J, Luszczek P (2011) In: Padua, D. (ed.) LINPACK benchmark, pp 1033\u20131036. Springer, Boston. https:\/\/doi.org\/10.1007\/978-0-387-09766-4_155","DOI":"10.1007\/978-0-387-09766-4_155"},{"issue":"1","key":"6214_CR27","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1093\/nsr\/nwv084","volume":"3","author":"J Dongarra","year":"2016","unstructured":"Dongarra J, Heroux MA, Luszczek P (2016) A new metric for ranking high-performance computing systems. Natl Sci Rev 3(1):30\u201335. https:\/\/doi.org\/10.1093\/nsr\/nwv084","journal-title":"Natl Sci Rev"},{"key":"6214_CR28","first-page":"45","volume":"19","author":"RC Murphy","year":"2010","unstructured":"Murphy RC, Wheeler KB, Barrett BW, Ang JA (2010) Introducing the graph 500. Cray Users Group (CUG) 19:45\u201374","journal-title":"Cray Users Group (CUG)"},{"key":"6214_CR29","doi-asserted-by":"publisher","unstructured":"Hoefler T, Mehlan T, Lumsdaine A, Rehm W (2007) Netgauge: a network performance measurement framework. In: High Performance Computing and Communications: Third International Conference, HPCC 2007, Houston, USA, September 26\u201328, 2007. Proceedings 3, pp 659\u2013671. Springer, Berlin. https:\/\/doi.org\/10.1007\/978-3-540-75444-2_62","DOI":"10.1007\/978-3-540-75444-2_62"},{"key":"6214_CR30","doi-asserted-by":"publisher","unstructured":"Sancho JC, Robles A, Duato J (2001) Effective strategy to compute forwarding tables for InfiniBand networks. In: International Conference on Parallel Processing, 2001, pp 48\u201357 .https:\/\/doi.org\/10.1109\/ICPP.2001.952046","DOI":"10.1109\/ICPP.2001.952046"},{"issue":"1","key":"6214_CR31","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1109\/MM.2019.2949280","volume":"40","author":"G Maglione-Mathey","year":"2020","unstructured":"Maglione-Mathey G, Escudero-Sahuquillo J, Garcia PJ, Quiles FJ, Duato J (2020) Path2SL: leveraging Infiniband resources to reduce head-of-line blocking in fat trees. IEEE Micro 40(1):8\u201314. https:\/\/doi.org\/10.1109\/MM.2019.2949280","journal-title":"IEEE Micro"},{"issue":"1","key":"6214_CR32","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1006\/jpdc.1995.1011","volume":"24","author":"SM Bhandarkar","year":"1995","unstructured":"Bhandarkar SM, Arabnia HR (1995) The Hough transform on a reconfigurable multi-ring network. J Parallel Distrib Comput 24(1):107\u2013114. https:\/\/doi.org\/10.1006\/jpdc.1995.1011","journal-title":"J Parallel Distrib Comput"},{"key":"6214_CR33","doi-asserted-by":"publisher","unstructured":"Das R, Eachempati S, Mishra AK, Narayanan V, Das CR (2009) Design and evaluation of a hierarchical on-chip interconnect for next-generation CMPs. In: 2009 IEEE 15th International Symposium on High Performance Computer Architecture, pp 175\u2013186 . https:\/\/doi.org\/10.1109\/HPCA.2009.4798252","DOI":"10.1109\/HPCA.2009.4798252"},{"issue":"7","key":"6214_CR34","doi-asserted-by":"publisher","first-page":"701","DOI":"10.1109\/71.940745","volume":"12","author":"Y Yang","year":"2001","unstructured":"Yang Y, Funahashi A, Jouraku A, Nishi H, Amano H, Sueyoshi T (2001) Recursive diagonal torus: an interconnection network for massively parallel computers. IEEE Trans Parallel Distrib Syst 12(7):701\u2013715. https:\/\/doi.org\/10.1109\/71.940745","journal-title":"IEEE Trans Parallel Distrib Syst"},{"key":"6214_CR35","doi-asserted-by":"publisher","unstructured":"Guo C, Lu G, Li D, Wu H, Zhang X, Shi Y, Tian C, Zhang Y, Lu S (2009) BCube: a high performance, server-centric network architecture for modular data centers. In: Proceedings of the ACM SIGCOMM 2009 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, Barcelona, Spain, August 16\u201321, 2009. SIGCOMM \u201909, pp 63\u201374. Association for Computing Machinery, New York, NY, USA . https:\/\/doi.org\/10.1145\/1592568.1592577","DOI":"10.1145\/1592568.1592577"}],"container-title":["The Journal of Supercomputing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-024-06214-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11227-024-06214-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-024-06214-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,2]],"date-time":"2024-08-02T14:03:54Z","timestamp":1722607434000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11227-024-06214-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,7]]},"references-count":35,"journal-issue":{"issue":"14","published-print":{"date-parts":[[2024,9]]}},"alternative-id":["6214"],"URL":"https:\/\/doi.org\/10.1007\/s11227-024-06214-6","relation":{},"ISSN":["0920-8542","1573-0484"],"issn-type":[{"value":"0920-8542","type":"print"},{"value":"1573-0484","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,7]]},"assertion":[{"value":"9 May 2024","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 June 2024","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}