{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T17:26:15Z","timestamp":1775323575689,"version":"3.50.1"},"reference-count":18,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2017,1,11]],"date-time":"2017-01-11T00:00:00Z","timestamp":1484092800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGARCH Comput. Archit. News"],"published-print":{"date-parts":[[2017,1,11]]},"abstract":"<jats:p>FPGA-centric clouds and clusters provide direct and programmable interconnects with obvious benefits for communication latency and bandwidth. One rarely studied aspect of DPI is that they facilitate application-aware routing: if communication patterns are static and known a priori, as is usually the case, then judicious routing can reduce congestion, latency, and the hardware required. In this study we explore applying the method of offline\/static routing to collective operations, in particular, multicast and reduction. An entirely new communication infrastructure is proposed and implemented, including switch design and routing algorithm. A substantial improvement in performance is obtained, especially for multicast. We believe that this is one of the few general offline\/static routing solutions for real HPC clusters, and FPGA-centric clusters in particular.<\/jats:p>","DOI":"10.1145\/3039902.3039904","type":"journal-article","created":{"date-parts":[[2017,1,17]],"date-time":"2017-01-17T13:42:08Z","timestamp":1484660528000},"page":"2-7","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":18,"title":["Collective Communication on FPGA Clusters with Static Scheduling"],"prefix":"10.1145","volume":"44","author":[{"given":"Jiayi","family":"Sheng","sequence":"first","affiliation":[{"name":"Boston University, Boston, MA"}]},{"given":"Qingqing","family":"Xiong","sequence":"additional","affiliation":[{"name":"Boston University, Boston, MA"}]},{"given":"Chen","family":"Yang","sequence":"additional","affiliation":[{"name":"Boston University, Boston, MA"}]},{"given":"Martin C.","family":"Herbordt","sequence":"additional","affiliation":[{"name":"Boston University, Boston, MA"}]}],"member":"320","published-online":{"date-parts":[[2017,1,11]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"13","volume-title":"Int. Symp. on ComputerArchitecture","author":"Putnam A.","year":"2014","unstructured":"A. Putnam , reconfigurable fabric for accelerating large-scale datacenter services,\" in Proc . Int. Symp. on ComputerArchitecture , 2014 , pp. 13 -- 24 . A. Putnam, et al., \"A reconfigurable fabric for accelerating large-scale datacenter services,\" in Proc. Int. Symp. on ComputerArchitecture, 2014, pp. 13--24."},{"key":"e_1_2_1_2_1","volume-title":"Extreme Computing Conf.","author":"George A.","year":"2016","unstructured":"A. George , M. Herbordt , H. Lam , A. Lawande , J. Sheng , and C. Yang , \" Novo-G#: A Community Resource for Exploring Large-Scale Reconfigurable Computing Through Direct and Programmable Interconnects,\" in IEEE High Perf . Extreme Computing Conf. , 2016 . A. George, M. Herbordt, H. Lam, A. Lawande, J. Sheng, and C. Yang, \"Novo-G#: A Community Resource for Exploring Large-Scale Reconfigurable Computing Through Direct and Programmable Interconnects,\" in IEEE High Perf. Extreme Computing Conf., 2016."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/165123.165124"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342014552086"},{"key":"e_1_2_1_5_1","unstructured":"Mellanox \"Mellanox Introduces Programmable Network Adapter Product Line with Application Acceleration Engine \" http:\/\/ir.mellanox.com\/releasedetail.cfm?ReleaseID=883814 accessed 11\/9\/2015 2015.  Mellanox \"Mellanox Introduces Programmable Network Adapter Product Line with Application Acceleration Engine \" http:\/\/ir.mellanox.com\/releasedetail.cfm?ReleaseID=883814 accessed 11\/9\/2015 2015."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2011.12.002"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2629470"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2015.42"},{"key":"e_1_2_1_9_1","volume-title":"IEEE Symp. on Field Programmable Custom Computing Machines","author":"Kapre N.","year":"2016","unstructured":"N. Kapre , \"Marathon : Statically-Scheduled Conflict-Free Routing on FPGA Overlay NoCs,\" in Proc . IEEE Symp. on Field Programmable Custom Computing Machines , 2016 . N. Kapre, \"Marathon: Statically-Scheduled Conflict-Free Routing on FPGA Overlay NoCs,\" in Proc. IEEE Symp. on Field Programmable Custom Computing Machines, 2016."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM.2006.55"},{"key":"e_1_2_1_12_1","volume-title":"Extreme Computing Conf.","author":"Sheng J.","year":"2014","unstructured":"J. Sheng , B. Humphries , H. Zhang , and M. Herbordt , \" Design of 3D FFTs with FPGA Clusters,\" in IEEE High Perf . Extreme Computing Conf. , 2014 . J. Sheng, B. Humphries, H. Zhang, and M. Herbordt, \"Design of 3D FFTs with FPGA Clusters,\" in IEEE High Perf. Extreme Computing Conf., 2014."},{"key":"e_1_2_1_13_1","volume-title":"Highly Efficient and Reconfigurable Technologies","author":"Sheng J.","year":"2015","unstructured":"J. Sheng , C. Yang , and M. Herbordt , \" Towards Low-Latency Communication on FPGA Clusters with 3D FFT Case Study,\" in Proc . Highly Efficient and Reconfigurable Technologies , 2015 . J. Sheng, C. Yang, and M. Herbordt, \"Towards Low-Latency Communication on FPGA Clusters with 3D FFT Case Study,\" in Proc. Highly Efficient and Reconfigurable Technologies, 2015."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2011.219"},{"key":"e_1_2_1_15_1","volume-title":"Principles and Practices of Interconnection Networks","author":"Dally W.","year":"2004","unstructured":"W. Dally and B. Towles , Principles and Practices of Interconnection Networks . Elsevier , 2004 . W. Dally and B. Towles, Principles and Practices of Interconnection Networks. Elsevier, 2004."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2008.12"},{"key":"e_1_2_1_17_1","first-page":"355","volume-title":"IEEE 15th Int. Symp. High Performance Computer Architecture","author":"Abad P.","year":"2009","unstructured":"P. Abad , V. Puente , and J.-A. Gregorio , \"MRR : Enabling Fully Adaptive Multicast Routing for CMP Interconnection Networks,\" in Proc . IEEE 15th Int. Symp. High Performance Computer Architecture , 2009 , pp. 355 -- 366 . P. Abad, V. Puente, and J.-A. Gregorio, \"MRR: Enabling Fully Adaptive Multicast Routing for CMP Interconnection Networks,\" in Proc. IEEE 15th Int. Symp. High Performance Computer Architecture, 2009, pp. 355--366."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/NOCS.2009.5071446"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155630"}],"container-title":["ACM SIGARCH Computer Architecture News"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3039902.3039904","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3039902.3039904","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:36:31Z","timestamp":1750217791000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3039902.3039904"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,1,11]]},"references-count":18,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2017,1,11]]}},"alternative-id":["10.1145\/3039902.3039904"],"URL":"https:\/\/doi.org\/10.1145\/3039902.3039904","relation":{},"ISSN":["0163-5964"],"issn-type":[{"value":"0163-5964","type":"print"}],"subject":[],"published":{"date-parts":[[2017,1,11]]},"assertion":[{"value":"2017-01-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}