{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:12:46Z","timestamp":1750306366377,"version":"3.41.0"},"reference-count":17,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2016,4,22]],"date-time":"2016-04-22T00:00:00Z","timestamp":1461283200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGARCH Comput. Archit. News"],"published-print":{"date-parts":[[2016,4,22]]},"abstract":"<jats:p>A hardware local essential tree (LET) generator used in an N-body simulation is implemented on the FPGA of PEACH2 (PCI Express Adaptive Communication Hub ver2), a low latency switching hub for high performance GPU clusters. By using the pipelined on-the-fly execution with a multipole acceptance criterion judging module and a data updating module, the generation performance is 2.2 times faster than that with the CPU. When data communication is considered, the performance was 7.2 times as the case with the CPU.<\/jats:p>","DOI":"10.1145\/2927964.2927966","type":"journal-article","created":{"date-parts":[[2016,4,25]],"date-time":"2016-04-25T19:51:13Z","timestamp":1461613873000},"page":"3-8","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["Off-Loading LET Generation to PEACH2"],"prefix":"10.1145","volume":"43","author":[{"given":"Chiharu","family":"Tsuruta","sequence":"first","affiliation":[{"name":"Keio University, Yokohama, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yohei","family":"Miki","sequence":"additional","affiliation":[{"name":"University of Tsukuba, Tsukuba, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Takuya","family":"Kuhara","sequence":"additional","affiliation":[{"name":"Keio University, Yokohama, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hideharu","family":"Amano","sequence":"additional","affiliation":[{"name":"Keio University, Yokohama, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Masayuki","family":"Umemura","sequence":"additional","affiliation":[{"name":"University of Tsukuba, Tsukuba, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2016,4,22]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"575","volume-title":"Proceedings of the International Conference on Field Programmable Logic and Application (FPL '12)","author":"Alagic A","year":"2012","unstructured":"A Alagic and H. Amano . Performance analysis of fully-adaptable CRC accelerators on an FPGA . In Proceedings of the International Conference on Field Programmable Logic and Application (FPL '12) , pages 575 -- 578 , Sept 2012 . A Alagic and H. Amano. Performance analysis of fully-adaptable CRC accelerators on an FPGA. In Proceedings of the International Conference on Field Programmable Logic and Application (FPL '12), pages 575--578, Sept 2012."},{"volume-title":"o. T","author":"U.","key":"e_1_2_1_2_1","unstructured":"U. o. T . Center for Computational Sciences . http:\/\/www.ccs.tsukuba.ac.jp\/. U. o. T. Center for Computational Sciences. http:\/\/www.ccs.tsukuba.ac.jp\/."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2008.2008741"},{"key":"e_1_2_1_4_1","first-page":"134","volume-title":"Proceedings of the Twenty-Sixth Hawaii International Conference on System Sciences","author":"Hashimoto S.","year":"1993","unstructured":"Ebisuzaki, Toshikazu; Fukushige, T.; Taiji, Makoto; Makino, J.; Sugimoto, D.; Ito, T.; Okumura, S. K.; Hashimoto , E.; Tomida, K.; Miyakawa, N. GRAPE : special purpose computer for simulations of many-body systems . In Proceedings of the Twenty-Sixth Hawaii International Conference on System Sciences , pages 134 -- 143 , Jan. 1993 . Ebisuzaki, Toshikazu; Fukushige, T.; Taiji, Makoto; Makino, J.; Sugimoto, D.; Ito, T.; Okumura, S. K.; Hashimoto, E.; Tomida, K.; Miyakawa, N. GRAPE: special purpose computer for simulations of many-body systems. In Proceedings of the Twenty-Sixth Hawaii International Conference on System Sciences, pages 134--143, Jan. 1993."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2014.6927487"},{"key":"e_1_2_1_6_1","first-page":"58","volume-title":"Implementation and Performance Evaluation of Astrophysical Tree-code for GPU Clusters","author":"Ogiya Go","year":"2013","unstructured":"Go Ogiya , Yohei Miki , Taisuki Boku , Masao Mori , Naohito Nakasato . Implementation and Performance Evaluation of Astrophysical Tree-code for GPU Clusters . In Information Processing Society of Japan Vol . 6, pages 58 -- 70 , April 2013 . Go Ogiya, Yohei Miki, Taisuki Boku, Masao Mori, Naohito Nakasato. Implementation and Performance Evaluation of Astrophysical Tree-code for GPU Clusters. In Information Processing Society of Japan Vol. 6, pages 58--70, April 2013."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/HOTI.2013.15"},{"key":"e_1_2_1_8_1","volume-title":"Springer-Verlag Berlin and Heildelberg GmBH and Co. K","author":"Sagan Hans","year":"1994","unstructured":"Hans Sagan , Space-Filing Curves . Springer-Verlag Berlin and Heildelberg GmBH and Co. K , 1994 . Hans Sagan, Space-Filing Curves. Springer-Verlag Berlin and Heildelberg GmBH and Co. K, 1994."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2014.10"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcp.2011.12.024"},{"key":"e_1_2_1_11_1","first-page":"446","volume-title":"Nature","author":"Barnes Josh","year":"1986","unstructured":"Josh Barnes , Piet Hut . A hierarchical O(NlogN) force-calculation algorithm . In Nature , pages 446 -- 449 , December 1986 . Josh Barnes, Piet Hut. A hierarchical O(NlogN) force-calculation algorithm. In Nature, pages 446--449, December 1986."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASAP.2010.5540976"},{"key":"e_1_2_1_13_1","first-page":"180","volume-title":"Proceedings of the International Conference on Application Specific Array Processors","author":"D.","year":"1990","unstructured":"Makino, J.; Ito, T.; Ebisuzaki, Toshikazu; Sugimoto, D. GRAPE : a special-purpose computer for N-body problems . In Proceedings of the International Conference on Application Specific Array Processors , pages 180 -- 189 , Sep 1990 . Makino, J.; Ito, T.; Ebisuzaki, Toshikazu; Sugimoto, D. GRAPE: a special-purpose computer for N-body problems. In Proceedings of the International Conference on Application Specific Array Processors, pages 180--189, Sep 1990."},{"key":"e_1_2_1_14_1","first-page":"570","volume-title":"John K. Salmon. Astrophysical N-body Simulations Using Hierarchical Tree Data Structures. In Proceedings of the 1992 ACM\/IEEE Conference on Supercomputing","author":"Warren Michael S.","year":"1992","unstructured":"Michael S. Warren , John K. Salmon. Astrophysical N-body Simulations Using Hierarchical Tree Data Structures. In Proceedings of the 1992 ACM\/IEEE Conference on Supercomputing , pages 570 -- 577 , Sep 1992 . Michael S. Warren, John K. Salmon. Astrophysical N-body Simulations Using Hierarchical Tree Data Structures. In Proceedings of the 1992 ACM\/IEEE Conference on Supercomputing, pages 570--577, Sep 1992."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/169627.169640"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1006\/jcph.1994.1050"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2693714.2693716"}],"container-title":["ACM SIGARCH Computer Architecture News"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2927964.2927966","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2927964.2927966","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:56:21Z","timestamp":1750222581000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2927964.2927966"}},"subtitle":["A Switching Hub for High Performance GPU Clusters"],"short-title":[],"issued":{"date-parts":[[2016,4,22]]},"references-count":17,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2016,4,22]]}},"alternative-id":["10.1145\/2927964.2927966"],"URL":"https:\/\/doi.org\/10.1145\/2927964.2927966","relation":{},"ISSN":["0163-5964"],"issn-type":[{"type":"print","value":"0163-5964"}],"subject":[],"published":{"date-parts":[[2016,4,22]]},"assertion":[{"value":"2016-04-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}