{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T16:19:49Z","timestamp":1761581989729,"version":"3.38.0"},"reference-count":22,"publisher":"SAGE Publications","issue":"6","license":[{"start":{"date-parts":[[2020,7,27]],"date-time":"2020-07-27T00:00:00Z","timestamp":1595808000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2020,11]]},"abstract":"<jats:p> In this paper, we report the implementation and measured performance of our extreme-scale whole planetary ring simulation code on Sunway TaihuLight and two PEZY-SC2 systems: Shoubu System B and Gyoukou. The numerical algorithm is the parallel Barnes-Hut tree algorithm, which has been used in many large-scale astrophysical particle-based simulations. Our implementation is based on our FDPS framework. However, the extremely large numbers of cores of the systems used (10 M on TaihuLight and 16 M on Gyoukou) and their relatively poor memory and network bandwidth pose new challenges. We describe the new algorithms introduced to achieve high efficiency on machines with low memory bandwidth. The measured performance is 47.9, 10.6 PF, and 1.01PF on TaihuLight, Gyoukou and Shoubu System B (efficiency 40%, 23.5% and 35.5%). The current code is developed for the simulation of planetary rings, but most of the new algorithms are useful for other simulations, and are now available in the FDPS framework. <\/jats:p>","DOI":"10.1177\/1094342020943652","type":"journal-article","created":{"date-parts":[[2020,7,27]],"date-time":"2020-07-27T12:26:57Z","timestamp":1595852817000},"page":"615-628","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":6,"title":["Implementation and performance of Barnes-hut n-body algorithm on extreme-scale heterogeneous many-core architectures"],"prefix":"10.1177","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2635-9575","authenticated-orcid":false,"given":"Masaki","family":"Iwasawa","sequence":"first","affiliation":[{"name":"National Institute of Technology, Matsue College, Matsue, Japan"},{"name":"Graduate School of Science, Kobe University, Kobe, Japan"},{"name":"RIKEN Center for Computational Science, Kobe, Japan"}]},{"given":"Daisuke","family":"Namekata","sequence":"additional","affiliation":[{"name":"RIKEN Center for Computational Science, Kobe, Japan"}]},{"given":"Ryo","family":"Sakamoto","sequence":"additional","affiliation":[{"name":"PEZY Computing K. K., Tokyo, Japan"}]},{"given":"Takashi","family":"Nakamura","sequence":"additional","affiliation":[{"name":"Preferred Networks Inc., Tokyo, Japan"}]},{"given":"Yasuyuki","family":"Kimura","sequence":"additional","affiliation":[{"name":"ExaScaler Inc., Tokyo, Japan"}]},{"given":"Keigo","family":"Nitadori","sequence":"additional","affiliation":[{"name":"RIKEN Center for Computational Science, Kobe, Japan"}]},{"given":"Long","family":"Wang","sequence":"additional","affiliation":[{"name":"RIKEN Center for Computational Science, Kobe, Japan"},{"name":"Department of Astronomy, School of Science, The University of Tokyo, Tokyo, Japan"}]},{"given":"Miyuki","family":"Tsubouchi","sequence":"additional","affiliation":[{"name":"RIKEN Center for Computational Science, Kobe, Japan"}]},{"given":"Jun","family":"Makino","sequence":"additional","affiliation":[{"name":"Graduate School of Science, Kobe University, Kobe, Japan"}]},{"given":"Zhao","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, Tsinghua University, Beijing, China"},{"name":"National Supercomputing Center in Wuxi, Wuxi, China"}]},{"given":"Haohuan","family":"Fu","sequence":"additional","affiliation":[{"name":"Department of Earth System Science, Tsinghua University, Beijing, China"}]},{"given":"Guangwen","family":"Yang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, Tsinghua University, Beijing, China"}]}],"member":"179","published-online":{"date-parts":[[2020,7,27]]},"reference":[{"key":"bibr1-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1038\/324446a0"},{"key":"bibr2-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1016\/0021-9991(90)90232-P"},{"key":"bibr3-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2014.10"},{"key":"bibr4-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1145\/113379.113380"},{"key":"bibr5-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2018.00021"},{"key":"bibr6-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1007\/s11432-016-5588-7"},{"key":"bibr7-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1145\/1654059.1654123"},{"key":"bibr8-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1088\/0004-637X\/788\/1\/27"},{"key":"bibr9-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2012.3"},{"key":"bibr10-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1093\/pasj\/psw053"},{"key":"bibr11-1094342020943652","first-page":"621","volume":"43","author":"Makino J","year":"1991","journal-title":"Publications of the Astronomical Society of Japan"},{"key":"bibr12-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1093\/pasj\/56.3.521"},{"key":"bibr13-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1086\/173068"},{"key":"bibr14-1094342020943652","doi-asserted-by":"publisher","DOI":"10.3847\/2041-8213\/aa6256"},{"key":"bibr15-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1093\/pasj\/psy062"},{"key":"bibr16-1094342020943652","unstructured":"Portegies Zwart S, B\u00e9dorf J (2014) Computational gravitational dynamics with modern numerical accelerators. ArXiv e-prints."},{"key":"bibr17-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1093\/mnras\/stt152"},{"key":"bibr18-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1051\/0004-6361\/201118085"},{"volume-title":"Using Parallel Computers for Very Large N-Body Simulations: Shell Formation Using 180 K Particles","year":"1990","author":"Salmon J","key":"bibr19-1094342020943652"},{"key":"bibr20-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1109\/SUPERC.1992.236647"},{"key":"bibr21-1094342020943652","doi-asserted-by":"publisher","DOI":"10.1086\/114690"},{"key":"bibr22-1094342020943652","doi-asserted-by":"crossref","unstructured":"Zebker HA, Marouf EA, Tyler GL (1985) Saturn\u2019s rings: Particle size distributions for thin layer models. Icarus 64(3): 531\u2013548.","DOI":"10.1016\/0019-1035(85)90074-0"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342020943652","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342020943652","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342020943652","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,1]],"date-time":"2025-03-01T22:56:47Z","timestamp":1740869807000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342020943652"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,27]]},"references-count":22,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2020,11]]}},"alternative-id":["10.1177\/1094342020943652"],"URL":"https:\/\/doi.org\/10.1177\/1094342020943652","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"type":"print","value":"1094-3420"},{"type":"electronic","value":"1741-2846"}],"subject":[],"published":{"date-parts":[[2020,7,27]]}}}