{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T21:07:13Z","timestamp":1776373633101,"version":"3.51.2"},"publisher-location":"New York, NY, USA","reference-count":22,"publisher":"ACM","funder":[{"name":"UT-Battelle, LLC, under contract DE-AC05-00OR22725","award":[""],"award-info":[{"award-number":[""]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,11,16]]},"DOI":"10.1145\/3712285.3759870","type":"proceedings-article","created":{"date-parts":[[2025,11,12]],"date-time":"2025-11-12T16:04:47Z","timestamp":1762963487000},"page":"935-948","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Distributed Cross-Channel Hierarchical Aggregation for Foundation Models"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7734-3349","authenticated-orcid":false,"given":"Aristeidis","family":"Tsaris","sequence":"first","affiliation":[{"name":"Oak Ridge National Laboratory (ORNL), Knoxville, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1682-4309","authenticated-orcid":false,"given":"Isaac","family":"Lyngaas","sequence":"additional","affiliation":[{"name":"Oak Ridge National Laboratory (ORNL), Knoxville, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8092-7433","authenticated-orcid":false,"given":"John","family":"Lagergren","sequence":"additional","affiliation":[{"name":"Oak Ridge National Laboratory (ORNL), Knoxville, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7165-2095","authenticated-orcid":false,"given":"Mohamed","family":"Wahib","sequence":"additional","affiliation":[{"name":"RIKEN Center for Computational Science (R-CCS), Kobe, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1995-9479","authenticated-orcid":false,"given":"Larry","family":"York","sequence":"additional","affiliation":[{"name":"Oak Ridge National Laboratory (ORNL), Knoxville, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0292-5715","authenticated-orcid":false,"given":"Prasanna","family":"Balaprakash","sequence":"additional","affiliation":[{"name":"Oak Ridge National Laboratory (ORNL), Knoxville, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5162-9843","authenticated-orcid":false,"given":"Dan","family":"Lu","sequence":"additional","affiliation":[{"name":"Oak Ridge National Laboratory (ORNL), Knoxville, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0099-1559","authenticated-orcid":false,"given":"Feiyi","family":"Wang","sequence":"additional","affiliation":[{"name":"Oak Ridge National Laboratory (ORNL), Knoxville, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6545-1943","authenticated-orcid":false,"given":"Xiao","family":"Wang","sequence":"additional","affiliation":[{"name":"Oak Ridge National Laboratory (ORNL), Knoxville, USA"}]}],"member":"320","published-online":{"date-parts":[[2025,11,15]]},"reference":[{"key":"e_1_3_3_2_2_2","unstructured":"[n. d.]. The Frontier supercomputer. https:\/\/www.olcf.ornl.gov\/frontier\/."},{"key":"e_1_3_3_2_3_2","doi-asserted-by":"publisher","unstructured":"2020. xESMF: Universal Regridder for Geospatial Data. 10.5281\/zenodo.4294774","DOI":"10.5281\/zenodo.4294774"},{"key":"e_1_3_3_2_4_2","unstructured":"Cristian Bodnar Wessel\u00a0P. Bruinsma Ana Lucic Megan Stanley Anna Vaughan Johannes Brandstetter Patrick Garvan Maik Riechert Jonathan\u00a0A. Weyn Haiyu Dong Jayesh\u00a0K. Gupta Kit Thambiratnam Alexander\u00a0T. Archibald Chun-Chieh Wu Elizabeth Heider Max Welling Richard\u00a0E. Turner and Paris Perdikaris. 2024. A Foundation Model for the Earth System. arxiv:https:\/\/arXiv.org\/abs\/2405.13063\u00a0[physics.ao-ph] https:\/\/arxiv.org\/abs\/2405.13063"},{"key":"e_1_3_3_2_5_2","unstructured":"Keumgang Cha Junghoon Seo and Taekyung Lee. 2023. A Billion-scale Foundation Model for Remote Sensing Images. arxiv:https:\/\/arXiv.org\/abs\/2304.05215\u00a0[cs.CV]"},{"key":"e_1_3_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01567"},{"key":"e_1_3_3_2_7_2","unstructured":"Tri Dao. 2023. FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning. arxiv:https:\/\/arXiv.org\/abs\/2307.08691\u00a0[cs.LG]"},{"key":"e_1_3_3_2_8_2","unstructured":"Mostafa Dehghani Josip Djolonga Basil Mustafa Piotr Padlewski Jonathan Heek Justin Gilmer Andreas Steiner Mathilde Caron Robert Geirhos Ibrahim Alabdulmohsin Rodolphe Jenatton Lucas Beyer Michael Tschannen Anurag Arnab Xiao Wang Carlos Riquelme Matthias Minderer Joan Puigcerver Utku Evci Manoj Kumar Sjoerd van Steenkiste Gamaleldin\u00a0F. Elsayed Aravindh Mahendran Fisher Yu Avital Oliver Fantine Huot Jasmijn Bastings Mark\u00a0Patrick Collier Alexey Gritsenko Vighnesh Birodkar Cristina Vasconcelos Yi Tay Thomas Mensink Alexander Kolesnikov Filip Paveti\u0107 Dustin Tran Thomas Kipf Mario Lu\u010di\u0107 Xiaohua Zhai Daniel Keysers Jeremiah Harmsen and Neil Houlsby. 2023. Scaling Vision Transformers to 22 Billion Parameters. arxiv:https:\/\/arXiv.org\/abs\/2302.05442\u00a0[cs.CV]"},{"key":"e_1_3_3_2_9_2","doi-asserted-by":"publisher","unstructured":"V. Eyring S. Bony G.\u00a0A. Meehl C.\u00a0A. Senior B. Stevens R.\u00a0J. Stouffer and K.\u00a0E. Taylor. 2016. Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geoscientific Model Development 9 5 (2016) 1937\u20131958. 10.5194\/gmd-9-1937-2016","DOI":"10.5194\/gmd-9-1937-2016"},{"key":"e_1_3_3_2_10_2","unstructured":"William Fedus Barret Zoph and Noam Shazeer. 2022. Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res. 23 1 Article 120 (jan 2022) 39\u00a0pages."},{"key":"e_1_3_3_2_11_2","unstructured":"Kaiming He Xinlei Chen Saining Xie Yanghao Li Piotr Doll\u00e1r and Ross Girshick. 2021. Masked Autoencoders Are Scalable Vision Learners. arxiv:https:\/\/arXiv.org\/abs\/2111.06377\u00a0[cs.CV] https:\/\/arxiv.org\/abs\/2111.06377"},{"key":"e_1_3_3_2_12_2","doi-asserted-by":"publisher","unstructured":"Hans Hersbach Bill Bell Paul Berrisford Shoji Hirahara Andr\u00e1s Hor\u00e1nyi Joaqu\u00edn Mu\u00f1oz-Sabater Julien Nicolas Carole Peubey Raluca Radu Dinand Schepers Adrian Simmons Cornel Soci Saleh Abdalla Xavier Ab\u0303ellan Gianpaolo Balsamo Peter Bechtold Gionata Biavati Jean Bidlot Massimo Bonavita Giovanna De\u00a0Chiara Per Dahlgren Dick Dee Michail Diamantakis Rossana Dragani Johannes Flemming Manuel Forbes Richard andF\u0303uentes Alan Geer Leo Haimberger Sean Healy Robin\u00a0J. Hogan El\u00edas H\u00f3lm Marta Janiskov\u00e1 Sarah Keeley Patrick Laloyaux Philippe Lopez Cristina Lupu Gabor Radnoti Patricia de Rosnay Freja Rozum Iryna an d\u00a0Vamborg Sebastien Villaume and Jean-No\u00ebl Th\u00e9paut. 2020. The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society 146 730 (2020) 1999\u20132049. 10.1002\/qj.3803 arXiv:https:\/\/rmets.onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/qj.3803","DOI":"10.1002\/qj.3803"},{"key":"e_1_3_3_2_13_2","unstructured":"Andrew Jaegle Felix Gimeno Andrew Brock Andrew Zisserman Oriol Vinyals and Joao Carreira. 2021. Perceiver: General Perception with Iterative Attention. arxiv:https:\/\/arXiv.org\/abs\/2103.03206\u00a0[cs.CV] https:\/\/arxiv.org\/abs\/2103.03206"},{"key":"e_1_3_3_2_14_2","unstructured":"Oak Ridge\u00a0National Laboratory. 2025. Advanced Plant Phenotyping Laboratory | ORNL. https:\/\/www.ornl.gov\/appl Accessed: 2025-04-10."},{"key":"e_1_3_3_2_15_2","unstructured":"Ze Liu Yutong Lin Yue Cao Han Hu Yixuan Wei Zheng Zhang Stephen Lin and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. arxiv:https:\/\/arXiv.org\/abs\/2103.14030\u00a0[cs.CV] https:\/\/arxiv.org\/abs\/2103.14030"},{"key":"e_1_3_3_2_16_2","unstructured":"Tung Nguyen Johannes Brandstetter Ashish Kapoor Jayesh\u00a0K Gupta and Aditya Grover. 2023. Climax: A foundation model for weather and climate. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2301.10343 (2023)."},{"key":"e_1_3_3_2_17_2","unstructured":"Tung Nguyen Rohan Shah Hritik Bansal Troy Arcomano Romit Maulik Veerabhadra Kotamarthi Ian Foster Sandeep Madireddy and Aditya Grover. 2024. Scaling transformer neural networks for skillful and reliable medium-range weather forecasting. arxiv:https:\/\/arXiv.org\/abs\/2312.03876\u00a0[physics.ao-ph] https:\/\/arxiv.org\/abs\/2312.03876"},{"key":"e_1_3_3_2_18_2","unstructured":"Xiao Wang Siyan Liu Aristeidis Tsaris Jong-Youl Choi Ashwin Aji Ming Fan Wei Zhang Junqi Yin Moetasim Ashfaq Dan Lu and Prasanna Balaprakash. 2024. ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability. arxiv:https:\/\/arXiv.org\/abs\/2404.14712\u00a0[physics.ao-ph] https:\/\/arxiv.org\/abs\/2404.14712"},{"key":"e_1_3_3_2_19_2","unstructured":"Zhitong Xiong Yi Wang Fahong Zhang and Xiao\u00a0Xiang Zhu. 2024. One for All: Toward Unified Foundation Models for Earth Vision. arxiv:https:\/\/arXiv.org\/abs\/2401.07527\u00a0[cs.CV]"},{"key":"e_1_3_3_2_20_2","unstructured":"Peng Xu Xiatian Zhu and David\u00a0A. Clifton. 2023. Multimodal Learning with Transformers: A Survey. arxiv:https:\/\/arXiv.org\/abs\/2206.06488\u00a0[cs.CV] https:\/\/arxiv.org\/abs\/2206.06488"},{"key":"e_1_3_3_2_21_2","doi-asserted-by":"publisher","unstructured":"Zongyin Yang Tom Albrow-Owen Weiwei Cai and Tawfique Hasan. 2021. Miniaturization of optical spectrometers. Science 371 6528 (2021) eabe0722. 10.1126\/science.abe0722 arXiv:https:\/\/www.science.org\/doi\/pdf\/10.1126\/science.abe0722","DOI":"10.1126\/science.abe0722"},{"key":"e_1_3_3_2_22_2","doi-asserted-by":"publisher","unstructured":"Fengming Yuan Dali Wang Shih-Chieh Kao Michele Thornton Daniel Ricciuto Verity Salmon Colleen Iversen Peter Schwartz and Peter Thornton. 2023. An ultrahigh-resolution E3SM land model simulation framework and its first application to the Seward Peninsula in Alaska. Journal of Computational Science 73 (2023) 102145. 10.1016\/j.jocs.2023.102145","DOI":"10.1016\/j.jocs.2023.102145"},{"key":"e_1_3_3_2_23_2","unstructured":"Yanli Zhao Andrew Gu Rohan Varma Liang Luo Chien-Chin Huang Min Xu Less Wright Hamid Shojanazeri Myle Ott Sam Shleifer Alban Desmaison Can Balioglu Pritam Damania Bernard Nguyen Geeta Chauhan Yuchen Hao Ajit Mathews and Shen Li. 2023. PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel. arxiv:https:\/\/arXiv.org\/abs\/2304.11277\u00a0[cs.DC] https:\/\/arxiv.org\/abs\/2304.11277"}],"event":{"name":"SC '25: The International Conference for High Performance Computing, Networking, Storage and Analysis","location":"St. Louis MO USA","acronym":"SC '25","sponsor":["SIGHPC ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing"]},"container-title":["Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3712285.3759870","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T18:49:15Z","timestamp":1773254955000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3712285.3759870"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,15]]},"references-count":22,"alternative-id":["10.1145\/3712285.3759870","10.1145\/3712285"],"URL":"https:\/\/doi.org\/10.1145\/3712285.3759870","relation":{},"subject":[],"published":{"date-parts":[[2025,11,15]]},"assertion":[{"value":"2025-11-15","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}