{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T12:25:13Z","timestamp":1773318313131,"version":"3.50.1"},"reference-count":53,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2025,12,30]],"date-time":"2025-12-30T00:00:00Z","timestamp":1767052800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"DOI":"10.13039\/501100012190","name":"Ministry of Science and Higher Education of the Russian Federation","doi-asserted-by":"publisher","award":["075-00269-25-00"],"award-info":[{"award-number":["075-00269-25-00"]}],"id":[{"id":"10.13039\/501100012190","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100007251","name":"National Research University Higher School of Economics","doi-asserted-by":"publisher","award":["Basic Research Program"],"award-info":[{"award-number":["Basic Research Program"]}],"id":[{"id":"10.13039\/501100007251","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100006769","name":"Russian Science Foundation","doi-asserted-by":"publisher","award":["20-71-10127"],"award-info":[{"award-number":["20-71-10127"]}],"id":[{"id":"10.13039\/501100006769","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2026,3]]},"abstract":"<jats:p>One of the most important aspects of supercomputer development in the post-Moore era is the interconnect technologies that allow one to unite a multitude of processing elements into a well-synchronized computing system. Novel types of supercomputer interconnect require careful benchmarking and compliance with the requirements of modern hardware trends. GPU-based heterogeneous computing is one of the most important current avenues for building high performance computing systems, and the support of GPU-aware MPI technology is a requirement for any competitive interconnect. In this paper, we describe a UCX API based GPU-aware MPI implementation for the Angara interconnect. Performance analysis for peer-to-peer, MPI_Bcast and MPI_Reduce operations is presented, as well as for the rocHPL benchmark and for a typical biomolecular model within the LAMMPS molecular dynamics code. The deployment of the Desmos supercomputer equipped with both Angara and InfiniBand FDR allows us to make an accurate comparison of these two types of interconnect using the latter as a reference.<\/jats:p>","DOI":"10.1177\/10943420251411961","type":"journal-article","created":{"date-parts":[[2025,12,30]],"date-time":"2025-12-30T14:13:34Z","timestamp":1767104014000},"page":"240-253","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["Towards performance analysis of GPU-aware MPI over Angara interconnect"],"prefix":"10.1177","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7381-5759","authenticated-orcid":false,"given":"Timur","family":"Ismagilov","sequence":"first","affiliation":[{"name":"Department of Multiscale Supercomputer Modelling, Joint Institute for High Temperatures of RAS"}]},{"given":"Anatoly","family":"Mukosey","sequence":"additional","affiliation":[{"name":"Department of Multiscale Supercomputer Modelling, Joint Institute for High Temperatures of RAS"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-2478-2484","authenticated-orcid":false,"given":"Felix","family":"Smirnov","sequence":"additional","affiliation":[{"name":"International Laboratory for Supercomputer Atomistic Modelling and Multiscale Analysis, Tikhonov Moscow Institute of Electronics and Mathematics, HSE University"}]},{"given":"Vladislav","family":"Galigerov","sequence":"additional","affiliation":[{"name":"Department of Multiscale Supercomputer Modelling, Joint Institute for High Temperatures of RAS"}]},{"given":"Yuri","family":"Grishichkin","sequence":"additional","affiliation":[{"name":"Department of Multiscale Supercomputer Modelling, Joint Institute for High Temperatures of RAS"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5349-3991","authenticated-orcid":false,"given":"Vladimir","family":"Stegailov","sequence":"additional","affiliation":[{"name":"Department of Multiscale Supercomputer Modelling, Joint Institute for High Temperatures of RAS"},{"name":"International Laboratory for Supercomputer Atomistic Modelling and Multiscale Analysis, Tikhonov Moscow Institute of Electronics and Mathematics, HSE University"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1156-893X","authenticated-orcid":false,"given":"Alexey","family":"Timofeev","sequence":"additional","affiliation":[{"name":"Department of Multiscale Supercomputer Modelling, Joint Institute for High Temperatures of RAS"},{"name":"International Laboratory for Supercomputer Atomistic Modelling and Multiscale Analysis, Tikhonov Moscow Institute of Electronics and Mathematics, HSE University"}]}],"member":"179","published-online":{"date-parts":[[2025,12,30]]},"reference":[{"key":"e_1_3_3_2_1","doi-asserted-by":"publisher","DOI":"10.1177\/10943420241277839"},{"key":"e_1_3_3_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2018.00090"},{"key":"e_1_3_3_4_1","doi-asserted-by":"publisher","DOI":"10.1134\/S1995080218090081"},{"key":"e_1_3_3_5_1","doi-asserted-by":"publisher","DOI":"10.1051\/epjconf\/202429510006"},{"key":"e_1_3_3_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3236367.3236381"},{"key":"e_1_3_3_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2019.03.005"},{"key":"e_1_3_3_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSR59073.2023.00037"},{"key":"e_1_3_3_9_1","first-page":"64","article-title":"Adaptive routing system for the domestic interconnect SMPO-10G","volume":"3","author":"Basalov VG","year":"2012","unstructured":"Basalov VG, Vyalukhin VM (2012) Adaptive routing system for the domestic interconnect SMPO-10G. VANT. Ser.: Mat. Mod. Fiz. Proc 3: 64\u201370.","journal-title":"VANT. Ser.: Mat. Mod. Fiz. Proc"},{"key":"e_1_3_3_10_1","doi-asserted-by":"publisher","DOI":"10.1177\/10943420241265936"},{"key":"e_1_3_3_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/HOTI.2015.22"},{"key":"e_1_3_3_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/40.342015"},{"key":"e_1_3_3_13_1","doi-asserted-by":"publisher","DOI":"10.3390\/computers12090173"},{"key":"e_1_3_3_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2023.01.005"},{"key":"e_1_3_3_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2018.2867222"},{"key":"e_1_3_3_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3649510"},{"key":"e_1_3_3_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC41405.2020.00039"},{"key":"e_1_3_3_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-66057-4_9"},{"key":"e_1_3_3_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/MASCOTS53633.2021.9614285"},{"key":"e_1_3_3_20_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.7188"},{"key":"e_1_3_3_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1088149.1088161"},{"key":"e_1_3_3_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-22941-1_31"},{"key":"e_1_3_3_23_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.7895"},{"key":"e_1_3_3_24_1","volume-title":"HECBioSim HPC Benchmarking Suite","author":"HECBioSim","year":"2025","unstructured":"HECBioSim (2025) HECBioSim HPC Benchmarking Suite. https:\/\/www.hecbiosim.ac.uk\/access-hpc\/hpc-benchmarking-suite. Online; accessed 20-October-2025."},{"key":"e_1_3_3_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2007.370475"},{"key":"e_1_3_3_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-85697-6_17"},{"key":"e_1_3_3_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-22941-1_43"},{"key":"e_1_3_3_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2008.19"},{"key":"e_1_3_3_29_1","doi-asserted-by":"publisher","DOI":"10.1177\/10943420211008288"},{"key":"e_1_3_3_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPSW.2018.00019"},{"key":"e_1_3_3_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3141865.3141869"},{"key":"e_1_3_3_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2019.2928289"},{"key":"e_1_3_3_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC59245.2023.00024"},{"key":"e_1_3_3_34_1","doi-asserted-by":"publisher","DOI":"10.3390\/electronics11091369"},{"key":"e_1_3_3_35_1","first-page":"85","volume-title":"Competence in High Performance Computing 2010: Proceedings of an International Conference on Competence in High Performance Computing, June 2010, Schloss Schwetzingen","author":"Mey DA","year":"2012","unstructured":"Mey DA, Biersdorf S, Bischof C, et al. (2012) Score-p: a unified performance measurement system for petascale applications In: Competence in High Performance Computing 2010: Proceedings of an International Conference on Competence in High Performance Computing, June 2010, Schloss Schwetzingen. Springer, 85\u201397."},{"key":"e_1_3_3_36_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2021.102831"},{"issue":"3","key":"e_1_3_3_37_1","first-page":"40","article-title":"Simulation of collective operations hardware support for angara interconnect","volume":"4","author":"Mukosey AV","year":"2015","unstructured":"Mukosey AV, Semenov AS, Simonov AS (2015) Simulation of collective operations hardware support for angara interconnect. Vestnik Yuzhno-Ural\u2019skogo Gosudarstvennogo Universiteta. Seriya\u201d Vychislitelnaya Matematika i Informatika 4(3): 40\u201355.","journal-title":"Vestnik Yuzhno-Ural\u2019skogo Gosudarstvennogo Universiteta. Seriya\u201d Vychislitelnaya Matematika i Informatika"},{"key":"e_1_3_3_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2023.104765"},{"key":"e_1_3_3_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER51413.2022.00077"},{"key":"e_1_3_3_40_1","first-page":"113","volume-title":"2009 International Conference on Reconfigurable Computing and FPGAs","author":"N\u00fcssle M","year":"2009","unstructured":"N\u00fcssle M, Geib B, Fr\u00f6ning H, et al. (2009) An fpga-based custom high performance interconnection network In: 2009 International Conference on Reconfigurable Computing and FPGAs. IEEE, 113\u2013118."},{"key":"e_1_3_3_41_1","doi-asserted-by":"publisher","DOI":"10.1177\/10943420231213013"},{"key":"e_1_3_3_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/40.988689"},{"key":"e_1_3_3_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2013.17"},{"key":"e_1_3_3_44_1","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/bxae143"},{"issue":"3","key":"e_1_3_3_45_1","first-page":"41","article-title":"The l-csc cluster: optimizing power efficiency to become the greenest supercomputer in the world in the green500 list of November 2014","volume":"2","author":"Rohr D","year":"2015","unstructured":"Rohr D, Neskovic G, Lindenstruth V (2015) The l-csc cluster: optimizing power efficiency to become the greenest supercomputer in the world in the green500 list of November 2014. Supercomput. Front. Innov.: International Journal 2(3): 41\u201348.","journal-title":"Supercomput. Front. Innov.: International Journal"},{"key":"e_1_3_3_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPSW50202.2020.00147"},{"key":"e_1_3_3_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-66471-8_17"},{"key":"e_1_3_3_48_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342019826667"},{"key":"e_1_3_3_49_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cpc.2021.108171"},{"key":"e_1_3_3_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSC.2023.3337662"},{"key":"e_1_3_3_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3578245.3583715"},{"key":"e_1_3_3_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/3392717.3392752"},{"key":"e_1_3_3_53_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2021.102837"},{"key":"e_1_3_3_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3524059.3532388"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/10943420251411961","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/10943420251411961","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/10943420251411961","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T18:40:44Z","timestamp":1773254444000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/10943420251411961"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,30]]},"references-count":53,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,3]]}},"alternative-id":["10.1177\/10943420251411961"],"URL":"https:\/\/doi.org\/10.1177\/10943420251411961","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,30]]}}}