{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T03:45:50Z","timestamp":1772163950120,"version":"3.50.1"},"reference-count":28,"publisher":"World Scientific Pub Co Pte Ltd","issue":"02","funder":[{"name":"The National Key R&D Program of China","award":["2020YFB1710200"],"award-info":[{"award-number":["2020YFB1710200"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Parallel Process. Lett."],"published-print":{"date-parts":[[2024,6]]},"abstract":"<jats:p>SpGEMM (General Sparse Matrix-Matrix Multiplication) is one of the kernels of an algebraic multi-grid method, graph algorithm, and solving linear equations. Due to the non-uniformity of some sparse matrices, the existing parallel SpGEMM algorithms suffer from load imbalance, lead to a decrease in computational efficiency. This paper proposes a new algorithm, SPMSD (SpGEMM Based on Minimum Standard Deviation). The algorithm is developed based on a hash table and partition strategy. First, the number of intermediate results in the matrix is divided into multiple blocks based on a new partition strategy to ensure the minimum standard deviation among blocks. Second, the input matrix is transformed according to the result of the partition strategy. Finally, SPMSD performs the parallel computing of SpGEMM based on the advantages of fast insertion and also fast access storage of the hash table and the calculation process controls the insertion and merging of intermediate results according to the offset to avoid the shortage of atomic operations. These experiments indicate the execution of SPMSD is faster than the existing cuSPARSE libraries by 7.4x. Compared with the Out of Core method, SPMSD improves the computational performance by 1.2x, SPMSD memory utilization is decreased by 0.19x.<\/jats:p>","DOI":"10.1142\/s012962642450004x","type":"journal-article","created":{"date-parts":[[2024,5,28]],"date-time":"2024-05-28T00:45:37Z","timestamp":1716857137000},"source":"Crossref","is-referenced-by-count":1,"title":["SPMSD: An Partitioning-Strategy for Parallel General Sparse Matrix-Matrix Multiplication on GPU"],"prefix":"10.1142","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7996-7545","authenticated-orcid":false,"given":"Huanyu","family":"Cui","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1738-7937","authenticated-orcid":false,"given":"Nianbin","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5185-8387","authenticated-orcid":false,"given":"Qilong","family":"Han","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0223-8181","authenticated-orcid":false,"given":"Ye","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"219","published-online":{"date-parts":[[2024,5,27]]},"reference":[{"key":"S012962642450004XBIB001","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2018.2864729"},{"key":"S012962642450004XBIB002","doi-asserted-by":"publisher","DOI":"10.1109\/ScalA.2018.00011"},{"key":"S012962642450004XBIB003","doi-asserted-by":"publisher","DOI":"10.1145\/3229710.3229720"},{"key":"S012962642450004XBIB005","doi-asserted-by":"publisher","DOI":"10.1145\/3079079.3079105"},{"key":"S012962642450004XBIB006","first-page":"370","volume-title":"IEEE 28th International Parallel and Distributed Processing Symposium","author":"Liu W. F."},{"key":"S012962642450004XBIB007","first-page":"58","volume-title":"IEEE International Symposium on High-Performance Computer Architecture (HPCA)","author":"Qin E."},{"key":"S012962642450004XBIB009","first-page":"94","volume-title":"Proceedings of the ACM International Conference on Supercomputing","author":"Xie Z."},{"issue":"3","key":"S012962642450004XBIB010","first-page":"1","volume":"4","author":"Akbudak K.","year":"2018","journal-title":"ACM Transactions on Parallel Computing (TOPC)"},{"key":"S012962642450004XBIB011","first-page":"1","volume-title":"Proceedings of the Fourteenth EuroSys Conference, 2019 46th International Conference on Parallel Processing (ICPP)","author":"Jamour F."},{"key":"S012962642450004XBIB012","first-page":"1","author":"Cui H. Y.","year":"2022","journal-title":"The Journal of\u200aSupercomputing"},{"key":"S012962642450004XBIB013","first-page":"101","volume-title":"2017 46th International Conference on Parallel Processing (ICPP)","author":"Nagasaka Y."},{"key":"S012962642450004XBIB014","first-page":"362","volume-title":"Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming","author":"Parger M."},{"key":"S012962642450004XBIB015","first-page":"392","volume-title":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","author":"Xia Y."},{"key":"S012962642450004XBIB016","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2020.10.036"},{"key":"S012962642450004XBIB017","first-page":"327","volume-title":"Proceedings of the International Multiconference on Computer Science and Information Technology","author":"Martone M."},{"key":"S012962642450004XBIB018","doi-asserted-by":"publisher","DOI":"10.3390\/fi13020036"},{"key":"S012962642450004XBIB019","doi-asserted-by":"publisher","DOI":"10.1137\/0613024"},{"key":"S012962642450004XBIB020","doi-asserted-by":"publisher","DOI":"10.1145\/3350755.3400216"},{"key":"S012962642450004XBIB021","first-page":"1","volume-title":"Proceedings of the 2016 International Conference on Supercomputing","author":"Anh N. Q. Pham"},{"key":"S012962642450004XBIB022","first-page":"93","volume-title":"2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","author":"Deveci M."},{"key":"S012962642450004XBIB023","doi-asserted-by":"publisher","DOI":"10.1137\/130948811"},{"key":"S012962642450004XBIB024","doi-asserted-by":"publisher","DOI":"10.1016\/j.compeleceng.2020.106848"},{"key":"S012962642450004XBIB025","doi-asserted-by":"publisher","DOI":"10.1137\/110848244"},{"key":"S012962642450004XBIB026","first-page":"925","volume-title":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","author":"Lee J."},{"key":"S012962642450004XBIB027","first-page":"48","volume-title":"High Performance Computing: 30th International Conference, ISC High Performance","author":"Patwary M. M. A."},{"key":"S012962642450004XBIB030","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2020.3000708"},{"key":"S012962642450004XBIB031","first-page":"61","volume-title":"CCF National Conference on Computer Engineering and Technology","author":"Guo S.","year":"2015"},{"key":"S012962642450004XBIB034","doi-asserted-by":"publisher","DOI":"10.1177\/1094342021990738"}],"container-title":["Parallel Processing Letters"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S012962642450004X","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,31]],"date-time":"2024-05-31T06:00:58Z","timestamp":1717135258000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S012962642450004X"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,27]]},"references-count":28,"journal-issue":{"issue":"02","published-print":{"date-parts":[[2024,6]]}},"alternative-id":["10.1142\/S012962642450004X"],"URL":"https:\/\/doi.org\/10.1142\/s012962642450004x","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-2052543\/v1","asserted-by":"object"}]},"ISSN":["0129-6264","1793-642X"],"issn-type":[{"value":"0129-6264","type":"print"},{"value":"1793-642X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,27]]},"article-number":"2450004"}}