{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T15:34:02Z","timestamp":1772724842772,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":59,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,6,26]],"date-time":"2019-06-26T00:00:00Z","timestamp":1561507200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2017YFB0202105, 2016YFB0201305, 2016YFB0200803, 2016YFB0200300"],"award-info":[{"award-number":["2017YFB0202105, 2016YFB0201305, 2016YFB0200803, 2016YFB0200300"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61521092, 91430218, 31327901, 61472395, 61432018"],"award-info":[{"award-number":["61521092, 91430218, 31327901, 61472395, 61432018"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,6,26]]},"DOI":"10.1145\/3330345.3330354","type":"proceedings-article","created":{"date-parts":[[2019,6,18]],"date-time":"2019-06-18T12:14:30Z","timestamp":1560860070000},"page":"94-105","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":55,"title":["IA-SpGEMM"],"prefix":"10.1145","author":[{"given":"Zhen","family":"Xie","sequence":"first","affiliation":[{"name":"Chinese Academy of Sciences and University of Chinese Academy of Sciences"}]},{"given":"Guangming","family":"Tan","sequence":"additional","affiliation":[{"name":"Chinese Academy of Sciences and University of Chinese Academy of Sciences"}]},{"given":"Weifeng","family":"Liu","sequence":"additional","affiliation":[{"name":"China University of Petroleum, Beijing"}]},{"given":"Ninghui","family":"Sun","sequence":"additional","affiliation":[{"name":"Chinese Academy of Sciences and University of Chinese Academy of Sciences"}]}],"member":"320","published-online":{"date-parts":[[2019,6,26]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-38750-0_12"},{"key":"e_1_3_2_1_2_1","volume-title":"Optimizing LOBPCG: Sparse Matrix Loop and Data Transformations in Action. In International Workshop on Languages and Compilers for Parallel Computing. Springer, 218--232","author":"Ahmad Khalid","year":"2016","unstructured":"Khalid Ahmad , Anand Venkat , and Mary Hall . 2016 . Optimizing LOBPCG: Sparse Matrix Loop and Data Transformations in Action. In International Workshop on Languages and Compilers for Parallel Computing. Springer, 218--232 . Khalid Ahmad, Anand Venkat, and Mary Hall. 2016. Optimizing LOBPCG: Sparse Matrix Loop and Data Transformations in Action. In International Workshop on Languages and Compilers for Parallel Computing. Springer, 218--232."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2597652.2597678"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2014.11.001"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1137\/15M104253X"},{"key":"e_1_3_2_1_6_1","volume-title":"HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks. Nucleic acids research 46, 6","author":"Azad Ariful","year":"2018","unstructured":"Ariful Azad , Georgios A Pavlopoulos , Christos A Ouzounis , Nikos C Kyrpides , and Aydin Bulu\u00e7 . 2018. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks. Nucleic acids research 46, 6 ( 2018 ), e33--e33. Ariful Azad, Georgios A Pavlopoulos, Christos A Ouzounis, Nikos C Kyrpides, and Aydin Bulu\u00e7. 2018. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks. Nucleic acids research 46, 6 (2018), e33--e33."},{"key":"e_1_3_2_1_7_1","unstructured":"David Bader. {n. d.}. Graph BLAS Forum. https:\/\/graphblas.org. ({n. d.}).  David Bader. {n. d.}. Graph BLAS Forum. https:\/\/graphblas.org. ({n. d.})."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2755573.2755613"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1137\/15M1028807"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1137\/110838844"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1654059.1654078"},{"key":"e_1_3_2_1_12_1","unstructured":"L. Breiman J. Friedman C.J. Stone and R.A. Olshen. 1984. Classification and Regression Trees. Taylor & Francis. https:\/\/books.google.com\/books?id=JwQx-WOmSyQC  L. Breiman J. Friedman C.J. Stone and R.A. Olshen. 1984. Classification and Regression Trees. Taylor & Francis. https:\/\/books.google.com\/books?id=JwQx-WOmSyQC"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1137\/110848244"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.sigpro.2012.09.011"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1837853.1693471"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2699470"},{"key":"e_1_3_2_1_17_1","volume-title":"Proceedings of the GPU Technology Conference","volume":"3","author":"Demouth Julien","year":"2012","unstructured":"Julien Demouth . 2012 . Sparse matrix-matrix multiplication on the gpu . In Proceedings of the GPU Technology Conference , Vol. 3 . Julien Demouth. 2012. Sparse matrix-matrix multiplication on the gpu. In Proceedings of the GPU Technology Conference, Vol. 3."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPSW.2017.8"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2018.06.009"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1137\/0613024"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2008.45"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.81"},{"key":"e_1_3_2_1_23_1","volume-title":"Proceedings of the fourteenth international conference on artificial intelligence and statistics. 315--323","author":"Glorot Xavier","year":"2011","unstructured":"Xavier Glorot , Antoine Bordes , and Yoshua Bengio . 2011 . Deep sparse rectifier neural networks . In Proceedings of the fourteenth international conference on artificial intelligence and statistics. 315--323 . Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Deep sparse rectifier neural networks. In Proceedings of the fourteenth international conference on artificial intelligence and statistics. 315--323."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1137\/130948811"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/355791.355796"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1137\/130930352"},{"key":"e_1_3_2_1_28_1","unstructured":"Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.   Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_3_2_1_30_1","volume-title":"Proceedings of the 1988 connectionist models summer school","volume":"1","author":"LeCun Yann","year":"1988","unstructured":"Yann LeCun , D Touresky , G Hinton , and T Sejnowski . 1988 . A theoretical framework for back-propagation . In Proceedings of the 1988 connectionist models summer school , Vol. 1 . CMU, Pittsburgh, Pa : Morgan Kaufmann, 21--28. Yann LeCun, D Touresky, G Hinton, and T Sejnowski. 1988. A theoretical framework for back-propagation. In Proceedings of the 1988 connectionist models summer school, Vol. 1. CMU, Pittsburgh, Pa: Morgan Kaufmann, 21--28."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3126908.3126931"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2499370.2462181"},{"key":"e_1_3_2_1_33_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 806--814","author":"Liu Baoyuan","year":"2015","unstructured":"Baoyuan Liu , Min Wang , Hassan Foroosh , Marshall Tappen , and Marianna Pensky . 2015 . Sparse convolutional neural networks . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 806--814 . Baoyuan Liu, Min Wang, Hassan Foroosh, Marshall Tappen, and Marianna Pensky. 2015. Sparse convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 806--814."},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.4244"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2014.47"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2751205.2751209"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2015.06.010"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2015.04.004"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-11515-8_10"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3229710.3229720"},{"key":"e_1_3_2_1_42_1","volume-title":"High-Performance and Memory-Saving Sparse General Matrix-Matrix Multiplication for NVIDIA Pascal GPU. In 2017 46th International Conference on Parallel Processing (ICPP). IEEE, 101--110","author":"Nagasaka Yusuke","year":"2017","unstructured":"Yusuke Nagasaka , Akira Nukada , and Satoshi Matsuoka . 2017 . High-Performance and Memory-Saving Sparse General Matrix-Matrix Multiplication for NVIDIA Pascal GPU. In 2017 46th International Conference on Parallel Processing (ICPP). IEEE, 101--110 . Yusuke Nagasaka, Akira Nukada, and Satoshi Matsuoka. 2017. High-Performance and Memory-Saving Sparse General Matrix-Matrix Multiplication for NVIDIA Pascal GPU. In 2017 46th International Conference on Parallel Processing (ICPP). IEEE, 101--110."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-20119-1_4"},{"key":"e_1_3_2_1_44_1","volume-title":"Intl. Workshop on GPUs and Scientific Applications. 51--56","author":"Rupp Karl","year":"2010","unstructured":"Karl Rupp , Florian Rudolf , and Josef Weinbub . 2010 . ViennaCL-a high level linear algebra library for GPUs and multi-core CPUs . In Intl. Workshop on GPUs and Scientific Applications. 51--56 . Karl Rupp, Florian Rudolf, and Josef Weinbub. 2010. ViennaCL-a high level linear algebra library for GPUs and multi-core CPUs. In Intl. Workshop on GPUs and Scientific Applications. 51--56."},{"key":"e_1_3_2_1_45_1","volume-title":"SPARSKIT: A basic tool kit for sparse matrix computations.","author":"Saad Youcef","year":"1990","unstructured":"Youcef Saad . 1990 . SPARSKIT: A basic tool kit for sparse matrix computations. (1990). Youcef Saad. 1990. SPARSKIT: A basic tool kit for sparse matrix computations. (1990)."},{"key":"e_1_3_2_1_46_1","volume-title":"Artificial Neural Networks-ICANN","author":"Scherer Dominik","year":"2010","unstructured":"Dominik Scherer , Andreas M\u00fcller , and Sven Behnke . 2010. Evaluation of pooling operations in convolutional architectures for object recognition . In Artificial Neural Networks-ICANN 2010 . Springer , 92--101. Dominik Scherer, Andreas M\u00fcller, and Sven Behnke. 2010. Evaluation of pooling operations in convolutional architectures for object recognition. In Artificial Neural Networks-ICANN 2010. Springer, 92--101."},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2014.09.003"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/2751205.2751244"},{"key":"e_1_3_2_1_49_1","volume-title":"An interactive system for combinatorial scientific computing with an emphasis on programmer productivity","author":"Shah Viral B","unstructured":"Viral B Shah . 2007. An interactive system for combinatorial scientific computing with an emphasis on programmer productivity . University of California , Santa Barbara . Viral B Shah. 2007. An interactive system for combinatorial scientific computing with an emphasis on programmer productivity. University of California, Santa Barbara."},{"key":"e_1_3_2_1_50_1","volume-title":"Proceedings of the 16th Annual Workshop on Circuits, Systems and Signal Processing, ProRisc","volume":"2005","author":"Smailbegovic FS","year":"2005","unstructured":"FS Smailbegovic , Georgi N Gaydadjiev , and Stamatis Vassiliadis . 2005 . Sparse matrix storage format . In Proceedings of the 16th Annual Workshop on Circuits, Systems and Signal Processing, ProRisc , Vol. 2005 . 445--448. FS Smailbegovic, Georgi N Gaydadjiev, and Stamatis Vassiliadis. 2005. Sparse matrix storage format. In Proceedings of the 16th Annual Workshop on Circuits, Systems and Signal Processing, ProRisc, Vol. 2005. 445--448."},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2304576.2304624"},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/3218823"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/2813885.2738003"},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.5555\/3014904.3014959"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178487.3178513"},{"key":"e_1_3_2_1_56_1","volume-title":"A study on several machine-learning methods for classification of malignant and benign clustered microcalcifications","author":"Wei Liyang","year":"2005","unstructured":"Liyang Wei , Yongyi Yang , Robert M Nishikawa , and Yulei Jiang . 2005. A study on several machine-learning methods for classification of malignant and benign clustered microcalcifications . IEEE transactions on medical imaging 24, 3 ( 2005 ), 371--380. Liyang Wei, Yongyi Yang, Robert M Nishikawa, and Yulei Jiang. 2005. A study on several machine-learning methods for classification of malignant and benign clustered microcalcifications. IEEE transactions on medical imaging 24, 3 (2005), 371--380."},{"key":"e_1_3_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/2555243.2555255"},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.5555\/982792.982828"},{"key":"e_1_3_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10590-1_53"},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/3018743.3018755"},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178487.3178495"}],"event":{"name":"ICS '19: 2019 International Conference on Supercomputing","location":"Phoenix Arizona","acronym":"ICS '19","sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture"]},"container-title":["Proceedings of the ACM International Conference on Supercomputing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3330345.3330354","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3330345.3330354","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:54:05Z","timestamp":1750204445000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3330345.3330354"}},"subtitle":["an input-aware auto-tuning framework for parallel sparse matrix-matrix multiplication"],"short-title":[],"issued":{"date-parts":[[2019,6,26]]},"references-count":59,"alternative-id":["10.1145\/3330345.3330354","10.1145\/3330345"],"URL":"https:\/\/doi.org\/10.1145\/3330345.3330354","relation":{},"subject":[],"published":{"date-parts":[[2019,6,26]]},"assertion":[{"value":"2019-06-26","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}