{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T15:33:49Z","timestamp":1772724829749,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":51,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,2,16]],"date-time":"2019-02-16T00:00:00Z","timestamp":1550275200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000185","name":"Defense Advanced Research Projects Agency","doi-asserted-by":"publisher","award":["D16PC00183"],"award-info":[{"award-number":["D16PC00183"]}],"id":[{"id":"10.13039\/100000185","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["1404995,1513120,1629548,1645599,1816793"],"award-info":[{"award-number":["1404995,1513120,1629548,1645599,1816793"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,2,16]]},"DOI":"10.1145\/3293883.3295712","type":"proceedings-article","created":{"date-parts":[[2019,2,5]],"date-time":"2019-02-05T20:44:12Z","timestamp":1549399452000},"page":"300-314","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":137,"title":["Adaptive sparse tiling for sparse matrix multiplication"],"prefix":"10.1145","author":[{"given":"Changwan","family":"Hong","sequence":"first","affiliation":[{"name":"The Ohio State University"}]},{"given":"Aravind","family":"Sukumaran-Rajam","sequence":"additional","affiliation":[{"name":"The Ohio State University"}]},{"given":"Israt","family":"Nisa","sequence":"additional","affiliation":[{"name":"The Ohio State University"}]},{"given":"Kunal","family":"Singh","sequence":"additional","affiliation":[{"name":"The Ohio State University"}]},{"given":"P.","family":"Sadayappan","sequence":"additional","affiliation":[{"name":"The Ohio State University"}]}],"member":"320","published-online":{"date-parts":[[2019,2,16]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"2018. The API reference guide for cuSPARSE the CUDA sparse matrixlibrary.(v9.1 ed.). http:\/\/docs.nvidia.com\/cuda\/cusparse\/index.html.  2018. The API reference guide for cuSPARSE the CUDA sparse matrixlibrary.(v9.1 ed.). http:\/\/docs.nvidia.com\/cuda\/cusparse\/index.html."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2014.125"},{"key":"e_1_3_2_1_3_1","volume-title":"Proceedings of the Symposium on High Performance Computing. Society for Computer Simulation International, 75--82","author":"Anzt Hartwig","year":"2015","unstructured":"Hartwig Anzt , Stanimire Tomov , and Jack Dongarra . 2015 . Accelerating the LOBPCG method on GPUs using a blocked sparse matrix vector product . In Proceedings of the Symposium on High Performance Computing. Society for Computer Simulation International, 75--82 . Hartwig Anzt, Stanimire Tomov, and Jack Dongarra. 2015. Accelerating the LOBPCG method on GPUs using a blocked sparse matrix vector product. In Proceedings of the Symposium on High Performance Computing. Society for Computer Simulation International, 75--82."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1137\/040608088"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1654059.1654078"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2008.4536313"},{"key":"e_1_3_2_1_7_1","volume-title":"Design of the GraphBLAS API for C. In Parallel and Distributed Processing Symposium Workshops (IPDPSW)","author":"Buluc Aydin","year":"2017","unstructured":"Aydin Buluc , Tim Mattson , Scott McMillan , Jos\u00e9 Moreira , and Carl Yang . 2017 . Design of the GraphBLAS API for C. In Parallel and Distributed Processing Symposium Workshops (IPDPSW) , 2017 IEEE International. IEEE, 643--652. Aydin Buluc, Tim Mattson, Scott McMillan, Jos\u00e9 Moreira, and Carl Yang. 2017. Design of the GraphBLAS API for C. In Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2017 IEEE International. IEEE, 643--652."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2011.73"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342007083801"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/829514.830525"},{"key":"e_1_3_2_1_11_1","volume-title":"BigLearn workshop, NIPS. 117","author":"Canny John","year":"2013","unstructured":"John Canny and Huasha Zhao . 2013 . Bidmach: Large-scale learning with zero memory allocation . In BigLearn workshop, NIPS. 117 . John Canny and Huasha Zhao. 2013. Bidmach: Large-scale learning with zero memory allocation. In BigLearn workshop, NIPS. 117."},{"key":"e_1_3_2_1_12_1","volume-title":"Cusp: Generic parallel algorithms for sparse matrix and graph computations","author":"Dalton Steven","year":"2014","unstructured":"Steven Dalton , Nathan Bell , Luke Olson , and Michael Garland . 2014 . Cusp: Generic parallel algorithms for sparse matrix and graph computations , 2014. Version 0.5. 0 (2014). Steven Dalton, Nathan Bell, Luke Olson, and Michael Garland. 2014. Cusp: Generic parallel algorithms for sparse matrix and graph computations, 2014. Version 0.5. 0 (2014)."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2049662.2049663"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2014.68"},{"key":"e_1_3_2_1_15_1","unstructured":"Ga\u00ebl Guennebaud Beno\u0131t Jacob Philip Avery Abraham Bachrach Sebastien Barthelemy etal 2010. Eigen v3.  Ga\u00ebl Guennebaud Beno\u0131t Jacob Philip Avery Abraham Bachrach Sebastien Barthelemy et al. 2010. Eigen v3."},{"key":"e_1_3_2_1_16_1","volume-title":"Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149","author":"Han Song","year":"2015","unstructured":"Song Han , Huizi Mao , and William J Dally . 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 ( 2015 ). Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015)."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3208040.3208062"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/305219.305248"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.5555\/520033.858246"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133901"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2009.263"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/HiPC.2017.00039"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2751205.2751209"},{"key":"e_1_3_2_1_24_1","volume-title":"Architectures and Processors (ASAP), 2015 IEEE 26th International Conference on. IEEE, 82--89","author":"Liu Yongchao","year":"2015","unstructured":"Yongchao Liu and Bertil Schmidt . 2015 . LightSpMV: Faster CSR-based sparse matrix-vector multiplication on CUDA-enabled GPUs. In Application-specific Systems , Architectures and Processors (ASAP), 2015 IEEE 26th International Conference on. IEEE, 82--89 . Yongchao Liu and Bertil Schmidt. 2015. LightSpMV: Faster CSR-based sparse matrix-vector multiplication on CUDA-enabled GPUs. In Application-specific Systems, Architectures and Processors (ASAP), 2015 IEEE 26th International Conference on. IEEE, 82--89."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2628071.2628087"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2503210.2503268"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2851141.2851190"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-11515-8_10"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1137\/S00361445003820"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/bxt038"},{"key":"e_1_3_2_1_31_1","volume-title":"Hai Li, Yiran Chen, and Pradeep Dubey.","author":"Park Jongsoo","year":"2016","unstructured":"Jongsoo Park , Sheng Li , Wei Wen , Ping Tak Peter Tang , Hai Li, Yiran Chen, and Pradeep Dubey. 2016 . Faster cnns with direct sparse convolutions and guided pruning. arXiv preprint arXiv:1608.01409 (2016). Jongsoo Park, Sheng Li, Wei Wen, Ping Tak Peter Tang, Hai Li, Yiran Chen, and Pradeep Dubey. 2016. Faster cnns with direct sparse convolutions and guided pruning. arXiv preprint arXiv:1608.01409 (2016)."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.micpro.2011.05.005"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2014.07.006"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPEC.2016.7761634"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079079.3079086"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2304576.2304624"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2011.53"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2503210.2503234"},{"key":"e_1_3_2_1_39_1","unstructured":"Michalis K Titsias. 2008. The infinite gamma-Poisson feature model. In Advances in Neural Information Processing Systems. 1513--1520.   Michalis K Titsias. 2008. The infinite gamma-Poisson feature model. In Advances in Neural Information Processing Systems . 1513--1520."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.1658"},{"key":"e_1_3_2_1_41_1","series-title":"In Journal of Physics: Conference Series","volume-title":"OSKI: A library of automatically tuned sparse matrix kernels","author":"Vuduc Richard","year":"2005","unstructured":"Richard Vuduc , James W Demmel , and Katherine A Yelick . 2005 . OSKI: A library of automatically tuned sparse matrix kernels . In Journal of Physics: Conference Series , Vol. 16 . IOP Publishing , 521. Richard Vuduc, James W Demmel, and Katherine A Yelick. 2005. OSKI: A library of automatically tuned sparse matrix kernels. In Journal of Physics: Conference Series, Vol. 16. IOP Publishing, 521."},{"key":"e_1_3_2_1_42_1","unstructured":"Joerg Walter Mathias Koch etal 2006. uBLAS. Boost C++ software library available from http:\/\/www.boost.org\/doc\/libs (2006).  Joerg Walter Mathias Koch et al. 2006. uBLAS. Boost C++ software library available from http:\/\/www.boost.org\/doc\/libs (2006)."},{"key":"e_1_3_2_1_43_1","volume-title":"High-Performance Computing on the Intel\u00ae Xeon Phi\u00e2\u010e\u0107","author":"Wang Endong","unstructured":"Endong Wang , Qing Zhang , Bo Shen , Guangyong Zhang , Xiaowei Lu , Qing Wu , and Yajuan Wang . 2014. Intel math kernel library . In High-Performance Computing on the Intel\u00ae Xeon Phi\u00e2\u010e\u0107 . Springer , 167--188. Endong Wang, Qing Zhang, Bo Shen, Guangyong Zhang, Xiaowei Lu, Qing Wu, and Yajuan Wang. 2014. Intel math kernel library. In High-Performance Computing on the Intel\u00ae Xeon Phi\u00e2\u010e\u0107. Springer, 167--188."},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915220"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/1362622.1362674"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3168818"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/2692916.2555255"},{"key":"e_1_3_2_1_48_1","volume-title":"Design Principles for Sparse Matrix Multiplication on the GPU. arXiv preprint arXiv:1803.08601","author":"Yang Carl","year":"2018","unstructured":"Carl Yang , Aydin Buluc , and John D Owens . 2018. Design Principles for Sparse Matrix Multiplication on the GPU. arXiv preprint arXiv:1803.08601 ( 2018 ). Carl Yang, Aydin Buluc, and John D Owens. 2018. Design Principles for Sparse Matrix Multiplication on the GPU. arXiv preprint arXiv:1803.08601 (2018)."},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCSE.2012.28"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1137\/080733243"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783416"}],"event":{"name":"PPoPP '19: 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming","location":"Washington District of Columbia","acronym":"PPoPP '19","sponsor":["SIGPLAN ACM Special Interest Group on Programming Languages","SIGHPC ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing"]},"container-title":["Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3293883.3295712","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3293883.3295712","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3293883.3295712","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:01:47Z","timestamp":1750208507000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3293883.3295712"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,2,16]]},"references-count":51,"alternative-id":["10.1145\/3293883.3295712","10.1145\/3293883"],"URL":"https:\/\/doi.org\/10.1145\/3293883.3295712","relation":{},"subject":[],"published":{"date-parts":[[2019,2,16]]},"assertion":[{"value":"2019-02-16","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}