{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:12:34Z","timestamp":1750219954369,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":29,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,6,14]],"date-time":"2023-06-14T00:00:00Z","timestamp":1686700800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,6,14]]},"DOI":"10.1145\/3597031.3597057","type":"proceedings-article","created":{"date-parts":[[2023,7,19]],"date-time":"2023-07-19T20:12:13Z","timestamp":1689797533000},"page":"107-113","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["cuSCNN : an Efficient CUDA Implementation of Sparse CNNs"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8555-7331","authenticated-orcid":false,"given":"Mohamed A.","family":"Elgammal","sequence":"first","affiliation":[{"name":"University of Toronto, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0676-5867","authenticated-orcid":false,"given":"Omar Mohamed","family":"Awad","sequence":"additional","affiliation":[{"name":"University of Toronto, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-1189-2259","authenticated-orcid":false,"given":"Isak Edo","family":"Vivancos","sequence":"additional","affiliation":[{"name":"University of Toronto, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7768-367X","authenticated-orcid":false,"given":"Andreas","family":"Moshovos","sequence":"additional","affiliation":[{"name":"University of Toronto, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0528-6493","authenticated-orcid":false,"given":"Vaughn","family":"Betz","sequence":"additional","affiliation":[{"name":"University of Toronto, Canada"}]}],"member":"320","published-online":{"date-parts":[[2023,7,19]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Tensorflow: A system for large-scale machine learning. In 12th { USENIX} Symposium on Operating Systems Design and Implementation ({ OSDI} 16). 265\u2013283.","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , 2016 . Tensorflow: A system for large-scale machine learning. In 12th { USENIX} Symposium on Operating Systems Design and Implementation ({ OSDI} 16). 265\u2013283. Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, 2016. Tensorflow: A system for large-scale machine learning. In 12th { USENIX} Symposium on Operating Systems Design and Implementation ({ OSDI} 16). 265\u2013283."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001138"},{"key":"e_1_3_2_1_3_1","volume-title":"Return of the Devil in the Details: Delving Deep into Convolutional Nets. CoRR abs\/1405.3531","author":"Chatfield Ken","year":"2014","unstructured":"Ken Chatfield , Karen Simonyan , Andrea Vedaldi , and Andrew Zisserman . 2014. Return of the Devil in the Details: Delving Deep into Convolutional Nets. CoRR abs\/1405.3531 ( 2014 ). Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the Devil in the Details: Delving Deep into Convolutional Nets. CoRR abs\/1405.3531 (2014)."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.3390\/rs13224712"},{"key":"e_1_3_2_1_5_1","first-page":"7","article-title":"Escoin","volume":"4","author":"Chen Xuhao","year":"2019","unstructured":"Xuhao Chen . 2019 . Escoin : Efficient Sparse Convolutional Neural Network Inference on GPUs. Matrix 4 , 5 (2019), 7 \u2013 8 . Xuhao Chen. 2019. Escoin: Efficient Sparse Convolutional Neural Network Inference on GPUs. Matrix 4, 5 (2019), 7\u20138.","journal-title":"Efficient Sparse Convolutional Neural Network Inference on GPUs. Matrix"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5201\/ipol.2019.274"},{"key":"e_1_3_2_1_8_1","volume-title":"HPIPE: Heterogeneous layer-pipelined and sparse-aware CNN inference for FPGAs. arXiv preprint arXiv:2007.10451","author":"Hall Mathew","year":"2020","unstructured":"Mathew Hall and Vaughn Betz . 2020 . HPIPE: Heterogeneous layer-pipelined and sparse-aware CNN inference for FPGAs. arXiv preprint arXiv:2007.10451 (2020). Mathew Hall and Vaughn Betz. 2020. HPIPE: Heterogeneous layer-pipelined and sparse-aware CNN inference for FPGAs. arXiv preprint arXiv:2007.10451 (2020)."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s40747-021-00428-4"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2647868.2654889"},{"key":"e_1_3_2_1_11_1","volume-title":"Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky , Ilya Sutskever , and Geoffrey\u00a0 E. Hinton . 2012. ImageNet Classification with Deep Convolutional Neural Networks . In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012 . Proceedings of a meeting held December 3-6, 2012, Lake Tahoe, Nevada, United States .1106\u20131114. http:\/\/papers.nips.cc\/paper\/4824-imagenet-classification-with-deep-convolutional-neural-networks Alex Krizhevsky, Ilya Sutskever, and Geoffrey\u00a0E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012, Lake Tahoe, Nevada, United States.1106\u20131114. http:\/\/papers.nips.cc\/paper\/4824-imagenet-classification-with-deep-convolutional-neural-networks"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCI.2017.2671360"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2001.937655"},{"key":"e_1_3_2_1_14_1","volume-title":"Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440","author":"Molchanov Pavlo","year":"2016","unstructured":"Pavlo Molchanov , Stephen Tyree , Tero Karras , Timo Aila , and Jan Kautz . 2016. Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440 ( 2016 ). Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2016. Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440 (2016)."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/SLT54892.2023.10022739"},{"key":"e_1_3_2_1_16_1","volume-title":"Audio-visual speech recognition using deep learning. Applied intelligence 42","author":"Noda Kuniaki","year":"2015","unstructured":"Kuniaki Noda , Yuki Yamaguchi , Kazuhiro Nakadai , Hiroshi\u00a0 G Okuno , and Tetsuya Ogata . 2015. Audio-visual speech recognition using deep learning. Applied intelligence 42 ( 2015 ), 722\u2013737. Kuniaki Noda, Yuki Yamaguchi, Kazuhiro Nakadai, Hiroshi\u00a0G Okuno, and Tetsuya Ogata. 2015. Audio-visual speech recognition using deep learning. Applied intelligence 42 (2015), 722\u2013737."},{"key":"e_1_3_2_1_17_1","first-page":"31","article-title":"Cublas library. NVIDIA Corporation, Santa Clara","volume":"15","author":"Nvidia CUDA","year":"2008","unstructured":"CUDA Nvidia . 2008 . Cublas library. NVIDIA Corporation, Santa Clara , California 15 , 27 (2008), 31 . CUDA Nvidia. 2008. Cublas library. NVIDIA Corporation, Santa Clara, California 15, 27 (2008), 31.","journal-title":"California"},{"key":"e_1_3_2_1_18_1","volume-title":"Cusparse library","author":"Nvidia CUDA","year":"2014","unstructured":"CUDA Nvidia . 2014. Cusparse library . NVIDIA Corporation , Santa Clara , California ( 2014 ). CUDA Nvidia. 2014. Cusparse library. NVIDIA Corporation, Santa Clara, California (2014)."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2020.2979670"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-17795-9_10"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080254"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3140659.3080254"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3410463.3414648"},{"key":"e_1_3_2_1_24_1","volume-title":"ImageNet Large Scale Visual Recognition Challenge. arXiv:1409.0575 [cs] (Sept","author":"Russakovsky Olga","year":"2014","unstructured":"Olga Russakovsky , Jia Deng , Hao Su , Jonathan Krause , Sanjeev Satheesh , Sean Ma , Zhiheng Huang , Andrej Karpathy , Aditya Khosla , Michael Bernstein , Alexander\u00a0 C. Berg , and Li Fei-Fei . 2014. ImageNet Large Scale Visual Recognition Challenge. arXiv:1409.0575 [cs] (Sept . 2014 ). arXiv:1409.0575. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander\u00a0C. Berg, and Li Fei-Fei. 2014. ImageNet Large Scale Visual Recognition Challenge. arXiv:1409.0575 [cs] (Sept. 2014). arXiv:1409.0575."},{"volume-title":"HPIPE-NX: Leveraging tensor blocks for high-performance CNN inference acceleration on FPGAs. Ph.\u00a0D. Dissertation","author":"Stan Marius\u00a0Octavian","key":"e_1_3_2_1_25_1","unstructured":"Marius\u00a0Octavian Stan . 2022. HPIPE-NX: Leveraging tensor blocks for high-performance CNN inference acceleration on FPGAs. Ph.\u00a0D. Dissertation . University of Toronto (Canada) . Marius\u00a0Octavian Stan. 2022. HPIPE-NX: Leveraging tensor blocks for high-performance CNN inference acceleration on FPGAs. Ph.\u00a0D. Dissertation. University of Toronto (Canada)."},{"key":"e_1_3_2_1_26_1","volume-title":"A survey of deep learning techniques for neural machine translation. arXiv preprint arXiv:2002.07526","author":"Yang Shuoheng","year":"2020","unstructured":"Shuoheng Yang , Yuxin Wang , and Xiaowen Chu . 2020. A survey of deep learning techniques for neural machine translation. arXiv preprint arXiv:2002.07526 ( 2020 ). Shuoheng Yang, Yuxin Wang, and Xiaowen Chu. 2020. A survey of deep learning techniques for neural machine translation. arXiv preprint arXiv:2002.07526 (2020)."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33015676"},{"key":"e_1_3_2_1_28_1","volume-title":"Learning Deep CNN Denoiser Prior for Image Restoration. In IEEE Conference on Computer Vision and Pattern Recognition. 3929\u20133938","author":"Zhang Kai","year":"2017","unstructured":"Kai Zhang , Wangmeng Zuo , Shuhang Gu , and Lei Zhang . 2017 . Learning Deep CNN Denoiser Prior for Image Restoration. In IEEE Conference on Computer Vision and Pattern Recognition. 3929\u20133938 . Kai Zhang, Wangmeng Zuo, Shuhang Gu, and Lei Zhang. 2017. Learning Deep CNN Denoiser Prior for Image Restoration. In IEEE Conference on Computer Vision and Pattern Recognition. 3929\u20133938."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2016.7783723"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.3390\/electronics10101187"}],"event":{"name":"HEART 2023: The International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies 2023","acronym":"HEART 2023","location":"Kusatsu Japan"},"container-title":["Proceedings of the 13th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3597031.3597057","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3597031.3597057","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:49:05Z","timestamp":1750182545000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3597031.3597057"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,14]]},"references-count":29,"alternative-id":["10.1145\/3597031.3597057","10.1145\/3597031"],"URL":"https:\/\/doi.org\/10.1145\/3597031.3597057","relation":{},"subject":[],"published":{"date-parts":[[2023,6,14]]},"assertion":[{"value":"2023-07-19","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}