{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,1]],"date-time":"2025-10-01T15:17:00Z","timestamp":1759331820166,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":25,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,11,2]],"date-time":"2020-11-02T00:00:00Z","timestamp":1604275200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Science Foundation","award":["1817037, 1725447, 1730309"],"award-info":[{"award-number":["1817037, 1725447, 1730309"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,11,2]]},"DOI":"10.1145\/3400302.3415610","type":"proceedings-article","created":{"date-parts":[[2020,12,18]],"date-time":"2020-12-18T01:17:48Z","timestamp":1608254268000},"page":"1-9","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["fuseGNN"],"prefix":"10.1145","author":[{"given":"Zhaodong","family":"Chen","sequence":"first","affiliation":[{"name":"University of California"}]},{"given":"Mingyu","family":"Yan","sequence":"additional","affiliation":[{"name":"University of California"}]},{"given":"Maohua","family":"Zhu","sequence":"additional","affiliation":[{"name":"University of California"}]},{"given":"Lei","family":"Deng","sequence":"additional","affiliation":[{"name":"University of California"}]},{"given":"Guoqi","family":"Li","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]},{"given":"Shuangchen","family":"Li","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Yuan","family":"Xie","sequence":"additional","affiliation":[{"name":"University of California"}]}],"member":"320","published-online":{"date-parts":[[2020,12,17]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"C CUDA. [n.d.]. Best Practices Guide-CUDA Toolkit Documentation.  C CUDA. [n.d.]. Best Practices Guide-CUDA Toolkit Documentation."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1687399.1687501"},{"key":"e_1_3_2_1_3_1","volume-title":"Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428","author":"Fey Matthias","year":"2019","unstructured":"Matthias Fey and Jan Eric Lenssen . 2019. Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428 ( 2019 ). Matthias Fey and Jan Eric Lenssen. 2019. Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428 (2019)."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-015-1483-z"},{"key":"e_1_3_2_1_5_1","first-page":"135","article-title":"Cache-based control of atomic operations in conjunction with an external ALU block","volume":"8","author":"Glasco David B","year":"2012","unstructured":"David B Glasco , Peter B Holmqvist , George R Lynch , Patrick R Marchand , Karan Mehra , and James Roberts . 2012 . Cache-based control of atomic operations in conjunction with an external ALU block . US Patent 8 , 135 ,926. David B Glasco, Peter B Holmqvist, George R Lynch, Patrick R Marchand, Karan Mehra, and James Roberts. 2012. Cache-based control of atomic operations in conjunction with an external ALU block. US Patent 8,135,926.","journal-title":"US Patent"},{"key":"e_1_3_2_1_6_1","unstructured":"Will Hamilton Zhitao Ying and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in neural information processing systems. 1024--1034.  Will Hamilton Zhitao Ying and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in neural information processing systems. 1024--1034."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2458523.2458525"},{"key":"e_1_3_2_1_8_1","unstructured":"Mark Harris et al. 2007. Optimizing parallel reduction in CUDA. Nvidia developer technology 2 4 (2007) 70.  Mark Harris et al. 2007. Optimizing parallel reduction in CUDA. Nvidia developer technology 2 4 (2007) 70."},{"key":"e_1_3_2_1_9_1","volume-title":"Dissecting the nvidia volta gpu architecture via microbenchmarking. arXiv preprint arXiv:1804.06826","author":"Jia Zhe","year":"2018","unstructured":"Zhe Jia , Marco Maggioni , Benjamin Staiger , and Daniele P Scarpazza . 2018. Dissecting the nvidia volta gpu architecture via microbenchmarking. arXiv preprint arXiv:1804.06826 ( 2018 ). Zhe Jia, Marco Maggioni, Benjamin Staiger, and Daniele P Scarpazza. 2018. Dissecting the nvidia volta gpu architecture via microbenchmarking. arXiv preprint arXiv:1804.06826 (2018)."},{"key":"e_1_3_2_1_10_1","volume-title":"Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907","author":"Kipf Thomas N","year":"2016","unstructured":"Thomas N Kipf and Max Welling . 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 ( 2016 ). Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)."},{"key":"e_1_3_2_1_11_1","volume-title":"Deepgcns: Making gcns go as deep as cnns. arXiv preprint arXiv:1910.06849","author":"Li Guohao","year":"2019","unstructured":"Guohao Li , Matthias M\u00fcller , Guocheng Qian , Itzel C Delgadillo , Abdulellah Abualshour , Ali Thabet , and Bernard Ghanem . 2019 . Deepgcns: Making gcns go as deep as cnns. arXiv preprint arXiv:1910.06849 (2019). Guohao Li, Matthias M\u00fcller, Guocheng Qian, Itzel C Delgadillo, Abdulellah Abualshour, Ali Thabet, and Bernard Ghanem. 2019. Deepgcns: Making gcns go as deep as cnns. arXiv preprint arXiv:1910.06849 (2019)."},{"key":"e_1_3_2_1_12_1","unstructured":"Lingxiao Ma Zhi Yang Youshan Miao Jilong Xue Ming Wu Lidong Zhou and Yafei Dai. 2019. Neugraph: parallel deep neural network computation on large graphs. In 2019 {USENIX} Annual Technical Conference ({USENIX}{ATC} 19). 443--458.  Lingxiao Ma Zhi Yang Youshan Miao Jilong Xue Ming Wu Lidong Zhou and Yafei Dai. 2019. Neugraph: parallel deep neural network computation on large graphs. In 2019 { USENIX } Annual Technical Conference ( { USENIX }{ ATC } 19). 443--458."},{"key":"e_1_3_2_1_13_1","unstructured":"M Naumov LS Chien P Vandermersch and U Kapasi. [n.d.]. Cusparse library.  M Naumov LS Chien P Vandermersch and U Kapasi. [n.d.]. Cusparse library."},{"key":"e_1_3_2_1_14_1","first-page":"31","article-title":"Cublas library. NVIDIA Corporation, Santa Clara","volume":"15","author":"Nvidia CUDA","year":"2008","unstructured":"CUDA Nvidia . 2008 . Cublas library. NVIDIA Corporation, Santa Clara , California 15 , 27 (2008), 31 . CUDA Nvidia. 2008. Cublas library. NVIDIA Corporation, Santa Clara, California 15, 27 (2008), 31.","journal-title":"California"},{"key":"e_1_3_2_1_15_1","volume-title":"4th GPU Technology Conf.(GTC'13)","author":"Nyland Lars","year":"2013","unstructured":"Lars Nyland and Stephen Jones . 2013 . Understanding and using atomic memory operations . In 4th GPU Technology Conf.(GTC'13) , March. Lars Nyland and Stephen Jones. 2013. Understanding and using atomic memory operations. In 4th GPU Technology Conf.(GTC'13), March."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAD.2011.6105404"},{"key":"e_1_3_2_1_17_1","volume-title":"Attention-based graph neural network for semi-supervised learning. arXiv preprint arXiv:1803.03735","author":"Thekumparampil Kiran K","year":"2018","unstructured":"Kiran K Thekumparampil , Chong Wang , Sewoong Oh , and Li-Jia Li. 2018. Attention-based graph neural network for semi-supervised learning. arXiv preprint arXiv:1803.03735 ( 2018 ). Kiran K Thekumparampil, Chong Wang, Sewoong Oh, and Li-Jia Li. 2018. Attention-based graph neural network for semi-supervised learning. arXiv preprint arXiv:1803.03735 (2018)."},{"key":"e_1_3_2_1_18_1","volume-title":"Graph attention networks. arXiv preprint arXiv:1710.10903","author":"Veli\u010dkovi\u0107 Petar","year":"2017","unstructured":"Petar Veli\u010dkovi\u0107 , Guillem Cucurull , Arantxa Casanova , Adriana Romero , Pietro Lio , and Yoshua Bengio . 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 ( 2017 ). Petar Veli\u010dkovi\u0107, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"crossref","unstructured":"Guibin Wang YiSong Lin and Wei Yi. 2010. Kernel fusion: An effective method for better power efficiency on multithreaded GPU. In 2010 IEEE\/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber Physical and Social Computing. IEEE 344--350.  Guibin Wang YiSong Lin and Wei Yi. 2010. Kernel fusion: An effective method for better power efficiency on multithreaded GPU. In 2010 IEEE\/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber Physical and Social Computing. IEEE 344--350.","DOI":"10.1109\/GreenCom-CPSCom.2010.102"},{"key":"e_1_3_2_1_20_1","volume-title":"Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs. ICLR Workshop on Representation Learning on Graphs and Manifolds (2019","author":"Wang Minjie","year":"2019","unstructured":"Minjie Wang , Lingfan Yu , Da Zheng , Quan Gan , Yu Gai , Zihao Ye , Mufei Li , Jinjing Zhou , Qi Huang , Chao Ma , Ziyue Huang , Qipeng Guo , Hao Zhang , Haibin Lin , Junbo Zhao , Jinyang Li , Alexander J Smola , and Zheng Zhang . 2019 . Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs. ICLR Workshop on Representation Learning on Graphs and Manifolds (2019 ). https:\/\/arxiv.org\/abs\/1909.01315 Minjie Wang, Lingfan Yu, Da Zheng, Quan Gan, Yu Gai, Zihao Ye, Mufei Li, Jinjing Zhou, Qi Huang, Chao Ma, Ziyue Huang, Qipeng Guo, Hao Zhang, Haibin Lin, Junbo Zhao, Jinyang Li, Alexander J Smola, and Zheng Zhang. 2019. Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs. ICLR Workshop on Representation Learning on Graphs and Manifolds (2019). https:\/\/arxiv.org\/abs\/1909.01315"},{"key":"e_1_3_2_1_21_1","volume-title":"Simplifying Graph Convolutional Networks. In International Conference on Machine Learning. 6861--6871","author":"Wu Felix","year":"2019","unstructured":"Felix Wu , Amauri Souza , Tianyi Zhang , Christopher Fifty , Tao Yu , and Kilian Weinberger . 2019 . Simplifying Graph Convolutional Networks. In International Conference on Machine Learning. 6861--6871 . Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying Graph Convolutional Networks. In International Conference on Machine Learning. 6861--6871."},{"key":"e_1_3_2_1_22_1","volume-title":"International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=ryGs6iA5Km","author":"Xu Keyulu","year":"2019","unstructured":"Keyulu Xu , Weihua Hu , Jure Leskovec , and Stefanie Jegelka . 2019 . How Powerful are Graph Neural Networks? . In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=ryGs6iA5Km Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=ryGs6iA5Km"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2020.2970395"},{"key":"e_1_3_2_1_24_1","volume-title":"HyGCN: A GCN Accelerator with Hybrid Architecture. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).","author":"Yan Mingyu","year":"2020","unstructured":"Mingyu Yan , Lei Deng , Xing Hu , Ling Liang , Yujing Feng , Xiaochun Ye , Zhimin Zhang , Dongrui Fan , and Yuan Xie . 2020 . HyGCN: A GCN Accelerator with Hybrid Architecture. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). Mingyu Yan, Lei Deng, Xing Hu, Ling Liang, Yujing Feng, Xiaochun Ye, Zhimin Zhang, Dongrui Fan, and Yuan Xie. 2020. HyGCN: A GCN Accelerator with Hybrid Architecture. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAD.2011.6105323"}],"event":{"name":"ICCAD '20: IEEE\/ACM International Conference on Computer-Aided Design","sponsor":["SIGDA ACM Special Interest Group on Design Automation","IEEE CAS","IEEE CEDA","IEEE CS"],"location":"Virtual Event USA","acronym":"ICCAD '20"},"container-title":["Proceedings of the 39th International Conference on Computer-Aided Design"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3400302.3415610","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3400302.3415610","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3400302.3415610","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:31:40Z","timestamp":1750195900000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3400302.3415610"}},"subtitle":["accelerating graph convolutional neural network training on GPGPU"],"short-title":[],"issued":{"date-parts":[[2020,11,2]]},"references-count":25,"alternative-id":["10.1145\/3400302.3415610","10.1145\/3400302"],"URL":"https:\/\/doi.org\/10.1145\/3400302.3415610","relation":{},"subject":[],"published":{"date-parts":[[2020,11,2]]},"assertion":[{"value":"2020-12-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}