{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T14:45:18Z","timestamp":1776955518755,"version":"3.51.4"},"publisher-location":"New York, NY, USA","reference-count":57,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,6,11]],"date-time":"2022-06-11T00:00:00Z","timestamp":1654905600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100002418","name":"Intel Corporation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100002418","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF (National Science Foundation)","doi-asserted-by":"publisher","award":["CCF 1725734, CNS 1909999, CNS 1942888, CCF 2028861"],"award-info":[{"award-number":["CCF 1725734, CNS 1909999, CNS 1942888, CCF 2028861"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,6,18]]},"DOI":"10.1145\/3470496.3527403","type":"proceedings-article","created":{"date-parts":[[2022,5,31]],"date-time":"2022-05-31T19:06:01Z","timestamp":1654023961000},"page":"916-931","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":28,"title":["Graphite"],"prefix":"10.1145","author":[{"given":"Zhangxiaowen","family":"Gong","sequence":"first","affiliation":[{"name":"University of Illinois at Urbana-Champaign and Intel Labs"}]},{"given":"Houxiang","family":"Ji","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign"}]},{"given":"Yao","family":"Yao","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign"}]},{"given":"Christopher W.","family":"Fletcher","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign"}]},{"given":"Christopher J.","family":"Hughes","sequence":"additional","affiliation":[{"name":"Intel Labs"}]},{"given":"Josep","family":"Torrellas","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign"}]}],"member":"320","published-online":{"date-parts":[[2022,6,11]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358305"},{"key":"e_1_3_2_1_2_1","unstructured":"Dario Amodei Rishita Anubhai Eric Battenberg Carl Case Jared Casper Bryan Catanzaro Jingdong Chen Mike Chrzanowski Adam Coates Greg Diamos Erich Elsen Jesse Engel Linxi Fan Christopher Fougner Tony Han Awni Hannun Billy Jun Patrick LeGresley Libby Lin Sharan Narang Andrew Ng Sherjil Ozair Ryan Prenger Jonathan Raiman Sanjeev Satheesh David Seetapun Shubho Sengupta Yi Wang Zhiqian Wang Chong Wang Bo Xiao Dani Yogatama Jun Zhan and Zhenyao Zhu. 2015. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin. arXiv:1512.02595 [cs.CL]  Dario Amodei Rishita Anubhai Eric Battenberg Carl Case Jared Casper Bryan Catanzaro Jingdong Chen Mike Chrzanowski Adam Coates Greg Diamos Erich Elsen Jesse Engel Linxi Fan Christopher Fougner Tony Han Awni Hannun Billy Jun Patrick LeGresley Libby Lin Sharan Narang Andrew Ng Sherjil Ozair Ryan Prenger Jonathan Raiman Sanjeev Satheesh David Seetapun Shubho Sengupta Yi Wang Zhiqian Wang Chong Wang Bo Xiao Dani Yogatama Jun Zhan and Zhenyao Zhu. 2015. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin. arXiv:1512.02595 [cs.CL]"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3085572"},{"key":"e_1_3_2_1_4_1","volume-title":"The GAP benchmark suite. arXiv preprint arXiv:1508.03619","author":"Beamer Scott","year":"2015","unstructured":"Scott Beamer , Krste Asanovi\u0107 , and David Patterson . 2015. The GAP benchmark suite. arXiv preprint arXiv:1508.03619 ( 2015 ). Scott Beamer, Krste Asanovi\u0107, and David Patterson. 2015. The GAP benchmark suite. arXiv preprint arXiv:1508.03619 (2015)."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/ReConFig.2015.7393317"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1963405.1963488"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447786.3456233"},{"key":"e_1_3_2_1_8_1","first-page":"1","article-title":"Sniper: Exploring the Level of Abstraction for Scalable andAccurate Parallel Multi-Core Simulations. In International Conference for High Performance Computing","volume":"52","author":"Carlson Trevor E.","year":"2011","unstructured":"Trevor E. Carlson , Wim Heirman , and Lieven Eeckhout . 2011 . Sniper: Exploring the Level of Abstraction for Scalable andAccurate Parallel Multi-Core Simulations. In International Conference for High Performance Computing , Networking, Storage and Analysis (SC). 52 : 1 -- 52 :12. Trevor E. Carlson, Wim Heirman, and Lieven Eeckhout. 2011. Sniper: Exploring the Level of Abstraction for Scalable andAccurate Parallel Multi-Core Simulations. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC). 52:1--52:12.","journal-title":"Networking, Storage and Analysis (SC)."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2049662.2049670"},{"key":"e_1_3_2_1_10_1","volume-title":"Global Neighbor Sampling for Mixed CPU-GPU Training on Giant Graphs. arXiv preprint arXiv:2106.06150","author":"Dong Jialin","year":"2021","unstructured":"Jialin Dong , Da Zheng , Lin F Yang , and Geroge Karypis . 2021. Global Neighbor Sampling for Mixed CPU-GPU Training on Giant Graphs. arXiv preprint arXiv:2106.06150 ( 2021 ). Jialin Dong, Da Zheng, Lin F Yang, and Geroge Karypis. 2021. Global Neighbor Sampling for Mixed CPU-GPU Training on Giant Graphs. arXiv preprint arXiv:2106.06150 (2021)."},{"key":"e_1_3_2_1_11_1","volume-title":"Fast Graph Representation Learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds.","author":"Fey Matthias","unstructured":"Matthias Fey and Jan E. Lenssen . 2019 . Fast Graph Representation Learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds. Matthias Fey and Jan E. Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds."},{"key":"e_1_3_2_1_12_1","volume-title":"Protein interface prediction using graph convolutional networks. Ph. D. Dissertation","author":"Fout Alex M","unstructured":"Alex M Fout . 2017. Protein interface prediction using graph convolutional networks. Ph. D. Dissertation . Colorado State University . Alex M Fout. 2017. Protein interface prediction using graph convolutional networks. Ph. D. Dissertation. Colorado State University."},{"key":"e_1_3_2_1_13_1","volume-title":"15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21)","author":"Gandhi Swapnil","year":"2021","unstructured":"Swapnil Gandhi and Anand Padmanabha Iyer . 2021 . P3: Distributed Deep Graph Learning at Scale . In 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21) . USENIX Association, 551--568. https:\/\/www.usenix.org\/conference\/osdi21\/presentation\/gandhi Swapnil Gandhi and Anand Padmanabha Iyer. 2021. P3: Distributed Deep Graph Learning at Scale. In 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21). USENIX Association, 551--568. https:\/\/www.usenix.org\/conference\/osdi21\/presentation\/gandhi"},{"key":"e_1_3_2_1_14_1","volume-title":"Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043","author":"Garcia Victor","year":"2017","unstructured":"Victor Garcia and Joan Bruna . 2017. Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043 ( 2017 ). Victor Garcia and Joan Bruna. 2017. Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043 (2017)."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO50266.2020.00079"},{"key":"e_1_3_2_1_16_1","volume-title":"Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures. In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 830--841","author":"Georganas Evangelos","year":"2018","unstructured":"Evangelos Georganas , Sasikanth Avancha , Kunal Banerjee , Dhiraj Kalamkar , Greg Henry , Hans Pabst , and Alexander Heinecke . 2018 . Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures. In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 830--841 . Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke. 2018. Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures. In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 830--841."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3410463.3414655"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO50266.2020.00070"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2016.7783759"},{"key":"e_1_3_2_1_20_1","volume-title":"Knowledge transfer for out-of-knowledge-base entities: A graph neural network approach. arXiv preprint arXiv:1706.05674","author":"Hamaguchi Takuo","year":"2017","unstructured":"Takuo Hamaguchi , Hidekazu Oiwa , Masashi Shimbo , and Yuji Matsumoto . 2017. Knowledge transfer for out-of-knowledge-base entities: A graph neural network approach. arXiv preprint arXiv:1706.05674 ( 2017 ). Takuo Hamaguchi, Hidekazu Oiwa, Masashi Shimbo, and Yuji Matsumoto. 2017. Knowledge transfer for out-of-knowledge-base entities: A graph neural network approach. arXiv preprint arXiv:1706.05674 (2017)."},{"key":"e_1_3_2_1_21_1","volume-title":"Proceedings of the 31st International Conference on Neural Information Processing Systems. 1025--1035","author":"Hamilton William L","year":"2017","unstructured":"William L Hamilton , Rex Ying , and Jure Leskovec . 2017 . Inductive representation learning on large graphs . In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1025--1035 . William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1025--1035."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2016.83"},{"key":"e_1_3_2_1_23_1","volume-title":"Open graph benchmark: Datasets for machine learning on graphs. arXiv preprint arXiv:2005.00687","author":"Hu Weihua","year":"2020","unstructured":"Weihua Hu , Matthias Fey , Marinka Zitnik , Yuxiao Dong , Hongyu Ren , Bowen Liu , Michele Catasta , and Jure Leskovec . 2020. Open graph benchmark: Datasets for machine learning on graphs. arXiv preprint arXiv:2005.00687 ( 2020 ). Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. arXiv preprint arXiv:2005.00687 (2020)."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3437801.3441585"},{"key":"e_1_3_2_1_25_1","unstructured":"Intel. 2019. Intel Data Streaming Accelerator Architecture Specification.  Intel. 2019. Intel Data Streaming Accelerator Architecture Specification."},{"key":"e_1_3_2_1_26_1","first-page":"187","article-title":"Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc","volume":"2","author":"Jia Zhihao","year":"2020","unstructured":"Zhihao Jia , Sina Lin , Mingyu Gao , Matei Zaharia , and Alex Aiken . 2020 . Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc . Proceedings of Machine Learning and Systems 2 (2020), 187 -- 198 . Zhihao Jia, Sina Lin, Mingyu Gao, Matei Zaharia, and Alex Aiken. 2020. Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc. Proceedings of Machine Learning and Systems 2 (2020), 187--198.","journal-title":"Proceedings of Machine Learning and Systems"},{"key":"e_1_3_2_1_27_1","volume-title":"Proceedings of the Workshop on Resource-Constrained Machine Learning (ReCoML","author":"Kiningham Kevin","year":"2020","unstructured":"Kevin Kiningham , Philip Levis , and Christopher R\u00e9 . 2020 . GReTA: Hardware Optimized Graph Processing for GNNs . In Proceedings of the Workshop on Resource-Constrained Machine Learning (ReCoML 2020). Kevin Kiningham, Philip Levis, and Christopher R\u00e9. 2020. GReTA: Hardware Optimized Graph Processing for GNNs. In Proceedings of the Workshop on Resource-Constrained Machine Learning (ReCoML 2020)."},{"key":"e_1_3_2_1_28_1","volume-title":"GRIP: A Graph Neural Network Accelerator Architecture. arXiv preprint arXiv:2007.13828","author":"Kiningham Kevin","year":"2020","unstructured":"Kevin Kiningham , Christopher Re , and Philip Levis . 2020 . GRIP: A Graph Neural Network Accelerator Architecture. arXiv preprint arXiv:2007.13828 (2020). Kevin Kiningham, Christopher Re, and Philip Levis. 2020. GRIP: A Graph Neural Network Accelerator Architecture. arXiv preprint arXiv:2007.13828 (2020)."},{"key":"e_1_3_2_1_29_1","volume-title":"Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907","author":"Kipf Thomas N","year":"2016","unstructured":"Thomas N Kipf and Max Welling . 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 ( 2016 ). Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)."},{"key":"e_1_3_2_1_30_1","volume-title":"Hinton","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky , Ilya Sutskever , and Geoffrey E . Hinton . 2012 . Imagenet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems (NIPS) . Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems (NIPS)."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00936"},{"key":"e_1_3_2_1_32_1","volume-title":"DeeperGCN: All You Need to Train Deeper GCNs. arXiv preprint arXiv:2006.07739","author":"Li Guohao","year":"2020","unstructured":"Guohao Li , Chenxin Xiong , Ali Thabet , and Bernard Ghanem . 2020. DeeperGCN: All You Need to Train Deeper GCNs. arXiv preprint arXiv:2006.07739 ( 2020 ). Guohao Li, Chenxin Xiong, Ali Thabet, and Bernard Ghanem. 2020. DeeperGCN: All You Need to Train Deeper GCNs. arXiv preprint arXiv:2006.07739 (2020)."},{"key":"e_1_3_2_1_33_1","volume-title":"EnGN: A High-Throughput and Energy-Efficient Accelerator for Large Graph Neural Networks","author":"Liang Shengwen","year":"2020","unstructured":"Shengwen Liang , Ying Wang , Cheng Liu , Lei He , LI Huawei , Dawen Xu , and Xiaowei Li. 2020. EnGN: A High-Throughput and Energy-Efficient Accelerator for Large Graph Neural Networks . IEEE Trans. Comput . ( 2020 ). Shengwen Liang, Ying Wang, Cheng Liu, Lei He, LI Huawei, Dawen Xu, and Xiaowei Li. 2020. EnGN: A High-Throughput and Energy-Efficient Accelerator for Large Graph Neural Networks. IEEE Trans. Comput. (2020)."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403076"},{"key":"e_1_3_2_1_35_1","volume-title":"NeuGraph: Parallel Deep Neural Network Computation on Large Graphs. In 2019 USENIX Annual Technical Conference (USENIX ATC 19)","author":"Ma Lingxiao","year":"2019","unstructured":"Lingxiao Ma , Zhi Yang , Youshan Miao , Jilong Xue , Ming Wu , Lidong Zhou , and Yafei Dai . 2019 . NeuGraph: Parallel Deep Neural Network Computation on Large Graphs. In 2019 USENIX Annual Technical Conference (USENIX ATC 19) . 443--458. Lingxiao Ma, Zhi Yang, Youshan Miao, Jilong Xue, Ming Wu, Lidong Zhou, and Yafei Dai. 2019. NeuGraph: Parallel Deep Neural Network Computation on Large Graphs. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). 443--458."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3458817.3480856"},{"key":"e_1_3_2_1_37_1","volume-title":"Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv preprint arXiv:1511.06434","author":"Radford Alec","year":"2015","unstructured":"Alec Radford , Luke Metz , and Soumith Chintala . 2015. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv preprint arXiv:1511.06434 ( 2015 ). Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv preprint arXiv:1511.06434 (2015)."},{"key":"e_1_3_2_1_38_1","volume-title":"FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural Networks. In 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 256--266","author":"Rahman Md Khaledur","year":"2021","unstructured":"Md Khaledur Rahman , Majedul Haque Sujon , and Ariful Azad . 2021 . FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural Networks. In 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 256--266 . Md Khaledur Rahman, Majedul Haque Sujon, and Ariful Azad. 2021. FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural Networks. In 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 256--266."},{"key":"e_1_3_2_1_39_1","first-page":"18482","article-title":"GCN Meets GPU: Decoupling \"When to Sample\" from \"How to Sample","volume":"33","author":"Ramezani Morteza","year":"2020","unstructured":"Morteza Ramezani , Weilin Cong , Mehrdad Mahdavi , Anand Sivasubramaniam , and Mahmut Kandemir . 2020 . GCN Meets GPU: Decoupling \"When to Sample\" from \"How to Sample \". Advances in Neural Information Processing Systems 33 (2020), 18482 -- 18492 . Morteza Ramezani, Weilin Cong, Mehrdad Mahdavi, Anand Sivasubramaniam, and Mahmut Kandemir. 2020. GCN Meets GPU: Decoupling \"When to Sample\" from \"How to Sample\". Advances in Neural Information Processing Systems 33 (2020), 18482--18492.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_40_1","unstructured":"Andres Rodriguez Wei Li Jason Dai Frank Zhang Jiong Gong and Chong Yu. 2017. Intel Processors for Deep Learning Training. https:\/\/software.intel.com\/content\/www\/us\/en\/develop\/articles\/intel-processors-for-deep-learning-training.html  Andres Rodriguez Wei Li Jason Dai Frank Zhang Jiong Gong and Chong Yu. 2017. Intel Processors for Deep Learning Training. https:\/\/software.intel.com\/content\/www\/us\/en\/develop\/articles\/intel-processors-for-deep-learning-training.html"},{"key":"e_1_3_2_1_41_1","volume-title":"International Conference on Machine Learning. PMLR, 4470--4479","author":"Sanchez-Gonzalez Alvaro","year":"2018","unstructured":"Alvaro Sanchez-Gonzalez , Nicolas Heess , Jost Tobias Springenberg , Josh Merel , Martin Riedmiller , Raia Hadsell , and Peter Battaglia . 2018 . Graph Networks as Learnable Physics Engines for Inference and Control . In International Conference on Machine Learning. PMLR, 4470--4479 . Alvaro Sanchez-Gonzalez, Nicolas Heess, Jost Tobias Springenberg, Josh Merel, Martin Riedmiller, Raia Hadsell, and Peter Battaglia. 2018. Graph Networks as Learnable Physics Engines for Inference and Control. In International Conference on Machine Learning. PMLR, 4470--4479."},{"key":"e_1_3_2_1_42_1","unstructured":"Lattice Semiconductor. 2015. Scatter-Gather Direct Memory Access Controller IP Core User Guide.  Lattice Semiconductor. 2015. Scatter-Gather Direct Memory Access Controller IP Core User Guide."},{"key":"e_1_3_2_1_43_1","volume-title":"Xbyak: JIT assembler for x86(IA32), x64(AMD64, x86-64) by C++. https:\/\/github.com\/herumi\/xbyak.","author":"Shigeo Mitsunari","year":"2021","unstructured":"Mitsunari Shigeo . 2021 . Xbyak: JIT assembler for x86(IA32), x64(AMD64, x86-64) by C++. https:\/\/github.com\/herumi\/xbyak. Mitsunari Shigeo. 2021. Xbyak: JIT assembler for x86(IA32), x64(AMD64, x86-64) by C++. https:\/\/github.com\/herumi\/xbyak."},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature16961"},{"key":"e_1_3_2_1_45_1","volume-title":"Dropout: A Simple Way to Prevent Neural Networks from Overfitting. The journal of machine learning research 15, 1","author":"Srivastava Nitish","year":"2014","unstructured":"Nitish Srivastava , Geoffrey Hinton , Alex Krizhevsky , Ilya Sutskever , and Ruslan Salakhutdinov . 2014 . Dropout: A Simple Way to Prevent Neural Networks from Overfitting. The journal of machine learning research 15, 1 (2014), 1929--1958. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. The journal of machine learning research 15, 1 (2014), 1929--1958."},{"key":"e_1_3_2_1_46_1","unstructured":"Dean Takahashi. 2018. Gadi Singer interview - How Intel designs processors in the AI era. https:\/\/venturebeat.com\/2018\/09\/09\/gadi-singer-interview-how-intel-designs-processors-in-the-ai-era\/  Dean Takahashi. 2018. Gadi Singer interview - How Intel designs processors in the AI era. https:\/\/venturebeat.com\/2018\/09\/09\/gadi-singer-interview-how-intel-designs-processors-in-the-ai-era\/"},{"key":"e_1_3_2_1_47_1","volume-title":"15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21)","author":"Thorpe John","year":"2021","unstructured":"John Thorpe , Yifan Qiao , Jonathan Eyolfson , Shen Teng , Guanzhou Hu , Zhihao Jia , Jinliang Wei , Keval Vora , Ravi Netravali , Miryung Kim , and Guoqing Harry Xu . 2021 . Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads . In 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21) . USENIX Association, 495--514. https:\/\/www.usenix.org\/conference\/osdi21\/presentation\/thorpe John Thorpe, Yifan Qiao, Jonathan Eyolfson, Shen Teng, Guanzhou Hu, Zhihao Jia, Jinliang Wei, Keval Vora, Ravi Netravali, Miryung Kim, and Guoqing Harry Xu. 2021. Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads. In 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21). USENIX Association, 495--514. https:\/\/www.usenix.org\/conference\/osdi21\/presentation\/thorpe"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447786.3456229"},{"key":"e_1_3_2_1_49_1","volume-title":"Highly-Performant Package for Graph Neural Networks. arXiv preprint arXiv:1909.01315","author":"Wang Minjie","year":"2019","unstructured":"Minjie Wang , Da Zheng , Zihao Ye , Quan Gan , Mufei Li , Xiang Song , Jinjing Zhou , Chao Ma , Lingfan Yu , Yu Gai , Tianjun Xiao , Tong He , George Karypis , Jinyang Li , and Zheng Zhang . 2019. Deep Graph Library: A Graph-Centric , Highly-Performant Package for Graph Neural Networks. arXiv preprint arXiv:1909.01315 ( 2019 ). Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, Tianjun Xiao, Tong He, George Karypis, Jinyang Li, and Zheng Zhang. 2019. Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. arXiv preprint arXiv:1909.01315 (2019)."},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/2851141.2851145"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2020.2978386"},{"key":"e_1_3_2_1_52_1","unstructured":"Xilinx. 2019. AXI DMA v7.1 LogiCORE IP Product Guide.  Xilinx. 2019. AXI DMA v7.1 LogiCORE IP Product Guide."},{"key":"e_1_3_2_1_53_1","unstructured":"Koichi Yamada Wei Li and Pradeep Dubey. 2020. Intel's MLPerf Results Show Robust CPU-Based Training Performance For a Range of Workloads. https:\/\/www.intel.com\/content\/www\/us\/en\/artificial-intelligence\/posts\/intels-mlperf-results.html  Koichi Yamada Wei Li and Pradeep Dubey. 2020. Intel's MLPerf Results Show Robust CPU-Based Training Performance For a Range of Workloads. https:\/\/www.intel.com\/content\/www\/us\/en\/artificial-intelligence\/posts\/intels-mlperf-results.html"},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA47549.2020.00012"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219890"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.aiopen.2021.01.001"},{"key":"e_1_3_2_1_57_1","volume-title":"AliGraph: A Comprehensive Graph Neural Network Platform. arXiv preprint arXiv:1902.08730","author":"Zhu Rong","year":"2019","unstructured":"Rong Zhu , Kun Zhao , Hongxia Yang , Wei Lin , Chang Zhou , Baole Ai , Yong Li , and Jingren Zhou . 2019. AliGraph: A Comprehensive Graph Neural Network Platform. arXiv preprint arXiv:1902.08730 ( 2019 ). Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. 2019. AliGraph: A Comprehensive Graph Neural Network Platform. arXiv preprint arXiv:1902.08730 (2019)."}],"event":{"name":"ISCA '22: The 49th Annual International Symposium on Computer Architecture","location":"New York New York","acronym":"ISCA '22","sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture","IEEE CS TCAA IEEE CS technical committee on architectural acoustics"]},"container-title":["Proceedings of the 49th Annual International Symposium on Computer Architecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3470496.3527403","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3470496.3527403","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3470496.3527403","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:28Z","timestamp":1750188628000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3470496.3527403"}},"subtitle":["optimizing graph neural networks on CPUs through cooperative software-hardware techniques"],"short-title":[],"issued":{"date-parts":[[2022,6,11]]},"references-count":57,"alternative-id":["10.1145\/3470496.3527403","10.1145\/3470496"],"URL":"https:\/\/doi.org\/10.1145\/3470496.3527403","relation":{},"subject":[],"published":{"date-parts":[[2022,6,11]]},"assertion":[{"value":"2022-06-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}