{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T15:22:42Z","timestamp":1772119362189,"version":"3.50.1"},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"5s","license":[{"start":{"date-parts":[[2023,9,9]],"date-time":"2023-09-09T00:00:00Z","timestamp":1694217600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"National Science Foundation","award":["CCF-1813370 and CCF-2006788"],"award-info":[{"award-number":["CCF-1813370 and CCF-2006788"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2023,10,31]]},"abstract":"<jats:p>\n            Graph neural networks (GNNs) have emerged as a powerful approach for modelling and learning from graph-structured data. Multiple fields have since benefitted enormously from the capabilities of GNNs, such as recommendation systems, social network analysis, drug discovery, and robotics. However, accelerating and efficiently processing GNNs require a unique approach that goes beyond conventional artificial neural network accelerators, due to the substantial computational and memory requirements of GNNs. The slowdown of scaling in CMOS platforms also motivates a search for alternative implementation substrates. In this paper, we present\n            <jats:italic>GHOST<\/jats:italic>\n            , the first silicon-photonic hardware accelerator for GNNs.\n            <jats:italic>GHOST<\/jats:italic>\n            efficiently alleviates the costs associated with both vertex-centric and edge-centric operations. It implements separately the three main stages involved in running GNNs in the optical domain, allowing it to be used for the inference of various widely used GNN models and architectures, such as graph convolution networks and graph attention networks. Our simulation studies indicate that\n            <jats:italic>GHOST<\/jats:italic>\n            exhibits at least 10.2 \u00d7 better throughput and 3.8 \u00d7 better energy efficiency when compared to GPU, TPU, CPU and multiple state-of-the-art GNN hardware accelerators.\n          <\/jats:p>","DOI":"10.1145\/3609097","type":"journal-article","created":{"date-parts":[[2023,9,9]],"date-time":"2023-09-09T13:33:18Z","timestamp":1694266398000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["GHOST: A Graph Neural Network Accelerator using Silicon Photonics"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-0376-8754","authenticated-orcid":false,"given":"Salma","family":"Afifi","sequence":"first","affiliation":[{"name":"Colorado State University"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3619-1465","authenticated-orcid":false,"given":"Febin","family":"Sunny","sequence":"additional","affiliation":[{"name":"Colorado State University"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7814-2370","authenticated-orcid":false,"given":"Amin","family":"Shafiee","sequence":"additional","affiliation":[{"name":"Colorado State University"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4930-2985","authenticated-orcid":false,"given":"Mahdi","family":"Nikdast","sequence":"additional","affiliation":[{"name":"Colorado State University"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0846-0066","authenticated-orcid":false,"given":"Sudeep","family":"Pasricha","sequence":"additional","affiliation":[{"name":"Colorado State University"}]}],"member":"320","published-online":{"date-parts":[[2023,9,9]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1038\/323533a0"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.aiopen.2021.01.001"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2020.2978386"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477141"},{"issue":"2166","key":"e_1_3_1_7_2","doi-asserted-by":"crossref","first-page":"20190061","DOI":"10.1098\/rsta.2019.0061","article-title":"The future of computing beyond Moore's law","volume":"378","author":"Shalf John","year":"2020","unstructured":"John Shalf. 2020. The future of computing beyond Moore's law. Philosophical Transactions of the Royal Society A 378, 2166 (2020), 20190061.","journal-title":"Philosophical Transactions of the Royal Society A"},{"issue":"4","key":"e_1_3_1_8_2","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1109\/MDAT.2020.2982628","article-title":"A survey of silicon photonics for energy-efficient manycore computing","volume":"37","author":"Pasricha Sudeep","year":"2021","unstructured":"Sudeep Pasricha and Mahdi Nikdast. 2021. A survey of silicon photonics for energy-efficient manycore computing. IEEE Design & Test 37, 4 (2021), 60\u201381.","journal-title":"IEEE Design & Test"},{"key":"e_1_3_1_9_2","first-page":"1069","volume-title":"58th ACM\/IEEE Design Automation Conference (DAC)","author":"Sunny Febin","year":"2021","unstructured":"Febin Sunny, Asif Mirza, Mahdi Nikdast, and Sudeep Pasricha. 2021. CrossLight: A cross-layer optimized silicon photonic neural network accelerator. 58th ACM\/IEEE Design Automation Conference (DAC). 1069\u20131074."},{"issue":"1","key":"e_1_3_1_10_2","first-page":"1","article-title":"Digital electronics and analog photonics for convolutional neural networks (DEAP-CNNs)","volume":"26","author":"Bangari Viraj","year":"2020","unstructured":"Viraj Bangari et al. 2020. Digital electronics and analog photonics for convolutional neural networks (DEAP-CNNs). IEEE Journal of Quantum Electronics 26, 1 (2020), 1\u201313.","journal-title":"IEEE Journal of Quantum Electronics"},{"key":"e_1_3_1_11_2","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1109\/ISVLSI54635.2022.00030","volume-title":"IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","author":"Sunny Febin","year":"2022","unstructured":"Febin Sunny, Mahdi Nikdast, and Sudeep Pasricha. 2022. RecLight: A recurrent neural network accelerator with integrated silicon photonics. IEEE Computer Society Annual Symposium on VLSI (ISVLSI). 98\u2013103."},{"key":"e_1_3_1_12_2","first-page":"15","volume-title":"ACM Great Lakes Symposium on VLSI (GLSVLSI)","author":"Afifi Salma","year":"2023","unstructured":"Salma Afifi, Febin Sunny, Mhadi Nikdast, and Sudeep Pasricha. 2023. TRON: Transformer neural network acceleration with non-coherent silicon photonics. ACM Great Lakes Symposium on VLSI (GLSVLSI). 15\u201321."},{"key":"e_1_3_1_13_2","first-page":"214","volume-title":"IEEE\/ACM Asia & South Pacific Design Automation Conference (ASPDAC)","author":"Sunny Febin","year":"2022","unstructured":"Febin Sunny, Mahdi Nikdast, and Sudeep Pasricha. 2022. SONIC: A sparse neural network inference accelerator with silicon photonics for energy-efficient deep learning. IEEE\/ACM Asia & South Pacific Design Automation Conference (ASPDAC). 214\u2013219."},{"key":"e_1_3_1_14_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3476988","article-title":"ROBIN: A robust optical binary neural network accelerator","author":"Sunny Febin","year":"2021","unstructured":"Febin Sunny, Asif Mirza, Mahdi Nikdast, and Sudeep Pasricha. 2021. ROBIN: A robust optical binary neural network accelerator. ACM Transactions on Embedded Computing Systems. 1\u201324.","journal-title":"ACM Transactions on Embedded Computing Systems"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2008.2005605"},{"key":"e_1_3_1_16_2","unstructured":"Thomas N. Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907."},{"key":"e_1_3_1_17_2","first-page":"1024","article-title":"Inductive representation learning on large graphs","volume":"30","author":"Hamilton Will","year":"2017","unstructured":"Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in Neural Information Processing Systems 30 (2017), 1024\u20131034.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_18_2","unstructured":"Keyulu Xu Weihua Hu Jure Leskovec and Stefanie Jegelka. 2018. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826."},{"key":"e_1_3_1_19_2","unstructured":"Petar Velickovic et al. 2017. Graph Attention Networks. arXiv preprint arXiv:1710.10903."},{"key":"e_1_3_1_20_2","unstructured":"Matthias Fey and Jan Eric Lenssen. 2019. Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428."},{"key":"e_1_3_1_21_2","article-title":"Deep graph library: Towards efficient and scalable deep learning on graphs","author":"Wang Minjie Yu","year":"2019","unstructured":"Minjie Yu Wang. 2019. Deep graph library: Towards efficient and scalable deep learning on graphs. ICLR Workshop on Representation Learning on Graphs and Manifolds.","journal-title":"ICLR Workshop on Representation Learning on Graphs and Manifolds"},{"key":"e_1_3_1_22_2","first-page":"443","article-title":"NeuGraph: Parallel deep neural network computation on large graphs","author":"Ma Lingxiao","year":"2019","unstructured":"Lingxiao Ma. 2019. NeuGraph: Parallel deep neural network computation on large graphs. USENIX Annual Technical Conference. 443\u2013458.","journal-title":"USENIX Annual Technical Conference"},{"key":"e_1_3_1_23_2","volume-title":"Workshop on Resource-Constrained Machine Learning (ReCoML 2020)","author":"Kiningham Kevin","year":"2020","unstructured":"Kevin Kiningham, Philip Levis, and Christopher R\u00e9. 2020. Greta: Hardware optimized graph processing for gnns. Workshop on Resource-Constrained Machine Learning (ReCoML 2020)."},{"key":"e_1_3_1_24_2","first-page":"1","volume-title":"ACM\/IEEE Design Automation Conference (DAC)","author":"Auten Adam","year":"2020","unstructured":"Adam Auten, Matthew Tomei, and Rakesh Kumar. 2020. Hardware acceleration of graph neural networks. ACM\/IEEE Design Automation Conference (DAC). 1\u20136."},{"key":"e_1_3_1_25_2","unstructured":"Shengwen Liang et al. 2019. EnGN: A high-throughput and energy-efficient accelerator for large graph neural networks. arXiv preprint arXiv:1909.00155."},{"key":"e_1_3_1_26_2","first-page":"15","volume-title":"IEEE International Symposium on High Performance Computer Architecture (HPCA)","author":"Yan Mingyu","year":"2020","unstructured":"Mingyu Yan. 2020. Hygcn: A gcn accelerator with hybrid architecture. IEEE International Symposium on High Performance Computer Architecture (HPCA). 15\u201329."},{"issue":"4","key":"e_1_3_1_27_2","doi-asserted-by":"crossref","first-page":"914","DOI":"10.1109\/TC.2022.3197083","article-title":"GRIP: A graph neural network accelerator architecture","volume":"72","author":"Kiningham Kevin","year":"2022","unstructured":"Kevin Kiningham, Philip Levis, and Christopher R\u00e9. 2022. GRIP: A graph neural network accelerator architecture. IEEE Transactions on Computers 72, 4 (2022), 914\u2013925.","journal-title":"IEEE Transactions on Computers"},{"key":"e_1_3_1_28_2","first-page":"469","volume-title":"59th ACM\/IEEE Design Automation Conference (DAC)","author":"Liu Cong","year":"2022","unstructured":"Cong Liu. 2022. ReGNN: A ReRAM-based heterogeneous architecture for general graph neural networks. 59th ACM\/IEEE Design Automation Conference (DAC). 469\u2013474."},{"key":"e_1_3_1_29_2","first-page":"1667","article-title":"ReGraphX: NoC-enabled 3D heterogeneous ReRAM architecture for training graph neural networks","author":"Iqbal Arka Aqeeb","year":"2021","unstructured":"Aqeeb Iqbal Arka. 2021. ReGraphX: NoC-enabled 3D heterogeneous ReRAM architecture for training graph neural networks. Design, Automation & Test in Europe Conference & Exhibition (DATE). HERE. 1667\u20131672","journal-title":"Design, Automation & Test in Europe Conference & Exhibition (DATE)"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/3459009"},{"key":"e_1_3_1_31_2","first-page":"539","volume-title":"Great Lakes Symposium on VLSI (GLSVLSI)","author":"Sunny Febin","year":"2023","unstructured":"Febin Sunny, Mahdi Nikdast, and Sudeep Pasricha. 2023. Cross-Layer design for AI acceleration with non-coherent optical computing. Great Lakes Symposium on VLSI (GLSVLSI). 539\u2013544."},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2021.3132555"},{"issue":"9","key":"e_1_3_1_33_2","doi-asserted-by":"crossref","first-page":"1800275","DOI":"10.1002\/lpor.201800275","article-title":"PWM-Driven thermally tunable silicon microring resonators: design, fabrication, and characterization","volume":"13","author":"Pintus Paolo","year":"2019","unstructured":"Paolo Pintus. 2019. PWM-Driven thermally tunable silicon microring resonators: design, fabrication, and characterization. Laser & Photonics Reviews 13, 9 (2019), 1800275.","journal-title":"Laser & Photonics Reviews"},{"issue":"8","key":"e_1_3_1_34_2","doi-asserted-by":"crossref","first-page":"1688","DOI":"10.1109\/JLT.2015.2510282","article-title":"A hybrid barium titanate\u2013silicon photonics platform for ultraefficient electro-optic tuning","volume":"34","author":"Abel Stefan","year":"2016","unstructured":"Stefan Abel. 2016. A hybrid barium titanate\u2013silicon photonics platform for ultraefficient electro-optic tuning. Journal of Lightwave Technology 34, 8 (2016), 1688\u20131693.","journal-title":"Journal of Lightwave Technology"},{"key":"e_1_3_1_35_2","first-page":"7","article-title":"Study on in-chip phase locked high brightness bottom emitting Talbot-VCSELs array","volume":"11562","author":"Wang Congcong","year":"2020","unstructured":"Congcong Wang, Chong Li, Jingjing Dai, and Zhiyong Wang. 2020. Study on in-chip phase locked high brightness bottom emitting Talbot-VCSELs array. Advanced Laser Technology and Application (AOPC) 11562 (2020), 7\u201312.","journal-title":"Advanced Laser Technology and Application (AOPC)"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPHOT.2019.2941960"},{"issue":"4","key":"e_1_3_1_37_2","doi-asserted-by":"crossref","first-page":"1325","DOI":"10.1109\/JLT.2019.2892512","article-title":"Canceling thermal cross-talk effects in photonic integrated circuits","volume":"37","author":"Milanizadeh Maziyar","year":"2019","unstructured":"Maziyar Milanizadeh, Douglas Aguiar, Andrea Melloni, and Francesco Morichetti. 2019. Canceling thermal cross-talk effects in photonic integrated circuits. Journal of Lightwave Technology 37, 4 (2019), 1325\u20131332.","journal-title":"Journal of Lightwave Technology"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1002\/lpor.201100017"},{"key":"e_1_3_1_39_2","unstructured":"Online: https:\/\/www.ansys.com\/products\/photonics. Last accessed: 3\/19\/2023."},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3446212"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2011.2161771"},{"key":"e_1_3_1_42_2","first-page":"45","volume-title":"IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","author":"Wei Zhigang","year":"2020","unstructured":"Zhigang Wei, Aman Arora, Pragenesh Patel, and Lizy John. 2020. Design space exploration for softmax implementations. IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP). 45\u201352."},{"key":"e_1_3_1_43_2","unstructured":"HP Labs : CACTI. [Online]: https:\/\/www.hpl.hp.com\/research\/cacti\/"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.vlsi.2017.02.002"},{"key":"e_1_3_1_45_2","unstructured":"Intel: HBM2. [online]: https:\/\/www.intel.com\/content\/www\/us\/en\/docs\/programmable\/683189\/21-3-19-6-1\/high-bandwidth-memory-hbm2-dram-bandwidth.html"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2020.2973991"},{"issue":"14","key":"e_1_3_1_47_2","doi-asserted-by":"crossref","first-page":"1623","DOI":"10.1364\/OL.29.001623","article-title":"Ultralow-loss 3-dB photonic crystal waveguide splitter","volume":"29","author":"Hagedorn Frandsen Lars","year":"2004","unstructured":"Lars Hagedorn Frandsen. 2004. Ultralow-loss 3-dB photonic crystal waveguide splitter. Optics letters 29, 14 (2004), 1623\u20131625.","journal-title":"Optics letters"},{"issue":"4","key":"e_1_3_1_48_2","first-page":"1","article-title":"High-efficiency ultra-broadband multi-tip edge couplers for integration of distributed feedback laser with silicon-on-insulator waveguide","volume":"11","author":"Tu Yi-Chou","year":"2019","unstructured":"Yi-Chou Tu, Po-Han Fu, and Ding-Wei Huang. 2019. High-efficiency ultra-broadband multi-tip edge couplers for integration of distributed feedback laser with silicon-on-insulator waveguide. IEEE Photonics Journal 11, 4 (2019), 1\u201313.","journal-title":"IEEE Photonics Journal"},{"key":"e_1_3_1_49_2","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1109\/ASPDAC.2011.5722211","volume-title":"Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","author":"Pasricha Sudeep","year":"2011","unstructured":"Sudeep Pasricha and Shirish Bahirat. 2011. OPAL: A multi-layer hybrid photonic NoC for 3D ICs. Asia and South Pacific Design Automation Conference (ASP-DAC 2011). 345\u2013350."},{"key":"e_1_3_1_50_2","first-page":"48","volume-title":"IEEE Optical Interconnects Conference (OI)","author":"Jayatilleka Hasitha","year":"2015","unstructured":"Hasitha Jayatilleka. 2015. Crosstalk limitations of microring-resonator based WDM demultiplexers on SOI. IEEE Optical Interconnects Conference (OI). 48\u201349."},{"issue":"7","key":"e_1_3_1_51_2","first-page":"2312","article-title":"A 3 mW 6-bit 4 GS\/s subranging ADC with subrange-dependent embedded references","volume":"68","author":"Yang Chung-Ming","year":"2021","unstructured":"Chung-Ming Yang and Tai-Haur Kuo. 2021. A 3 mW 6-bit 4 GS\/s subranging ADC with subrange-dependent embedded references. IEEE Transactions on Circuits and Systems II: Express Briefs 68, 7 (2021), 2312\u20132316.","journal-title":"IEEE Transactions on Circuits and Systems II: Express Briefs"},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2013.2279571"},{"key":"e_1_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2023.3241146"},{"key":"e_1_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNANO.2022.3223915"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3609097","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3609097","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:48:58Z","timestamp":1750182538000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3609097"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,9]]},"references-count":53,"journal-issue":{"issue":"5s","published-print":{"date-parts":[[2023,10,31]]}},"alternative-id":["10.1145\/3609097"],"URL":"https:\/\/doi.org\/10.1145\/3609097","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,9]]},"assertion":[{"value":"2023-03-22","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-06-30","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-09-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}