{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:32:02Z","timestamp":1772137922140,"version":"3.50.1"},"reference-count":73,"publisher":"IOP Publishing","issue":"1","license":[{"start":{"date-parts":[[2024,2,23]],"date-time":"2024-02-23T00:00:00Z","timestamp":1708646400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,23]],"date-time":"2024-02-23T00:00:00Z","timestamp":1708646400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Neuromorph. Comput. Eng."],"published-print":{"date-parts":[[2024,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency when performing inference with deep learning workloads. Error backpropagation is presently regarded as the most effective method for training SNNs, but in a twist of irony, when training on modern graphics processing units this becomes more expensive than non-spiking networks. The emergence of Graphcore\u2019s intelligence processing units (IPUs) balances the parallelized nature of deep learning workloads with the sequential, reusable, and sparsified nature of operations prevalent when training SNNs. IPUs adopt multi-instruction multi-data parallelism by running individual processing threads on smaller data blocks, which is a natural fit for the sequential, non-vectorized steps required to solve spiking neuron dynamical state equations. We present an IPU-optimized release of our custom SNN Python package,\n                    <jats:italic>snnTorch<\/jats:italic>\n                    , which exploits fine-grained parallelism by utilizing low-level, pre-compiled custom operations to accelerate irregular and sparse data access patterns that are characteristic of training SNN workloads. We provide a rigorous performance assessment across a suite of commonly used spiking neuron models, and propose methods to further reduce training run-time via half-precision training. By amortizing the cost of sequential processing into vectorizable population codes, we ultimately demonstrate the potential for integrating domain-specific accelerators with the next generation of neural networks.\n                  <\/jats:p>","DOI":"10.1088\/2634-4386\/ad2373","type":"journal-article","created":{"date-parts":[[2024,1,29]],"date-time":"2024-01-29T17:21:06Z","timestamp":1706548866000},"page":"014004","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Exploiting deep learning accelerators for neuromorphic workloads"],"prefix":"10.1088","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-7743-0961","authenticated-orcid":true,"given":"Pao-Sheng Vincent","family":"Sun","sequence":"first","affiliation":[]},{"given":"Alexander","family":"Titterton","sequence":"additional","affiliation":[]},{"given":"Anjlee","family":"Gopiani","sequence":"additional","affiliation":[]},{"given":"Tim","family":"Santos","sequence":"additional","affiliation":[]},{"given":"Arindam","family":"Basu","sequence":"additional","affiliation":[]},{"given":"Wei D","family":"Lu","sequence":"additional","affiliation":[]},{"given":"Jason K","family":"Eshraghian","sequence":"additional","affiliation":[]}],"member":"266","published-online":{"date-parts":[[2024,2,23]]},"reference":[{"key":"ncead2373bib1","article-title":"High performance convolutional neural networks for document processing","author":"Chellapilla","year":"2006"},{"key":"ncead2373bib2","doi-asserted-by":"publisher","first-page":"1311","DOI":"10.1016\/j.patcog.2004.01.013","article-title":"GPU implementation of neural networks","volume":"37","author":"Oh","year":"2004","journal-title":"Pattern Recognit."},{"key":"ncead2373bib3","first-page":"pp 133","article-title":"Understanding the efficiency of GPU algorithms for matrix-matrix multiplication","author":"Fatahalian","year":"2004"},{"key":"ncead2373bib4","article-title":"Flexible, high performance convolutional neural networks for image classification","author":"Ciresan","year":"2011"},{"key":"ncead2373bib5","first-page":"pp 1097","article-title":"Imagenet classification with deep convolutional neural networks","volume":"vol 25","author":"Krizhevsky","year":"2012"},{"key":"ncead2373bib6","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1109\/MSP.2019.2931595","article-title":"Surrogate gradient learning in spiking neural networks: bringing the power of gradient-based optimization to spiking neural networks","volume":"36","author":"Neftci","year":"2019","journal-title":"IEEE Signal Process. Mag."},{"key":"ncead2373bib7","doi-asserted-by":"publisher","first-page":"544","DOI":"10.1016\/j.neuron.2009.07.018","article-title":"Generating coherent patterns of activity from chaotic neural networks","volume":"63","author":"Sussillo","year":"2009","journal-title":"Neuron"},{"key":"ncead2373bib8","doi-asserted-by":"publisher","first-page":"1297","DOI":"10.1021\/nl904092h","article-title":"Nanoscale memristor device as synapse in neuromorphic systems","volume":"10","author":"Jo","year":"2010","journal-title":"Nano Lett."},{"key":"ncead2373bib9","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41467-021-24260-z","article-title":"Avalanches and edge-of-chaos learning in neuromorphic nanowire networks","volume":"12","author":"Hochstetter","year":"2021","journal-title":"Nat. Commun."},{"key":"ncead2373bib10","doi-asserted-by":"publisher","first-page":"1659","DOI":"10.1016\/S0893-6080(97)00011-7","article-title":"Networks of spiking neurons: the third generation of neural network models","volume":"10","author":"Maass","year":"1997","journal-title":"Neural Netw."},{"key":"ncead2373bib11","doi-asserted-by":"publisher","first-page":"99","DOI":"10.3389\/fncom.2015.00099","article-title":"Unsupervised learning of digit recognition using spike-timing-dependent plasticity","volume":"9","author":"Diehl","year":"2015","journal-title":"Front. Comput. Neurosci."},{"key":"ncead2373bib12","doi-asserted-by":"publisher","first-page":"167","DOI":"10.3109\/0954898X.2012.730170","article-title":"Simulating spiking neural networks on GPU","volume":"23","author":"Brette","year":"2012","journal-title":"Netw. Comput. Neural Syst."},{"key":"ncead2373bib13","doi-asserted-by":"publisher","first-page":"1138","DOI":"10.1109\/TBCAS.2020.3036081","article-title":"Hardware implementation of deep network accelerators towards healthcare and biomedical applications","volume":"14","author":"Azghadi","year":"2020","journal-title":"IEEE Trans. Biomed. Circuits Syst."},{"key":"ncead2373bib14","first-page":"pp 1","article-title":"Accelerated simulation of spiking neural networks using GPUs","author":"Fidjeland","year":"2010"},{"key":"ncead2373bib15","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1109\/MNANO.2022.3141443","article-title":"Memristor-based binarized spiking neural networks: challenges and applications","volume":"16","author":"Eshraghian","year":"2022","journal-title":"IEEE Nanotechnol. Mag."},{"key":"ncead2373bib16","doi-asserted-by":"publisher","first-page":"2295","DOI":"10.1109\/JPROC.2017.2761740","article-title":"Efficient processing of deep neural networks: a tutorial and survey","volume":"105","author":"Sze","year":"2017","journal-title":"Proc. IEEE"},{"key":"ncead2373bib17","first-page":"pp 1","article-title":"In-datacenter performance analysis of a tensor processing unit","author":"Jouppi","year":"2017"},{"key":"ncead2373bib18","doi-asserted-by":"publisher","first-page":"5135","DOI":"10.1109\/TCSI.2022.3206262","article-title":"APTPU: approximate computing based tensor processing unit","volume":"69","author":"Elbtity","year":"2022","journal-title":"IEEE Trans. Circuits Syst. I"},{"key":"ncead2373bib19","first-page":"pp 145","article-title":"Think fast: a tensor streaming processor (TSP) for accelerating deep learning workloads","author":"Abts","year":"2020"},{"key":"ncead2373bib20","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1016\/S0925-2312(01)00658-0","article-title":"Error-backpropagation in temporally encoded networks of spiking neurons","volume":"48","author":"Bohte","year":"2002","journal-title":"Neurocomputing"},{"key":"ncead2373bib21","article-title":"Pytorch: an imperative style, high-performance deep learning library","volume":"vol 32","author":"Paszke","year":"2019"},{"key":"ncead2373bib22","first-page":"pp 265","article-title":"TensorFlow: a system for large-scale machine learning","author":"Abadi","year":"2016"},{"key":"ncead2373bib23","article-title":"Compiling machine learning programs via high-level tracing","volume":"vol 4","author":"Frostig","year":"2018"},{"key":"ncead2373bib24","article-title":"Spiking deep networks with LIF neurons","author":"Hunsberger","year":"2015"},{"key":"ncead2373bib25","first-page":"pp 1419","article-title":"SLAYER: spike layer error reassignment in time","author":"Shrestha","year":"2018"},{"key":"ncead2373bib26","article-title":"Long short-term memory and learning-to-learn in networks of spiking neurons","author":"Bellec","year":"2018"},{"key":"ncead2373bib27","article-title":"Spiking neural network for nonlinear regression","author":"Henkes","year":"2022"},{"key":"ncead2373bib28","doi-asserted-by":"publisher","first-page":"11441","DOI":"10.1073\/pnas.1604850113","article-title":"Convolutional networks for fast, energy-efficient neuromorphic computing","volume":"113","author":"Esser","year":"2016","journal-title":"Proc. Natl Acad. Sci."},{"key":"ncead2373bib29","article-title":"Gradient descent for spiking neural networks","author":"Huh","year":"2017"},{"key":"ncead2373bib30","article-title":"Generalization of back propagation to recurrent and higher order neural networks","author":"Pineda","year":"1987"},{"key":"ncead2373bib31","doi-asserted-by":"publisher","first-page":"1550","DOI":"10.1109\/5.58337","article-title":"Backpropagation through time: what it does and how to do it","volume":"78","author":"Werbos","year":"1990","journal-title":"Proc. IEEE"},{"key":"ncead2373bib32","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1109\/MM.2018.112130359","article-title":"Loihi: a neuromorphic manycore processor with on-chip learning","volume":"38","author":"Davies","year":"2018","journal-title":"IEEE Micro"},{"key":"ncead2373bib33","first-page":"pp 254","article-title":"Efficient neuromorphic signal processing with Loihi 2","author":"Orchard","year":"2021"},{"key":"ncead2373bib34","doi-asserted-by":"publisher","first-page":"668","DOI":"10.1126\/science.1254642","article-title":"A million spiking-neuron integrated circuit with a scalable communication network and interface","volume":"345","author":"Merolla","year":"2014","journal-title":"Am. Assoc. Adv. Sci."},{"key":"ncead2373bib35","doi-asserted-by":"publisher","first-page":"1537","DOI":"10.1109\/TCAD.2015.2474396","article-title":"TrueNorth: design and tool flow of a 65 mW 1 million neuron programmable neurosynaptic chip","volume":"34","author":"Akopyan","year":"2015","journal-title":"IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ncead2373bib36","doi-asserted-by":"publisher","first-page":"699","DOI":"10.1109\/JPROC.2014.2313565","article-title":"Neurogrid: a mixed-analog-digital multichip system for large-scale neural simulations","volume":"102","author":"Benjamin","year":"2014","journal-title":"Proc. IEEE"},{"key":"ncead2373bib37","first-page":"pp 2849","article-title":"SpiNNaker: mapping neural networks onto a massively-parallel chip multiprocessor","author":"Khan","year":"2008"},{"key":"ncead2373bib38","doi-asserted-by":"publisher","first-page":"652","DOI":"10.1109\/JPROC.2014.2304638","article-title":"The SpiNNaker project","volume":"102","author":"Furber","year":"2014","journal-title":"Proc. IEEE"},{"key":"ncead2373bib39","first-page":"pp 240","article-title":"Shenjing: a low power reconfigurable neuromorphic accelerator with partial-sum and spike networks-on-chip","author":"Wang","year":"2020"},{"key":"ncead2373bib40","first-page":"pp 1","article-title":"RENO: a high-efficient reconfigurable neuromorphic computing accelerator design","author":"Liu","year":"2015"},{"key":"ncead2373bib41","doi-asserted-by":"publisher","first-page":"617","DOI":"10.1109\/TCSI.2016.2529279","article-title":"Harmonica: a framework of heterogeneous computing systems with memristor-based neuromorphic computing accelerators","volume":"63","author":"Liu","year":"2016","journal-title":"IEEE Trans. Circuits Syst. I"},{"key":"ncead2373bib42","doi-asserted-by":"publisher","first-page":"1009","DOI":"10.1109\/TCAD.2017.2729466","article-title":"MNSIM: simulation platform for memristor-based neuromorphic computing system","volume":"37","author":"Xia","year":"2017","journal-title":"IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ncead2373bib43","doi-asserted-by":"publisher","first-page":"270","DOI":"10.1162\/neco.1989.1.2.270","article-title":"A learning algorithm for continually running fully recurrent neural networks","volume":"1","author":"Williams","year":"1989","journal-title":"Neural Comput."},{"key":"ncead2373bib44","first-page":"pp 1","article-title":"ReckOn: a 28nm sub-mm2 task-agnostic spiking recurrent neural network processor enabling on-chip learning over second-long timescales","volume":"vol 65","author":"Frenkel","year":"2022"},{"key":"ncead2373bib45","doi-asserted-by":"publisher","first-page":"3625","DOI":"10.1038\/s41467-020-17236-y","article-title":"A solution to the learning dilemma for recurrent networks of spiking neurons","volume":"11","author":"Bellec","year":"2020","journal-title":"Nat. Commun."},{"key":"ncead2373bib46","doi-asserted-by":"publisher","first-page":"899","DOI":"10.1162\/neco_a_01367","article-title":"The remarkable robustness of surrogate gradient learning for instilling complex function in spiking neural networks","volume":"33","author":"Zenke","year":"2021","journal-title":"Neural Comput."},{"key":"ncead2373bib47","author":"Griewank","year":"2008"},{"key":"ncead2373bib48","doi-asserted-by":"crossref","DOI":"10.21203\/rs.3.rs-701752\/v1","article-title":"The backpropagation algorithm implemented on spiking neuromorphic hardware","author":"Renner","year":"2021"},{"key":"ncead2373bib49","article-title":"Biograd: biologically plausible gradient-based learning for spiking neural networks","author":"Tang","year":"2021"},{"key":"ncead2373bib50","doi-asserted-by":"publisher","first-page":"208","DOI":"10.1145\/242224.242490","article-title":"A bridging model for parallel computation, communication and I\/O","volume":"28","author":"Cormen","year":"1996","journal-title":"ACM Computing Surveys (CSUR)"},{"key":"ncead2373bib51","first-page":"pp 497","article-title":"Memory bandwidth contention: communication vs computation tradeoffs in supercomputers with multicore architectures","author":"Langguth","year":"2018"},{"key":"ncead2373bib52","first-page":"pp 291","article-title":"iPUG: accelerating breadth-first graph traversals using manycore graphcore IPUs","author":"Burchard","year":"2021"},{"key":"ncead2373bib53","doi-asserted-by":"publisher","DOI":"10.3389\/fninf.2021.659005","article-title":"PyGeNN: a Python library for GPU-enhanced neural networks","volume":"15","author":"Knight","year":"2021","journal-title":"Front. Neuroinform."},{"key":"ncead2373bib54","article-title":"Training spiking neural networks using lessons from deep learning","author":"Eshraghian","year":"2021"},{"key":"ncead2373bib55","article-title":"Spikingjelly","author":"Fang","year":"2020"},{"key":"ncead2373bib56","doi-asserted-by":"publisher","first-page":"89","DOI":"10.3389\/fninf.2018.00089","article-title":"BindsNET: a machine learning-oriented spiking neural networks library in Python","volume":"12","author":"Hazan","year":"2018","journal-title":"Front. Neuroinform."},{"key":"ncead2373bib57","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.4422025","article-title":"Norse\u2014a deep learning library for spiking neural networks","author":"Pehle","year":"2021"},{"key":"ncead2373bib58","first-page":"pp 8","article-title":"Efficient GPU training of LSNNs using eProp","author":"Knight","year":"2022"},{"key":"ncead2373bib59","author":"Dayan","year":"2005"},{"key":"ncead2373bib60","first-page":"620","article-title":"Recherches quantitatives sur l\u2019excitation electrique des nerfs traitee comme une polarization","volume":"9","author":"Lapique","year":"1907","journal-title":"J. Physiol. Pathol."},{"key":"ncead2373bib61","article-title":"Neural networks for machine learning","author":"Hinton","year":"2012"},{"key":"ncead2373bib62","article-title":"The MNIST database of handwritten digits","author":"LeCun","year":"1998"},{"key":"ncead2373bib63","article-title":"Learning multiple layers of features from tiny images","author":"Krizhevsky","year":"2009"},{"key":"ncead2373bib64","article-title":"Adam: a method for stochastic optimization","author":"Kingma","year":"2014"},{"key":"ncead2373bib65","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41467-021-26022-3","article-title":"Neural heterogeneity promotes robust learning","volume":"12","author":"Perez-Nieves","year":"2021","journal-title":"Nat. Commun."},{"key":"ncead2373bib66","doi-asserted-by":"publisher","first-page":"1015","DOI":"10.1109\/JETCAS.2023.3330432","article-title":"To spike or not to spike: a digital hardware perspective on deep learning acceleration","volume":"13","author":"Ottati","year":"2023","journal-title":"IEEE J. Emerg. Top. Circuits Syst."},{"key":"ncead2373bib67","doi-asserted-by":"publisher","first-page":"10464","DOI":"10.1523\/JNEUROSCI.18-24-10464.1998","article-title":"Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength and postsynaptic cell type","volume":"18","author":"Bi","year":"1998","journal-title":"J. Neurosci."},{"key":"ncead2373bib68","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1001080","article-title":"Spike-based population coding and working memory","volume":"7","author":"Boerlin","year":"2011","journal-title":"PLoS Comput. Biol."},{"key":"ncead2373bib69","first-page":"pp 182","article-title":"What is the other 85 percent of V1 doing","volume":"vol 23","author":"Olshausen","year":"2006"},{"key":"ncead2373bib70","first-page":"pp 1311","article-title":"Direct training for spiking neural networks: faster, larger, better","volume":"vol 33","author":"Wu","year":"2019"},{"key":"ncead2373bib71","doi-asserted-by":"crossref","DOI":"10.5244\/C.30.87","article-title":"Wide residual networks","author":"Zagoruyko","year":"2016"},{"key":"ncead2373bib72","article-title":"Speck: a smart event-based vision sensor with a low latency 327K neuron convolutional neuronal network processing pipeline","author":"Richter","year":"2023"},{"key":"ncead2373bib73","article-title":"Neuromorphic intermediate representation: a unified instruction set for interoperable brain-inspired computing","author":"Pedersen","year":"2023"}],"container-title":["Neuromorphic Computing and Engineering"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad2373","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad2373\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad2373","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad2373\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad2373\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad2373\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad2373\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad2373\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,23]],"date-time":"2024-02-23T04:02:29Z","timestamp":1708660949000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad2373"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,23]]},"references-count":73,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,2,23]]},"published-print":{"date-parts":[[2024,3,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2634-4386\/ad2373","relation":{"has-review":[{"id-type":"doi","id":"10.1088\/2634-4386\/AD2373\/v2\/review1","asserted-by":"object"},{"id-type":"doi","id":"10.1088\/2634-4386\/AD2373\/v1\/review2","asserted-by":"object"},{"id-type":"doi","id":"10.1088\/2634-4386\/AD2373\/v1\/review1","asserted-by":"object"},{"id-type":"doi","id":"10.1088\/2634-4386\/AD2373\/v2\/review2","asserted-by":"object"},{"id-type":"doi","id":"10.1088\/2634-4386\/AD2373\/v1\/decision1","asserted-by":"object"},{"id-type":"doi","id":"10.1088\/2634-4386\/AD2373\/v2\/response1","asserted-by":"object"},{"id-type":"doi","id":"10.1088\/2634-4386\/AD2373\/v3\/decision1","asserted-by":"object"},{"id-type":"doi","id":"10.1088\/2634-4386\/AD2373\/v3\/response1","asserted-by":"object"},{"id-type":"doi","id":"10.1088\/2634-4386\/AD2373\/v2\/decision1","asserted-by":"object"}]},"ISSN":["2634-4386"],"issn-type":[{"value":"2634-4386","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,23]]},"assertion":[{"value":"Exploiting deep learning accelerators for neuromorphic workloads","name":"article_title","label":"Article Title"},{"value":"Neuromorphic Computing and Engineering","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2024 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2023-09-09","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2024-01-29","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2024-02-23","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}