{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T22:46:41Z","timestamp":1777675601667,"version":"3.51.4"},"reference-count":43,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2020,4,3]],"date-time":"2020-04-03T00:00:00Z","timestamp":1585872000000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"Swiss ETH board"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2020,7]]},"abstract":"<jats:p>Big science initiatives are trying to reconstruct and model the brain by attempting to simulate brain tissue at larger scales and with increasingly more biological detail than previously thought possible. The exponential growth of parallel computer performance has been supporting these developments, and at the same time maintainers of neuroscientific simulation code have strived to optimally and efficiently exploit new hardware features. Current state-of-the-art software for the simulation of biological networks has so far been developed using performance engineering practices, but a thorough analysis and modeling of the computational and performance characteristics, especially in the case of morphologically detailed neuron simulations, is lacking. Other computational sciences have successfully used analytic performance engineering, which is based on \u201cwhite-box,\u201d that is, first-principles performance models, to gain insight on the computational properties of simulation kernels, aid developers in performance optimizations and eventually drive codesign efforts, but to our knowledge a model-based performance analysis of neuron simulations has not yet been conducted. We present a detailed study of the shared-memory performance of morphologically detailed neuron simulations based on the Execution-Cache-Memory performance model. We demonstrate that this model can deliver accurate predictions of the runtime of almost all the kernels that constitute the neuron models under investigation. The gained insight is used to identify the main governing mechanisms underlying performance bottlenecks in the simulation. The implications of this analysis on the optimization of neural simulation software and eventually codesign of future hardware architectures are discussed. In this sense, our work represents a valuable conceptual and quantitative contribution to understanding the performance properties of biological networks simulations.<\/jats:p>","DOI":"10.1177\/1094342020912528","type":"journal-article","created":{"date-parts":[[2020,4,3]],"date-time":"2020-04-03T07:09:25Z","timestamp":1585897765000},"page":"428-449","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":7,"title":["Analytic performance modeling and analysis of detailed neuron simulations"],"prefix":"10.1177","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1027-485X","authenticated-orcid":false,"given":"Francesco","family":"Cremonesi","sequence":"first","affiliation":[{"name":"Blue Brain Project, Brain Mind Institute, \u00c9cole polytechnique f\u00e9d\u00e9rale de Lausanne (EPFL), Campus Biotech, Geneva, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Georg","family":"Hager","sequence":"additional","affiliation":[{"name":"Erlangen Regional Computing Center, Friedrich-Alexander Universit\u00e4t Erlangen-N\u00fcrnberg, Erlangen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gerhard","family":"Wellein","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Friedrich-Alexander Universit\u00e4t Erlangen-N\u00fcrnberg, Erlangen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Felix","family":"Sch\u00fcrmann","sequence":"additional","affiliation":[{"name":"Blue Brain Project, Brain Mind Institute, \u00c9cole polytechnique f\u00e9d\u00e9rale de Lausanne (EPFL), Campus Biotech, Geneva, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2020,4,3]]},"reference":[{"key":"bibr1-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1145\/1362622.1362627"},{"key":"bibr2-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-007-3858-4_7"},{"key":"bibr3-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1007\/s10827-007-0038-6"},{"key":"bibr4-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1145\/2503210.2503277"},{"key":"bibr5-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511541612"},{"key":"bibr6-1094342020912528","unstructured":"Cremonesi F, et al. (2019) Reproducibility appendix for paper on modeling Blue Brain Project kernels with ECM. Available at: https:\/\/github.com\/RRZE-HPC\/BBP-ECM-RA\/releases\/tag\/2019-01-16."},{"key":"bibr7-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-55224-3_74"},{"key":"bibr8-1094342020912528","volume-title":"Instruction Tables: Lists of Instruction Latencies, Throughputs and Micro-Operation Breakdowns for Intel, AMD and VIA CPUs","author":"Fog A","year":"2017"},{"key":"bibr9-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9781107447615"},{"key":"bibr10-1094342020912528","unstructured":"Gruber T, Eitzinger J (2018) LIKWID: a multicore performance tool suite. Available at: http:\/\/tiny.cc\/LIKWID (accessed 3 September 2020)."},{"key":"bibr11-1094342020912528","unstructured":"Hager G, Eitzinger J, Hornich J, et al. (2018) Applying the execution-cache-memory model: current state of practice. Available at: https:\/\/sc18.supercomputing.org\/proceedings\/tech_poster\/tech_poster_pages\/post152.html (accessed 3 September 2020)."},{"key":"bibr12-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.3180"},{"key":"bibr13-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-56702-0_1"},{"key":"bibr14-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1016\/0020-7101(84)90008-4"},{"key":"bibr15-1094342020912528","doi-asserted-by":"publisher","DOI":"10.3389\/fncom.2011.00049"},{"key":"bibr16-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1162\/089976600300015475"},{"key":"bibr17-1094342020912528","unstructured":"Hofmann J, Alappat CL, Hager G, et al. (2019) Bridging the architecture gap: abstracting performance-relevant properties of modern server processors. Corr abs\/1907.00048. Available at: http:\/\/arxiv.org\/abs\/1907.00048 (accessed 3 September 2020)."},{"key":"bibr18-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-92040-5_2"},{"key":"bibr19-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-58667-0_16"},{"key":"bibr20-1094342020912528","unstructured":"Intel (2017) Intel Architecture Code Analyzer. Available at: https:\/\/software.intel.com\/en-us\/articles\/intel-architecture-code-analyzer (accessed 3 September 2020)."},{"key":"bibr21-1094342020912528","unstructured":"Intel (2018) Intel 64 and IA-32 Architectures Optimization Reference Manual. Available at: http:\/\/www.intel.com\/content\/dam\/www\/public\/us\/en\/documents\/manuals\/64-ia-32-architectures-optimization-manual.pdf (accessed 3 September 2020)."},{"key":"bibr22-1094342020912528","doi-asserted-by":"publisher","DOI":"10.3389\/fninf.2017.00030"},{"key":"bibr23-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0712231105"},{"key":"bibr24-1094342020912528","doi-asserted-by":"publisher","DOI":"10.3389\/fninf.2018.00002"},{"key":"bibr25-1094342020912528","doi-asserted-by":"publisher","DOI":"10.3389\/fninf.2011.00015"},{"key":"bibr26-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-41321-1_19"},{"key":"bibr27-1094342020912528","doi-asserted-by":"crossref","unstructured":"Laukemann J, Hammer J, Hager G, et al. (2019) Automatic throughput and critical path analysis of x86 and arm assembly kernels. DOI:10.1109\/PMBS49563.2019.00006. To be published.","DOI":"10.1109\/PMBS49563.2019.00006"},{"key":"bibr28-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1109\/PMBS.2018.8641578"},{"key":"bibr29-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1016\/j.cell.2015.09.029"},{"key":"bibr30-1094342020912528","first-page":"19","volume":"2","author":"McCalpin JD","year":"1995","journal-title":"IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter"},{"key":"bibr31-1094342020912528","unstructured":"NEC (2018) NEC SX-Aurora TSUBASA\u2014Vector Engine. Available at: https:\/\/www.nec.com\/en\/global\/solutions\/hpc\/sx\/vector_engine.html (accessed 3 September 2020)."},{"key":"bibr32-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1016\/j.amc.2005.02.028"},{"key":"bibr33-1094342020912528","volume-title":"Society for Neuroscience Annual Meeting","author":"Peyser A","year":"2015"},{"key":"bibr34-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1093\/cercor\/bhs358"},{"key":"bibr35-1094342020912528","doi-asserted-by":"publisher","DOI":"10.3389\/fncir.2015.00044"},{"key":"bibr36-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1005179"},{"key":"bibr37-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1145\/2751205.2751240"},{"key":"bibr38-1094342020912528","unstructured":"Thomas LH (1949) Elliptic problems in linear difference equations over a network. New York: Watson Sci. Comput. Lab. Rept., Columbia University, p. 1."},{"key":"bibr39-1094342020912528","doi-asserted-by":"publisher","DOI":"10.3389\/fninf.2017.00046"},{"key":"bibr40-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-14390-8_64"},{"key":"bibr41-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1109\/ICPPW.2010.38"},{"key":"bibr42-1094342020912528","doi-asserted-by":"publisher","DOI":"10.1145\/1498765.1498785"},{"key":"bibr43-1094342020912528","doi-asserted-by":"publisher","DOI":"10.3389\/fninf.2014.00076"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342020912528","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342020912528","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342020912528","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T08:15:58Z","timestamp":1777450558000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342020912528"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,4,3]]},"references-count":43,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,7]]}},"alternative-id":["10.1177\/1094342020912528"],"URL":"https:\/\/doi.org\/10.1177\/1094342020912528","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,4,3]]}}}