{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T15:03:38Z","timestamp":1773414218983,"version":"3.50.1"},"reference-count":31,"publisher":"IOP Publishing","issue":"3","license":[{"start":{"date-parts":[[2024,7,10]],"date-time":"2024-07-10T00:00:00Z","timestamp":1720569600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,7,10]],"date-time":"2024-07-10T00:00:00Z","timestamp":1720569600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"funder":[{"name":"EBRAINS-Italy IR00011 PNRR Project","award":["CUP B51E22000150006"],"award-info":[{"award-number":["CUP B51E22000150006"]}]}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Neuromorph. Comput. Eng."],"published-print":{"date-parts":[[2024,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Reinforcement learning (RL) faces substantial challenges when applied to real-life problems, primarily stemming from the scarcity of available data due to limited interactions with the environment. This limitation is exacerbated by the fact that RL often demands a considerable volume of data for effective learning. The complexity escalates further when implementing RL in recurrent spiking networks, where inherent noise introduced by spikes adds a layer of difficulty. Life-long learning machines must inherently resolve the plasticity-stability paradox. Striking a balance between acquiring new knowledge and maintaining stability is crucial for artificial agents. To address this challenge, we draw inspiration from machine learning technology and introduce a biologically plausible implementation of proximal policy optimization, referred to as lf-cs (learning fast changing slow). Our approach results in two notable advancements: firstly, the capacity to assimilate new information into a new policy without requiring alterations to the current policy; and secondly, the capability to replay experiences without experiencing policy divergence. Furthermore, when contrasted with other experience replay techniques, our method demonstrates the added advantage of being computationally efficient in an online setting. We demonstrate that the proposed methodology enhances the efficiency of learning, showcasing its potential impact on neuromorphic and real-world applications.<\/jats:p>","DOI":"10.1088\/2634-4386\/ad5c96","type":"journal-article","created":{"date-parts":[[2024,6,27]],"date-time":"2024-06-27T22:26:09Z","timestamp":1719527169000},"page":"034002","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Learning fast while changing slow in spiking neural networks"],"prefix":"10.1088","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9958-2551","authenticated-orcid":true,"given":"Cristiano","family":"Capone","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4520-5950","authenticated-orcid":false,"given":"Paolo","family":"Muratore","sequence":"additional","affiliation":[]}],"member":"266","published-online":{"date-parts":[[2024,7,10]]},"reference":[{"key":"ncead5c96bib1","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ncead5c96bib2","doi-asserted-by":"publisher","first-page":"108","DOI":"10.1016\/j.neunet.2019.08.009","article-title":"Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to atari breakout game","volume":"120","author":"Patel","year":"2019","journal-title":"Neural Netw."},{"key":"ncead5c96bib3","first-page":"pp 2016","article-title":"Deep reinforcement learning with population-coded spiking neural network for continuous control","author":"Tang","year":"2021"},{"key":"ncead5c96bib4","doi-asserted-by":"publisher","DOI":"10.3389\/fnbot.2022.1075647","article-title":"Toward robust and scalable deep spiking reinforcement learning","volume":"16","author":"Akl","year":"2023","journal-title":"Front. Neurorobot."},{"key":"ncead5c96bib5","doi-asserted-by":"publisher","first-page":"899","DOI":"10.1162\/neco_a_01367","article-title":"The remarkable robustness of surrogate gradient learning for instilling complex function in spiking neural networks","volume":"33","author":"Zenke","year":"2021","journal-title":"Neural Comput."},{"key":"ncead5c96bib6","doi-asserted-by":"publisher","first-page":"1468","DOI":"10.1162\/neco.2007.19.6.1468","article-title":"Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity","volume":"19","author":"Florian","year":"2007","journal-title":"Neural Comput."},{"key":"ncead5c96bib7","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1003024","article-title":"Reinforcement learning using a continuous time actor-critic framework with spiking neurons","volume":"9","author":"Fr\u00e9maux","year":"2013","journal-title":"PLoS Comput. Biol."},{"key":"ncead5c96bib8","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41467-020-17236-y","article-title":"A solution to the learning dilemma for recurrent networks of spiking neurons","volume":"11","author":"Bellec","year":"2020","journal-title":"Nat. Commun."},{"key":"ncead5c96bib9","doi-asserted-by":"publisher","first-page":"230","DOI":"10.1038\/s42256-021-00311-4","article-title":"Optimized spiking neurons can classify images with high accuracy through temporal coding with two spikes","volume":"3","author":"St\u00f6ckl","year":"2021","journal-title":"Nat. Mach. Intell."},{"key":"ncead5c96bib10","doi-asserted-by":"publisher","first-page":"38","DOI":"10.3389\/fncom.2014.00038","article-title":"Stochastic variational learning in recurrent spiking networks","volume":"8","author":"Jimenez Rezende","year":"2014","journal-title":"Front. Comput. Neurosci."},{"key":"ncead5c96bib11","doi-asserted-by":"publisher","DOI":"10.7554\/eLife.28295","article-title":"Predicting non-linear dynamics by stable local learning in a recurrent spiking neural network","volume":"6","author":"Gilra","year":"2017","journal-title":"Elife"},{"key":"ncead5c96bib12","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1010221","article-title":"Error-based or target-based? a unified framework for learning in recurrent spiking networks","volume":"18","author":"Capone","year":"2022","journal-title":"PLoS Comput. Biol."},{"key":"ncead5c96bib13","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0247014","article-title":"Target spike patterns enable efficient and biologically plausible learning for complex temporal tasks","volume":"16","author":"Muratore","year":"2021","journal-title":"PLoS One"},{"key":"ncead5c96bib14","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0191527","article-title":"full-force: a target-based method for training recurrent networks","volume":"13","author":"DePasquale","year":"2018","journal-title":"PLoS One"},{"key":"ncead5c96bib15","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0220547","article-title":"Training dynamically balanced excitatory-inhibitory networks","volume":"14","author":"Ingrosso","year":"2019","journal-title":"PLoS One"},{"key":"ncead5c96bib16","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-019-45525-0","article-title":"Sleep-like slow oscillations improve visual classification through synaptic homeostasis and memory association in a thalamo-cortical model","volume":"9","author":"Capone","year":"2019","journal-title":"Sci. Rep."},{"key":"ncead5c96bib17","doi-asserted-by":"publisher","first-page":"6543","DOI":"10.1038\/s41598-023-32410-0","article-title":"Dendrites help mitigate the plasticity-stability dilemma","volume":"13","author":"Wilmes","year":"2023","journal-title":"Sci. Rep."},{"key":"ncead5c96bib18","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.2220743120","article-title":"Beyond spiking networks: The computational advantages of dendritic amplification and input segregation","volume":"120","author":"Capone","year":"2023","journal-title":"Proc. Natl Acad. Sci."},{"key":"ncead5c96bib19","first-page":"pp 1928","article-title":"Asynchronous methods for deep reinforcement learning","volume":"vol 48","author":"Mnih","year":"2016"},{"key":"ncead5c96bib20","first-page":"pp 1889","article-title":"Trust region policy optimization","author":"Schulman","year":"2015"},{"key":"ncead5c96bib21","article-title":"Sample efficient actor-critic with experience replay","author":"Wang","year":"2016"},{"key":"ncead5c96bib22","article-title":"Proximal policy optimization algorithms","author":"Schulman","year":"2017"},{"key":"ncead5c96bib23","first-page":"pp 1928","article-title":"Asynchronous methods for deep reinforcement learning","author":"Mnih","year":"2016"},{"key":"ncead5c96bib24","article-title":"Openai gym","author":"Brockman","year":"2016"},{"key":"ncead5c96bib25","article-title":"Towards biologically plausible dreaming and planning","author":"Capone","year":"2022"},{"key":"ncead5c96bib26","author":"Sutton","year":"2018"},{"key":"ncead5c96bib27","article-title":"Adam: a method for stochastic optimization","author":"Kingma","year":"2014"},{"key":"ncead5c96bib28","article-title":"Evolving connectivity for recurrent spiking neural networks","author":"Wang","year":"2023"},{"key":"ncead5c96bib29","first-page":"pp 1","article-title":"An event-driven recurrent spiking neural network architecture for efficient inference on FPGA","author":"Sankaran","year":"2022"},{"key":"ncead5c96bib30","doi-asserted-by":"publisher","first-page":"496","DOI":"10.1016\/j.neunet.2021.09.010","article-title":"Combining stdp and binary networks for reinforcement learning from images and sparse rewards","volume":"144","author":"Chevtchenko","year":"2021","journal-title":"Neural Netw."},{"key":"ncead5c96bib31","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.2218173120","article-title":"Brain-inspired neural circuit evolution for spiking neural networks","volume":"120","author":"Shen","year":"2023","journal-title":"Proc. Natl Acad. Sci."}],"container-title":["Neuromorphic Computing and Engineering"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad5c96","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad5c96\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad5c96","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad5c96\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad5c96\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad5c96\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad5c96\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad5c96\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,10]],"date-time":"2024-07-10T10:27:55Z","timestamp":1720607275000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad5c96"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,10]]},"references-count":31,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2024,7,10]]},"published-print":{"date-parts":[[2024,9,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2634-4386\/ad5c96","relation":{},"ISSN":["2634-4386"],"issn-type":[{"value":"2634-4386","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,10]]},"assertion":[{"value":"Learning fast while changing slow in spiking neural networks","name":"article_title","label":"Article Title"},{"value":"Neuromorphic Computing and Engineering","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2024 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2023-12-31","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2024-06-27","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2024-07-10","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}