{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,8]],"date-time":"2026-05-08T07:14:49Z","timestamp":1778224489482,"version":"3.51.4"},"reference-count":82,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T00:00:00Z","timestamp":1773100800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T00:00:00Z","timestamp":1773100800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100005366","name":"University of Oslo","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100005366","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Biol Cybern"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>While reinforcement learning algorithms have made significant progress in solving multi-armed bandit problems, they often lack biological plausibility in architecture and dynamics. Here, we propose a bio-inspired neural model based on interacting populations of rate neurons, drawing inspiration from the orbitofrontal cortex and anterior cingulate cortex. Our model reports robust performance across various stochastic bandit problems, matching the effectiveness of standard algorithms such as Thompson Sampling and UCB. Notably, the model exhibits adaptive behavior: employing greedy strategies in low-uncertainty situations while increasing exploratory behavior as uncertainty rises. Through evolutionary optimization, the model\u2019s hyperparameters converged to values that align with the principles of synaptic mechanisms, particularly in terms of synapse-dependent neural activity and learning rate adaptation. These findings suggest that biologically-inspired computational architectures can achieve competitive performance while providing insights into neural mechanisms of decision-making under uncertainty.<\/jats:p>","DOI":"10.1007\/s00422-026-01037-5","type":"journal-article","created":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T05:15:23Z","timestamp":1773119723000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A bio-inspired minimal model for non-stationary K-armed bandits"],"prefix":"10.1007","volume":"120","author":[{"given":"Krubeal","family":"Danieli","sequence":"first","affiliation":[]},{"given":"Mikkel Elle","family":"Lepper\u00f8d","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2026,3,10]]},"reference":[{"key":"1037_CR1","unstructured":"Agrawal S, Goyal N (2012) Analysis of Thompson Sampling for the Multi-armed Bandit Problem. In Proceedings of the 25th Annual Conference on Learning Theory, pages 39.1\u201339.26. JMLR Workshop and Conference Proceedings"},{"issue":"3","key":"1037_CR2","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1086\/208517","volume":"12","author":"CT Allen","year":"1985","unstructured":"Allen CT, Madden TJ (1985) A Closer Look at Classical Conditioning. Journal of Consumer Research 12(3):301\u2013315","journal-title":"Journal of Consumer Research"},{"key":"1037_CR3","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1016\/j.neunet.2021.01.026","volume":"138","author":"A Apicella","year":"2021","unstructured":"Apicella A, Donnarumma F, Isgr\u00f2 F, Prevete R (2021) A survey on modern trainable activation functions. Neural Netw 138:14\u201332","journal-title":"Neural Netw"},{"key":"1037_CR4","first-page":"9","volume":"4","author":"P Ariel","year":"2012","unstructured":"Ariel P, Hoppa MB, Ryan TA (2012) Intrinsic variability in Pv, RRP size, Ca(2+) channel repertoire, and presynaptic potentiation in individual synaptic boutons. Frontiers in Synaptic Neuroscience 4:9","journal-title":"Frontiers in Synaptic Neuroscience"},{"issue":"2","key":"1037_CR5","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1023\/A:1013689704352","volume":"47","author":"P Auer","year":"2002","unstructured":"Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time Analysis of the Multiarmed Bandit Problem. Mach Learn 47(2):235\u201356","journal-title":"Mach Learn"},{"issue":"3","key":"1037_CR6","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1004164","volume":"11","author":"BB Averbeck","year":"2015","unstructured":"Averbeck BB (2015) Theory of Choice in Bandit, Information Sampling and Foraging Tasks. PLoS Comput Biol 11(3):e1004164","journal-title":"PLoS Comput Biol"},{"issue":"1","key":"1037_CR7","doi-asserted-by":"publisher","first-page":"1891","DOI":"10.1038\/s41467-018-04397-0","volume":"9","author":"BM Babayan","year":"2018","unstructured":"Babayan BM, Uchida N, Gershman SJ (2018) Belief state representation in the dopamine system. Nat Commun 9(1):1891","journal-title":"Nat Commun"},{"issue":"5","key":"1037_CR8","doi-asserted-by":"publisher","first-page":"311","DOI":"10.1038\/nrn.2017.35","volume":"18","author":"DR Bach","year":"2017","unstructured":"Bach DR, Dayan P (2017) Algorithms for survival: A comparative perspective on emotions. Nat Rev Neurosci 18(5):311\u2013319","journal-title":"Nat Rev Neurosci"},{"key":"1037_CR9","unstructured":"Baldassarre G, Parisi D (2000) Classical and instrumental conditioning: From laboratory phenomena to integrated mechanisms for adaptation. Meyer et al, pages 131\u2013139"},{"issue":"9","key":"1037_CR10","doi-asserted-by":"publisher","first-page":"1575","DOI":"10.1038\/s41593-023-01407-3","volume":"26","author":"ZZ Balewski","year":"2023","unstructured":"Balewski ZZ, Elston TW, Knudsen EB, Wallis JD (2023) Value dynamics affect choice preparation during decision-making. Nat Neurosci 26(9):1575\u20131583","journal-title":"Nat Neurosci"},{"key":"1037_CR11","doi-asserted-by":"crossref","unstructured":"Ban Y, He J, Cook CB (2021) Multi-facet Contextual Bandits: A Neural Network Perspective","DOI":"10.1145\/3447548.3467299"},{"key":"1037_CR12","doi-asserted-by":"crossref","unstructured":"Bartol TM, Bromer C, Kinney J, Chirillo MA, Bourne JN, Harris KM, Sejnowski TJ (2015) Hippocampal Spine Head Sizes Are Highly Precise","DOI":"10.1101\/016329"},{"issue":"9","key":"1037_CR13","doi-asserted-by":"publisher","first-page":"1214","DOI":"10.1038\/nn1954","volume":"10","author":"TEJ Behrens","year":"2007","unstructured":"Behrens TEJ, Woolrich MW, Walton ME, Rushworth MFS (2007) Learning the value of information in an uncertain world. Nat Neurosci 10(9):1214\u20131221","journal-title":"Nat Neurosci"},{"key":"1037_CR14","unstructured":"Besbes O, Gur Y, Zeevi A (2014) Stochastic Multi-Armed-Bandit Problem with Non-stationary Rewards. In Advances in Neural Information Processing Systems, volume\u00a027. Curran Associates, Inc.,"},{"key":"1037_CR15","doi-asserted-by":"publisher","first-page":"11","DOI":"10.3389\/fnsyn.2013.00011","volume":"5","author":"AV Blackman","year":"2013","unstructured":"Blackman AV, Abrahamsson T, Costa RP, Lalanne T, Jesper Sj\u00f6str\u00f6m P (2013) Target-cell-specific short-term plasticity in local circuits. Frontiers in Synaptic Neuroscience 5:11","journal-title":"Frontiers in Synaptic Neuroscience"},{"issue":"4","key":"1037_CR16","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pbio.0040092","volume":"4","author":"DA Butts","year":"2006","unstructured":"Butts DA, Goldman MS (2006) Tuning Curves, Neuronal Variability, and Sensory Coding. PLoS Biol 4(4):e92","journal-title":"PLoS Biol"},{"issue":"1","key":"1037_CR17","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1007\/s10827-013-0486-0","volume":"37","author":"S Carroll","year":"2014","unstructured":"Carroll S, Josi\u0107 K, Kilpatrick ZP (2014) Encoding certainty in bump attractors. J Comput Neurosci 37(1):29\u201348","journal-title":"J Comput Neurosci"},{"issue":"3","key":"1037_CR18","doi-asserted-by":"publisher","first-page":"380","DOI":"10.3390\/e23030380","volume":"23","author":"E Cavenaghi","year":"2021","unstructured":"Cavenaghi E, Sottocornola G, Stella F, Zanker M (2021) Non Stationary Multi-Armed Bandit: Empirical Evaluation of a New Concept Drift-Aware Algorithm. Entropy 23(3):380","journal-title":"Entropy"},{"issue":"6","key":"1037_CR19","doi-asserted-by":"publisher","first-page":"693","DOI":"10.1038\/nn.2123","volume":"11","author":"AK Churchland","year":"2008","unstructured":"Churchland AK, Kiani R, Shadlen MN (2008) Decision-making with multiple alternatives. Nat Neurosci 11(6):693\u2013702","journal-title":"Nat Neurosci"},{"issue":"6","key":"1037_CR20","doi-asserted-by":"publisher","first-page":"927","DOI":"10.1016\/j.conb.2012.05.007","volume":"22","author":"P Cisek","year":"2012","unstructured":"Cisek P (2012) Making decisions through a distributed consensus. Curr Opin Neurobiol 22(6):927\u2013936","journal-title":"Curr Opin Neurobiol"},{"issue":"1","key":"1037_CR21","doi-asserted-by":"publisher","first-page":"269","DOI":"10.1146\/annurev.neuro.051508.135409","volume":"33","author":"P Cisek","year":"2010","unstructured":"Cisek P, Kalaska JF (2010) Neural Mechanisms for Interacting with a World Full of Action Choices. Annu Rev Neurosci 33(1):269\u2013298","journal-title":"Annu Rev Neurosci"},{"issue":"1","key":"1037_CR22","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1038\/sj.npp.1301559","volume":"33","author":"A Citri","year":"2008","unstructured":"Citri A, Malenka RC (2008) Synaptic Plasticity: Multiple Forms, Functions, and Mechanisms. Neuropsychopharmacology 33(1):18\u201341","journal-title":"Neuropsychopharmacology"},{"issue":"4","key":"1037_CR23","doi-asserted-by":"publisher","first-page":"429","DOI":"10.3758\/CABN.8.4.429","volume":"8","author":"P Dayan","year":"2008","unstructured":"Dayan P, Daw ND (2008) Decision theory, reinforcement learning, and the brain. Cognitive Affective & Behavioral Neuroscience 8(4):429\u2013453","journal-title":"Cognitive Affective & Behavioral Neuroscience"},{"key":"1037_CR24","doi-asserted-by":"publisher","first-page":"e03005","DOI":"10.7554\/eLife.03005","volume":"3","author":"J Drugowitsch","year":"2014","unstructured":"Drugowitsch J, DeAngelis GC, Klier EM, Angelaki DE, Pouget A (2014) Optimal multisensory decision-making in a reaction-time task. Elife 3:e03005","journal-title":"Elife"},{"key":"1037_CR25","doi-asserted-by":"crossref","unstructured":"Esnaola-Acebes JM, Roxin A, Wimmer K (2021) Bump attractor dynamics underlying stimulus integration in perceptual estimation tasks. BioRxiv","DOI":"10.1101\/2021.03.15.434192"},{"key":"1037_CR26","doi-asserted-by":"crossref","unstructured":"Faisal AA (2012) Noise in Neurons and Other Constraints. In N.\u00a0Le\u00a0Nov\u00e8re, editor, Computational Systems Neurobiology, pages 227\u2013257. Springer Netherlands, Dordrecht","DOI":"10.1007\/978-94-007-3858-4_8"},{"issue":"2","key":"1037_CR27","doi-asserted-by":"publisher","first-page":"401","DOI":"10.1016\/j.neuron.2017.03.044","volume":"94","author":"S Farashahi","year":"2017","unstructured":"Farashahi S, Donahue CH, Khorsand P, Seo H, Lee D, Soltani A (2017) Metaplasticity as a Neural Substrate for Adaptive Learning and Choice under Uncertainty. Neuron 94(2):401-414.e6","journal-title":"Neuron"},{"issue":"2","key":"1037_CR28","doi-asserted-by":"publisher","first-page":"300","DOI":"10.1037\/0033-295X.113.2.300","volume":"113","author":"MJ Frank","year":"2006","unstructured":"Frank MJ, Claus ED (2006) Anatomy of a decision: Striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. Psychol Rev 113(2):300\u2013326","journal-title":"Psychol Rev"},{"key":"1037_CR29","doi-asserted-by":"publisher","first-page":"431","DOI":"10.3389\/fnins.2017.00431","volume":"11","author":"S Funahashi","year":"2017","unstructured":"Funahashi S (2017) Prefrontal Contribution to Decision-Making under Free-Choice Conditions. Front Neurosci 11:431","journal-title":"Front Neurosci"},{"issue":"4","key":"1037_CR30","doi-asserted-by":"publisher","first-page":"599","DOI":"10.1016\/j.neuron.2005.02.001","volume":"45","author":"S Fusi","year":"2005","unstructured":"Fusi S, Drew PJ, Abbott LF (2005) Cascade Models of Synaptically Stored Memories. Neuron 45(4):599\u2013611","journal-title":"Neuron"},{"key":"1037_CR31","unstructured":"Garivier A, Moulines E (2008) On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems"},{"issue":"2","key":"1037_CR32","doi-asserted-by":"publisher","first-page":"148","DOI":"10.1111\/j.2517-6161.1979.tb01068.x","volume":"41","author":"JC Gittins","year":"1979","unstructured":"Gittins JC (1979) Bandit Processes and Dynamic Allocation Indices. J Roy Stat Soc Series B (Methodological) 41(2):148\u2013177","journal-title":"J Roy Stat Soc Series B (Methodological)"},{"issue":"2","key":"1037_CR33","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1016\/j.neuron.2017.06.011","volume":"95","author":"D Hassabis","year":"2017","unstructured":"Hassabis D, Kumaran D, Summerfield C, Botvinick M (2017) Neuroscience-Inspired Artificial Intelligence. Neuron 95(2):245\u2013258","journal-title":"Neuron"},{"issue":"1","key":"1037_CR34","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1162\/evco.2007.15.1.1","volume":"15","author":"C Igel","year":"2007","unstructured":"Igel C, Hansen N, Roth S (2007) Covariance Matrix Adaptation for Multi-objective Optimization. Evol Comput 15(1):1\u201328","journal-title":"Evol Comput"},{"key":"1037_CR35","doi-asserted-by":"publisher","first-page":"e18073","DOI":"10.7554\/eLife.18073","volume":"5","author":"K Iigaya","year":"2016","unstructured":"Iigaya K (2016) Adaptive learning and decision-making under uncertainty by metaplastic synapses guided by a surprise detection system. Elife 5:e18073","journal-title":"Elife"},{"key":"1037_CR36","doi-asserted-by":"publisher","first-page":"e13747","DOI":"10.7554\/eLife.13747","volume":"5","author":"K Iigaya","year":"2016","unstructured":"Iigaya K, Story GW, Kurth-Nelson Z, Dolan RJ, Dayan P (2016) The modulation of savouring by prediction error and its effects on choice. Elife 5:e13747","journal-title":"Elife"},{"issue":"1","key":"1037_CR37","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1007\/s42113-020-00083-x","volume":"4","author":"JB Inglis","year":"2021","unstructured":"Inglis JB, Valentin VV, Gregory Ashby F (2021) Modulation of Dopamine for Adaptive Learning: A Neurocomputational Model. Computational brain & behavior 4(1):34\u201352","journal-title":"Computational brain & behavior"},{"key":"1037_CR38","volume-title":"and R\u00e9mi Munos","author":"E Kaufmann","year":"2012","unstructured":"Kaufmann E, Korda N (2012) and R\u00e9mi Munos. An Asymptotically Optimal Finite Time Analysis, Thompson Sampling"},{"issue":"2","key":"1037_CR39","doi-asserted-by":"publisher","DOI":"10.1101\/cshperspect.a016824","volume":"8","author":"MB Kennedy","year":"2016","unstructured":"Kennedy MB (2016) Synaptic Signaling in Learning and Memory. Cold Spring Harb Perspect Biol 8(2):a016824","journal-title":"Cold Spring Harb Perspect Biol"},{"issue":"3","key":"1037_CR40","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1037\/a0023575","volume":"125","author":"SW Kennerley","year":"2011","unstructured":"Kennerley SW, Walton ME (2011) Decision Making and Reward in Frontal Cortex. Behav Neurosci 125(3):297\u2013317","journal-title":"Behav Neurosci"},{"key":"1037_CR41","doi-asserted-by":"crossref","unstructured":"Khamassi M, Enel P, Dominey PF, Procyk E (2013) Chapter 22 - Medial prefrontal cortex and the adaptive regulation of reinforcement learning parameters. In V.\u00a0S.\u00a0Chandrasekhar Pammi and Narayanan Srinivasan, editors, Progress in Brain Research, volume 202 of Decision Making, pages 441\u2013464. Elsevier","DOI":"10.1016\/B978-0-444-62604-2.00022-8"},{"issue":"6","key":"1037_CR42","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1005630","volume":"13","author":"P Khorsand","year":"2017","unstructured":"Khorsand P, Soltani A (2017) Optimal structure of metaplasticity for adaptive learning. PLoS Comput Biol 13(6):e1005630","journal-title":"PLoS Comput Biol"},{"issue":"7","key":"1037_CR43","doi-asserted-by":"publisher","first-page":"3202","DOI":"10.1523\/JNEUROSCI.2532-12.2013","volume":"33","author":"MC Klein-Fl\u00fcgge","year":"2013","unstructured":"Klein-Fl\u00fcgge MC, Barron HC, Brodersen KH, Dolan RJ, Behrens TEJ (2013) Segregated Encoding of Reward-Identity and Stimulus-Reward Associations in Human Orbitofrontal Cortex. J Neurosci 33(7):3202\u20133211","journal-title":"J Neurosci"},{"issue":"10","key":"1037_CR44","doi-asserted-by":"publisher","first-page":"1280","DOI":"10.1038\/nn.4382","volume":"19","author":"N Kolling","year":"2016","unstructured":"Kolling N, Wittmann MK, Behrens TEJ, Boorman ED, Mars RB, Rushworth MFS (2016) Value, search, persistence and model updating in anterior cingulate cortex. Nat Neurosci 19(10):1280\u20131285","journal-title":"Nat Neurosci"},{"key":"1037_CR45","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1016\/j.conb.2015.08.001","volume":"35","author":"RS Larsen","year":"2015","unstructured":"Larsen RS, Jesper Sj\u00f6str\u00f6m P (2015) Synapse-type-specific plasticity in local circuits. Curr Opin Neurobiol 35:127\u2013135","journal-title":"Curr Opin Neurobiol"},{"key":"1037_CR46","doi-asserted-by":"publisher","first-page":"1062678","DOI":"10.3389\/fncom.2022.1062678","volume":"16","author":"J Lee","year":"2022","unstructured":"Lee J, Jo J, Lee B, Lee J-H, Yoon S (2022) Brain-inspired Predictive Coding Improves the Performance of Machine Challenging Tasks. Front Comput Neurosci 16:1062678","journal-title":"Front Comput Neurosci"},{"key":"1037_CR47","doi-asserted-by":"publisher","first-page":"230","DOI":"10.54097\/0v0q9842","volume":"94","author":"J Liu","year":"2024","unstructured":"Liu J (2024) Comprehensive Exploration and Implementation of Multi-Armed Bandit Algorithms Across Various Domains. Highlights in Science Engineering and Technology 94:230\u2013235","journal-title":"Highlights in Science Engineering and Technology"},{"key":"1037_CR48","volume-title":"and Max Tegmark","author":"Z Liu","year":"2023","unstructured":"Liu Z, Gan E (2023) and Max Tegmark. Brain-Inspired Modular Training for Mechanistic Interpretability, Seeing is Believing"},{"issue":"5","key":"1037_CR49","doi-asserted-by":"publisher","first-page":"1864","DOI":"10.1523\/JNEUROSCI.4920-12.2013","volume":"33","author":"C-H Luk","year":"2013","unstructured":"Luk C-H, Wallis JD (2013) Choice Coding in Frontal Cortex during Stimulus-Guided or Action-Guided Decision-Making. J Neurosci 33(5):1864\u20131871","journal-title":"J Neurosci"},{"key":"1037_CR50","doi-asserted-by":"publisher","first-page":"75","DOI":"10.3389\/fncir.2016.00075","volume":"10","author":"E Marcos","year":"2016","unstructured":"Marcos E, Genovesio A (2016) Determining Monkey Free Choice Long before the Choice Is Made: The Principal Role of Prefrontal Neurons Involved in Both Decision and Motor Processes. Frontiers in Neural Circuits 10:75","journal-title":"Frontiers in Neural Circuits"},{"issue":"1","key":"1037_CR51","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1146\/annurev.psych.57.102904.190143","volume":"58","author":"A Martin","year":"2007","unstructured":"Martin A (2007) The Representation of Object Concepts in the Brain. Annu Rev Psychol 58(1):25\u201345","journal-title":"Annu Rev Psychol"},{"issue":"7","key":"1037_CR52","doi-asserted-by":"publisher","first-page":"991","DOI":"10.1111\/j.1460-9568.2011.07982.x","volume":"35","author":"MA McDannald","year":"2012","unstructured":"McDannald MA, Takahashi YK, Lopatina N, Pietras BW, Jones JL, Schoenbaum G (2012) Model-based learning and the contribution of the orbitofrontal cortex to the model-free world. Eur J Neurosci 35(7):991\u2013996","journal-title":"Eur J Neurosci"},{"issue":"1","key":"1037_CR53","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1007\/s00422-018-0768-8","volume":"113","author":"P Miller","year":"2019","unstructured":"Miller P, Cannon J (2019) Combined mechanisms of neural firing rate homeostasis. Biol Cybern 113(1):47\u201359","journal-title":"Biol Cybern"},{"key":"1037_CR54","doi-asserted-by":"crossref","unstructured":"Niv Y, Joel D, Meilijson I, Ruppin E (2002) Evolution of Reinforcement Learning in Uncertain Environments: A Simple Explanation for Complex Foraging Behaviors. International Society for Adaptive Behavior","DOI":"10.1177\/10597123020101001"},{"issue":"1","key":"1037_CR55","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1186\/1744-9081-1-6","volume":"1","author":"Y Niv","year":"2005","unstructured":"Niv Y, Duff MO, Dayan P (2005) Dopamine, uncertainty and TD learning. Behav Brain Funct 1(1):6","journal-title":"Behav Brain Funct"},{"key":"1037_CR56","doi-asserted-by":"publisher","first-page":"60738","DOI":"10.1109\/ACCESS.2022.3179968","volume":"10","author":"JD Nunes","year":"2022","unstructured":"Nunes JD, Carvalho M, Carneiro D, Cardoso JS (2022) Spiking Neural Networks: A Survey. IEEE Access 10:60738\u201360764","journal-title":"IEEE Access"},{"issue":"8","key":"1037_CR57","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1008080","volume":"16","author":"GK Ocker","year":"2020","unstructured":"Ocker GK, Buice MA (2020) Flexible neural connectivity under constraints on total connection strength. PLoS Comput Biol 16(8):e1008080","journal-title":"PLoS Comput Biol"},{"key":"1037_CR58","unstructured":"Qi H, Guo F, Zhu L (2023) Forced Exploration in Bandit Problems"},{"issue":"5","key":"1037_CR59","doi-asserted-by":"publisher","first-page":"758","DOI":"10.1016\/j.neuron.2013.05.030","volume":"78","author":"S Ratt\u00e9","year":"2013","unstructured":"Ratt\u00e9 S, Hong S, De Schutter E, Prescott SA (2013) Impact of Neuronal Properties on Network Coding: Roles of Spike Initiation Dynamics and Robust Synchrony Transfer. Neuron 78(5):758\u2013772","journal-title":"Neuron"},{"issue":"7","key":"1037_CR60","doi-asserted-by":"publisher","first-page":"973","DOI":"10.1038\/nn.4320","volume":"19","author":"EL Rich","year":"2016","unstructured":"Rich EL, Wallis JD (2016) Decoding subjective decisions from orbitofrontal cortex. Nat Neurosci 19(7):973\u2013980","journal-title":"Nat Neurosci"},{"issue":"5","key":"1037_CR61","doi-asserted-by":"publisher","first-page":"1201","DOI":"10.1007\/s00429-023-02644-9","volume":"228","author":"ET Rolls","year":"2023","unstructured":"Rolls ET (2023) Emotion, motivation, decision-making, the orbitofrontal cortex, anterior cingulate cortex, and the amygdala. Brain Structure & Function 228(5):1201\u20131257","journal-title":"Brain Structure & Function"},{"issue":"3","key":"1037_CR62","doi-asserted-by":"publisher","first-page":"266","DOI":"10.1176\/appi.neuropsych.11060139","volume":"24","author":"MH Rosenbloom","year":"2012","unstructured":"Rosenbloom MH, Schmahmann JD, Price BH (2012) The Functional Neuroanatomy of Decision-Making. J Neuropsychiatry Clin Neurosci 24(3):266\u2013277","journal-title":"J Neuropsychiatry Clin Neurosci"},{"issue":"19","key":"1037_CR63","doi-asserted-by":"publisher","first-page":"3076","DOI":"10.1016\/j.neuron.2022.07.024","volume":"110","author":"S Safavi","year":"2022","unstructured":"Safavi S, Dayan P (2022) Multistability, perceptual value, and internal foraging. Neuron 110(19):3076\u20133090","journal-title":"Neuron"},{"issue":"5","key":"1037_CR64","doi-asserted-by":"publisher","first-page":"781","DOI":"10.1162\/neco_a_01659","volume":"36","author":"M Samavat","year":"2024","unstructured":"Samavat M, Bartol TM, Harris KM, Sejnowski TJ (2024) Synaptic Information Storage Capacity Measured With Information Theory. Neural Comput 36(5):781\u2013802","journal-title":"Neural Comput"},{"issue":"2","key":"1037_CR65","doi-asserted-by":"publisher","DOI":"10.1063\/5.0186054","volume":"2","author":"S Schmidgall","year":"2024","unstructured":"Schmidgall S, Ziaei R, Achterberg J, Louis Kirsch S, Hajiseyedrazi P, Eshraghian J (2024) Brain-inspired learning in artificial neural networks: A review. APL Machine Learning 2(2):021501","journal-title":"APL Machine Learning"},{"issue":"1","key":"1037_CR66","doi-asserted-by":"publisher","first-page":"23","DOI":"10.31887\/DCNS.2016.18.1\/wschultz","volume":"18","author":"W Schultz","year":"2016","unstructured":"Schultz W (2016) Dopamine reward prediction error coding. Dialogues Clin Neurosci 18(1):23\u201332","journal-title":"Dialogues Clin Neurosci"},{"key":"1037_CR67","doi-asserted-by":"publisher","DOI":"10.1016\/j.cogpsych.2019.101261","volume":"119","author":"E Schulz","year":"2020","unstructured":"Schulz E, Franklin NT, Gershman SJ (2020) Finding structure in multi-armed bandits. Cogn Psychol 119:101261","journal-title":"Cogn Psychol"},{"key":"1037_CR68","doi-asserted-by":"publisher","first-page":"126","DOI":"10.3389\/fnint.2010.00126","volume":"4","author":"T Singh","year":"2010","unstructured":"Singh T, McDannald MA, Haney RZ, Cerri DH, Schoenbaum G (2010) Nucleus Accumbens Core and Shell are Necessary for Reinforcer Devaluation Effects on Pavlovian Conditioned Responding. Front Integr Neurosci 4:126","journal-title":"Front Integr Neurosci"},{"key":"1037_CR69","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1016\/j.bandc.2015.11.003","volume":"112","author":"MW Spratling","year":"2017","unstructured":"Spratling MW (2017) A review of predictive coding algorithms. Brain Cogn 112:92\u201397","journal-title":"Brain Cogn"},{"issue":"3","key":"1037_CR70","doi-asserted-by":"publisher","first-page":"616","DOI":"10.1016\/j.neuron.2018.03.036","volume":"98","author":"CK Starkweather","year":"2018","unstructured":"Starkweather CK, Gershman SJ, Uchida N (2018) The Medial Prefrontal Cortex Shapes Dopamine Reward Prediction Errors under State Uncertainty. Neuron 98(3):616-629.e6","journal-title":"Neuron"},{"issue":"3","key":"1037_CR71","doi-asserted-by":"publisher","first-page":"168","DOI":"10.1016\/j.jmp.2008.11.002","volume":"53","author":"M Steyvers","year":"2009","unstructured":"Steyvers M, Lee MD, Wagenmakers E-J (2009) A Bayesian analysis of human decision-making on bandit problems. J Math Psychol 53(3):168\u2013179","journal-title":"J Math Psychol"},{"issue":"4\u20136","key":"1037_CR72","doi-asserted-by":"publisher","first-page":"523","DOI":"10.1016\/S0893-6080(02)00046-1","volume":"15","author":"RE Suri","year":"2002","unstructured":"Suri RE (2002) TD models of reward predictive responses in dopamine neurons. Neural Netw 15(4\u20136):523\u2013533","journal-title":"Neural Netw"},{"key":"1037_CR73","unstructured":"Sutton RS, Barto AG (1998) The Reinforcement Learning Problem. In Reinforcement Learning: An Introduction, pages 51\u201385. MIT Press"},{"issue":"2","key":"1037_CR74","doi-asserted-by":"publisher","first-page":"591","DOI":"10.1016\/j.neuron.2015.03.019","volume":"86","author":"S Suzuki","year":"2015","unstructured":"Suzuki S, Adachi R, Dunne S, Bossaerts P, O\u2019Doherty JP (2015) Neural mechanisms underlying human consensus decision-making. Neuron 86(2):591\u2013602","journal-title":"Neuron"},{"key":"1037_CR75","series-title":"volume 6359","first-page":"203","volume-title":"KI 2010: Advances in Artificial Intelligence","author":"M Tokic","year":"2010","unstructured":"Tokic M (2010) Adaptive $$\\varepsilon $$-Greedy Exploration in Reinforcement Learning Based on Value Differences. In: Dillmann R, Beyerer J, Hanebeck UD, Schultz T (eds) KI 2010: Advances in Artificial Intelligence. volume 6359. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 203\u2013210"},{"key":"1037_CR76","series-title":"volume 7006","first-page":"335","volume-title":"KI 2011: Advances in Artificial Intelligence","author":"M Tokic","year":"2011","unstructured":"Tokic M, Palm G (2011) Value-Difference Based Exploration: Adaptive Control between Epsilon-Greedy and Softmax. In: Bach J, Edelkamp S (eds) KI 2011: Advances in Artificial Intelligence. volume 7006. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 335\u2013346"},{"issue":"1","key":"1037_CR77","doi-asserted-by":"publisher","first-page":"2371","DOI":"10.1038\/s41467-020-15766-z","volume":"11","author":"MS Tomov","year":"2020","unstructured":"Tomov MS, Truong VQ, Hundia RA, Gershman SJ (2020) Dissociable neural correlates of uncertainty underlie different exploration strategies. Nat Commun 11(1):2371","journal-title":"Nat Commun"},{"issue":"Suppl 2","key":"1037_CR78","doi-asserted-by":"publisher","first-page":"T142","DOI":"10.1016\/j.neuroimage.2007.03.029","volume":"36","author":"ME Walton","year":"2007","unstructured":"Walton ME, Croxson PL, Behrens TEJ, Kennerley SW, Rushworth MFS (2007) Adaptive Decision Making and Value in the Anterior Cingulate Cortex. Neuroimage 36(Suppl 2):T142\u2013T154","journal-title":"Neuroimage"},{"issue":"10","key":"1037_CR79","doi-asserted-by":"publisher","first-page":"2039","DOI":"10.1109\/JAS.2024.124806","volume":"11","author":"XH Wen","year":"2024","unstructured":"Wen XH, Zhou MC (2024) Evolution and Role of Optimizers in Training Deep Learning Models. IEEE\/CAA Journal of Automatica Sinica 11(10):2039\u20132042","journal-title":"IEEE\/CAA Journal of Automatica Sinica"},{"issue":"2","key":"1037_CR80","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1111\/j.1467-8721.2008.00554.x","volume":"17","author":"S Yantis","year":"2008","unstructured":"Yantis S (2008) The Neural Basis of Selective Attention. Curr Dir Psychol Sci 17(2):86\u201390","journal-title":"Curr Dir Psychol Sci"},{"issue":"8","key":"1037_CR81","doi-asserted-by":"publisher","first-page":"2108","DOI":"10.1162\/NECO_a_00473","volume":"25","author":"H You","year":"2013","unstructured":"You H, Wang D-H (2013) Dynamics of Multiple-Choice Decision Making. Neural Comput 25(8):2108\u20132145","journal-title":"Neural Comput"},{"key":"1037_CR82","unstructured":"Zhang S, Yu AJ (2013) Forgetful Bayes and myopic planning: Human learning and decision-making in a bandit setting. In Advances in Neural Information Processing Systems, volume\u00a026. Curran Associates, Inc.,"}],"container-title":["Biological Cybernetics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00422-026-01037-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00422-026-01037-5","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00422-026-01037-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,5,8]],"date-time":"2026-05-08T07:03:47Z","timestamp":1778223827000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00422-026-01037-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,10]]},"references-count":82,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2026,4]]}},"alternative-id":["1037"],"URL":"https:\/\/doi.org\/10.1007\/s00422-026-01037-5","relation":{},"ISSN":["1432-0770"],"issn-type":[{"value":"1432-0770","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,10]]},"assertion":[{"value":"15 July 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 January 2026","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 March 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no Conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"8"}}