{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T20:46:15Z","timestamp":1770756375376,"version":"3.50.0"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1013879","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2026,2,9]],"date-time":"2026-02-09T00:00:00Z","timestamp":1770595200000}}],"reference-count":49,"publisher":"Public Library of Science (PLoS)","issue":"2","license":[{"start":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T00:00:00Z","timestamp":1769990400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100008510","name":"University of Maryland","doi-asserted-by":"publisher","award":["Start-up funds"],"award-info":[{"award-number":["Start-up funds"]}],"id":[{"id":"10.13039\/100008510","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100008510","name":"University of Maryland","doi-asserted-by":"publisher","award":["Start-up funds"],"award-info":[{"award-number":["Start-up funds"]}],"id":[{"id":"10.13039\/100008510","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000025","name":"National Institute of Mental Health","doi-asserted-by":"publisher","award":["R00MH123669"],"award-info":[{"award-number":["R00MH123669"]}],"id":[{"id":"10.13039\/100000025","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>When receiving a reward after a sequence of multiple events, how do we determine which event caused the reward? This problem, known as temporal credit assignment, can be difficult for humans to solve given the temporal uncertainty in the environment. Research to date has attempted to isolate dimensions of delay and reward during decision-making, but algorithmic solutions to temporal learning problems and the effect of uncertainty on learning remain underexplored. To further our understanding, we adapted a reward learning task that creates a temporal credit assignment problem by combining sequentially delayed rewards, intervening events, and varying uncertainty via the amount of information presented during feedback. Using computational modeling, two learning strategies were developed: an eligibility trace, whereby previously selected actions are updated as a function of the temporal sequence, and a tabular update, whereby only systematically related past actions (rather than unrelated intervening events) are updated. We hypothesized that reduced information uncertainty would correlate with increased use of the tabular strategy, given the model\u2019s capacity to incorporate additional feedback information. Both models effectively learned the task, and predicted choices made by participants (N\u2009=\u2009142) as well as specific behavioral signatures of credit assignment. Consistent with our hypothesis, the tabular model outperformed the eligibility model under low information uncertainty, as evidenced by more accurate predictions of participants\u2019 behavior and an increase in tabular weight. These findings provide new insights into the mechanisms implemented by humans to solve temporal credit assignment and adapt their strategy in varying environments.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1013879","type":"journal-article","created":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T18:53:33Z","timestamp":1770058413000},"page":"e1013879","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":0,"title":["Information uncertainty influences learning strategy from sequentially delayed rewards"],"prefix":"10.1371","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-3949-6087","authenticated-orcid":true,"given":"Sean R.","family":"Maulhardt","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alec","family":"Solway","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Caroline J.","family":"Charpentier","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"340","published-online":{"date-parts":[[2026,2,2]]},"reference":[{"issue":"2","key":"pcbi.1013879.ref001","doi-asserted-by":"crossref","first-page":"85","DOI":"10.3758\/BF03333113","article-title":"A neuronal model of classical conditioning","volume":"16","author":"A Harry Klopf","year":"1988","journal-title":"Psychobiology"},{"issue":"5","key":"pcbi.1013879.ref002","doi-asserted-by":"crossref","first-page":"1077","DOI":"10.1016\/j.cub.2022.01.025","article-title":"The role of state uncertainty in the dynamics of dopamine","volume":"32","author":"JG Mikhael","year":"2022","journal-title":"Curr Biol"},{"key":"pcbi.1013879.ref003","volume-title":"Temporal credit assignment in reinforcement learning","author":"RS Sutton","year":"1984"},{"issue":"1","key":"pcbi.1013879.ref004","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1109\/JRPROC.1961.287775","article-title":"Steps toward Artificial Intelligence","volume":"49","author":"M Minsky","year":"1961","journal-title":"Proc IRE"},{"issue":"3","key":"pcbi.1013879.ref005","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1037\/0033-2909.117.3.363","article-title":"Assessment of the Rescorla-Wagner model","volume":"117","author":"RR Miller","year":"1995","journal-title":"Psychol Bull"},{"issue":"3","key":"pcbi.1013879.ref006","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.jmp.2008.12.005","article-title":"Reinforcement learning in the brain","volume":"53","author":"Y Niv","year":"2009","journal-title":"J Math Psychol"},{"key":"pcbi.1013879.ref007","first-page":"64","article-title":"A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement","volume-title":"Classical Conditioning II","author":"RA Rescorla","year":"1972"},{"issue":"5306","key":"pcbi.1013879.ref008","doi-asserted-by":"crossref","first-page":"1593","DOI":"10.1126\/science.275.5306.1593","article-title":"A neural substrate of prediction and reward","volume":"275","author":"W Schultz","year":"1997","journal-title":"Science"},{"key":"pcbi.1013879.ref009","first-page":"355","article-title":"A temporal-difference model of classical conditioning","author":"RS Sutton","year":"1987"},{"issue":"2","key":"pcbi.1013879.ref010","doi-asserted-by":"crossref","first-page":"131","DOI":"10.3758\/s13415-011-0027-0","article-title":"Learning from delayed feedback: neural responses in temporal credit assignment","volume":"11","author":"MM Walsh","year":"2011","journal-title":"Cogn Affect Behav Neurosci"},{"issue":"6","key":"pcbi.1013879.ref011","doi-asserted-by":"crossref","first-page":"1204","DOI":"10.1016\/j.neuron.2011.02.027","article-title":"Model-based influences on humans\u2019 choices and striatal prediction errors","volume":"69","author":"ND Daw","year":"2011","journal-title":"Neuron"},{"issue":"1","key":"pcbi.1013879.ref012","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1037\/a0030844","article-title":"Retrospective revaluation in sequential decision making: a tale of two systems","volume":"143","author":"SJ Gershman","year":"2014","journal-title":"J Exp Psychol Gen"},{"issue":"1","key":"pcbi.1013879.ref013","doi-asserted-by":"crossref","first-page":"5738","DOI":"10.1038\/s41467-019-13632-1","article-title":"Task complexity interacts with state-space uncertainty in the arbitration between model-based and model-free learning","volume":"10","author":"D Kim","year":"2019","journal-title":"Nat Commun"},{"issue":"3","key":"pcbi.1013879.ref014","doi-asserted-by":"crossref","first-page":"687","DOI":"10.1016\/j.neuron.2013.11.028","article-title":"Neural computations underlying arbitration between model-based and model-free learning","volume":"81","author":"SW Lee","year":"2014","journal-title":"Neuron"},{"issue":"1","key":"pcbi.1013879.ref015","doi-asserted-by":"crossref","first-page":"750","DOI":"10.1038\/s41467-019-08662-8","article-title":"Retrospective model-based inference guides model-free credit assignment","volume":"10","author":"R Moran","year":"2019","journal-title":"Nat Commun"},{"key":"pcbi.1013879.ref016","doi-asserted-by":"crossref","unstructured":"Acuna DE, Schrater P. Structure learning in human sequential decision-making. 2010;6(12).","DOI":"10.1371\/journal.pcbi.1001003"},{"key":"pcbi.1013879.ref017","doi-asserted-by":"crossref","unstructured":"Agogino AK, Tumer K. Unifying temporal and structural credit assignment problems. 2004.","DOI":"10.65109\/ISIS9131"},{"key":"pcbi.1013879.ref018","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1146\/annurev-psych-122414-033625","article-title":"Reinforcement Learning and Episodic Memory in Humans and Animals: An Integrative Framework","volume":"68","author":"SJ Gershman","year":"2017","journal-title":"Annu Rev Psychol"},{"issue":"1","key":"pcbi.1013879.ref019","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1023\/A:1018012322525","article-title":"Reinforcement learning with replacing eligibility traces","volume":"22","author":"SP Singh","year":"1996","journal-title":"Mach Learn"},{"key":"pcbi.1013879.ref020","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1016\/B978-1-55860-377-6.50072-4","article-title":"TD Models: Modeling the World at a Mixture of Time Scales.","volume-title":"Machine Learning Proceedings 1995","author":"RS Sutton","year":"1995"},{"issue":"3","key":"pcbi.1013879.ref021","doi-asserted-by":"crossref","first-page":"615","DOI":"10.1037\/0278-7393.33.3.615","article-title":"Working memory, attention control, and the N-back task: a question of construct validity","volume":"33","author":"MJ Kane","year":"2007","journal-title":"J Exp Psychol Learn Mem Cogn"},{"issue":"3","key":"pcbi.1013879.ref022","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1016\/j.pharmthera.2012.02.004","article-title":"Excessive discounting of delayed reinforcers as a trans-disease process contributing to addiction and other disease-related vulnerabilities: emerging evidence","volume":"134","author":"WK Bickel","year":"2012","journal-title":"Pharmacol Ther"},{"issue":"21","key":"pcbi.1013879.ref023","doi-asserted-by":"crossref","first-page":"5796","DOI":"10.1523\/JNEUROSCI.4246-06.2007","article-title":"Time discounting for primary rewards","volume":"27","author":"SM McClure","year":"2007","journal-title":"J Neurosci"},{"issue":"3","key":"pcbi.1013879.ref024","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1016\/S0376-6357(04)00140-8","article-title":"Measuring state changes in human delay discounting: an experiential discounting task","volume":"67","author":"B Reynolds","year":"2004","journal-title":"Behav Processes"},{"issue":"3","key":"pcbi.1013879.ref025","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1002\/jeab.18","article-title":"The costs of delay: waiting versus postponing in intertemporal choice","volume":"99","author":"F Paglieri","year":"2013","journal-title":"J Exp Anal Behav"},{"issue":"50","key":"pcbi.1013879.ref026","doi-asserted-by":"crossref","first-page":"15669","DOI":"10.1523\/JNEUROSCI.2799-09.2009","article-title":"Serotonin Affects Association of Aversive Outcomes to Past Actions","volume":"29","author":"SC Tanaka","year":"2009","journal-title":"J Neurosci"},{"issue":"4","key":"pcbi.1013879.ref027","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2016884118","article-title":"Human subjects exploit a cognitive map for credit assignment","volume":"118","author":"R Moran","year":"2021","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1013879.ref028","volume-title":"Reinforcement Learning and Causal Models","author":"SJ Gershman","year":"2017"},{"issue":"2","key":"pcbi.1013879.ref029","doi-asserted-by":"crossref","first-page":"466","DOI":"10.1037\/a0033455","article-title":"Navigating complex decision spaces: Problems and paradigms in sequential choice","volume":"140","author":"MM Walsh","year":"2014","journal-title":"Psychol Bull"},{"key":"pcbi.1013879.ref030","article-title":"Characterizing heterogeneity in human reinforcement learning and the arbitration of behavioral control [Internet]","author":"J Cockburn","year":"2024","journal-title":"PsyArXiv"},{"key":"pcbi.1013879.ref031","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1016\/j.neubiorev.2020.10.022","article-title":"Why and how the brain weights contributions from a mixture of experts","volume":"123","author":"JP O\u2019Doherty","year":"2021","journal-title":"Neurosci Biobehav Rev"},{"issue":"1","key":"pcbi.1013879.ref032","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1080\/09548980902759086","article-title":"Prospective and retrospective temporal difference learning","volume":"20","author":"P Dayan","year":"2009","journal-title":"Netw Comput Neural Syst"},{"issue":"4","key":"pcbi.1013879.ref033","doi-asserted-by":"crossref","first-page":"1494","DOI":"10.3758\/s13428-016-0809-y","article-title":"Evaluating significance in linear mixed-effects models in R","volume":"49","author":"SG Luke","year":"2017","journal-title":"Behav Res Methods"},{"issue":"4","key":"pcbi.1013879.ref034","doi-asserted-by":"crossref","first-page":"585","DOI":"10.1016\/j.neuron.2010.04.016","article-title":"States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning","volume":"66","author":"J Gl\u00e4scher","year":"2010","journal-title":"Neuron"},{"key":"pcbi.1013879.ref035","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.47463","article-title":"One-shot learning and behavioral eligibility traces in sequential decision making","volume":"8","author":"MP Lehmann","year":"2019","journal-title":"Elife"},{"issue":"3","key":"pcbi.1013879.ref036","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1016\/j.cognition.2009.03.013","article-title":"Short-term gains, long-term pains: How cues about state aid learning in dynamic environments","volume":"113","author":"TM Gureckis","year":"2009","journal-title":"Cognition"},{"key":"pcbi.1013879.ref037","article-title":"Credit Assignment: Challenges and Opportunities in Developing Human-like AI Agents [Internet]","author":"TN Nguyen","year":"2023","journal-title":"arXiv"},{"key":"pcbi.1013879.ref038","doi-asserted-by":"crossref","unstructured":"Bruckner R, Heekeren HR, Nassar MR. Understanding Learning Through Uncertainty and Bias [Internet]. 2022 [cited 2024 Apr 29]. Available from: https:\/\/osf.io\/xjkbg","DOI":"10.31234\/osf.io\/xjkbg"},{"issue":"10","key":"pcbi.1013879.ref039","doi-asserted-by":"crossref","DOI":"10.1038\/s41562-020-0905-y","article-title":"Humans primarily use model-based inference in the two-stage task","volume":"4","author":"C Feher Da Silva","year":"2020","journal-title":"Nat Hum Behav"},{"key":"pcbi.1013879.ref040","volume-title":"Perseverative Behavioral Sequences Aid Long-Term Credit Assignment. In: Conference on Cognitive Computational Neuroscience","author":"S Bruinsma","year":"2024"},{"key":"pcbi.1013879.ref041","first-page":"4105","volume-title":"Bandits with delayed, aggregated anonymous feedback. In: International Conference on Machine Learning. PMLR","author":"C Pike-Burke","year":"2018"},{"issue":"7","key":"pcbi.1013879.ref042","doi-asserted-by":"crossref","first-page":"1024","DOI":"10.1111\/j.1460-9568.2011.07980.x","article-title":"How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis","volume":"35","author":"AGE Collins","year":"2012","journal-title":"Eur J Neurosci"},{"key":"pcbi.1013879.ref043","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1186\/1744-9081-1-6","article-title":"Dopamine, uncertainty and TD learning","volume":"1","author":"Y Niv","year":"2005","journal-title":"Behav Brain Funct"},{"issue":"1","key":"pcbi.1013879.ref044","doi-asserted-by":"crossref","first-page":"3574","DOI":"10.1038\/s41598-020-80593-7","article-title":"Dissociation between asymmetric value updating and perseverance in human reinforcement learning","volume":"11","author":"M Sugawara","year":"2021","journal-title":"Sci Rep"},{"key":"pcbi.1013879.ref045","doi-asserted-by":"crossref","DOI":"10.4135\/9781036231378","volume-title":"Building Experiments in PsychoPy","author":"J Pierce","year":"2022"},{"key":"pcbi.1013879.ref046","first-page":"526","volume-title":"Reinforcement learning: an introduction","author":"RS Sutton","year":"2018","edition":"2"},{"key":"pcbi.1013879.ref047","article-title":"Observe and Look Further: Achieving Consistent Performance on Atari [Internet]","author":"T Pohlen","year":"2018","journal-title":"arXiv"},{"key":"pcbi.1013879.ref048","volume-title":"Learning values across many orders of magnitude. In: 30th Conference","author":"H van Hasselt","year":"2016"},{"issue":"6","key":"pcbi.1013879.ref049","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v040.i06","article-title":"DEoptim: An R Package for Global Optimization by Differential Evolution","volume":"40","author":"KM Mullen","year":"2011","journal-title":"J Stat Softw"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1013879","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2026,2,9]],"date-time":"2026-02-09T00:00:00Z","timestamp":1770595200000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1013879","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,9]],"date-time":"2026-02-09T19:01:28Z","timestamp":1770663688000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1013879"}},"subtitle":[],"editor":[{"given":"Alireza","family":"Soltani","sequence":"first","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2026,2,2]]},"references-count":49,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2026,2,2]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1013879","relation":{},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,2]]}}}