{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,18]],"date-time":"2026-06-18T20:31:21Z","timestamp":1781814681329,"version":"3.54.5"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1012404","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2024,9,16]],"date-time":"2024-09-16T00:00:00Z","timestamp":1726444800000}}],"reference-count":39,"publisher":"Public Library of Science (PLoS)","issue":"9","license":[{"start":{"date-parts":[[2024,9,4]],"date-time":"2024-09-04T00:00:00Z","timestamp":1725408000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100023564","name":"Einstein Center for Neurosciences Berlin","doi-asserted-by":"publisher","award":["PhD scholarship, no number"],"award-info":[{"award-number":["PhD scholarship, no number"]}],"id":[{"id":"10.13039\/501100023564","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["390523135"],"award-info":[{"award-number":["390523135"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Humans tend to give more weight to information confirming their beliefs than to information that disconfirms them. Nevertheless, this apparent irrationality has been shown to improve individual decision-making under uncertainty. However, little is known about this bias\u2019 impact on decision-making in a social context. Here, we investigate the conditions under which confirmation bias is beneficial or detrimental to decision-making under social influence. To do so, we develop a Collective Asymmetric Reinforcement Learning (CARL) model in which artificial agents observe others\u2019 actions and rewards, and update this information asymmetrically. We use agent-based simulations to study how confirmation bias affects collective performance on a two-armed bandit task, and how resource scarcity, group size and bias strength modulate this effect. We find that a confirmation bias benefits group learning across a wide range of resource-scarcity conditions. Moreover, we discover that, past a critical bias strength, resource abundance favors the emergence of two different performance regimes, one of which is suboptimal. In addition, we find that this regime bifurcation comes with polarization in small groups of agents. Overall, our results suggest the existence of an optimal, moderate level of confirmation bias for decision-making in a social context.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1012404","type":"journal-article","created":{"date-parts":[[2024,9,4]],"date-time":"2024-09-04T17:49:12Z","timestamp":1725472152000},"page":"e1012404","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":13,"title":["Moderate confirmation bias enhances decision-making in groups of reinforcement-learning agents"],"prefix":"10.1371","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7640-283X","authenticated-orcid":true,"given":"Cl\u00e9mence","family":"Bergerot","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Wolfram","family":"Barfuss","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Pawel","family":"Romanczuk","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"340","published-online":{"date-parts":[[2024,9,4]]},"reference":[{"issue":"1","key":"pcbi.1012404.ref001","doi-asserted-by":"crossref","first-page":"40391","DOI":"10.1038\/srep40391","article-title":"Modeling confirmation bias and polarization","volume":"7","author":"M. Del Vicario","year":"2017","journal-title":"Scientific reports"},{"issue":"2","key":"pcbi.1012404.ref002","doi-asserted-by":"crossref","first-page":"523","DOI":"10.1093\/pubmed\/fdac128","article-title":"Confirmation bias and vaccine-related beliefs in the time of COVID-19","volume":"45","author":"E. Malthouse","year":"2023","journal-title":"Journal of Public Health"},{"key":"pcbi.1012404.ref003","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1016\/j.techsoc.2018.06.002","article-title":"Does social media increase racist behavior? An examination of confirmation bias theory","volume":"55","author":"A. Alsaad","year":"2018","journal-title":"Technology in Society"},{"issue":"4","key":"pcbi.1012404.ref004","doi-asserted-by":"crossref","first-page":"500","DOI":"10.1177\/00936502211028049","article-title":"Confirmation bias and the persistence of misinformation on climate change","volume":"49","author":"Y. Zhou","year":"2022","journal-title":"Communication Research"},{"key":"pcbi.1012404.ref005","volume-title":"Reinforcement learning: an introduction","author":"R. S. Sutton","year":"2018"},{"issue":"7","key":"pcbi.1012404.ref006","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1016\/j.tics.2022.04.005","article-title":"The computational roots of positivity and confirmation biases in reinforcement learning","volume":"26","author":"S. Palminteri","year":"2022","journal-title":"Trends in Cognitive Sciences"},{"issue":"11","key":"pcbi.1012404.ref007","doi-asserted-by":"crossref","first-page":"1215","DOI":"10.1038\/s41562-019-0714-3","article-title":"Flexible combination of reward information across primates","volume":"3","author":"S. Farashahi","year":"2019","journal-title":"Nature human behaviour"},{"key":"pcbi.1012404.ref008","doi-asserted-by":"crossref","first-page":"e61387","DOI":"10.7554\/eLife.61387","article-title":"Impaired adaptation of learning to contingency volatility in internalizing psychopathology","volume":"9","author":"C. Gagne","year":"2020","journal-title":"Elife"},{"issue":"4","key":"pcbi.1012404.ref009","doi-asserted-by":"crossref","first-page":"0067","DOI":"10.1038\/s41562-017-0067","article-title":"Behavioural and neural characterization of optimistic reinforcement learning","volume":"1","author":"G. Lefebvre","year":"2017","journal-title":"Nature Human Behaviour"},{"key":"pcbi.1012404.ref010","doi-asserted-by":"crossref","first-page":"218","DOI":"10.1016\/j.neunet.2021.05.030","article-title":"The asymmetric learning rates of murine exploratory behavior in sparse reward environments","volume":"143","author":"H. Ohta","year":"2021","journal-title":"Neural Networks"},{"issue":"8","key":"pcbi.1012404.ref011","doi-asserted-by":"crossref","first-page":"e1005684","DOI":"10.1371\/journal.pcbi.1005684","article-title":"Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing","volume":"13","author":"S. Palminteri","year":"2017","journal-title":"PLoS computational biology"},{"issue":"3","key":"pcbi.1012404.ref012","doi-asserted-by":"crossref","first-page":"e13330","DOI":"10.1111\/desc.13330","article-title":"Confirmatory reinforcement learning changes with age during adolescence","volume":"26","author":"G. Chierchia","year":"2023","journal-title":"Developmental science"},{"issue":"10","key":"pcbi.1012404.ref013","doi-asserted-by":"crossref","first-page":"1067","DOI":"10.1038\/s41562-020-0919-5","article-title":"Information about action outcomes differentially affects learning from self-determined versus imposed choices","volume":"4","author":"V. Chambon","year":"2020","journal-title":"Nature Human Behaviour"},{"issue":"6","key":"pcbi.1012404.ref014","doi-asserted-by":"crossref","first-page":"711","DOI":"10.1007\/s00422-013-0571-5","article-title":"Adaptive properties of differential learning rates for positive and negative outcomes","volume":"107","author":"R. D. Caz\u00e9","year":"2013","journal-title":"Biological cybernetics"},{"key":"pcbi.1012404.ref015","article-title":"Optimal reinforcement learning with asymmetric updating in volatile environments: a simulation study","author":"M. R. Kandroodi","year":"2021","journal-title":"bioRxiv"},{"key":"pcbi.1012404.ref016","article-title":"Confirmation bias optimizes reward learning","author":"T. Tarantola","year":"2021","journal-title":"BioRxiv"},{"issue":"2","key":"pcbi.1012404.ref017","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1162\/neco_a_01455","article-title":"A normative account of confirmation bias during reinforcement learning","volume":"34","author":"G. Lefebvre","year":"2022","journal-title":"Neural computation"},{"issue":"1","key":"pcbi.1012404.ref018","doi-asserted-by":"crossref","first-page":"5493","DOI":"10.1038\/s41598-020-62085-w","article-title":"A minimalistic model of bias, polarization and misinformation in social networks","volume":"10","author":"O. Sikder","year":"2020","journal-title":"Scientific reports"},{"issue":"1","key":"pcbi.1012404.ref019","doi-asserted-by":"crossref","first-page":"31834","DOI":"10.1038\/srep31834","article-title":"Emergence of metapopulations and echo chambers in mobile agents","volume":"6","author":"M. Starnini","year":"2016","journal-title":"Scientific reports"},{"issue":"2","key":"pcbi.1012404.ref020","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1017\/psa.2023.176","article-title":"Can confirmation bias improve group learning?","volume":"91","author":"N. Gabriel","year":"2024","journal-title":"Philosophy of Science"},{"key":"pcbi.1012404.ref021","doi-asserted-by":"crossref","unstructured":"Panait, L., Sullivan, K., & Luke, S. (2006, May). Lenient learners in cooperative multiagent systems. In Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems (pp. 801-803).","DOI":"10.1145\/1160633.1160776"},{"key":"pcbi.1012404.ref022","doi-asserted-by":"crossref","unstructured":"Matignon, L., Laurent, G. J., & Le Fort-Piat, N. (2007, October). Hysteretic q-learning: an algorithm for decentralized reinforcement learning in cooperative multi-agent teams. In 2007 IEEE\/RSJ International Conference on Intelligent Robots and Systems (pp. 64-69). IEEE.","DOI":"10.1109\/IROS.2007.4399095"},{"issue":"2","key":"pcbi.1012404.ref023","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1016\/S0004-3702(02)00121-2","article-title":"Multiagent learning using a variable learning rate","volume":"136","author":"M. Bowling","year":"2002","journal-title":"Artificial intelligence"},{"key":"pcbi.1012404.ref024","doi-asserted-by":"crossref","unstructured":"Kapetanakis, S., & Kudenko, D. (2002, April). Improving on the reinforcement learning of coordination in cooperative multi-agent systems. In Proceedings of the Second Symposium on Adaptive Agents and Multi-agent Systems (AISB02).","DOI":"10.1007\/3-540-44826-8_2"},{"key":"pcbi.1012404.ref025","unstructured":"Matignon, L., Laurent, G. J., & Le Fort-Piat, N. (2008, May). A study of FMQ heuristic in cooperative multi-agent games. In The 7th International Conference on Autonomous Agents and Multiagent Systems. Workshop 10: Multi-Agent Sequential Decision Making in Uncertain Multi-Agent Domains, aamas\u2019 08. (Vol. 1, pp. 77-91)."},{"issue":"2017","key":"pcbi.1012404.ref026","doi-asserted-by":"crossref","first-page":"20232011","DOI":"10.1098\/rspb.2023.2011","article-title":"The roots of polarization in the individual reward system","volume":"291","author":"G. Lefebvre","year":"2024","journal-title":"Proceedings of the Royal Society B"},{"issue":"4","key":"pcbi.1012404.ref027","doi-asserted-by":"crossref","first-page":"043305","DOI":"10.1103\/PhysRevE.99.043305","article-title":"Deterministic limit of temporal difference reinforcement learning for stochastic games","volume":"99","author":"W. Barfuss","year":"2019","journal-title":"Physical Review E"},{"issue":"5","key":"pcbi.1012404.ref028","doi-asserted-by":"crossref","first-page":"523","DOI":"10.1287\/orsc.12.5.523.10092","article-title":"Adaptation as information restriction: The hot stove effect","volume":"12","author":"J. Denrell","year":"2001","journal-title":"Organization science"},{"issue":"2","key":"pcbi.1012404.ref029","doi-asserted-by":"crossref","first-page":"398","DOI":"10.1037\/0033-295X.114.2.398","article-title":"Interdependent sampling and social influence","volume":"114","author":"J. Denrell","year":"2007","journal-title":"Psychological review"},{"key":"pcbi.1012404.ref030","doi-asserted-by":"crossref","first-page":"e75308","DOI":"10.7554\/eLife.75308","article-title":"Conformist social learning leads to self-organised prevention against adverse bias in risky decision making","volume":"11","author":"W. Toyokawa","year":"2022","journal-title":"Elife"},{"key":"pcbi.1012404.ref031","volume-title":"The enigma of reason","author":"H. Mercier","year":"2017"},{"issue":"1","key":"pcbi.1012404.ref032","doi-asserted-by":"crossref","first-page":"1309","DOI":"10.1038\/s41598-023-27672-7","article-title":"Intrinsic fluctuations of reinforcement learning promote cooperation","volume":"13","author":"W. Barfuss","year":"2023","journal-title":"Scientific reports"},{"issue":"5","key":"pcbi.1012404.ref033","doi-asserted-by":"crossref","first-page":"983","DOI":"10.1016\/0042-6989(92)90040-P","article-title":"The information capacity of visual attention","volume":"32","author":"P. Verghese","year":"1992","journal-title":"Vision research"},{"issue":"3","key":"pcbi.1012404.ref034","doi-asserted-by":"crossref","first-page":"034409","DOI":"10.1103\/PhysRevE.105.034409","article-title":"Modeling the effects of environmental and perceptual uncertainty using deterministic reinforcement learning dynamics with partial observability","volume":"105","author":"W. Barfuss","year":"2022","journal-title":"Physical Review E"},{"issue":"4","key":"pcbi.1012404.ref035","doi-asserted-by":"crossref","first-page":"873","DOI":"10.1162\/neco.2008.12-06-420","article-title":"The diffusion decision model: theory and data for two-choice decision tasks","volume":"20","author":"R. Ratcliff","year":"2008","journal-title":"Neural computation"},{"issue":"29","key":"pcbi.1012404.ref036","doi-asserted-by":"crossref","first-page":"eabb0266","DOI":"10.1126\/sciadv.abb0266","article-title":"Wise or mad crowds? The cognitive mechanisms underlying information cascades","volume":"6","author":"A. N. Tump","year":"2020","journal-title":"Science Advances"},{"issue":"8","key":"pcbi.1012404.ref037","doi-asserted-by":"crossref","first-page":"e1010442","DOI":"10.1371\/journal.pcbi.1010442","article-title":"Avoiding costly mistakes in groups: the evolution of error management in collective decision making","volume":"18","author":"A. N. Tump","year":"2022","journal-title":"PLOS Computational Biology"},{"issue":"1","key":"pcbi.1012404.ref038","doi-asserted-by":"crossref","first-page":"3574","DOI":"10.1038\/s41598-020-80593-7","article-title":"Dissociation between asymmetric value updating and perseverance in human reinforcement learning","volume":"11","author":"M. Sugawara","year":"2021","journal-title":"Scientific reports"},{"issue":"1","key":"pcbi.1012404.ref039","doi-asserted-by":"crossref","DOI":"10.1177\/2053951715621086","article-title":"The unlikely encounter between von Foerster and Snowden: When second-order cybernetics sheds light on societal impacts of Big Data","volume":"3","author":"D. Chavalarias","year":"2016","journal-title":"Big Data & Society"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1012404","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2024,9,16]],"date-time":"2024-09-16T00:00:00Z","timestamp":1726444800000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1012404","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,16]],"date-time":"2024-09-16T18:05:26Z","timestamp":1726509926000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1012404"}},"subtitle":[],"editor":[{"given":"Stefano","family":"Palminteri","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2024,9,4]]},"references-count":39,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2024,9,4]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1012404","relation":{"new_version":[{"id-type":"doi","id":"10.1371\/journal.pcbi.1012404","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,9,4]]}}}