{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,31]],"date-time":"2025-12-31T10:55:17Z","timestamp":1767178517516,"version":"build-2238731810"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1013271","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2025,9,12]],"date-time":"2025-09-12T00:00:00Z","timestamp":1757635200000}}],"reference-count":85,"publisher":"Public Library of Science (PLoS)","issue":"9","license":[{"start":{"date-parts":[[2025,9,5]],"date-time":"2025-09-05T00:00:00Z","timestamp":1757030400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100018694","name":"HORIZON EUROPE Marie Sklodowska-Curie Actions","doi-asserted-by":"publisher","award":["945304"],"award-info":[{"award-number":["945304"]}],"id":[{"id":"10.13039\/100018694","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001665","name":"Agence Nationale de la Recherche","doi-asserted-by":"publisher","award":["ANR-17-EURE-0017"],"award-info":[{"award-number":["ANR-17-EURE-0017"]}],"id":[{"id":"10.13039\/501100001665","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001665","name":"Agence Nationale de la Recherche","doi-asserted-by":"publisher","award":["ANR-10-IDEX-0001-02"],"award-info":[{"award-number":["ANR-10-IDEX-0001-02"]}],"id":[{"id":"10.13039\/501100001665","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Humans can spontaneously detect complex algebraic structures. Historically, two opposing views explain this ability, at the root of language and music acquisition. Some argue for the existence of an innate and specific mechanism. Others argue that this ability emerges from experience: i.e. when generic learning principles continuously process sensory inputs. These two views, however, remain difficult to test experimentally. Here, we use deep learning models to evaluate the factors that lead to the spontaneous detection of algebraic structures in the auditory modality. Specifically, we use self-supervised learning to train multiple deep-learning models with a variable amount of either natural (environmental sounds) and\/or cultural sounds (speech or music) to evaluate the impact of these stimuli. We then expose these models to the experimental paradigms classically used to evaluate the processing of algebraic structures. Like humans, these models spontaneously detect repeated sequences, probabilistic chunks, and complex algebraic structures. Also like humans, this ability diminishes with structure complexity. Importantly, this ability can emerge from experience alone: the more the models are exposed to natural sounds, the more they spontaneously detect increasingly complex structures. Finally, this ability does not emerge in models pretrained only on speech, and emerges more rapidly in models pretrained with music than environmental sounds. Overall, our study provides an operational framework to clarify sufficient built-in and acquired principles that model human\u2019s advanced capacity to detect algebraic structures in sounds.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1013271","type":"journal-article","created":{"date-parts":[[2025,9,5]],"date-time":"2025-09-05T17:58:06Z","timestamp":1757095086000},"page":"e1013271","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":0,"title":["The detection of algebraic auditory structures emerges with self-supervised learning"],"prefix":"10.1371","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1094-266X","authenticated-orcid":true,"given":"Pierre","family":"Orhan","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yves","family":"Boubenec","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jean-R\u00e9mi","family":"King","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"340","published-online":{"date-parts":[[2025,9,5]]},"reference":[{"issue":"2","key":"pcbi.1013271.ref001","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1016\/j.cognition.2005.04.006","article-title":"The nature of the language faculty and its implications for evolution of language (reply to Fitch, Hauser, and Chomsky)","volume":"97","author":"R Jackendoff","year":"2005","journal-title":"Cognition"},{"key":"pcbi.1013271.ref002","volume-title":"First mit press paperback edition ed","author":"RC Berwick","year":"2017"},{"issue":"7","key":"pcbi.1013271.ref003","doi-asserted-by":"crossref","first-page":"674","DOI":"10.1038\/nn1082","article-title":"Language, music, syntax and the brain","volume":"6","author":"AD Patel","year":"2003","journal-title":"Nat Neurosci"},{"issue":"1","key":"pcbi.1013271.ref004","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1016\/j.neuron.2015.09.019","article-title":"The neural representation of sequences: from transition probabilities to algebraic patterns and linguistic trees","volume":"88","author":"S Dehaene","year":"2015","journal-title":"Neuron"},{"key":"pcbi.1013271.ref005","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.65566","article-title":"Distinct higher-order representations of natural sounds in human and ferret auditory cortex","volume":"10","author":"A Landemard","year":"2021","journal-title":"Elife"},{"key":"pcbi.1013271.ref006","volume-title":"How the mind works","author":"S Pinker","year":"1999"},{"key":"pcbi.1013271.ref007","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/1187.001.0001","volume-title":"The algebraic mind: integrating connectionism and cognitive science","author":"GF Marcus","year":"2001"},{"key":"pcbi.1013271.ref008","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.84376","article-title":"Brain-imaging evidence for compression of binary sound sequences in human memory","volume":"12","author":"F Al Roumi","year":"2023","journal-title":"Elife"},{"issue":"16","key":"pcbi.1013271.ref009","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2023123118","article-title":"Sensitivity to geometric shape regularity in humans and baboons: a putative signature of human singularity","volume":"118","author":"M Sabl\u00e9-Meyer","year":"2021","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"9","key":"pcbi.1013271.ref010","doi-asserted-by":"crossref","first-page":"751","DOI":"10.1016\/j.tics.2022.06.010","article-title":"Symbols and mental programs: a hypothesis about human singularity","volume":"26","author":"S Dehaene","year":"2022","journal-title":"Trends Cogn Sci"},{"issue":"31","key":"pcbi.1013271.ref011","doi-asserted-by":"crossref","first-page":"10687","DOI":"10.1073\/pnas.0802631105","article-title":"The discovery of structural form","volume":"105","author":"C Kemp","year":"2008","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1013271.ref012","unstructured":"Chomsky N. The minimalist program. Cambridge, Mass: The MIT Press; 1995."},{"key":"pcbi.1013271.ref013","unstructured":"Bates E, Benigni L, Bretherton I, Camaioni L, Volterrra V. The emergence of symbols: cognition and communication in infancy. New York: Academic Press; 1979."},{"issue":"5294","key":"pcbi.1013271.ref014","doi-asserted-by":"crossref","first-page":"1926","DOI":"10.1126\/science.274.5294.1926","article-title":"Statistical learning by 8-month-old infants","volume":"274","author":"JR Saffran","year":"1996","journal-title":"Science"},{"issue":"5593","key":"pcbi.1013271.ref015","doi-asserted-by":"crossref","first-page":"604","DOI":"10.1126\/science.1072901","article-title":"Signal-driven computations in speech processing","volume":"298","author":"M Pe\u00f1a","year":"2002","journal-title":"Science"},{"issue":"24","key":"pcbi.1013271.ref016","doi-asserted-by":"crossref","first-page":"15822","DOI":"10.1073\/pnas.232472899","article-title":"Statistical learning of new visual feature combinations by infants","volume":"99","author":"J Fiser","year":"2002","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"4","key":"pcbi.1013271.ref017","doi-asserted-by":"crossref","first-page":"486","DOI":"10.1038\/nn.3331","article-title":"Neural representations of events arise from temporal community structure","volume":"16","author":"AC Schapiro","year":"2013","journal-title":"Nat Neurosci"},{"issue":"1","key":"pcbi.1013271.ref018","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1002\/hipo.22523","article-title":"Statistical learning of temporal community structure in the hippocampus","volume":"26","author":"AC Schapiro","year":"2016","journal-title":"Hippocampus"},{"key":"pcbi.1013271.ref019","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.86430","article-title":"Humans parsimoniously represent auditory sequences by pruning and completing the underlying network structure","volume":"12","author":"L Benjamin","year":"2023","journal-title":"Elife"},{"issue":"12","key":"pcbi.1013271.ref020","doi-asserted-by":"crossref","first-page":"2544","DOI":"10.1016\/j.clinph.2007.04.026","article-title":"The mismatch negativity (MMN) in basic research of central auditory processing: a review","volume":"118","author":"R N\u00e4\u00e4t\u00e4nen","year":"2007","journal-title":"Clin Neurophysiol"},{"issue":"5","key":"pcbi.1013271.ref021","doi-asserted-by":"crossref","first-page":"1672","DOI":"10.1073\/pnas.0809667106","article-title":"Neural signature of the conscious processing of auditory regularities","volume":"106","author":"TA Bekinschtein","year":"2009","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"51","key":"pcbi.1013271.ref022","doi-asserted-by":"crossref","first-page":"20754","DOI":"10.1073\/pnas.1117807108","article-title":"Evidence for a hierarchy of predictions and prediction errors in human cortex","volume":"108","author":"C Wacongne","year":"2011","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"5","key":"pcbi.1013271.ref023","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.1508523113","article-title":"Brain responses in humans reveal ideal observer-like sensitivity to complex acoustic patterns","volume":"113","author":"N Barascud","year":"2016","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"11","key":"pcbi.1013271.ref024","doi-asserted-by":"crossref","first-page":"3665","DOI":"10.1523\/JNEUROSCI.5003-11.2012","article-title":"A neuronal model of predictive coding accounting for the mismatch negativity","volume":"32","author":"C Wacongne","year":"2012","journal-title":"J Neurosci"},{"key":"pcbi.1013271.ref025","unstructured":"Pearce MT. The construction and evaluation of statistical models of melodic structure in music perception and composition. City University London; 2005. https:\/\/openaccess.city.ac.uk\/id\/eprint\/8459\/"},{"issue":"11","key":"pcbi.1013271.ref026","article-title":"PPM-Decay: a computational model of auditory prediction with memory decay","volume":"16","author":"PMC Harrison","year":"2020","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1013271.ref027","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.56073","article-title":"Long-term implicit memory for sequential auditory patterns in humans","volume":"9","author":"R Bianco","year":"2020","journal-title":"Elife"},{"key":"pcbi.1013271.ref028","unstructured":"Lake B, Baroni M. Generalization without systematicity: on the compositional skills of sequence-to-sequence recurrent networks. In: Proceedings of the 35th International Conference on Machine Learning. 2018. p. 2873\u201382. https:\/\/proceedings.mlr.press\/v80\/lake18a.html"},{"issue":"7985","key":"pcbi.1013271.ref029","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1038\/s41586-023-06668-3","article-title":"Human-like systematic generalization through a meta-learning neural network","volume":"623","author":"BM Lake","year":"2023","journal-title":"Nature"},{"key":"pcbi.1013271.ref030","doi-asserted-by":"crossref","first-page":"757","DOI":"10.1613\/jair.1.11674","article-title":"Compositionality decomposed: how do neural networks generalise?","volume":"67","author":"D Hupkes","year":"2020","journal-title":"Journal of Artificial Intelligence Research"},{"key":"pcbi.1013271.ref031","doi-asserted-by":"crossref","unstructured":"Lakretz Y, Kruszewski G, Desbordes T, Hupkes D, Dehaene S, Baroni M. The emergence of number and syntax units in LSTM language models. arXiv preprint 2019. http:\/\/arxiv.org\/abs\/1903.07435.","DOI":"10.18653\/v1\/N19-1002"},{"key":"pcbi.1013271.ref032","doi-asserted-by":"crossref","first-page":"521","DOI":"10.1162\/tacl_a_00115","article-title":"Assessing the ability of LSTMs to learn syntax-sensitive dependencies","volume":"4","author":"T Linzen","year":"2016","journal-title":"TACL"},{"issue":"8","key":"pcbi.1013271.ref033","doi-asserted-by":"crossref","first-page":"348","DOI":"10.1016\/j.tics.2010.06.002","article-title":"Letting structure emerge: connectionist and dynamical systems approaches to cognition","volume":"14","author":"JL McClelland","year":"2010","journal-title":"Trends in Cognitive Sciences"},{"key":"pcbi.1013271.ref034","article-title":"An explanation of in-context learning as implicit Bayesian inference","author":"SM Xie","year":"2022","journal-title":"arXiv preprint"},{"key":"pcbi.1013271.ref035","unstructured":"Chan S, Santoro A, Lampinen A, Wang J, Singh A, Richemond P. Data distributional properties drive emergent in-context learning in transformers. In: Advances in Neural Information Processing Systems. 2022. p. 18878\u201391. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2022\/hash\/77c6ccacfd9962e2307fc64680fc5ace-Abstract-Conference.html"},{"key":"pcbi.1013271.ref036","unstructured":"Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, et al. Language models are few-shot learners. In: Advances in Neural Information Processing Systems. 2020. p. 1877\u2013901. https:\/\/papers.nips.cc\/paper\/2020\/hash\/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html"},{"key":"pcbi.1013271.ref037","first-page":"22628","article-title":"How well do unsupervised learning algorithms model human real-time and life-long learning?","volume":"35","author":"C Zhuang","year":"2022","journal-title":"Adv Neural Inf Process Syst"},{"issue":"3","key":"pcbi.1013271.ref038","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2014196118","article-title":"Unsupervised neural network models of the ventral visual stream","volume":"118","author":"C Zhuang","year":"2021","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1013271.ref039","unstructured":"Millet J, Caucheteux C, Orhan P, Boubenec Y, Gramfort A, Dunbar E. Toward a realistic model of speech processing in the brain with self-supervised learning. In: Advances in Neural Information Processing Systems. 2022. p. 33428\u201343. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2022\/hash\/d81ecfc8fb18e833a3fa0a35d92532b8-Abstract-Conference.html"},{"issue":"9","key":"pcbi.1013271.ref040","doi-asserted-by":"crossref","first-page":"657","DOI":"10.1038\/s41562-017-0186-2","article-title":"Letter perception emerges from unsupervised deep learning and recycling of natural image features","volume":"1","author":"A Testolin","year":"2017","journal-title":"Nat Hum Behav"},{"key":"pcbi.1013271.ref041","article-title":"wav2vec 2.0: a framework for self-supervised learning of speech representations","author":"A Baevski","year":"2020","journal-title":"arXiv preprint"},{"key":"pcbi.1013271.ref042","doi-asserted-by":"crossref","unstructured":"Panayotov V, Chen G, Povey D, Khudanpur S. Librispeech: An ASR corpus based on public domain audio books. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2015. p. 5206\u201310. https:\/\/ieeexplore.ieee.org\/document\/7178964","DOI":"10.1109\/ICASSP.2015.7178964"},{"key":"pcbi.1013271.ref043","article-title":"FMA: a dataset for music analysis","author":"M Defferrard","year":"2017","journal-title":"arXiv preprint"},{"key":"pcbi.1013271.ref044","doi-asserted-by":"crossref","unstructured":"Gemmeke JF, Ellis DPW, Freedman D, Jansen A, Lawrence W, Moore RC, et al. Audio set: an ontology and human-labeled dataset for audio events. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2017. p. 776\u201380. https:\/\/ieeexplore.ieee.org\/document\/7952261","DOI":"10.1109\/ICASSP.2017.7952261"},{"issue":"1","key":"pcbi.1013271.ref045","article-title":"Two distinct dynamic modes subtend the detection of unexpected sounds","volume":"9","author":"J-R King","year":"2014","journal-title":"PLoS One"},{"key":"pcbi.1013271.ref046","doi-asserted-by":"crossref","first-page":"726","DOI":"10.1016\/j.neuroimage.2013.07.013","article-title":"Single-trial decoding of auditory novelty responses facilitates the detection of residual consciousness","volume":"83","author":"JR King","year":"2013","journal-title":"Neuroimage"},{"issue":"2","key":"pcbi.1013271.ref047","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1016\/j.cognition.2014.03.013","article-title":"A hierarchy of cortical responses to sequence violations in three-month-old infants","volume":"132","author":"A Basirat","year":"2014","journal-title":"Cognition"},{"issue":"11","key":"pcbi.1013271.ref048","doi-asserted-by":"crossref","first-page":"4203","DOI":"10.1093\/cercor\/bhu143","article-title":"Event-related potential, time-frequency, and functional connectivity facets of local and global auditory novelty processing: an intracranial study in humans","volume":"25","author":"I El Karoui","year":"2015","journal-title":"Cereb Cortex"},{"issue":"39","key":"pcbi.1013271.ref049","doi-asserted-by":"crossref","first-page":"13389","DOI":"10.1523\/JNEUROSCI.2227-12.2012","article-title":"Repetition suppression and expectation suppression are dissociable in time in early auditory evoked fields","volume":"32","author":"A Todorovic","year":"2012","journal-title":"J Neurosci"},{"issue":"4","key":"pcbi.1013271.ref050","doi-asserted-by":"crossref","first-page":"1127","DOI":"10.1523\/JNEUROSCI.3165-13.2014","article-title":"A hierarchy of responses to auditory regularities in the macaque brain","volume":"34","author":"L Uhrig","year":"2014","journal-title":"J Neurosci"},{"key":"pcbi.1013271.ref051","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.74653","article-title":"Constructing the hierarchy of predictive auditory sequences in the marmoset brain","volume":"11","author":"Y Jiang","year":"2022","journal-title":"Elife"},{"key":"pcbi.1013271.ref052","article-title":"Speech language models lack important brain-relevant semantics","author":"SR Oota","year":"2023","journal-title":"arXiv preprint"},{"issue":"39","key":"pcbi.1013271.ref053","doi-asserted-by":"crossref","first-page":"16428","DOI":"10.1073\/pnas.1112937108","article-title":"Functional specificity for high-level linguistic processing in the human brain","volume":"108","author":"E Fedorenko","year":"2011","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1013271.ref054","article-title":"Robust speech recognition via large-scale weak supervision","author":"A Radford","year":"2022","journal-title":"arXiv preprint"},{"key":"pcbi.1013271.ref055","article-title":"High fidelity neural audio compression","author":"A D\u00e9fossez","year":"2022","journal-title":"arXiv preprint"},{"key":"pcbi.1013271.ref056","article-title":"Concurrent encoding of precision and prediction error in unfolding auditory patterns: insights from MEG","author":"M Hu","year":"2023","journal-title":"arXiv preprint"},{"issue":"1","key":"pcbi.1013271.ref057","doi-asserted-by":"crossref","first-page":"464","DOI":"10.1121\/1.4807641","article-title":"The detection of repetitions in noise before and after perceptual learning","volume":"134","author":"TR Agus","year":"2013","journal-title":"J Acoust Soc Am"},{"issue":"4","key":"pcbi.1013271.ref058","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1016\/j.neuron.2010.04.014","article-title":"Rapid formation of robust auditory memories: insights from noise","volume":"66","author":"TR Agus","year":"2010","journal-title":"Neuron"},{"issue":"4","key":"pcbi.1013271.ref059","doi-asserted-by":"crossref","first-page":"2219","DOI":"10.1121\/1.5007730","article-title":"Auditory memory for random time patterns","volume":"142","author":"H Kang","year":"2017","journal-title":"The Journal of the Acoustical Society of America"},{"issue":"1043","key":"pcbi.1013271.ref060","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1098\/rspb.1970.0040","article-title":"A theory for cerebral neocortex","volume":"176","author":"D Marr","year":"1970","journal-title":"Proc R Soc Lond B Biol Sci"},{"issue":"3","key":"pcbi.1013271.ref061","doi-asserted-by":"crossref","first-page":"551","DOI":"10.1016\/0306-4522(89)90423-5","article-title":"Two-stage model of memory trace formation: a role for \u201cnoisy\u201d brain states","volume":"31","author":"G Buzs\u00e1ki","year":"1989","journal-title":"Neuroscience"},{"issue":"7","key":"pcbi.1013271.ref062","first-page":"310","article-title":"The hippocampus: hub of brain network communication for memory","volume":"15","author":"FP Battaglia","year":"2011","journal-title":"Trends Cogn Sci"},{"issue":"1791","key":"pcbi.1013271.ref063","first-page":"20141000","article-title":"Representations of specific acoustic patterns in the auditory cortex and hippocampus","volume":"281","author":"S Kumar","year":"2014","journal-title":"Proc Biol Sci"},{"issue":"15","key":"pcbi.1013271.ref064","doi-asserted-by":"crossref","first-page":"1966","DOI":"10.1016\/j.cub.2015.06.035","article-title":"Representation of numerical and sequential patterns in macaque and human brains","volume":"25","author":"L Wang","year":"2015","journal-title":"Curr Biol"},{"key":"pcbi.1013271.ref065","article-title":"Parallel mechanisms signal a hierarchy of sequence structure violations in the auditory cortex","author":"S Jamali","year":"2024","journal-title":"arXiv preprint"},{"key":"pcbi.1013271.ref066","unstructured":"Poole KC. How does the brain extract acoustic patterns? A behavioural and neural study. UCL (University College London). 2023. https:\/\/discovery.ucl.ac.uk\/id\/eprint\/10173385\/"},{"issue":"2","key":"pcbi.1013271.ref067","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1044\/2016_AJSLP-15-0169","article-title":"Mapping the early language environment using all-day recordings and automated analysis","volume":"26","author":"J Gilkerson","year":"2017","journal-title":"Am J Speech Lang Pathol"},{"key":"pcbi.1013271.ref068","unstructured":"Hewitt J, Manning CD. A structural probe for finding syntax in word representations. In: Proceedings of the 2019 Conference of the North. 2019. p. 4129\u201338. http:\/\/aclweb.org\/anthology\/N19-1419"},{"issue":"12","key":"pcbi.1013271.ref069","doi-asserted-by":"crossref","first-page":"2213","DOI":"10.1038\/s41593-023-01468-4","article-title":"Dissecting neural computations in the human auditory pathway using deep neural networks for speech","volume":"26","author":"Y Li","year":"2023","journal-title":"Nat Neurosci"},{"key":"pcbi.1013271.ref070","article-title":"Self-supervised models of audio effectively explain human cortical responses to speech","author":"AR Vaidya","year":"2022","journal-title":"arXiv preprint"},{"issue":"1","key":"pcbi.1013271.ref071","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1038\/s41562-021-01244-z","article-title":"Deep neural network models of sound localization reveal how perception is adapted to real-world environments","volume":"6","author":"A Francl","year":"2022","journal-title":"Nat Hum Behav"},{"issue":"12","key":"pcbi.1013271.ref072","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pbio.3002366","article-title":"Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions","volume":"21","author":"G Tuckute","year":"2023","journal-title":"PLoS Biol"},{"issue":"1","key":"pcbi.1013271.ref073","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1038\/s41467-023-44516-0","article-title":"Spontaneous emergence of rudimentary music detectors in deep neural networks","volume":"15","author":"G Kim","year":"2024","journal-title":"Nat Commun"},{"key":"pcbi.1013271.ref074","unstructured":"LeCun Y, Boser B, Denker J, Henderson D, Howard R, Hubbard W, et al. Handwritten digit recognition with a back-propagation network. In: Advances in Neural Information Processing Systems. 1989. https:\/\/proceedings.neurips.cc\/paper\/1989\/hash\/53c3bce66e43be4f209556518c2fcb54-Abstract.html"},{"key":"pcbi.1013271.ref075","doi-asserted-by":"crossref","unstructured":"Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, et al. Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Online: Association for Computational Linguistics; 2020. p. 38\u201345. https:\/\/aclanthology.org\/2020.emnlp-demos.6","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"pcbi.1013271.ref076","doi-asserted-by":"crossref","unstructured":"Wesker T, Meyer B, Wagener K, Anem\u00fcller J, Mertins A, Kollmeier B. Oldenburg logatome speech corpus (OLLO) for speech recognition experiments with humans and machines. In: Interspeech 2005. ISCA; 2005. p. 1273\u20136. https:\/\/www.isca-speech.org\/archive\/interspeech_2005\/wesker05_interspeech.html","DOI":"10.21437\/Interspeech.2005-485"},{"issue":"5","key":"pcbi.1013271.ref077","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1109\/TSA.2002.800560","article-title":"Musical genre classification of audio signals","volume":"10","author":"G Tzanetakis","year":"2002","journal-title":"IEEE Transactions on Speech and Audio Processing"},{"issue":"2","key":"pcbi.1013271.ref078","first-page":"147","article-title":"The GTZAN dataset: Its contents, its faults, their effects on evaluation, and its future use","volume":"43","author":"BL Sturm","year":"2014","journal-title":"Journal of New Music Research"},{"key":"pcbi.1013271.ref079","doi-asserted-by":"crossref","unstructured":"Piczak KJ. ESC. In: Proceedings of the 23rd ACM international conference on Multimedia. 2015. p. 1015\u20138. https:\/\/doi.org\/10.1145\/2733373.2806390","DOI":"10.1145\/2733373.2806390"},{"key":"pcbi.1013271.ref080","unstructured":"McFee B, McVicar M, Faronbi D, Roman I, Gover M, Balke S, et al. librosa\/librosa: 0.10.1. Zenodo; 2023. https:\/\/zenodo.org\/record\/8252662"},{"key":"pcbi.1013271.ref081","article-title":"Estimating the carbon footprint of BLOOM, a 176B parameter language model","author":"AS Luccioni","year":"2022","journal-title":"arXiv preprint"},{"issue":"1","key":"pcbi.1013271.ref082","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1008598","article-title":"A theory of memory for binary sequences: evidence for a mental compression algorithm in humans","volume":"17","author":"S Planton","year":"2021","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1013271.ref083","doi-asserted-by":"crossref","unstructured":"Raffel C. Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching. 2016. Columbia University. https:\/\/academiccommons.columbia.edu\/doi\/10.7916\/D8N58MHV","DOI":"10.1109\/ICASSP.2016.7471641"},{"issue":"1","key":"pcbi.1013271.ref084","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1109\/TIT.1976.1055501","article-title":"On the complexity of finite sequences","volume":"22","author":"A Lempel","year":"1976","journal-title":"IEEE Transactions on Information Theory"},{"issue":"4","key":"pcbi.1013271.ref085","doi-asserted-by":"crossref","first-page":"1203","DOI":"10.1007\/s10910-008-9512-2","article-title":"Normalized Lempel-Ziv complexity and its application in bio-sequence analysis","volume":"46","author":"Y Zhang","year":"2008","journal-title":"J Math Chem"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1013271","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2025,9,12]],"date-time":"2025-09-12T00:00:00Z","timestamp":1757635200000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1013271","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,12]],"date-time":"2025-09-12T18:06:54Z","timestamp":1757700414000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1013271"}},"subtitle":[],"editor":[{"given":"Yuanning","family":"Li","sequence":"first","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2025,9,5]]},"references-count":85,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2025,9,5]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1013271","relation":{},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,5]]}}}