{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,28]],"date-time":"2026-05-28T00:35:15Z","timestamp":1779928515875,"version":"3.53.1"},"reference-count":76,"publisher":"MIT Press","license":[{"start":{"date-parts":[[2023,8,16]],"date-time":"2023-08-16T00:00:00Z","timestamp":1692144000000},"content-version":"vor","delay-in-days":227,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,8,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>While natural languages differ widely in both canonical word order and word order flexibility, their word orders still follow shared cross-linguistic statistical patterns, often attributed to functional pressures. In the effort to identify these pressures, prior work has compared real and counterfactual word orders. Yet one functional pressure has been overlooked in such investigations: The uniform information density (UID) hypothesis, which holds that information should be spread evenly throughout an utterance. Here, we ask whether a pressure for UID may have influenced word order patterns cross-linguistically. To this end, we use computational models to test whether real orders lead to greater information uniformity than counterfactual orders. In our empirical study of 10 typologically diverse languages, we find that: (i) among SVO languages, real word orders consistently have greater uniformity than reverse word orders, and (ii) only linguistically implausible counterfactual orders consistently exceed the uniformity of real orders. These findings are compatible with a pressure for information uniformity in the development and usage of natural languages.1<\/jats:p>","DOI":"10.1162\/tacl_a_00589","type":"journal-article","created":{"date-parts":[[2023,8,16]],"date-time":"2023-08-16T19:47:24Z","timestamp":1692215244000},"page":"1048-1065","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":10,"title":["A Cross-Linguistic Pressure for Uniform Information Density in Word Order"],"prefix":"10.1162","volume":"11","author":[{"given":"Thomas Hikaru","family":"Clark","sequence":"first","affiliation":[{"name":"MIT, USA. thclark@mit.edu"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Clara","family":"Meister","sequence":"additional","affiliation":[{"name":"ETH Z\u00fcrich, Switzerland. meistecl@inf.ethz.ch"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Tiago","family":"Pimentel","sequence":"additional","affiliation":[{"name":"University of Cambridge, UK. tp472@cam.ac.uk"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Michael","family":"Hahn","sequence":"additional","affiliation":[{"name":"Saarland University, Germany. mhahn@lst.uni-saarland.de"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ryan","family":"Cotterell","sequence":"additional","affiliation":[{"name":"ETH Z\u00fcrich, Switzerland. ryan.cotterell@inf.ethz.ch"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Richard","family":"Futrell","sequence":"additional","affiliation":[{"name":"UC Irvine, USA. rfutrell@uci.edu"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Roger","family":"Levy","sequence":"additional","affiliation":[{"name":"MIT, USA. rplevy@mit.edu"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"281","published-online":{"date-parts":[[2023,8,15]]},"reference":[{"issue":"1","key":"2023081619465415300_bib1","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1177\/00238309040470010201","article-title":"The smooth signal redundancy hypothesis: A functional explanation for relationships between redundancy, prosodic prominence, and duration in spontaneous speech","volume":"47","author":"Aylett","year":"2004","journal-title":"Language and Speech"},{"issue":"5","key":"2023081619465415300_bib2","doi-asserted-by":"publisher","first-page":"1178","DOI":"10.1037\/a0024194","article-title":"In search of on-line locality effects in sentence comprehension","volume":"37","author":"Bartek","year":"2011","journal-title":"Journal of Experimental Psychology: Learning, Memory, and Cognition"},{"key":"2023081619465415300_bib3","first-page":"174","article-title":"Testing the processing hypothesis of word order variation using a probabilistic language model","volume-title":"Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity","author":"Bloem","year":"2016"},{"key":"2023081619465415300_bib4","article-title":"Evidence for availability effects on speaker choice in the Russian comparative alternation","volume-title":"Proceedings of the 44th Annual Meeting of the Cognitive Science Society","author":"Clark","year":"2022"},{"issue":"5","key":"2023081619465415300_bib5","doi-asserted-by":"publisher","first-page":"651","DOI":"10.1007\/s10936-013-9273-3","article-title":"Information density and dependency length as complementary cognitive models","volume":"43","author":"Collins","year":"2014","journal-title":"Journal of Psycholinguistic Research"},{"key":"2023081619465415300_bib6","doi-asserted-by":"publisher","first-page":"8440","DOI":"10.18653\/v1\/2020.acl-main.747","article-title":"Unsupervised cross-lingual representation learning at scale","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Conneau","year":"2020"},{"key":"2023081619465415300_bib7","article-title":"Order of subject, object and verb (v2020.3)","volume-title":"The World Atlas of Language Structures Online","author":"Dryer","year":"2013"},{"key":"2023081619465415300_bib8","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2212.10502","article-title":"A measure-theoretic characterization of tight language models","author":"Li","year":"2022","journal-title":"arXiv preprint arXiv: 2212.10502v1"},{"issue":"2","key":"2023081619465415300_bib9","doi-asserted-by":"publisher","first-page":"143","DOI":"10.1017\/S0272263102002024","article-title":"Frequency effects in language processing","volume":"24","author":"Ellis","year":"2002","journal-title":"Studies in Second Language Acquisition"},{"issue":"3","key":"2023081619465415300_bib10","first-page":"400","article-title":"Konstanz im Kurzzeitged\u00e4chtnis\u2014Konstanz im sprachlichen Informationsflu\u00df","volume":"27","author":"Fenk","year":"1980","journal-title":"Zeitschrift f\u00fcr experimentelle und angewandte Psychologie"},{"issue":"5 Pt 2","key":"2023081619465415300_bib11","doi-asserted-by":"publisher","first-page":"056135","DOI":"10.1103\/PhysRevE.70.056135","article-title":"Euclidean distance between syntactically linked words","volume":"70","author":"Ferrer-i-Cancho","year":"2004","journal-title":"Physical Review E"},{"key":"2023081619465415300_bib12","doi-asserted-by":"publisher","first-page":"311","DOI":"10.1016\/j.physa.2017.10.048","article-title":"Are crossing dependencies really scarce?","volume":"493","author":"Ferrer-i-Cancho","year":"2018","journal-title":"Physica A: Statistical Mechanics and its Applications"},{"key":"2023081619465415300_bib13","article-title":"Speaking rationally: Uniform information density as an optimal strategy for language production","volume-title":"Proceedings of the Cognitive Science Society","author":"Frank","year":"2008"},{"key":"2023081619465415300_bib14","doi-asserted-by":"publisher","first-page":"215","DOI":"10.1016\/j.cognition.2014.11.022","article-title":"Cross-linguistic gestures reflect typological universals: A subject-initial, verb-final bias in speakers of diverse languages","volume":"136","author":"Futrell","year":"2015","journal-title":"Cognition"},{"key":"2023081619465415300_bib15","first-page":"688","article-title":"Noisy-context surprisal as a human sentence processing cost model","volume-title":"Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers","author":"Futrell","year":"2017"},{"issue":"2","key":"2023081619465415300_bib16","doi-asserted-by":"publisher","first-page":"371","DOI":"10.1353\/lan.2020.0024","article-title":"Dependency locality as an explanatory principle for word order","volume":"96","author":"Futrell","year":"2020","journal-title":"Language"},{"issue":"33","key":"2023081619465415300_bib17","doi-asserted-by":"publisher","first-page":"10336","DOI":"10.1073\/pnas.1502134112","article-title":"Large-scale evidence of dependency length minimization in 37 languages","volume":"112","author":"Futrell","year":"2015","journal-title":"Proceedings of the National Academy of Sciences, U.S.A."},{"key":"2023081619465415300_bib18","doi-asserted-by":"publisher","first-page":"3","DOI":"10.18653\/v1\/W19-7703","article-title":"Syntactic dependencies correspond to word pairs with high mutual information","volume-title":"Proceedings of the Fifth International Conference on Dependency Linguistics (Depling, SyntaxFest 2019)","author":"Futrell","year":"2019"},{"key":"2023081619465415300_bib19","volume-title":"Die Sprachwissenschaft, ihre Aufgaben, Methoden, und bisherigen Ergebnisse","author":"von der Gabelentz","year":"1901","edition":"2nd"},{"key":"2023081619465415300_bib20","doi-asserted-by":"publisher","first-page":"199","DOI":"10.3115\/1073083.1073117","article-title":"Entropy rate constancy in text","volume-title":"Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics","author":"Genzel","year":"2002"},{"key":"2023081619465415300_bib21","doi-asserted-by":"publisher","first-page":"66","DOI":"10.18653\/v1\/w18-6008","article-title":"SUD or surface-syntactic universal dependencies: An annotation scheme near-isomorphic to UD","volume-title":"Proceedings of the Second Workshop on Universal Dependencies, UDW@EMNLP 2018, Brussels, Belgium, November 1, 2018","author":"Gerdes","year":"2018"},{"issue":"1","key":"2023081619465415300_bib22","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/S0010-0277(98)00034-1","article-title":"Linguistic complexity: Locality of syntactic dependencies","volume":"68","author":"Gibson","year":"1998","journal-title":"Cognition"},{"key":"2023081619465415300_bib23","first-page":"95","article-title":"The dependency locality theory: A distance-based theory of linguistic complexity","volume-title":"Image, Language, Brain: Papers from the First Mind Articulation Project Symposium","author":"Gibson","year":"2000"},{"issue":"5","key":"2023081619465415300_bib24","doi-asserted-by":"publisher","first-page":"389","DOI":"10.1016\/j.tics.2019.02.003","article-title":"How efficiency shapes human language","volume":"23","author":"Gibson","year":"2019","journal-title":"Trends in Cognitive Sciences"},{"issue":"7","key":"2023081619465415300_bib25","doi-asserted-by":"publisher","first-page":"1079","DOI":"10.1177\/0956797612463705","article-title":"A noisy-channel account of crosslinguistic word-order variation","volume":"24","author":"Gibson","year":"2013","journal-title":"Psychological Science"},{"key":"2023081619465415300_bib26","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1510.02823","article-title":"Human languages order information efficiently","author":"Gildea","year":"2015","journal-title":"arXiv preprint arXiv:1510.02823"},{"key":"2023081619465415300_bib27","first-page":"184","article-title":"Optimizing grammars for minimum dependency length","volume-title":"Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics","author":"Gildea","year":"2007"},{"issue":"2","key":"2023081619465415300_bib28","doi-asserted-by":"publisher","first-page":"286","DOI":"10.1111\/j.1551-6709.2009.01073.x","article-title":"Do grammars minimize dependency length?","volume":"34","author":"Gildea","year":"2010","journal-title":"Cognitive Science"},{"issue":"27","key":"2023081619465415300_bib29","doi-asserted-by":"publisher","first-page":"9163","DOI":"10.1073\/pnas.0710060105","article-title":"The natural order of events: How speakers of different languages represent events nonverbally","volume":"105","author":"Goldin-Meadow","year":"2008","journal-title":"Proceedings of the National Academy of Sciences, U.S.A."},{"key":"2023081619465415300_bib30","volume-title":"Some Universals of Grammar with Particular Reference to the Order of Meaningful Elements","author":"Greenberg","year":"1963"},{"issue":"2","key":"2023081619465415300_bib31","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1207\/s15516709cog0000_7","article-title":"Consequences of the serial nature of linguistic input for sentential complexity","volume":"29","author":"Grodner","year":"2005","journal-title":"Cognitive Science"},{"key":"2023081619465415300_bib32","first-page":"2440","article-title":"Wiki-40b: Multilingual language model dataset","volume-title":"Proceedings of the 12th Language Resources and Evaluation Conference","author":"Guo","year":"2020"},{"issue":"5","key":"2023081619465415300_bib33","doi-asserted-by":"publisher","first-page":"2347","DOI":"10.1073\/pnas.1910923117","article-title":"Universals of word order reflect optimization of grammars for efficient communication","volume":"117","author":"Hahn","year":"2020","journal-title":"Proceedings of the National Academy of Sciences, U.S.A."},{"issue":"24","key":"2023081619465415300_bib34","doi-asserted-by":"publisher","first-page":"e2122604119","DOI":"10.1073\/pnas.2122604119","article-title":"Crosslinguistic word order variation reflects evolutionary pressures of dependency and information locality","volume":"119","author":"Hahn","year":"2022","journal-title":"Proceedings of the National Academy of Sciences, U.S.A."},{"key":"2023081619465415300_bib35","doi-asserted-by":"publisher","first-page":"1","DOI":"10.3115\/1073336.1073357","article-title":"A probabilistic Earley parser as a psycholinguistic model","volume-title":"Second Meeting of the North American Chapter of the Association for Computational Linguistics","author":"Hale","year":"2001"},{"key":"2023081619465415300_bib36","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1075\/la.132.04has","article-title":"Parametric versus functional explanations of syntactic universals","volume-title":"The Limits of Syntactic Variation","author":"Haspelmath","year":"2008"},{"issue":"2","key":"2023081619465415300_bib37","first-page":"223","article-title":"A parsing theory of word order universals","volume":"21","author":"Hawkins","year":"1990","journal-title":"Linguistic Inquiry"},{"key":"2023081619465415300_bib38","volume-title":"A Performance Theory of Order and Constituency","author":"Hawkins","year":"1994"},{"key":"2023081619465415300_bib39","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780199252695.001.0001","volume-title":"Efficiency and Complexity in Grammars","author":"Hawkins","year":"2004"},{"key":"2023081619465415300_bib40","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780199664993.001.0001","volume-title":"Cross-linguistic Variation and Efficiency","author":"Hawkins","year":"2014"},{"key":"2023081619465415300_bib41","doi-asserted-by":"publisher","DOI":"10.31234\/osf.io\/qjnpv","article-title":"The plausibility of sampling as an algorithmic theory of sentence processing","author":"Hoover","year":"2022","journal-title":"PsyArXiv preprint PsyArXiv:qjnpv"},{"issue":"1","key":"2023081619465415300_bib42","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1016\/j.cogpsych.2010.02.002","article-title":"Redundancy and reduction: Speakers manage syntactic information density","volume":"61","author":"Jaeger","year":"2010","journal-title":"Cognitive Psychology"},{"issue":"3","key":"2023081619465415300_bib43","doi-asserted-by":"publisher","first-page":"323","DOI":"10.1002\/wcs.126","article-title":"On language \u2018utility\u2019: Processing complexity and communicative efficiency","volume":"2","author":"Jaeger","year":"2011","journal-title":"Wiley Interdisciplinary Reviews: Cognitive Science"},{"key":"2023081619465415300_bib44","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1412.6980","article-title":"Adam: A method for stochastic optimization","author":"Kingma","year":"2017","journal-title":"arXiv preprint arXiv:1412.6980"},{"key":"2023081619465415300_bib45","first-page":"1318","article-title":"Translationese and its dialects","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies","author":"Koppel","year":"2011"},{"key":"2023081619465415300_bib46","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-14568-1_3","volume-title":"Projective Dependency Structures","author":"Kuhlmann","year":"2010"},{"issue":"3","key":"2023081619465415300_bib47","doi-asserted-by":"publisher","first-page":"1126","DOI":"10.1016\/j.cognition.2007.05.006","article-title":"Expectation-based syntactic comprehension","volume":"106","author":"Levy","year":"2008","journal-title":"Cognition"},{"key":"2023081619465415300_bib48","doi-asserted-by":"publisher","first-page":"684","DOI":"10.31234\/osf.io\/4cgxh","article-title":"Communicative efficiency, uniform information density, and the rational speech act theory","volume-title":"Proceedings for the 40th Annual Meeting of the Cognitive Science Society","author":"Levy","year":"2018"},{"key":"2023081619465415300_bib49","article-title":"Speakers optimize information density through syntactic reduction","volume-title":"Advances in Neural Information Processing Systems","author":"Levy","year":"2006"},{"key":"2023081619465415300_bib50","first-page":"849","article-title":"Speakers optimize information density through syntactic reduction","volume":"19","author":"Levy","year":"2007","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"2","key":"2023081619465415300_bib51","doi-asserted-by":"publisher","first-page":"159","DOI":"10.17791\/jcs.2008.9.2.159","article-title":"Dependency distance as a metric of language comprehension difficulty","volume":"9","author":"Liu","year":"2008","journal-title":"Journal of Cognitive Science"},{"key":"2023081619465415300_bib52","doi-asserted-by":"publisher","first-page":"22","DOI":"10.1016\/j.cogpsych.2016.06.002","article-title":"Limits on lexical prediction during reading","volume":"88","author":"Luke","year":"2016","journal-title":"Cognitive Psychology"},{"key":"2023081619465415300_bib53","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2201.12911","article-title":"Experimentally measuring the redundancy of grammatical cues in transitive clauses","author":"Mahowald","year":"2022","journal-title":"arXiv preprint arXiv:2201.12911"},{"issue":"2","key":"2023081619465415300_bib54","doi-asserted-by":"publisher","first-page":"255","DOI":"10.1162\/coli_a_00402","article-title":"Universal Dependencies","volume":"47","author":"de Marneffe","year":"2021","journal-title":"Computational Linguistics"},{"key":"2023081619465415300_bib55","first-page":"1585","article-title":"Why are some word orders more common than others? A Uniform Information Density account","volume-title":"Advances in Neural Information Processing Systems","author":"Maurits","year":"2010"},{"key":"2023081619465415300_bib56","doi-asserted-by":"publisher","first-page":"963","DOI":"10.18653\/v1\/2021.emnlp-main.74","article-title":"Revisiting the Uniform Information Density hypothesis","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Meister","year":"2021"},{"key":"2023081619465415300_bib57","doi-asserted-by":"publisher","first-page":"48","DOI":"10.18653\/v1\/N19-4009","article-title":"fairseq: A fast, extensible toolkit for sequence modeling","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)","author":"Ott","year":"2019"},{"issue":"9","key":"2023081619465415300_bib58","doi-asserted-by":"publisher","first-page":"3526","DOI":"10.1073\/pnas.1012551108","article-title":"Word lengths are optimized for efficient communication","volume":"108","author":"Piantadosi","year":"2011","journal-title":"Proceedings of the National Academy of Sciences, U.S.A."},{"key":"2023081619465415300_bib59","doi-asserted-by":"publisher","first-page":"31","DOI":"10.18653\/v1\/2021.eacl-main.3","article-title":"Disambiguatory signals are stronger in word-initial positions","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Pimentel","year":"2021"},{"key":"2023081619465415300_bib60","doi-asserted-by":"publisher","first-page":"949","DOI":"10.18653\/v1\/2021.eacl-main.3","article-title":"A surprisal\u2013duration trade-off across and within the world\u2019s languages","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Pimentel","year":"2021"},{"key":"2023081619465415300_bib61","doi-asserted-by":"publisher","first-page":"4426","DOI":"10.18653\/v1\/2021.naacl-main.350","article-title":"How (non-)optimal is the lexicon?","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Pimentel","year":"2021"},{"key":"2023081619465415300_bib62","doi-asserted-by":"publisher","first-page":"3532","DOI":"10.18653\/v1\/N19-1356","article-title":"Studying the inductive biases of RNNs with synthetic variations of natural languages","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Ravfogel","year":"2019"},{"key":"2023081619465415300_bib63","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1075\/bjl.1.05rij","article-title":"Word order universals revisited: The principle of head proximity","volume":"1","author":"Rijkhoff","year":"1986","journal-title":"Belgian Journal of Linguistics"},{"issue":"1","key":"2023081619465415300_bib64","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1515\/ling.1990.28.1.5","article-title":"Explaining word order in the noun phrase","volume":"28","author":"Rijkhoff","year":"1990","journal-title":"Linguistics"},{"key":"2023081619465415300_bib65","doi-asserted-by":"publisher","first-page":"1715","DOI":"10.18653\/v1\/P16-1162","article-title":"Neural machine translation of rare words with subword units","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Sennrich","year":"2016"},{"key":"2023081619465415300_bib66","doi-asserted-by":"publisher","DOI":"10.31234\/osf.io\/4hyna","article-title":"Large-scale evidence for logarithmic effects of word predictability on reading time","author":"Shain","year":"2022","journal-title":"PsyArXiv preprint PsyArXiv:4hyna"},{"issue":"3","key":"2023081619465415300_bib67","doi-asserted-by":"publisher","first-page":"379","DOI":"10.1002\/j.1538-7305.1948.tb01338.x","article-title":"A mathematical theory of communication","volume":"27","author":"Shannon","year":"1948","journal-title":"Bell System Technical Journal"},{"issue":"3","key":"2023081619465415300_bib68","doi-asserted-by":"publisher","first-page":"302","DOI":"10.1016\/j.cognition.2013.02.013","article-title":"The effect of word predictability on reading time is logarithmic","volume":"128","author":"Smith","year":"2013","journal-title":"Cognition"},{"key":"2023081619465415300_bib69","doi-asserted-by":"publisher","first-page":"88","DOI":"10.18653\/v1\/K17-3009","article-title":"Tokenizing, POS tagging, lemmatizing and parsing UD 2.0 with UDPipe","volume-title":"Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies","author":"Straka","year":"2017"},{"key":"2023081619465415300_bib70","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1146\/annurev-linguistics-011817-045617","article-title":"Minimizing syntactic dependency lengths: Typological\/cognitive universal?","volume":"4","author":"Temperley","year":"2018","journal-title":"Annual Review of Linguistics"},{"key":"2023081619465415300_bib71","article-title":"Attention is all you need","volume-title":"Advances in Neural Information Processing Systems","author":"Vaswani","year":"2017"},{"key":"2023081619465415300_bib72","doi-asserted-by":"publisher","first-page":"454","DOI":"10.18653\/v1\/2021.acl-long.38","article-title":"Examining the inductive bias of neural language models with artificial languages","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"White","year":"2021"},{"issue":"s3","key":"2023081619465415300_bib73","doi-asserted-by":"publisher","DOI":"10.1515\/lingvan-2019-0070","article-title":"Do dependency lengths explain constraints on crossing dependencies?","volume":"7","author":"Yadav","year":"2021","journal-title":"Linguistics Vanguard"},{"key":"2023081619465415300_bib74","doi-asserted-by":"publisher","first-page":"1997","DOI":"10.18653\/v1\/N18-1181","article-title":"Comparing theories of speaker choice using a model of classifier production in Mandarin Chinese","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)","author":"Zhan","year":"2018"},{"key":"2023081619465415300_bib75","volume-title":"The Psycho-biology of Language: An Introduction to Dynamic Philology","author":"Zipf","year":"1935"},{"key":"2023081619465415300_bib76","doi-asserted-by":"publisher","first-page":"4809","DOI":"10.18653\/v1\/2020.emnlp-main.390","article-title":"Please mind the root: Decoding arborescences for dependency parsing","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Zmigrod","year":"2020"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00589\/2154495\/tacl_a_00589.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00589\/2154495\/tacl_a_00589.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,16]],"date-time":"2023-08-16T19:47:41Z","timestamp":1692215261000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00589\/117221\/A-Cross-Linguistic-Pressure-for-Uniform"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023]]},"references-count":76,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00589","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023]]},"published":{"date-parts":[[2023]]}}}