{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,6]],"date-time":"2026-04-06T23:34:55Z","timestamp":1775518495423,"version":"3.50.1"},"reference-count":70,"publisher":"MIT Press","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Transactions of the Association for Computational Linguistics"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:p>We present methods for calculating a measure of phonotactic complexity\u2014bits per phoneme\u2014 that permits a straightforward cross-linguistic comparison. When given a word, represented as a sequence of phonemic segments such as symbols in the international phonetic alphabet, and a statistical model trained on a sample of word types from the language, we can approximately measure bits per phoneme using the negative log-probability of that word under the model. This simple measure allows us to compare the entropy across languages, giving insight into how complex a language\u2019s phonotactics is. Using a collection of 1016 basic concept words across 106 languages, we demonstrate a very strong negative correlation of \u2212 0.74 between bits per phoneme and the average length of words.<\/jats:p>","DOI":"10.1162\/tacl_a_00296","type":"journal-article","created":{"date-parts":[[2020,1,23]],"date-time":"2020-01-23T19:20:38Z","timestamp":1579807238000},"page":"1-18","source":"Crossref","is-referenced-by-count":14,"title":["Phonotactic Complexity and Its Trade-offs"],"prefix":"10.1162","volume":"8","author":[{"given":"Tiago","family":"Pimentel","sequence":"first","affiliation":[{"name":"University of Cambridge."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Brian","family":"Roark","sequence":"additional","affiliation":[{"name":"Google."}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ryan","family":"Cotterell","sequence":"additional","affiliation":[{"name":"University of Cambridge."}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","reference":[{"issue":"1","key":"bib1","first-page":"31","volume":"18","author":"Brown Peter F.","year":"1992","journal-title":"Computational Linguistics"},{"key":"bib2","first-page":"34","author":"Colin Cherry E.","year":"1953","journal-title":"Language"},{"issue":"2","key":"bib3","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1017\/S0022226700001134","volume":"1","author":"Chomsky Noam","year":"1965","journal-title":"Journal of Linguistics"},{"issue":"2","key":"bib4","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1017\/S0305000903005579","volume":"30","author":"Coady Jeffry A.","year":"2003","journal-title":"Journal of Child Language"},{"issue":"3","key":"bib5","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1016\/j.jecp.2004.07.004","volume":"89","author":"Coady Jeffry A.","year":"2004","journal-title":"Journal of Experimental Child Psychology"},{"key":"bib6","doi-asserted-by":"crossref","first-page":"1182","DOI":"10.18653\/v1\/P17-1109","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Cotterell Ryan","year":"2017"},{"key":"bib7","first-page":"536","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)","author":"Cotterell Ryan","year":"2018"},{"issue":"9","key":"bib8","doi-asserted-by":"crossref","first-page":"eaaw2594","DOI":"10.1126\/sciadv.aaw2594","volume":"5","author":"Coup\u00e9 Christophe","year":"2019","journal-title":"Science Advances"},{"key":"bib9","volume-title":"Elements of Information Theory","author":"Cover Thomas M.","year":"2012"},{"key":"bib10","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1016\/j.cognition.2017.02.001","volume":"163","author":"Dautriche Isabelle","year":"2017","journal-title":"Cognition"},{"key":"bib11","volume-title":"First International Workshop on Computational Linguistics for Uralic Languages","author":"Dellert Johannes","year":"2015"},{"key":"bib12","unstructured":"Johannes Dellert. 2017. Information-Theoretic Causal Inference of Lexical Flow. Ph.D. thesis, University of T\u00fcbingen."},{"key":"bib13","unstructured":"Johannes Dellert and Gerhard J\u00e4ger. 2017. NorthEuraLex (version 0.9). http:\/\/northeuralex.org\/"},{"key":"bib14","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00047"},{"issue":"3","key":"bib15","doi-asserted-by":"crossref","first-page":"1155","DOI":"10.1016\/j.cognition.2007.11.009","volume":"107","author":"Goldrick Matthew","year":"2008","journal-title":"Cognition"},{"key":"bib16","volume-title":"Proceedings of the Stockholm Workshop on Variation within Optimality Theory","author":"Goldwater Sharon","year":"2003"},{"key":"bib17","first-page":"459","volume-title":"Advances in Neural Information Processing Systems","author":"Goldwater Sharon","year":"2006"},{"key":"bib18","unstructured":"Kyle Gorman. 2013. Generative phonotactics. Ph.D. thesis, University of Pennsylvania."},{"key":"bib19","volume-title":"Language universals, with special reference to feature hierarchies","author":"Greenberg Joseph","year":"1966"},{"key":"bib20","volume-title":"Universals of Human Language. Vol. 2: Phonology","author":"Greenberg Joseph H.","year":"1978"},{"key":"bib21","first-page":"1","volume-title":"Proceedings of the 2nd meeting of the North American Chapter of the Association for Computational Linguistics","author":"Hale John","year":"2001"},{"issue":"2","key":"bib22","volume":"4","author":"Hall Kathleen Currie","year":"2018","journal-title":"Linguistics Vanguard"},{"key":"bib23","volume-title":"The Sound Pattern of Russian","author":"Halle Morris","year":"1959"},{"key":"bib24","first-page":"294","volume-title":"Linguistic Theory and Psychological Reality","author":"Halle Morris","year":"1978"},{"key":"bib25","doi-asserted-by":"publisher","DOI":"10.1162\/ling.2008.39.3.379"},{"key":"bib26","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"bib27","volume-title":"A manual of phonology","author":"Hockett Charles Francis","year":"1955"},{"key":"bib28","volume-title":"A course in modern linguistics","author":"Hockett Charles Francis","year":"1958"},{"key":"bib29","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1093\/acprof:oso\/9780199573745.003.0002","volume-title":"Origins of sound change: Approaches to phonologization","author":"Hume Elizabeth","year":"2013"},{"key":"bib30","first-page":"1211","author":"Ishwaran Hemant","year":"2003","journal-title":"Statistica Sinica"},{"key":"bib31","first-page":"467","volume-title":"Architectures, Rules, and Preferences: Variations on Themes by Joan W. Bresnan","author":"J\u00e4ger Gerhard","year":"2007"},{"key":"bib32","volume-title":"Proceedings of the Workshop on Pattern Recognition in Practice, 1980","author":"Jelinek Frederick","year":"1980"},{"issue":"3","key":"bib33","first-page":"331","volume":"20","author":"Kaplan Ronald M.","year":"1994","journal-title":"Computational Linguistics"},{"key":"bib34","doi-asserted-by":"crossref","first-page":"284","DOI":"10.18653\/v1\/P18-1027","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Khandelwal Urvashi","year":"2018"},{"key":"bib35","volume-title":"Proceedings of the 11th Language Resources and Evaluation Conference","author":"Kirov Christo","year":"2018"},{"issue":"3","key":"bib36","doi-asserted-by":"crossref","first-page":"1126","DOI":"10.1016\/j.cognition.2007.05.006","volume":"106","author":"Levy Roger","year":"2008","journal-title":"Cognition"},{"key":"bib37","first-page":"62","author":"Lindblom Bj\u00f6rn","year":"1988","journal-title":"Language, Speech, and Mind"},{"issue":"1","key":"bib38","first-page":"106","volume":"10","author":"Maddieson Ian","year":"2006","journal-title":"Linguistic Typology"},{"key":"bib39","first-page":"85","volume-title":"Approaches to phonological complexity","author":"Maddieson Ian","year":"2009"},{"key":"bib40","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511753459","volume-title":"Patterns of Sounds","author":"Maddieson Ian","year":"1984"},{"issue":"8","key":"bib41","doi-asserted-by":"crossref","first-page":"3116","DOI":"10.1111\/cogs.12689","volume":"42","author":"Mahowald Kyle","year":"2018","journal-title":"Cognitive Science"},{"key":"bib42","volume-title":"\u00c9conomie des changements phon\u00e9tiques","author":"Martinet Andr\u00e9","year":"1955"},{"issue":"2","key":"bib43","first-page":"125","volume":"5","author":"McWhorter John","year":"2001","journal-title":"Linguistic Typology"},{"key":"bib44","author":"Merity Stephen","year":"2018","journal-title":"arXiv preprint arXiv:1803.08240"},{"key":"bib45","doi-asserted-by":"crossref","first-page":"4975","DOI":"10.18653\/v1\/P19-1491","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Mielke Sebastian J.","year":"2019"},{"key":"bib46","first-page":"11","volume-title":"FinEst Linguistics, Proceedings of the Annual Finnish and Estonian Conference of Linguistics","author":"Miestamo Matti","year":"2006"},{"key":"bib47","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1075\/slcs.94.04mie","volume-title":"Language complexity: Typology, contact, change","author":"Miestamo Matti","year":"2008"},{"key":"bib48","volume-title":"Eleventh Annual Conference of the International Speech Communication Association","author":"Mikolov Tom\u00e1\u0161","year":"2010"},{"key":"bib49","first-page":"217","volume-title":"Measuring grammatical complexity","author":"Moran Steven","year":"2014"},{"key":"bib50","volume-title":"PHOIBLE Online","author":"Moran Steven","year":"2014"},{"key":"bib51","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1515\/ling.1995.33.2.359","volume":"33","author":"Nettle Daniel","year":"1995","journal-title":"Linguistics"},{"issue":"3","key":"bib52","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1353\/lan.2011.0057","volume":"87","author":"Pellegrino Fran\u00e7ois","year":"2011","journal-title":"Language"},{"issue":"9","key":"bib53","doi-asserted-by":"crossref","first-page":"3526","DOI":"10.1073\/pnas.1012551108","volume":"108","author":"Piantadosi Steven T.","year":"2011","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"3","key":"bib54","doi-asserted-by":"crossref","first-page":"280","DOI":"10.1016\/j.cognition.2011.10.004","volume":"122","author":"Piantadosi Steven T.","year":"2012","journal-title":"Cognition"},{"key":"bib55","first-page":"2582","volume-title":"The 31st Annual Meeting of the Cognitive Science Society (CogSci09)","author":"Piantadosi Steven T.","year":"2009"},{"issue":"2","key":"bib56","volume":"4","author":"Priva Uriel Cohen","year":"2018","journal-title":"Linguistics Vanguard"},{"issue":"1","key":"bib57","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1515\/LINGTY.2006.001","volume":"10","author":"Shosted Ryan K.","year":"2006","journal-title":"Linguistic Typology"},{"key":"bib58","first-page":"2951","volume-title":"Advances in Neural Information Processing Systems","author":"Snoek Jasper","year":"2012"},{"key":"bib59","volume-title":"Eighth European Conference on Speech Communication and Technology (Eurospeech)","author":"van Son R.J.J.H.","year":"2003"},{"key":"bib60","doi-asserted-by":"crossref","first-page":"393","DOI":"10.2307\/411542","author":"Stanley Richard","year":"1967","journal-title":"Language"},{"issue":"6","key":"bib61","doi-asserted-by":"crossref","first-page":"1321","DOI":"10.1044\/1092-4388(2001\/103)","volume":"44","author":"Storkel Holly L.","year":"2001","journal-title":"Journal of Speech, Language, and Hearing Research"},{"issue":"6","key":"bib62","doi-asserted-by":"crossref","first-page":"1312","DOI":"10.1044\/1092-4388(2003\/102)","volume":"46","author":"Storkel Holly L.","year":"2003","journal-title":"Journal of Speech, Language, and Hearing Research"},{"issue":"6","key":"bib63","doi-asserted-by":"crossref","first-page":"1175","DOI":"10.1044\/1092-4388(2006\/085)","volume":"49","author":"Storkel Holly L.","year":"2006","journal-title":"Journal of Speech, Language, and Hearing Research"},{"issue":"2","key":"bib64","doi-asserted-by":"crossref","first-page":"497","DOI":"10.3758\/BRM.42.2.497","volume":"42","author":"Storkel Holly L.","year":"2010","journal-title":"Behavior Research Methods"},{"issue":"2","key":"bib65","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1080\/01690961003787609","volume":"26","author":"Storkel Holly L.","year":"2011","journal-title":"Language and Cognitive Processes"},{"key":"bib66","volume-title":"Thirteenth Annual Conference of the International Speech Communication Association","author":"Sundermeyer Martin","year":"2012"},{"issue":"2","key":"bib67","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1086\/464321","volume":"21","author":"Swadesh Morris","year":"1955","journal-title":"International Journal of American Linguistics"},{"key":"bib68","volume-title":"Grundz\u00fcge der phonologie","author":"Trubetzkoy Nikola\u00ef Sergeyevich","year":"1938"},{"issue":"3","key":"bib69","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1006\/jmla.1998.2618","volume":"40","author":"Vitevitch Michael S.","year":"1999","journal-title":"Journal of Memory and Language"},{"key":"bib70","volume-title":"The Psycho-Biology of Language: An Introduction to Dynamic Philology","author":"Zipf George Kingsley","year":"1935"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/tacl_a_00296","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,12]],"date-time":"2022-10-12T17:12:30Z","timestamp":1665594750000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/43538"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12]]},"references-count":70,"alternative-id":["10.1162\/tacl_a_00296"],"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00296","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,12]]}}}