{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,19]],"date-time":"2025-05-19T16:28:57Z","timestamp":1747672137927},"reference-count":136,"publisher":"MIT Press","issue":"3","license":[{"start":{"date-parts":[[2021,6,30]],"date-time":"2021-06-30T00:00:00Z","timestamp":1625011200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,11,3]]},"abstract":"<jats:p>Taxonomies of writing systems since Gelb (1952) have classified systems based on what the written symbols represent: if they represent words or morphemes, they are logographic; if syllables, syllabic; if segments, alphabetic; and so forth. Sproat (2000) and Rogers (2005) broke with tradition by splitting the logographic and phonographic aspects into two dimensions, with logography being graded rather than a categorical distinction. A system could be syllabic, and highly logographic; or alphabetic, and mostly non-logographic. This accords better with how writing systems actually work, but neither author proposed a method for measuring logography.<\/jats:p><jats:p>In this article we propose a novel measure of the degree of logography that uses an attention-based sequence-to-sequence model trained to predict the spelling of a token from its pronunciation in context. In an ideal phonographic system, the model should need to attend to only the current token in order to compute how to spell it, and this would show in the attention matrix activations. In contrast, with a logographic system, where a given pronunciation might correspond to several different spellings, the model would need to attend to a broader context. The ratio of the activation outside the token and the total activation forms the basis of our measure. We compare this with a simple lexical measure, and an entropic measure, as well as several other neural models, and argue that on balance our attention-based measure accords best with intuition about how logographic various systems are.<\/jats:p><jats:p>Our work provides the first quantifiable measure of the notion of logography that accords with linguistic intuition and, we argue, provides better insight into what this notion means.<\/jats:p>","DOI":"10.1162\/coli_a_00409","type":"journal-article","created":{"date-parts":[[2021,6,30]],"date-time":"2021-06-30T19:10:58Z","timestamp":1625080258000},"page":"477-528","update-policy":"http:\/\/dx.doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":7,"title":["The Taxonomy of Writing Systems: How to Measure How Logographic a System Is"],"prefix":"10.1162","volume":"47","author":[{"given":"Richard","family":"Sproat","sequence":"first","affiliation":[{"name":"Search Google, Japan. rws@google.com"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alexander","family":"Gutkin","sequence":"additional","affiliation":[{"name":"Research & Machine Intelligence, Google, UK. agutkin@google.com"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","published-online":{"date-parts":[[2021,11,3]]},"reference":[{"key":"2021111022501968800_bib1","article-title":"TensorFlow: Large-scale machine learning on heterogeneous distributed systems","volume":"abs\/1603.04467","author":"Abadi","year":"2016","journal-title":"CoRR"},{"key":"2021111022501968800_bib2","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1017\/9781316155752.017","article-title":"Learning to read Finnish","volume-title":"Learning to Read Across Languages and Writing Systems","author":"Aro","year":"2017"},{"key":"2021111022501968800_bib3","article-title":"Machine translation by jointly learning to align and translate","volume-title":"Proceedings of 3rd International Conference on Learning Representations (ICLR)","author":"Bahdanau","year":"2015"},{"key":"2021111022501968800_bib4","article-title":"An empirical evaluation of generic convolutional and recurrent networks for sequence modeling","author":"Bai","year":"2018","journal-title":"arXiv preprint arXiv:1803.01271"},{"key":"2021111022501968800_bib5","doi-asserted-by":"crossref","first-page":"2664","DOI":"10.18653\/v1\/2020.emnlp-main.211","article-title":"Losing heads in the lottery: Pruning transformer attention in neural machine translation","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Behnke","year":"2020"},{"key":"2021111022501968800_bib6","doi-asserted-by":"crossref","first-page":"73","DOI":"10.18653\/v1\/W16-0508","article-title":"Predicting the spelling difficulty of words for language learners","volume-title":"Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications","author":"Beinborn","year":"2016"},{"key":"2021111022501968800_bib7","first-page":"312","article-title":"Modern Hebrew","volume-title":"The Semitic Languages","author":"Berman","year":"1997"},{"issue":"1","key":"2021111022501968800_bib8","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1006\/jecp.1993.1001","article-title":"The effect of oral and written language input on children\u2019s phonological awareness: A cross-linguistic study","volume":"55","author":"Caravolas","year":"1993","journal-title":"Journal of Experimental Child Psychology"},{"key":"2021111022501968800_bib9","doi-asserted-by":"crossref","first-page":"1724","DOI":"10.3115\/v1\/D14-1179","article-title":"Learning phrase representations using RNN encoder\u2013decoder for statistical machine translation","volume-title":"Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Cho","year":"2014"},{"key":"2021111022501968800_bib10","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-014-9287-y","volume-title":"The Sound Pattern of English","author":"Chomsky","year":"1968"},{"issue":"2","key":"2021111022501968800_bib11","doi-asserted-by":"crossref","first-page":"375","DOI":"10.1007\/s10579-014-9287-y","article-title":"A massively parallel corpus: The Bible in 100 languages","volume":"49","author":"Christodoulopoulos","year":"2015","journal-title":"Language Resources and Evaluation"},{"key":"2021111022501968800_bib12","doi-asserted-by":"crossref","first-page":"276","DOI":"10.18653\/v1\/W19-4828","article-title":"What does BERT look at? An analysis of BERT\u2019s attention","volume-title":"Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP","author":"Clark","year":"2019"},{"key":"2021111022501968800_bib13","doi-asserted-by":"publisher","DOI":"10.1080\/10888438.2016.1251437","volume-title":"Writing Systems of the World","author":"Coulmas","year":"1989"},{"volume-title":"Writing Systems: An Introduction to Their Linguistic Analysis","year":"2003","author":"Coulmas","key":"2021111022501968800_bib14"},{"key":"2021111022501968800_bib15","doi-asserted-by":"crossref","first-page":"2978","DOI":"10.18653\/v1\/P19-1285","article-title":"Transformer-XL: Attentive language models beyond a fixed-length context","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL)","author":"Dai","year":"2019"},{"key":"2021111022501968800_bib16","first-page":"141","article-title":"Methods of decipherment","volume-title":"The World\u2019s Writing Systems","author":"Daniels","year":"1996"},{"key":"2021111022501968800_bib17","first-page":"3","article-title":"The study of writing systems","volume-title":"The World\u2019s Writing Systems","author":"Daniels","year":"1996"},{"volume-title":"An Exploration of Writing","year":"2018","author":"Daniels","key":"2021111022501968800_bib18"},{"key":"2021111022501968800_bib19","first-page":"933","article-title":"Language modeling with gated convolutional networks","volume-title":"Proceedings of the 34th International Conference on Machine Learning (ICML)","author":"Dauphin","year":"2017"},{"key":"2021111022501968800_bib20","doi-asserted-by":"crossref","DOI":"10.1515\/9780824841621","volume-title":"Visible Speech: The Diverse Oneness of Writing Systems","author":"DeFrancis","year":"1989"},{"volume-title":"Reading in the Brain","year":"2009","author":"Dehaene","key":"2021111022501968800_bib21"},{"key":"2021111022501968800_bib22","article-title":"Universal transformers","volume-title":"7th International Conference on Learning Representations (ICLR)","author":"Dehghani","year":"2019"},{"key":"2021111022501968800_bib23","first-page":"1","article-title":"Latent alignment and variational attention","volume-title":"32nd Conference on Neural Information Processing Systems (NeurIPS)","author":"Deng","year":"2018"},{"key":"2021111022501968800_bib24","doi-asserted-by":"crossref","first-page":"399","DOI":"10.18653\/v1\/P16-1038","article-title":"Grapheme-to-phoneme models for (almost) any language","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Deri","year":"2016"},{"key":"2021111022501968800_bib25","first-page":"4171","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Devlin","year":"2019"},{"volume-title":"The Alphabet: A Key to the History of Mankind","year":"1958","author":"Diringer","key":"2021111022501968800_bib26"},{"issue":"165","key":"2021111022501968800_bib27","first-page":"8","article-title":"Korean ro\u0306manizatio\u0306n: Is it finally time for the library of congress to stop promoting Mccune-Reischauer and adopt the Revised Romanization scheme?","volume":"2017","author":"Doll","year":"2017","journal-title":"Journal of East Asian Libraries"},{"volume-title":"Alphabetic Labyrinth: The Letters in History and Imagination","year":"1995","author":"Drucker","key":"2021111022501968800_bib28"},{"issue":"3","key":"2021111022501968800_bib29","first-page":"1","article-title":"The Kestrel TTS text normalization system","volume":"21","author":"Ebden","year":"2014","journal-title":"Natural Language Engineering"},{"key":"2021111022501968800_bib30","first-page":"31","article-title":"Literacy acquisition in Danish: A deep orthography in cross-linguistic light","volume-title":"Handbook of Orthography and Literacy","author":"Elbro","year":"2006"},{"key":"2021111022501968800_bib31","first-page":"301","article-title":"Beneath the surface of developmental dyslexia","volume-title":"Surface Dyslexia: Cognitive and Neuropsychological Studies of Phonological Reading","author":"Frith","year":"1985"},{"key":"2021111022501968800_bib32","first-page":"1243","article-title":"Convolutional sequence to sequence learning","volume-title":"Proceedings of the 34th International Conference on Machine Learning","author":"Gehring","year":"2017"},{"volume-title":"A Study of Writing","year":"1952","author":"Gelb","key":"2021111022501968800_bib33"},{"volume-title":"Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems","year":"2019","author":"G\u00e9ron","key":"2021111022501968800_bib34"},{"issue":"1","key":"2021111022501968800_bib35","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1111\/j.1467-968X.1979.tb00359.x","article-title":"The alloglottography of Old Persian","volume":"77","author":"Gershevitch","year":"1979","journal-title":"Transactions of the Philological Society"},{"key":"2021111022501968800_bib36","first-page":"249","article-title":"Understanding the difficulty of training deep feedforward neural networks","volume-title":"Proceedings of the 13th International Conference on Artificial Intelligence and Statistics","author":"Glorot","year":"2010"},{"key":"2021111022501968800_bib37","first-page":"1349","article-title":"Improving homograph disambiguation with supervised machine learning","volume-title":"Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)","author":"Gorman","year":"2018"},{"key":"2021111022501968800_bib38","first-page":"58563","article-title":"Chinese Gigaword","volume":"1","author":"Graff","year":"2005","journal-title":"LDC Catalog No.: LDC2003T09"},{"key":"2021111022501968800_bib39","article-title":"Hyperbolic attention networks","volume-title":"7th International Conference on Learning Representations (ICLR)","author":"G\u00fcl\u00e7ehre","year":"2019"},{"key":"2021111022501968800_bib40","first-page":"569","article-title":"Ethiopic writing","volume-title":"The World\u2019s Writing Systems","author":"Haile","year":"1996"},{"volume-title":"The Kodansha Kanji Learner\u2019s Dictionary: Revised and Expanded","year":"2013","author":"Halpern","key":"2021111022501968800_bib41"},{"key":"2021111022501968800_bib42","doi-asserted-by":"crossref","DOI":"10.1163\/9789004352223","volume-title":"Sinography: The Borrowing and Adaptation of the Chinese Script","author":"Handel","year":"2019"},{"volume-title":"Signs of Writing","year":"1995","author":"Harris","key":"2021111022501968800_bib43"},{"issue":"1","key":"2021111022501968800_bib44","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1016\/S0306-4573(00)00024-8","article-title":"Aspects of Swedish morphology and semantics from the perspective of mono- and cross-language information retrieval","volume":"37","author":"Hedlund","year":"2001","journal-title":"Information Processing & Management"},{"key":"2021111022501968800_bib45","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781139020411","volume-title":"Matrix Analysis","author":"Horn","year":"2012"},{"key":"2021111022501968800_bib46","doi-asserted-by":"crossref","first-page":"533","DOI":"10.4324\/9780429025563-21","article-title":"Pre-modern Hebrew: Biblical Hebrew","volume-title":"The Semitic Languages","author":"Hornkohl","year":"2019"},{"key":"2021111022501968800_bib47","unstructured":"Horodeck, Richard . 1987. The Role of Sound in Reading and Writing Kanji. Ph.D. thesis, Cornell University, Ithaca, NY."},{"volume-title":"Vozniknovenie i razvitie pisma \/ B\u043e\u0437\u043d\u0438\u043a\u043d\u043e\u0432\u0435\u043d\u0438\u0435 \u0438 \u0420\u0430\u0437\u0432\u0438\u0442\u0438e \u041f\u0438\u0441\u044c\u043ca","year":"1965","author":"Istrin","key":"2021111022501968800_bib48"},{"key":"2021111022501968800_bib49","unstructured":"Jenkins, John H., RichardCook, and KenLunde. 2020. Unicode Han database (UniHan), Unicode Consortium. Standard Annex #38, Unicode Consortium. Revision 29."},{"issue":"1","key":"2021111022501968800_bib50","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1075\/wll.14.1.04joy","article-title":"The significance of the morphographic principle for the classification of writing systems","volume":"14","author":"Joyce","year":"2011","journal-title":"Written Language and Literacy"},{"key":"2021111022501968800_bib51","volume-title":"Speech and Language Processing","author":"Jurafsky","year":"2009","edition":"2nd edition."},{"issue":"3","key":"2021111022501968800_bib52","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1111\/j.1467-9450.2005.00456.x","article-title":"Orthography as a handicap? A direct comparison of spelling acquisition in Danish and Icelandic","volume":"46","author":"Juul","year":"2005","journal-title":"Scandinavian Journal of Psychology"},{"key":"2021111022501968800_bib53","article-title":"Depthwise separable convolutions for neural machine translation","volume-title":"Proceedings of the 6th International Conference on Learning Representations (ICLR)","author":"Kaiser","year":"2018"},{"key":"2021111022501968800_bib54","article-title":"Neural machine translation in linear time","author":"Kalchbrenner","year":"2016","journal-title":"arXiv preprint arXiv:1610.10099"},{"key":"2021111022501968800_bib55","first-page":"218","article-title":"Korean writing","volume-title":"The World\u2019s Writing Systems","author":"King","year":"1996"},{"key":"2021111022501968800_bib56","article-title":"Adam: A method for stochastic optimization","author":"Kingma","year":"2014","journal-title":"arXiv preprint arXiv:1412.6980"},{"article-title":"LanguageNet Grapheme-to-Phoneme Transducers","year":"2018","author":"Kirchhoff","key":"2021111022501968800_bib57"},{"key":"2021111022501968800_bib58","article-title":"Reformer: The efficient Transformer","volume-title":"8th International Conference on Learning Representations (ICLR)","author":"Kitaev","year":"2020"},{"key":"2021111022501968800_bib59","doi-asserted-by":"crossref","first-page":"28","DOI":"10.18653\/v1\/W17-3204","article-title":"Six challenges for neural machine translation","volume-title":"Proceedings of the First Workshop on Neural Machine Translation","author":"Koehn","year":"2017"},{"volume-title":"Computational Analysis of Present-day American English","year":"1967","author":"Ku\u010dera","key":"2021111022501968800_bib60"},{"issue":"1-2","key":"2021111022501968800_bib61","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1515\/aofo-2016-0018","article-title":"Sumerograms and akkadograms in Hittite: Ideograms, logograms, allograms or heterograms?","volume":"43","author":"Kudrinski","year":"2016","journal-title":"Altorientalische Forschungen"},{"key":"2021111022501968800_bib62","first-page":"4223","article-title":"Massively multilingual pronunciation modeling with WikiPron","volume-title":"Proceedings of The 12th Language Resources and Evaluation Conference (LREC)","author":"Lee","year":"2020"},{"key":"2021111022501968800_bib63","doi-asserted-by":"crossref","first-page":"1412","DOI":"10.18653\/v1\/D15-1166","article-title":"Effective approaches to attention-based neural machine translation","volume-title":"Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Luong","year":"2015"},{"volume-title":"Information Theory, Inference and Learning Algorithms","year":"2003","author":"MacKay","key":"2021111022501968800_bib64"},{"issue":"1","key":"2021111022501968800_bib65","first-page":"1","article-title":"The Shona writing system: An analysis of its problems and possible solutions","volume":"29","author":"Magwa","year":"2002","journal-title":"Zambezia"},{"key":"2021111022501968800_bib66","first-page":"200","article-title":"Modern Chinese writing","volume-title":"The World\u2019s Writing Systems","author":"Mair","year":"1996"},{"issue":"1843","key":"2021111022501968800_bib67","article-title":"Spelling acquisition in English and Italian: A cross-linguistic study.","volume":"6","author":"Marinelli","year":"2015","journal-title":"Frontiers in Psychology"},{"key":"2021111022501968800_bib68","unstructured":"Matsunaga, Sachiko . 1994. The Linguistic and Psycholinguistic Nature of Kanji: Do Kanji Represent and Trigger only Meanings? Ph.D. thesis, University of Hawaii, Honolulu."},{"key":"2021111022501968800_bib69","first-page":"1","article-title":"Are sixteen heads really better than one?","volume-title":"33rd Conference on Neural Information Processing Systems (NeurIPS)","author":"Michel","year":"2019"},{"key":"2021111022501968800_bib70","doi-asserted-by":"crossref","first-page":"2536","DOI":"10.21437\/Interspeech.2017-1436","article-title":"Multitask sequence-to-sequence models for grapheme-to-phoneme conversion","volume-title":"Proceedings Interspeech 2017","author":"Milde","year":"2017"},{"key":"2021111022501968800_bib71","first-page":"2204","article-title":"Recurrent models of visual attention","volume-title":"Proceedings of Neural Information Processing Systems (NIPS)","author":"Mnih","year":"2014"},{"volume-title":"The Triumph of the Alphabet","year":"1953","author":"Moorehouse","key":"2021111022501968800_bib72"},{"key":"2021111022501968800_bib73","first-page":"2791","article-title":"Measuring and improving faithfulness of attention in neural machine translation","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Moradi","year":"2021"},{"key":"2021111022501968800_bib74","first-page":"2710","article-title":"Epitran: Precision G2P for many languages","volume-title":"Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)","author":"Mortensen","year":"2018"},{"article-title":"Leksikalsk Database for Svensk","year":"2011","author":"Nasjonalbiblioteket","key":"2021111022501968800_bib75"},{"key":"2021111022501968800_bib76","doi-asserted-by":"crossref","DOI":"10.1163\/9789004665583","volume-title":"Early History of the Alphabet: An Introduction to West Semitic Epigraphy and Palaeography","author":"Naveh","year":"1982"},{"key":"2021111022501968800_bib77","first-page":"2723","article-title":"Word-based partial annotation for efficient corpus construction","volume-title":"The 7th International Conference on Language Resources and Evaluation (LREC)","author":"Neubig","year":"2010"},{"key":"2021111022501968800_bib78","first-page":"529","article-title":"Pointwise prediction for robust, adaptable Japanese morphological analysis","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies","author":"Neubig","year":"2011"},{"issue":"3","key":"2021111022501968800_bib79","doi-asserted-by":"publisher","first-page":"516","DOI":"10.3758\/BF03195598","article-title":"Lexique 2: A new French lexical database","volume":"36","author":"New","year":"2004","journal-title":"Behavior Research Methods, Instruments, & Computers"},{"issue":"6","key":"2021111022501968800_bib80","doi-asserted-by":"publisher","first-page":"907","DOI":"10.1017\/S1351324915000315","article-title":"Phonetisaurus: Exploring grapheme-to-phoneme conversion with joint n-gram models in the WFST framework","volume":"22","author":"Novak","year":"2016","journal-title":"Natural Language Engineering"},{"key":"2021111022501968800_bib81","first-page":"88","article-title":"Epigraphic Semitic scripts","volume-title":"The World\u2019s Writing Systems","author":"O\u2019Connor","year":"1996"},{"key":"2021111022501968800_bib82","first-page":"48","article-title":"fairseq: A fast, extensible toolkit for sequence modeling","volume-title":"Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT): Demonstrations","author":"Ott","year":"2019"},{"key":"2021111022501968800_bib83","first-page":"117","article-title":"Quantitative methods for classifying writing systems","volume-title":"Proceedings of the North American Chapter of the Association for Computational Linguistics","author":"Penn","year":"2006"},{"key":"2021111022501968800_bib84","doi-asserted-by":"crossref","DOI":"10.4324\/9781410604583","volume-title":"Learning to Spell: Research, Theory, and Practice Across Languages","author":"Perfetti","year":"1997"},{"key":"2021111022501968800_bib85","doi-asserted-by":"crossref","first-page":"63","DOI":"10.18653\/v1\/2020.sigmorphon-1.4","article-title":"One-size-fits-all multilingual models","volume-title":"Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology","author":"Peters","year":"2020"},{"volume-title":"The Story of Decipherment: From Egyptian Hieroglyphs to Linear B","year":"1975","author":"Pope","key":"2021111022501968800_bib86"},{"key":"2021111022501968800_bib87","volume-title":"The Story of Decipherment: From Egyptian Hieroglyphs to Maya Script","author":"Pope","year":"1999","edition":"revised edition"},{"key":"2021111022501968800_bib88","first-page":"2837","article-title":"Online and linear-time attention by enforcing monotonic alignments","volume-title":"Proceedings of the 34th International Conference on Machine Learning (ICML)","author":"Raffel","year":"2017"},{"key":"2021111022501968800_bib89","doi-asserted-by":"crossref","first-page":"4225","DOI":"10.1109\/ICASSP.2015.7178767","article-title":"Grapheme-to-phoneme conversion using Long Short-Term Memory recurrent neural networks","volume-title":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"Rao","year":"2015"},{"key":"2021111022501968800_bib90","first-page":"339","article-title":"Hebrew orthography and literacy","volume-title":"Handbook of Orthography and Literacy","author":"Ravid","year":"2005"},{"key":"2021111022501968800_bib91","first-page":"3031","article-title":"Attention can reflect syntactic structure (if you let it)","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Ravishankar","year":"2021"},{"key":"2021111022501968800_bib92","first-page":"43","article-title":"Smoothed marginal distribution constraints for language modeling","volume-title":"Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Roark","year":"2013"},{"key":"2021111022501968800_bib93","first-page":"61","article-title":"The OpenGrm open-source finite-state grammar software libraries","volume-title":"Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL)","author":"Roark","year":"2012"},{"key":"2021111022501968800_bib94","volume-title":"The Story of Writing: Alphabets, Hieroglyphs & Pictographs","author":"Robinson","year":"2007","edition":"2nd edition."},{"key":"2021111022501968800_bib95","doi-asserted-by":"publisher","first-page":"842","DOI":"10.1162\/tacl_a_00349","article-title":"A primer in BERTology: What we know about how BERT works","volume":"8","author":"Rogers","year":"2021","journal-title":"Transactions of the Association for Computational Linguistics"},{"volume-title":"Writing Systems: A Linguistic Approach","year":"2005","author":"Rogers","key":"2021111022501968800_bib96"},{"key":"2021111022501968800_bib97","first-page":"33","article-title":"Writing in another tongue: Alloglottography in the Ancient Near East","volume-title":"Margins of Writing, Origins of Cultures","author":"Rubio","year":"2006"},{"key":"2021111022501968800_bib98","doi-asserted-by":"crossref","DOI":"10.1075\/loall.10","volume-title":"Somali","author":"Saeed","year":"1999"},{"key":"2021111022501968800_bib99","first-page":"373","article-title":"Brahmi and Kharoshthi","volume-title":"The World\u2019s Writing Systems","author":"Salomon","year":"1996"},{"volume-title":"Writing Systems","year":"1985","author":"Sampson","key":"2021111022501968800_bib100"},{"key":"2021111022501968800_bib101","volume-title":"Writing Systems","author":"Sampson","year":"2012","edition":"2nd edition."},{"volume-title":"An Historical Grammar of Japanese","year":"1928","author":"Sansom","key":"2021111022501968800_bib102"},{"issue":"11","key":"2021111022501968800_bib103","doi-asserted-by":"publisher","first-page":"2673","DOI":"10.1109\/78.650093","article-title":"Bidirectional recurrent neural networks","volume":"45","author":"Schuster","year":"1997","journal-title":"IEEE Transactions on Signal Processing"},{"issue":"3","key":"2021111022501968800_bib104","doi-asserted-by":"publisher","first-page":"379","DOI":"10.1002\/j.1538-7305.1948.tb01338.x","article-title":"A mathematical theory of communication","volume":"27","author":"Shannon","year":"1948","journal-title":"The Bell System Technical Journal"},{"issue":"1","key":"2021111022501968800_bib105","doi-asserted-by":"publisher","first-page":"50","DOI":"10.1002\/j.1538-7305.1951.tb01366.x","article-title":"Prediction and entropy of printed English","volume":"30","author":"Shannon","year":"1951","journal-title":"The Bell System Technical Journal"},{"issue":"4","key":"2021111022501968800_bib106","doi-asserted-by":"publisher","first-page":"584","DOI":"10.1037\/0033-2909.134.4.584","article-title":"On the Anglocentricities of current reading research and practice: The perils of overreliance on an \u201coutlier\u201d orthography.","volume":"134","author":"Share","year":"2008","journal-title":"Psychological Bulletin"},{"key":"2021111022501968800_bib107","first-page":"239","article-title":"Yi script","volume-title":"The World\u2019s Writing Systems","author":"Shi","year":"1996"},{"key":"2021111022501968800_bib108","article-title":"Do RNN states encode abstract phonological processes?","author":"Silfverberg","year":"2021","journal-title":"arXiv preprint arXiv:2104.00789"},{"key":"2021111022501968800_bib109","first-page":"515","article-title":"Aramaic scripts for Iranian languages","volume-title":"The World\u2019s Writing Systems","author":"Skjaervo","year":"1996"},{"key":"2021111022501968800_bib110","first-page":"209","article-title":"Japanese writing","volume-title":"The World\u2019s Writing Systems","author":"Smith","year":"1996"},{"volume-title":"A Computational Theory of Writing Systems","year":"2000","author":"Sproat","key":"2021111022501968800_bib111"},{"issue":"2","key":"2021111022501968800_bib112","doi-asserted-by":"publisher","first-page":"457","DOI":"10.1353\/lan.2014.0042","article-title":"A statistical comparison of written language and nonlinguistic symbol systems","volume":"90","author":"Sproat","year":"2014","journal-title":"Language"},{"key":"2021111022501968800_bib113","article-title":"English orthography among the writing systems of the world","volume-title":"The Routledge Handbook of the English Writing System","author":"Sproat","year":"2016"},{"issue":"1","key":"2021111022501968800_bib114","first-page":"1929","article-title":"Dropout: A simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"Journal of Machine Learning Research"},{"key":"2021111022501968800_bib115","first-page":"3104","article-title":"Sequence to sequence learning with neural networks","volume-title":"Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS\u201914)","author":"Sutskever","year":"2014"},{"key":"2021111022501968800_bib116","doi-asserted-by":"crossref","first-page":"26","DOI":"10.18653\/v1\/W18-6304","article-title":"An analysis of attention mechanisms: The case of word sense disambiguation in neural machine translation","volume-title":"Proceedings of the Third Conference on Machine Translation: Research Papers","author":"Tang","year":"2018"},{"key":"2021111022501968800_bib117","first-page":"271","article-title":"The Greek alphabet","volume-title":"The World\u2019s Writing Systems","author":"Threatte","year":"1996"},{"issue":"1","key":"2021111022501968800_bib118","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1080\/10888438.2016.1251437","article-title":"First-and second-language learnability explained by orthographic depth and orthographic learning: A \u201cnatural\u201d Scandinavian experiment","volume":"21","author":"van Daal","year":"2017","journal-title":"Scientific Studies of Reading"},{"key":"2021111022501968800_bib119","doi-asserted-by":"crossref","DOI":"10.4324\/9781315755946","volume-title":"The Bantu Languages","author":"van de Velde","year":"2019","edition":"2nd edition."},{"key":"2021111022501968800_bib120","article-title":"WaveNet: A generative model for raw audio","author":"van den Oord","year":"2016","journal-title":"arXiv preprint arXiv:1609.03499"},{"key":"2021111022501968800_bib121","first-page":"5998","article-title":"Attention is all you need","volume-title":"Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017)","author":"Vaswani","year":"2017"},{"key":"2021111022501968800_bib122","doi-asserted-by":"crossref","first-page":"63","DOI":"10.18653\/v1\/W19-4808","article-title":"Analyzing the structure of attention in a transformer language model","volume-title":"Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP","author":"Vig","year":"2019"},{"key":"2021111022501968800_bib123","doi-asserted-by":"crossref","first-page":"1264","DOI":"10.18653\/v1\/P18-1117","article-title":"Context-aware neural machine translation learns anaphora resolution","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Voita","year":"2018"},{"key":"2021111022501968800_bib124","doi-asserted-by":"crossref","first-page":"5797","DOI":"10.18653\/v1\/P19-1580","article-title":"Analyzing multihead self-attention: Specialized heads do the heavy lifting, the rest can be pruned","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Voita","year":"2019"},{"key":"2021111022501968800_bib125","first-page":"5776","article-title":"MiniLM: Deep self-attention distillation for task-agnostic compression of pre-trained transformers","volume-title":"34th Conference on Neural Information Processing Systems (NeurIPS)","author":"Wang","year":"2020"},{"key":"2021111022501968800_bib126","first-page":"684","article-title":"SAMPA computer readable phonetic alphabet","volume-title":"Handbook of Standards and Resources for Spoken Language Systems","author":"Wells","year":"1997"},{"key":"2021111022501968800_bib127","doi-asserted-by":"crossref","first-page":"11","DOI":"10.18653\/v1\/D19-1002","article-title":"Attention is not not explanation","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Wiegreffe","year":"2019"},{"issue":"3","key":"2021111022501968800_bib128","doi-asserted-by":"publisher","first-page":"219","DOI":"10.1016\/0010-0277(91)90026-Z","article-title":"The relationship of phonemic awareness to reading acquisition: More consequence than precondition but still important","volume":"40","author":"Wimmer","year":"1991","journal-title":"Cognition"},{"issue":"4","key":"2021111022501968800_bib129","doi-asserted-by":"publisher","first-page":"1085","DOI":"10.1109\/18.87000","article-title":"The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression","volume":"37","author":"Witten","year":"1991","journal-title":"IEEE Transactions on Information Theory"},{"volume-title":"Visible Language: Inventions of Writing in the Ancient Middle East and Beyond","year":"2010","author":"Woods","key":"2021111022501968800_bib130"},{"key":"2021111022501968800_bib131","first-page":"2048","article-title":"Show, attend and tell: Neural image caption generation with visual attention","volume-title":"International Conference on Machine Learning (ICML)","author":"Xu","year":"2015"},{"key":"2021111022501968800_bib132","doi-asserted-by":"crossref","first-page":"2095","DOI":"10.21437\/Interspeech.2019-1954","article-title":"Transformer based grapheme-to-phoneme conversion","volume-title":"Proceedings of Interspeech 2019","author":"Yolchuyeva","year":"2019"},{"key":"2021111022501968800_bib133","article-title":"Multi-scale context aggregation by dilated convolutions","volume-title":"Proceedings of the 4th International Conference on Learning Representations (ICLR)","author":"Yu","year":"2016"},{"key":"2021111022501968800_bib134","unstructured":"Yu Shiwen 2002. The grammatical knowledge-base of contemporary Chinese \u2013 a complete specification (second version). Technical report, Tsinghua University Press, Beijing."},{"key":"2021111022501968800_bib135","doi-asserted-by":"publisher","first-page":"293","DOI":"10.1162\/coli_a_00349","article-title":"Neural models of text normalization","volume":"45","author":"Zhang","year":"2019","journal-title":"Computational Linguistics"},{"key":"2021111022501968800_bib136","article-title":"Inducing language-agnostic multilingual representations","author":"Zhao","year":"2020","journal-title":"arXiv.org"}],"container-title":["Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/coli\/article-pdf\/47\/3\/477\/1971897\/coli_a_00409.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/coli\/article-pdf\/47\/3\/477\/1971897\/coli_a_00409.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,3]],"date-time":"2024-09-03T01:12:16Z","timestamp":1725325936000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/coli\/article\/47\/3\/477\/102776\/The-Taxonomy-of-Writing-Systems-How-to-Measure-How"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11]]},"references-count":136,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2021,11,3]]},"published-print":{"date-parts":[[2021,11,3]]}},"URL":"https:\/\/doi.org\/10.1162\/coli_a_00409","relation":{},"ISSN":["0891-2017","1530-9312"],"issn-type":[{"type":"print","value":"0891-2017"},{"type":"electronic","value":"1530-9312"}],"subject":[],"published-other":{"date-parts":[[2021,11]]},"published":{"date-parts":[[2021,11]]}}}