{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,20]],"date-time":"2025-10-20T18:03:28Z","timestamp":1760983408250,"version":"3.41.0"},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2010,3,1]],"date-time":"2010-03-01T00:00:00Z","timestamp":1267401600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Transactions on Asian Language Information Processing"],"published-print":{"date-parts":[[2010,3]]},"abstract":"<jats:p>\n            Allomorphic variation, or form variation among morphs with the same meaning, is a stumbling block to morphological induction (MI). To address this problem, we present a hybrid approach that uses a small amount of linguistic knowledge in the form of orthographic rewrite rules to help refine an existing MI-produced segmentation. Using rules, we derive underlying analyses of morphs---generalized with respect to contextual spelling differences---from an existing surface morph segmentation, and from these we learn a morpheme-level segmentation. To learn morphemes, we have extended the Morfessor segmentation algorithm [Creutz and Lagus 2004; 2005; 2006] by using rules to infer possible underlying analyses from surface segmentations. A segmentation produced by Morfessor Categories-MAP Software v. 0.9.2 is used as input to our procedure and as a baseline that we evaluate against. To suggest analyses for our procedure, a set of language-specific orthographic rules is needed. Our procedure has yielded promising improvements for English and Turkish over the baseline approach when tested on the Morpho Challenge 2005 and 2007 style evaluations. On the Morpho Challenge 2007 test evaluation, we report gains over the current best unsupervised contestant for Turkish, where our technique shows a 2.5% absolute\n            <jats:italic>F<\/jats:italic>\n            -score improvement.\n          <\/jats:p>","DOI":"10.1145\/1731035.1731038","type":"journal-article","created":{"date-parts":[[2010,3,30]],"date-time":"2010-03-30T12:32:23Z","timestamp":1269952343000},"page":"1-38","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Inducing Morphemes Using Light Knowledge"],"prefix":"10.1145","volume":"9","author":[{"given":"Michael","family":"Tepper","sequence":"first","affiliation":[{"name":"Department of Linguistics, University of Washington"}]},{"given":"Fei","family":"Xia","sequence":"additional","affiliation":[{"name":"Department of Linguistics, University of Washington"}]}],"member":"320","published-online":{"date-parts":[[2010,3]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Working Notes for the Workshop on Cross-Language Evaluation Forum (CLEF\u201907)","author":"Bernhard D.","year":"2007","unstructured":"Bernhard , D. 2007 . Simple morpheme labeling in unsupervised morpheme analysis . In Working Notes for the Workshop on Cross-Language Evaluation Forum (CLEF\u201907) . Bernhard, D. 2007. Simple morpheme labeling in unsupervised morpheme analysis. In Working Notes for the Workshop on Cross-Language Evaluation Forum (CLEF\u201907)."},{"key":"e_1_2_1_2_1","unstructured":"J. Res. Sci. Comput. Eng. 2006 The revised wordframe model for the Filipino language"},{"key":"e_1_2_1_3_1","volume-title":"Row: New York.","author":"Chomsky N.","year":"1968","unstructured":"Chomsky , N. and Halle , M . 1968 . The sound pattern of English. Harper & amp; Row: New York. Chomsky, N. and Halle, M. 1968. The sound pattern of English. Harper &amp; Row: New York."},{"volume-title":"Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology (SIGPHON\u201904)","author":"Creutz M.","key":"e_1_2_1_4_1","unstructured":"Creutz , M. and Lagus , K . 2004. Induction of a simple morphology for highly inflecting languages . In Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology (SIGPHON\u201904) . 43--51. Creutz, M. and Lagus, K. 2004. Induction of a simple morphology for highly inflecting languages. In Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology (SIGPHON\u201904). 43--51."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.3115\/1118647.1118650"},{"volume-title":"Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology (SIGPHON). 43--51","author":"Creutz M.","key":"e_1_2_1_7_1","unstructured":"Creutz , M. and Lagus , K . 2004. Induction of a simple morphology for highly inflecting languages . In Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology (SIGPHON). 43--51 . Creutz, M. and Lagus, K. 2004. Induction of a simple morphology for highly inflecting languages. In Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology (SIGPHON). 43--51."},{"volume-title":"Proceedings of the International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning (AKRR\u201905)","author":"Creutz M.","key":"e_1_2_1_8_1","unstructured":"Creutz , M. and Lagus , K . 2005. Inducing the morphological lexicon of a natural language from unannotated text . In Proceedings of the International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning (AKRR\u201905) . 106--113. Creutz, M. and Lagus, K. 2005. Inducing the morphological lexicon of a natural language from unannotated text. In Proceedings of the International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning (AKRR\u201905). 106--113."},{"volume-title":"Proceedings of the PASCAL Challenge Workshop on Unsupervised Segmentation of Words into Morphemes (PASCAL\u201906)","author":"Creutz M.","key":"e_1_2_1_9_1","unstructured":"Creutz , M. and Lagus , K . 2006. Morfessor in the morpho challenge . In Proceedings of the PASCAL Challenge Workshop on Unsupervised Segmentation of Words into Morphemes (PASCAL\u201906) . Creutz, M. and Lagus, K. 2006. Morfessor in the morpho challenge. In Proceedings of the PASCAL Challenge Workshop on Unsupervised Segmentation of Words into Morphemes (PASCAL\u201906)."},{"volume-title":"Proceedings of the Human Language Technology Conference\/North American Chapter of the Association for Computational Linguistics (HLT-NAACL\u201907)","author":"Dasgupta S.","key":"e_1_2_1_10_1","unstructured":"Dasgupta , S. and Ng , V . 2007. High performance, language-independent morphological segmentation . In Proceedings of the Human Language Technology Conference\/North American Chapter of the Association for Computational Linguistics (HLT-NAACL\u201907) . Dasgupta, S. and Ng, V. 2007. High performance, language-independent morphological segmentation. In Proceedings of the Human Language Technology Conference\/North American Chapter of the Association for Computational Linguistics (HLT-NAACL\u201907)."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/1603899.1603952"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-6393(97)00048-4"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the Association for Computational Learning (ACL\u201907)","author":"Demberg V.","year":"2007","unstructured":"Demberg , V. 2007 . A language-independent unsupervised model for morphological segmentation . In Proceedings of the Association for Computational Learning (ACL\u201907) . Demberg, V. 2007. A language-independent unsupervised model for morphological segmentation. In Proceedings of the Association for Computational Learning (ACL\u201907)."},{"key":"e_1_2_1_15_1","volume-title":"Turkish: A comprehensive grammar","author":"G\u00f6ksel A.","year":"2005","unstructured":"G\u00f6ksel , A. and Kerslake , C . 2005 . Turkish: A comprehensive grammar . Routledge : London . G\u00f6ksel, A. and Kerslake, C. 2005. Turkish: A comprehensive grammar. Routledge: London."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1162\/089120101750300490"},{"volume-title":"Methods in structural linguistics","author":"Harris Z. S.","key":"e_1_2_1_17_1","unstructured":"Harris , Z. S. 1951. Methods in structural linguistics . University of Chicago Press. Harris, Z. S. 1951. Methods in structural linguistics. University of Chicago Press."},{"key":"e_1_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Huddleston R. and Pullum G. K. 2001. The Cambridge Grammar of the English Language. Cambridge University Press. Huddleston R. and Pullum G. K. 2001. The Cambridge Grammar of the English Language . Cambridge University Press.","DOI":"10.1017\/9781316423530"},{"volume-title":"Linguistic Society of America Meeting Handbook","author":"Kaplan R. M.","key":"e_1_2_1_19_1","unstructured":"Kaplan , R. M. and Kay , M . 1981. Phonological rules and finite-state transducers . In Linguistic Society of America Meeting Handbook . New York. Kaplan, R. M. and Kay, M. 1981. Phonological rules and finite-state transducers. In Linguistic Society of America Meeting Handbook. New York."},{"key":"e_1_2_1_20_1","first-page":"3","article-title":"Regular models of phonological rule systems","volume":"20","author":"Kaplan R. M.","year":"1994","unstructured":"Kaplan , R. M. and Kay , M. 1994 . Regular models of phonological rule systems . Comput. Linguist. 20 , 3 . Kaplan, R. M. and Kay, M. 1994. Regular models of phonological rule systems. Comput. Linguist. 20, 3.","journal-title":"Comput. Linguist."},{"volume-title":"Proceedings of the European Summer School in Logic, Language, and Information (ESSLLI\u201901)","author":"Karttunen L.","key":"e_1_2_1_21_1","unstructured":"Karttunen , L. and Beesley , K. R . 2001. A short history of two-level morphology . In Proceedings of the European Summer School in Logic, Language, and Information (ESSLLI\u201901) . Karttunen, L. and Beesley, K. R. 2001. A short history of two-level morphology. In Proceedings of the European Summer School in Logic, Language, and Information (ESSLLI\u201901)."},{"key":"e_1_2_1_22_1","unstructured":"Karttunen L. and Beesley K. R. 2005. Twenty-five years of finite-state morphology. In Inquiries into Words Constraints and Contexts: Festschrift for Kimmo Kosenniemi on his 60th Birthday. CSLI Publications. Karttunen L. and Beesley K. R. 2005. Twenty-five years of finite-state morphology. In Inquiries into Words Constraints and Contexts: Festschrift for Kimmo Kosenniemi on his 60th Birthday . CSLI Publications."},{"volume-title":"Proceedings of the PASCAL Challenge Workshop on Unsupervised Segmentation of Words into Morphemes (PASCAL\u201906)","author":"Keshava S.","key":"e_1_2_1_23_1","unstructured":"Keshava , S. and Pitler , E . 2006. A simpler, intuitive approach to morpheme induction . In Proceedings of the PASCAL Challenge Workshop on Unsupervised Segmentation of Words into Morphemes (PASCAL\u201906) . Keshava, S. and Pitler, E. 2006. A simpler, intuitive approach to morpheme induction. In Proceedings of the PASCAL Challenge Workshop on Unsupervised Segmentation of Words into Morphemes (PASCAL\u201906)."},{"volume-title":"Two-Level Morphology: A  General Computational Model for Word-Form Recognition and Production","author":"Koskenniemi K.","key":"e_1_2_1_24_1","unstructured":"Koskenniemi , K. 1983. Two-Level Morphology: A General Computational Model for Word-Form Recognition and Production . University of Helsinki , Helsinki, Finland . Koskenniemi, K. 1983. Two-Level Morphology: A General Computational Model for Word-Form Recognition and Production. University of Helsinki, Helsinki, Finland."},{"volume-title":"International Encyclopedia of Linguistics, E-Reference Ed.","author":"Koskenniemi K.","key":"e_1_2_1_25_1","unstructured":"Koskenniemi , K. 2003. Computational morphology . In International Encyclopedia of Linguistics, E-Reference Ed. , W. J. Frawley Ed., Oxford University Press . Koskenniemi, K. 2003. Computational morphology. In International Encyclopedia of Linguistics, E-Reference Ed., W. J. Frawley Ed., Oxford University Press."},{"volume-title":"Working Notes for the CLEF Workshop.","author":"Kurimo M.","key":"e_1_2_1_26_1","unstructured":"Kurimo , M. , Creutz , M. , and Varjokallio , M . 2007. Unsupervised morpheme analysis evaluation by a comparison to a linguistic gold standard -- Morpho Challenge 2007 . In Working Notes for the CLEF Workshop. Kurimo, M., Creutz, M., and Varjokallio, M. 2007. Unsupervised morpheme analysis evaluation by a comparison to a linguistic gold standard -- Morpho Challenge 2007. In Working Notes for the CLEF Workshop."},{"volume-title":"Proceedings of the PASCAL Challenge Workshop on Unsupervised Segmentation of Words into Morphemes.","author":"Kurimo M.","key":"e_1_2_1_27_1","unstructured":"Kurimo , M. , Creutz , M. , Varjokallio , M. , Arisoy , E. , and Sara\u00e7lar , M . 2006. Unsupervised segmentation of words into morphemes -- Morpho Challenge 2005, an introduction and evaluation report . In Proceedings of the PASCAL Challenge Workshop on Unsupervised Segmentation of Words into Morphemes. Kurimo, M., Creutz, M., Varjokallio, M., Arisoy, E., and Sara\u00e7lar, M. 2006. Unsupervised segmentation of words into morphemes -- Morpho Challenge 2005, an introduction and evaluation report. In Proceedings of the PASCAL Challenge Workshop on Unsupervised Segmentation of Words into Morphemes."},{"key":"e_1_2_1_28_1","unstructured":"Lewis G. 1967. Turkish Grammar. Oxford University Press. Lewis G. 1967. Turkish Grammar . Oxford University Press."},{"volume-title":"Working Notes for the CLEF Workshop.","author":"Monson C.","key":"e_1_2_1_29_1","unstructured":"Monson , C. , Carbonell , J. , Lavie , A. , and Levin , L . 2008. Paramor and morpho challenge 2008 . In Working Notes for the CLEF Workshop. Monson, C., Carbonell, J., Lavie, A., and Levin, L. 2008. Paramor and morpho challenge 2008. In Working Notes for the CLEF Workshop."},{"volume-title":"Proceedings of the 4th International Conference on Intel. Data Analysis (IDA). 238--247","author":"Peng F.","key":"e_1_2_1_30_1","unstructured":"Peng , F. and Schuurmans , D . 2001. A hierarchical EM approach to word segmentation . In Proceedings of the 4th International Conference on Intel. Data Analysis (IDA). 238--247 . Peng, F. and Schuurmans, D. 2001. A hierarchical EM approach to word segmentation. In Proceedings of the 4th International Conference on Intel. Data Analysis (IDA). 238--247."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073336.1073360"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073012.1073075"},{"volume-title":"The Oxford Handbook of Computational Linguistics","author":"Trost H.","key":"e_1_2_1_33_1","unstructured":"Trost , H. 2003. Morphology . In The Oxford Handbook of Computational Linguistics . Oxford University Press , 25--47. Trost, H. 2003. Morphology. In The Oxford Handbook of Computational Linguistics. Oxford University Press, 25--47."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075218.1075245"}],"container-title":["ACM Transactions on Asian Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1731035.1731038","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1731035.1731038","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T12:41:27Z","timestamp":1750250487000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1731035.1731038"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,3]]},"references-count":32,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2010,3]]}},"alternative-id":["10.1145\/1731035.1731038"],"URL":"https:\/\/doi.org\/10.1145\/1731035.1731038","relation":{},"ISSN":["1530-0226","1558-3430"],"issn-type":[{"type":"print","value":"1530-0226"},{"type":"electronic","value":"1558-3430"}],"subject":[],"published":{"date-parts":[[2010,3]]},"assertion":[{"value":"2009-03-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-03-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}