{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:28:41Z","timestamp":1750220921728,"version":"3.41.0"},"reference-count":43,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2019,5,31]],"date-time":"2019-05-31T00:00:00Z","timestamp":1559260800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2020,1,31]]},"abstract":"<jats:p>This article presents a comprehensive study on two primary tasks in Burmese (Myanmar) morphological analysis: tokenization and part-of-speech (POS) tagging. Twenty thousand Burmese sentences of newswire are annotated with two-layer tokenization and POS-tagging information, as one component of the Asian Language Treebank Project. The annotated corpus has been released under a CC BY-NC-SA license, and it is the largest open-access database of annotated Burmese when this manuscript was prepared in 2017. Detailed descriptions of the preparation, refinement, and features of the annotated corpus are provided in the first half of the article. Facilitated by the annotated corpus, experiment-based investigations are presented in the second half of the article, wherein the standard sequence-labeling approach of conditional random fields and a long short-term memory (LSTM)-based recurrent neural network (RNN) are applied and discussed. We obtained several general conclusions, covering the effect of joint tokenization and POS-tagging and importance of ensemble from the viewpoint of stabilizing the performance of LSTM-based RNN. This study provides a solid basis for further studies on Burmese processing.<\/jats:p>","DOI":"10.1145\/3325885","type":"journal-article","created":{"date-parts":[[2019,6,3]],"date-time":"2019-06-03T12:23:16Z","timestamp":1559564596000},"page":"1-34","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Towards Burmese (Myanmar) Morphological Analysis"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7523-208X","authenticated-orcid":false,"given":"Chenchen","family":"Ding","sequence":"first","affiliation":[{"name":"ASTREC, National Institute of Information and Communications Technology, Kyoto, Japan"}]},{"given":"Hnin Thu Zar","family":"Aye","sequence":"additional","affiliation":[{"name":"University of Computer Studies, Yangon, Myanmar"}]},{"given":"Win Pa","family":"Pa","sequence":"additional","affiliation":[{"name":"University of Computer Studies, Yangon, Myanmar"}]},{"given":"Khin Thandar","family":"Nwet","sequence":"additional","affiliation":[{"name":"University of Computer Studies, Yangon, Myanmar"}]},{"given":"Khin Mar","family":"Soe","sequence":"additional","affiliation":[{"name":"University of Computer Studies, Yangon, Myanmar"}]},{"given":"Masao","family":"Utiyama","sequence":"additional","affiliation":[{"name":"ASTREC, National Institute of Information and Communications Technology, Kyoto, Japan"}]},{"given":"Eiichiro","family":"Sumita","sequence":"additional","affiliation":[{"name":"ASTREC, National Institute of Information and Communications Technology, Kyoto, Japan"}]}],"member":"320","published-online":{"date-parts":[[2019,5,31]]},"reference":[{"volume-title":"Proceedings of the ICACTE. 233--237","year":"2010","author":"Mon Aye Myat","key":"e_1_2_1_1_1"},{"volume-title":"Le pr\u00e9dicat en birman parl\u00e9","author":"Bernot Denise","key":"e_1_2_1_3_1"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D15-1141"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1179"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078186"},{"volume-title":"Department of the Myanmar Language Commission. 2014. Myanmar-English Dictionary (Myanma-anggalip Abidan)","edition":"12","key":"e_1_2_1_7_1"},{"volume-title":"Department of the Myanmar Language Commission. 2016. Myanmar Grammar (Myanma Sadda)","edition":"3","key":"e_1_2_1_8_1"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3276773"},{"volume-title":"Proceedings of the PACLING. 227--238","year":"2017","author":"Ding Chenchen","key":"e_1_2_1_10_1"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2846095"},{"volume-title":"Proceedings of the SIGHAN. 123--133","year":"2005","author":"Emerson Thomas","key":"e_1_2_1_12_1"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.3115\/977035.977059"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00054"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1162\/153244303768966139"},{"key":"e_1_2_1_16_1","first-page":"249","article-title":"Understanding the difficulty of training deep feedforward neural networks","volume":"9","author":"Glorot Xavier","year":"2010","journal-title":"Proceedings of the AISTATS (PMLR)"},{"volume-title":"LSTM: A search space odyssey","year":"2017","author":"Greff Klaus","key":"e_1_2_1_17_1"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"volume-title":"Proceedings of the CICLING.","year":"2017","author":"War Htike Khin War","key":"e_1_2_1_20_1"},{"volume-title":"Proceedings of the ICLR.","year":"2014","author":"Kingma Diederik","key":"e_1_2_1_21_1"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073336.1073361"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.5715\/jnlp.9.5_3"},{"volume-title":"Proceedings of the EMNLP. 230--237","year":"2004","author":"Kudo Taku","key":"e_1_2_1_24_1"},{"volume-title":"Pereira","year":"2001","author":"Lafferty John","key":"e_1_2_1_25_1"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1101"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.5555\/972470.972475"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2010-343"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2700051"},{"key":"e_1_2_1_30_1","unstructured":"Graham Neubig Chris Dyer Yoav Goldberg Austin Matthews Waleed Ammar Antonios Anastasopoulos Miguel Ballesteros David Chiang Daniel Clothiaux Trevor Cohn Kevin Duh Manaal Faruqui Cynthia Gan Dan Garrette Yangfeng Ji Lingpeng Kong Adhiguna Kuncoro Gaurav Kumar Chaitanya Malaviya Paul Michel Yusuke Oda Matthew Richardson Naomi Saphra Swabha Swayamdipta and Pengcheng Yin. 2017. DyNet: The dynamic neural network toolkit. arXiv:1701.03980 (2017).  Graham Neubig Chris Dyer Yoav Goldberg Austin Matthews Waleed Ammar Antonios Anastasopoulos Miguel Ballesteros David Chiang Daniel Clothiaux Trevor Cohn Kevin Duh Manaal Faruqui Cynthia Gan Dan Garrette Yangfeng Ji Lingpeng Kong Adhiguna Kuncoro Gaurav Kumar Chaitanya Malaviya Paul Michel Yusuke Oda Matthew Richardson Naomi Saphra Swabha Swayamdipta and Pengcheng Yin. 2017. DyNet: The dynamic neural network toolkit. arXiv:1701.03980 (2017)."},{"volume-title":"Proceedings of the ACL-HLT. 529--533","year":"2011","author":"Neubig Graham","key":"e_1_2_1_31_1"},{"key":"e_1_2_1_32_1","unstructured":"Hideki\n      Ogura Hanae\n      Koiso Yumi\n      Fujiike Sayaka\n      Miyauchi Hikari\n      Konishi and \n      Yutaka\n      Hara\n    . 2011. JC-D-10-05-01 and JC-D-10-05-02.\n   Retrieved from http:\/\/pj.ninjal.ac.jp\/corpus_center\/bccwj\/doc\/report\/JC-D-10-05-01.pdf; http:\/\/pj.ninjal.ac.jp\/corpus_center\/bccwj\/doc\/report\/JC-D-10-05-02.pdf (in \n  Japanese)\n  .  Hideki Ogura Hanae Koiso Yumi Fujiike Sayaka Miyauchi Hikari Konishi and Yutaka Hara. 2011. JC-D-10-05-01 and JC-D-10-05-02. Retrieved from http:\/\/pj.ninjal.ac.jp\/corpus_center\/bccwj\/doc\/report\/JC-D-10-05-01.pdf; http:\/\/pj.ninjal.ac.jp\/corpus_center\/bccwj\/doc\/report\/JC-D-10-05-02.pdf (in Japanese)."},{"key":"e_1_2_1_33_1","unstructured":"John Okell and Anna Allott. 2001. Burmese\/Myanmar Dictionary of Grammatical Forms. Routledge.  John Okell and Anna Allott. 2001. Burmese\/Myanmar Dictionary of Grammatical Forms. Routledge."},{"volume-title":"Proceedings of the LREC. 2089--2096","year":"2012","author":"Petrov Slav","key":"e_1_2_1_34_1"},{"volume-title":"Marcus","year":"1999","author":"Ramshaw Lance A.","key":"e_1_2_1_35_1"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSDA.2016.7918974"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2670313"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/W15-1511"},{"volume-title":"Le","year":"2014","author":"Sutskever Ilya","key":"e_1_2_1_39_1"},{"volume-title":"Treebanks","author":"Taylor Ann","key":"e_1_2_1_40_1"},{"volume-title":"Proceedings of the PACLIC. 130--139","year":"2011","author":"Zin Thet Thet","key":"e_1_2_1_41_1"},{"volume-title":"Proceedings of the EACL. 32--37","year":"2012","author":"Hlaing Tin Htay","key":"e_1_2_1_42_1"},{"key":"e_1_2_1_43_1","unstructured":"Sato Toshinori. 2015. Neologism dictionary based on the language resources on the Web for Mecab. Retrieved from https:\/\/github.com\/neologd\/mecab-ipadic-neologd.  Sato Toshinori. 2015. Neologism dictionary based on the language resources on the Web for Mecab. Retrieved from https:\/\/github.com\/neologd\/mecab-ipadic-neologd."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075218.1075260"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/1781134.1781135"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3325885","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3325885","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:53:08Z","timestamp":1750204388000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3325885"}},"subtitle":["Syllable-based Tokenization and Part-of-speech Tagging"],"short-title":[],"issued":{"date-parts":[[2019,5,31]]},"references-count":43,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,1,31]]}},"alternative-id":["10.1145\/3325885"],"URL":"https:\/\/doi.org\/10.1145\/3325885","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2019,5,31]]},"assertion":[{"value":"2017-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-04-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-05-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}