{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,16]],"date-time":"2026-03-16T20:25:20Z","timestamp":1773692720920,"version":"3.50.1"},"reference-count":50,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2023,6,17]],"date-time":"2023-06-17T00:00:00Z","timestamp":1686960000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Hankuk University of Foreign Studies Research Fund","award":["2023"],"award-info":[{"award-number":["2023"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2023,6,30]]},"abstract":"<jats:p>End-to-end neural network-based approaches have recently demonstrated significant improvements in natural language processing (NLP). However, in the NLP application such as assistant systems, NLP components are still processed to extract results using a pipeline paradigm. The pipeline-based concept has issues with error propagation. In Korean, morphological analysis and part-of-speech (POS) tagging step, incorrectly analyzing POS tags for a sentence containing spacing errors negatively affects other modules behind the POS module. Hence, we present a multi-task learning-based POS tagging neural model for Korean with word spacing challenges. When we apply this model to the Korean morphological analysis and POS tagging, we get findings that are robust to word spacing errors. We adopt syllable-level input and output formats, as well as a simple structure for ELECTRA and RNN-CRF models for multi-task learning, and we achieve a good performance 98.30 of F1, better than previous studies on the Sejong corpus test set.<\/jats:p>","DOI":"10.1145\/3591206","type":"journal-article","created":{"date-parts":[[2023,4,5]],"date-time":"2023-04-05T11:59:52Z","timestamp":1680695992000},"page":"1-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Robust Multi-task Learning-based Korean POS Tagging to Overcome Word Spacing Errors"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5386-0483","authenticated-orcid":false,"given":"Cheoneum","family":"Park","sequence":"first","affiliation":[{"name":"SK Telecom and Hyundai Motor Company"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7826-5226","authenticated-orcid":false,"given":"Juae","family":"Kim","sequence":"additional","affiliation":[{"name":"Hankuk University of Foreign Studies and Hyundai Motor Company"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,6,17]]},"reference":[{"key":"e_1_3_2_2_2","first-page":"195","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Ar\u0131k Sercan \u00d6.","year":"2017","unstructured":"Sercan \u00d6. Ar\u0131k, Mike Chrzanowski, Adam Coates, Gregory Diamos, Andrew Gibiansky, Yongguo Kang, Xian Li, John Miller, Andrew Ng, Jonathan Raiman et\u00a0al. 2017. Deep Voice: Real-time neural text-to-speech. In Proceedings of the International Conference on Machine Learning. PMLR, 195\u2013204."},{"key":"e_1_3_2_3_2","unstructured":"Dzmitry Bahdanau Kyunghyun Cho and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations (ICLR\u201915) . May 7-9 2015. Conference Track Proceedings San Diego CA."},{"key":"e_1_3_2_4_2","doi-asserted-by":"crossref","unstructured":"Michael Braun Anja Mainz Ronee Chadowitz Bastian Pfleging and Florian Alt. 2019. At your service: Designing voice assistant personalities to improve automotive user interfaces. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI\u201919) . Association for Computing Machinery 1\u201311.","DOI":"10.1145\/3290605.3300270"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007379606734"},{"key":"e_1_3_2_6_2","first-page":"2965","volume-title":"Proceedings of the 27th International Conference on Computational Linguistics","author":"Changpinyo Soravit","year":"2018","unstructured":"Soravit Changpinyo, Hexiang Hu, and Fei Sha. 2018. Multi-task learning for sequence tagging: An empirical study. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 2965\u20132977."},{"key":"e_1_3_2_7_2","first-page":"3872","volume-title":"Proceedings of the IEEE International Conference on Big Data (Big Data)","author":"Choi Jihun","year":"2016","unstructured":"Jihun Choi, Jonghem Youn, and Sang-goo Lee. 2016. A grapheme-level approach for constructing a Korean morphological analyzer without linguistic knowledge. In Proceedings of the IEEE International Conference on Big Data (Big Data). 3872\u20133879."},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.5555\/1690299.1690325"},{"key":"e_1_3_2_9_2","doi-asserted-by":"crossref","first-page":"36","DOI":"10.18653\/v1\/W17-4105","volume-title":"Proceedings of the 1st Workshop on Subword and Character Level Models in NLP","author":"Choi Sanghyuk","year":"2017","unstructured":"Sanghyuk Choi, Taeuk Kim, Jinseok Seol, and Sang-goo Lee. 2017. A syllable-based technique for word embeddings of Korean words. In Proceedings of the 1st Workshop on Subword and Character Level Models in NLP. Association for Computational Linguistics, 36\u201340."},{"key":"e_1_3_2_10_2","unstructured":"Junyoung Chung Caglar Gulcehre KyungHyun Cho and Yoshua Bengio. 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. (2014). arxiv:cs.NE\/1412.3555."},{"key":"e_1_3_2_11_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Clark Kevin","year":"2020","unstructured":"Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. ELECTRA: Pre-training text encoders as discriminators rather than generators. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390177"},{"key":"e_1_3_2_13_2","first-page":"4171","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 4171\u20134186."},{"key":"e_1_3_2_14_2","volume-title":"Proceedings of the 32nd Annual Conference on Human and Cognitive Language Technology","author":"Do Soojong","year":"2020","unstructured":"Soojong Do, Cheoneum Park, Cheongjae Lee, Kyuyeol Han, and Mirye Lee. 2020. Syllable-based Korean named entity recognition and slot filling with ELECTRA. In Proceedings of the 32nd Annual Conference on Human and Cognitive Language Technology."},{"key":"e_1_3_2_15_2","first-page":"199","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Dyer Chris","year":"2016","unstructured":"Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, and Noah A. Smith. 2016. Recurrent neural network grammars. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 199\u2013209."},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10590-004-7693-4"},{"key":"e_1_3_2_17_2","first-page":"6381","volume-title":"Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"He Y.","year":"2019","unstructured":"Y. He, T. N. Sainath, R. Prabhavalkar, I. McGraw, R. Alvarez, D. Zhao, D. Rybach, A. Kannan, Y. Wu, R. Pang, Q. Liang, D. Bhatia, Y. Shangguan, B. Li, G. Pundak, K. C. Sim, T. Bagby, S. Chang, K. Rao, and A. Gruenstein. 2019. Streaming end-to-end speech recognition for mobile devices. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 6381\u20136385."},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_19_2","first-page":"443","volume-title":"Proceedings of the Conference on Korea Software Congress","author":"Hwang HyunSun","year":"2016","unstructured":"HyunSun Hwang and ChangKi Lee. 2016. Korean morphological analysis using sequence-to-sequence learning with copying mechanism. In Proceedings of the Conference on Korea Software Congress. 443\u2013445."},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-009-5108-8"},{"key":"e_1_3_2_21_2","first-page":"1","volume-title":"Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP","author":"Kann Katharina","year":"2018","unstructured":"Katharina Kann, Johannes Bjerva, Isabelle Augenstein, Barbara Plank, and Anders S\u00f8gaard. 2018. Character-level supervision for low-resource POS tagging. In Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP. 1\u201311."},{"key":"e_1_3_2_22_2","first-page":"145","volume-title":"Proceedings of the 6th International Joint Conference on Natural Language Processing","author":"Kim Youngsam","year":"2013","unstructured":"Youngsam Kim and Hyopil Shin. 2013. Romanization-based approach to morphological analysis in Korean SMS text processing. In Proceedings of the 6th International Joint Conference on Natural Language Processing. 145\u2013152."},{"key":"e_1_3_2_23_2","unstructured":"Diederik P. Kingma and Jimmy Ba. 2017. Adam: A Method for Stochastic Optimization. (2017). arxiv:cs.LG\/1412.6980."},{"key":"e_1_3_2_24_2","first-page":"911","volume-title":"Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers","author":"Kuru Onur","year":"2016","unstructured":"Onur Kuru, Ozan Arkan Can, and Deniz Yuret. 2016. CharNER: Character-level named entity recognition. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, 911\u2013921."},{"issue":"12","key":"e_1_3_2_25_2","first-page":"826","article-title":"Joint models for korean word spacing and POS tagging using structural SVM","volume":"40","author":"Lee Changki","year":"2013","unstructured":"Changki Lee. 2013. Joint models for korean word spacing and POS tagging using structural SVM. In J. KISS: Softw. Applic. 40, 12 (2013), 826\u2013832.","journal-title":"J. KISS: Softw. Applic"},{"issue":"5","key":"e_1_3_2_26_2","doi-asserted-by":"crossref","first-page":"945","DOI":"10.1109\/TASL.2009.2019922","article-title":"Probabilistic modeling of Korean morphology","volume":"17","author":"Lee Do-Gil","year":"2009","unstructured":"Do-Gil Lee and Hae-Chang Rim. 2009. Probabilistic modeling of Korean morphology. IEEE Trans. Aud., Speech, Lang. Process. 17, 5 (2009), 945\u2013955.","journal-title":"IEEE Trans. Aud., Speech, Lang. Process."},{"key":"e_1_3_2_27_2","first-page":"257","volume-title":"J. KISS: Softw. Applic","author":"Lee Jae Sung","year":"2011","unstructured":"Jae Sung Lee. 2011. Three-step probabilistic model for Korean morphological analysis. In J. KISS: Softw. Applic. 257\u2013268."},{"key":"e_1_3_2_28_2","doi-asserted-by":"crossref","unstructured":"Tao Lei. 2021. When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute. (2021). arxiv:cs.CL\/2102.12459.","DOI":"10.18653\/v1\/2021.emnlp-main.602"},{"key":"e_1_3_2_29_2","first-page":"4470","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","author":"Lei Tao","year":"2018","unstructured":"Tao Lei, Yu Zhang, Sida I. Wang, Hui Dai, and Yoav Artzi. 2018. Simple recurrent units for highly parallelizable recurrence. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 4470\u20134481."},{"key":"e_1_3_2_30_2","first-page":"25","volume-title":"Proceedings of the Korean Information Science Society Conference","author":"Lim Dong-Hee","year":"2006","unstructured":"Dong-Hee Lim, Seung-Shik Kang, and Du-Seong Chang. 2006. Word spacing error correction for the postprocessing of speech recognition. In Proceedings of the Korean Information Science Society Conference. 25\u201327."},{"key":"e_1_3_2_31_2","first-page":"238","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Luo Wencan","year":"2016","unstructured":"Wencan Luo and Fan Yang. 2016. An empirical study of automatic Chinese word segmentation for spoken language understanding and named entity recognition. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 238\u2013248."},{"key":"e_1_3_2_32_2","volume-title":"Proceedings of the 4th International Conference on Learning Representations","author":"Luong Minh-Thang","year":"2016","unstructured":"Minh-Thang Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, and Lukasz Kaiser. 2016. Multi-task sequence to sequence learning. In Proceedings of the 4th International Conference on Learning Representations."},{"key":"e_1_3_2_33_2","first-page":"1403","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics","author":"Ma Xuezhe","year":"2018","unstructured":"Xuezhe Ma, Zecong Hu, Jingzhou Liu, Nanyun Peng, Graham Neubig, and Eduard Hovy. 2018. Stack-pointer networks for dependency parsing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1403\u20131414."},{"key":"e_1_3_2_34_2","first-page":"2482","volume-title":"Proceedings of the 27th International Conference on Computational Linguistics","author":"Matteson Andrew","year":"2018","unstructured":"Andrew Matteson, Chanhee Lee, Youngbum Kim, and Heuiseok Lim. 2018. Rich character-level information for Korean morphological analysis and part-of-speech tagging. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 2482\u20132492."},{"key":"e_1_3_2_35_2","first-page":"371","volume-title":"Proceedings of theConference on Korea Software Congress","author":"Min Jinwoo","year":"2020","unstructured":"Jinwoo Min, Seung-Hoon Na, Jong-Hoon Shin, and Young-Kil Kim. 2020. Stack pointer network for Korean morphological analysis. In Proceedings of theConference on Korea Software Congress. 371\u2013373."},{"issue":"3","key":"e_1_3_2_36_2","article-title":"Conditional random fields for Korean morpheme segmentation and POS tagging","volume":"14","author":"Na Seung-Hoon","year":"2015","unstructured":"Seung-Hoon Na. 2015. Conditional random fields for Korean morpheme segmentation and POS tagging. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 14, 3 (June2015).","journal-title":"ACM Trans. Asian Low-Resour. Lang. Inf. Process."},{"key":"e_1_3_2_37_2","first-page":"600","volume-title":"Proceedings of the Annual Conference on Human and Language Technology","author":"Park Keunyoung","year":"2018","unstructured":"Keunyoung Park, Kyungduk Kim, and Inho Kang. 2018. Jam-packing Korean sentence classification method robust for spacing errors. In Proceedings of the Annual Conference on Human and Language Technology. 600\u2013604."},{"key":"e_1_3_2_38_2","first-page":"133","volume-title":"Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing","author":"Park Kyubyong","year":"2020","unstructured":"Kyubyong Park, Joohong Lee, Seongbo Jang, and Dawoon Jung. 2020. An empirical study of tokenization strategies for various Korean NLP tasks. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 133\u2013142."},{"key":"e_1_3_2_39_2","volume-title":"Proceedings of the 3rd Workshop on Very Large Corpora","author":"Ramshaw Lance","year":"1995","unstructured":"Lance Ramshaw and Mitch Marcus. 1995. Text chunking using transformation-based learning. In Proceedings of the 3rd Workshop on Very Large Corpora."},{"key":"e_1_3_2_40_2","article-title":"An overview of multi-task learning in deep neural networks","author":"Ruder Sebastian","year":"2017","unstructured":"Sebastian Ruder. 2017. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017).","journal-title":"arXiv preprint arXiv:1706.05098"},{"key":"e_1_3_2_41_2","first-page":"3535","volume-title":"Proceedings of the 20th Annual Conference of the International Speech Communication Association","author":"Sharma Dravyansh","year":"2019","unstructured":"Dravyansh Sharma, Melissa Wilson, and Antoine Bruguier. 2019. Better morphology prediction for better speech systems. In Proceedings of the 20th Annual Conference of the International Speech Communication Association. ISCA, 3535\u20133539."},{"issue":"3","key":"e_1_3_2_42_2","doi-asserted-by":"crossref","first-page":"327","DOI":"10.19066\/cogsci.2011.22.3.005","article-title":"Syllable-based POS tagging without korean morphological analysis","volume":"22","author":"Shim Kwang-Seob","year":"2011","unstructured":"Kwang-Seob Shim. 2011. Syllable-based POS tagging without korean morphological analysis. Korean J. Cognit. Sci. 22, 3 (2011), 327\u2013345.","journal-title":"Korean J. Cognit. Sci."},{"key":"e_1_3_2_43_2","doi-asserted-by":"crossref","first-page":"1436","DOI":"10.18653\/v1\/D19-1150","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Song Hyun-Je","year":"2019","unstructured":"Hyun-Je Song and Seong-Bae Park. 2019. Korean morphological analysis with tied sequence-to-sequence multi-task model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 1436\u20131441."},{"key":"e_1_3_2_44_2","volume-title":"Advances in Neural Information Processing Systems","author":"Sutskever Ilya","year":"2014","unstructured":"Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems, Vol. 27. Curran Associates, Inc."},{"key":"e_1_3_2_45_2","first-page":"4784","volume-title":"Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"Tachibana H.","year":"2018","unstructured":"H. Tachibana, K. Uenoyama, and S. Aihara. 2018. Efficiently trainable text-to-speech system based on deep convolutional networks with guided attention. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 4784\u20134788."},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.15388\/infedu.2020.21"},{"key":"e_1_3_2_47_2","volume-title":"Advances in Neural Information Processing Systems","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc."},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNSE.2022.3151502"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.image.2021.116319"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2022.05.004"},{"key":"e_1_3_2_51_2","first-page":"1529","volume-title":"Proceedings of the IEEE International Conference on Computer Vision (ICCV)","author":"Zheng S.","year":"2015","unstructured":"S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. H. S. Torr. 2015. Conditional random fields as recurrent neural networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 1529\u20131537."}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3591206","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3591206","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:47:46Z","timestamp":1750178866000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3591206"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,17]]},"references-count":50,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,6,30]]}},"alternative-id":["10.1145\/3591206"],"URL":"https:\/\/doi.org\/10.1145\/3591206","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"value":"2375-4699","type":"print"},{"value":"2375-4702","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,17]]},"assertion":[{"value":"2022-03-27","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-03-31","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-06-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}