{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,27]],"date-time":"2025-12-27T10:04:27Z","timestamp":1766829867576,"version":"3.40.5"},"reference-count":96,"publisher":"Cambridge University Press (CUP)","issue":"2","license":[{"start":{"date-parts":[[2022,2,21]],"date-time":"2022-02-21T00:00:00Z","timestamp":1645401600000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":["cambridge.org"],"crossmark-restriction":true},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2023,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Deep learning approaches are superior in natural language processing due to their ability to extract informative features and patterns from languages. The two most successful neural architectures are LSTM and transformers, used in large pretrained language models such as BERT. While cross-lingual approaches are on the rise, most current natural language processing techniques are designed and applied to English, and less-resourced languages are lagging behind. In morphologically rich languages, information is conveyed through morphology, for example, through affixes modifying stems of words. The existing neural approaches do not explicitly use the information on word morphology. We analyse the effect of adding morphological features to LSTM and BERT models. As a testbed, we use three tasks available in many less-resourced languages: named entity recognition (NER), dependency parsing (DP) and comment filtering (CF). We construct baselines involving LSTM and BERT models, which we adjust by adding additional input in the form of part of speech (POS) tags and universal features. We compare the models across several languages from different language families. Our results suggest that adding morphological features has mixed effects depending on the quality of features and the task. The features improve the performance of LSTM-based models on the NER and DP tasks, while they do not benefit the performance on the CF task. For BERT-based models, the added morphological features only improve the performance on DP when they are of high quality (i.e., manually checked) while not showing any practical improvement when they are predicted. Even for high-quality features, the improvements are less pronounced in language-specific BERT variants compared to massively multilingual BERT models. As in NER and CF datasets manually checked features are not available, we only experiment with predicted features and find that they do not cause any practical improvement in performance.<\/jats:p>","DOI":"10.1017\/s1351324922000080","type":"journal-article","created":{"date-parts":[[2022,2,21]],"date-time":"2022-02-21T13:54:40Z","timestamp":1645451680000},"page":"360-385","update-policy":"https:\/\/doi.org\/10.1017\/policypage","source":"Crossref","is-referenced-by-count":6,"title":["Enhancing deep neural networks with morphological information"],"prefix":"10.1017","volume":"29","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7852-2357","authenticated-orcid":false,"given":"Matej","family":"Klemen","sequence":"first","affiliation":[]},{"given":"Luka","family":"Krsnik","sequence":"additional","affiliation":[]},{"given":"Marko","family":"Robnik-\u0160ikonja","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2022,2,21]]},"reference":[{"key":"S1351324922000080_ref1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.conll-1.6"},{"key":"S1351324922000080_ref49","unstructured":"Marton, Y. , Habash, N. and Rambow, O. (2010). Improving Arabic dependency parsing with lexical and inflectional morphological features. In Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages, pp. 13\u201321."},{"key":"S1351324922000080_ref19","doi-asserted-by":"publisher","DOI":"10.1145\/3200947.3208069"},{"key":"S1351324922000080_ref58","unstructured":"Nivre, J. (2003). An efficient algorithm for projective dependency parsing. In Proceedings of the Eighth International Conference on Parsing Technologies, pp. 149\u2013160."},{"key":"S1351324922000080_ref96","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.399"},{"key":"S1351324922000080_ref87","unstructured":"Virtanen, A. , Kanerva, J. , Ilo, R. , Luoma, J. , Luotolahti, J. , Salakoski, T. , Ginter, F. and Pyysalo, S. (2019). Multilingual is not enough: BERT for Finnish. ArXiv 1912.07076."},{"key":"S1351324922000080_ref14","article-title":"Amnesic probing: Behavioral explanation with amnesic counterfactuals","volume":"9","author":"Elazar","year":"2021","journal-title":"Trans. Assoc. Comput. Ling."},{"key":"S1351324922000080_ref80","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1452"},{"key":"S1351324922000080_ref5","article-title":"Enriching word vectors with subword information","volume":"5","author":"Bojanowski","year":"2017","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"S1351324922000080_ref68","unstructured":"Scheffler, T. , Haegert, E. , Pornavalai, S. and Sasse, M.L. (2018). Feature explorations for hate speech classification. In 14th Conference on Natural Language Processing KONVENS 2018, vol. 6, p. 8."},{"key":"S1351324922000080_ref84","unstructured":"Van Hee, C. , Lefever, E. , Verhoeven, B. , Mennes, J. , Desmet, B. , De Pauw, G. , Daelemans, W. and Hoste, V. (2015). Detection and fine-grained classification of cyberbullying events. In Proceedings of the International Conference Recent Advances in Natural Language Processing, pp. 672\u2013680."},{"key":"S1351324922000080_ref22","unstructured":"G\u00fcng\u00f6r, O. , Yldz, E. , \u00dcsk\u00fcdarli, S. and G\u00fcng\u00f6r, T. (2017). Morphological embeddings for named entity recognition in morphologically rich languages. arXiv preprint arXiv:1706.00506."},{"key":"S1351324922000080_ref2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-3712"},{"key":"S1351324922000080_ref93","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S19-2010"},{"key":"S1351324922000080_ref52","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-31372-2_24"},{"key":"S1351324922000080_ref60","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K18-2024"},{"key":"S1351324922000080_ref67","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.semeval-1.271"},{"key":"S1351324922000080_ref76","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-45510-5_20"},{"key":"S1351324922000080_ref95","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-60450-9_15"},{"key":"S1351324922000080_ref34","article-title":"Simple and accurate dependency parsing using bidirectional LSTM feature representations","volume":"4","author":"Kiperwasser","year":"2016","journal-title":"Trans. Assoc. Comput. Ling."},{"key":"S1351324922000080_ref59","unstructured":"Nivre, J. , Abrams, M. , Agi\u0107, \u017d. , Ahrenberg, L. , Aleksandravi\u010di\u016bt\u0117, G. , Antonsen, L. , Aplonova, K. , Aranzabe, M. , Arutie, G. , Asahara, M. , et al. (2020). Universal Dependencies 2.6. Available at http:\/\/hdl.handle.net\/11234\/1-2988. LINDAT\/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (\u00daFAL), Faculty of Mathematics and Physics, Charles University."},{"key":"S1351324922000080_ref92","unstructured":"Yang, Z. , Salakhutdinov, R. and Cohen, W. (2016). Multi-task cross-lingual sequence tagging from scratch. ArXiv:1603.06270."},{"key":"S1351324922000080_ref43","unstructured":"Li, Z. , Cai, J. , He, S. and Zhao, H. (2018). Seq2seq dependency parsing. In Proceedings of the 27th International Conference on Computational Linguistics, pp. 3203\u20133214."},{"key":"S1351324922000080_ref48","doi-asserted-by":"publisher","DOI":"10.26615\/978-954-452-049-6_062"},{"key":"S1351324922000080_ref63","doi-asserted-by":"crossref","unstructured":"Pires, T. , Schlinger, E. and Garrette, D. (2019). How multilingual is multilingual BERT? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4996\u20135001.","DOI":"10.18653\/v1\/P19-1493"},{"key":"S1351324922000080_ref20","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.iwpt-1.13"},{"key":"S1351324922000080_ref74","unstructured":"Starostin, A. , Bocharov, V.V. , Alexeeva, S. , Bodrova, A.A. , Chuchunkov, A. , Dzhumaev, Sh.Sh. , Efimenko, I. , Granovsky, D.V. , Khoroshevsky, V.F. , Krylova, I.V. , Nikolaeva, M. , Smurov, I. and Toldova, S. (2016). FactRuEval 2016: Evaluation of named entity recognition and fact extraction systems for Russian. In Annual International Conference \u201cDialogue\u201c."},{"key":"S1351324922000080_ref35","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1279"},{"key":"S1351324922000080_ref71","unstructured":"Seker, A. , Bandel, E. , Bareket, D. , Brusilovsky, I. , Greenfeld, R.S. and Tsarfaty, R. (2021). AlephBERT: A Hebrew large pre-trained language model to start-off your Hebrew NLP application with. ArXiv 2104.04052."},{"key":"S1351324922000080_ref57","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-demos.1"},{"key":"S1351324922000080_ref70","unstructured":"Seeker, W. and Kuhn, J. (2011). On the role of explicit morphological feature representation in syntactic dependency parsing for German. In Proceedings of the 12th International Conference on Parsing Technologies, pp. 58\u201362."},{"key":"S1351324922000080_ref89","article-title":"Critical values and probability levels for the Wilcoxon rank sum test and the Wilcoxon signed rank test","volume":"1","author":"Wilcoxon","year":"1970","journal-title":"Selected Tables Math. Stat."},{"key":"S1351324922000080_ref83","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58323-1_11"},{"key":"S1351324922000080_ref39","unstructured":"Kuru, O. , Can, O.A. and Yuret, D. (2016). CharNER: Character-level named entity recognition. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 911\u2013921."},{"key":"S1351324922000080_ref25","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"S1351324922000080_ref40","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N16-1030"},{"key":"S1351324922000080_ref46","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-4825"},{"key":"S1351324922000080_ref78","unstructured":"Taghizadeh, N. , Borhanifard, Z. , Pour, M.G. , Farhoodi, M. , Mahmoudi, M. , Azimzadeh, M. and Faili, H. (2019). NSURL-2019 task 7: Named entity recognition for Farsi. In Proceedings of The First International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2019) co-located with ICNLSP 2019 - Short Papers, pp. 9\u201315."},{"key":"S1351324922000080_ref17","article-title":"A survey on automatic detection of hate speech in text","volume":"51","author":"Fortuna","year":"2018","journal-title":"ACM Comput. Surv."},{"key":"S1351324922000080_ref26","unstructured":"Huang, Z. , Xu, W. and Yu, K. (2015). Bidirectional LSTM-CRF models for sequence tagging. ArXiv, abs\/1508.01991."},{"key":"S1351324922000080_ref28","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1237"},{"key":"S1351324922000080_ref29","article-title":"SpanBERT: Improving pre-training by representing and predicting spans","volume":"8","author":"Joshi","year":"2020","journal-title":"Trans. Assoc. Comput. Ling."},{"key":"S1351324922000080_ref64","doi-asserted-by":"crossref","unstructured":"Qi, P. , Zhang, Y. , Zhang, Y. , Bolton, J. and Manning, C.D. (2020). Stanza: A Python natural language processing toolkit for many human languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations.","DOI":"10.18653\/v1\/2020.acl-demos.14"},{"key":"S1351324922000080_ref10","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W15-3904"},{"key":"S1351324922000080_ref65","article-title":"A primer in BERTology: What we know about how BERT works","volume":"8","author":"Rogers","year":"2020","journal-title":"Trans. Assoc. Comput. Ling."},{"key":"S1351324922000080_ref72","doi-asserted-by":"crossref","unstructured":"Shtovba, S. , Shtovba, O. and Petrychko, M. (2019). Detection of social network toxic comments with usage of syntactic dependencies in the sentences. In Proceedings of the Second International Workshop on Computer Modeling and Intelligent Systems, pp. 313\u2013323.","DOI":"10.32782\/cmis\/2353-25"},{"key":"S1351324922000080_ref79","unstructured":"Tanvir, H. , Kittask, C. , Eiche, S. and Sirts, K. (2021). EstBERT: A pretrained language-specific BERT for Estonian. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), pp. 11\u201319."},{"key":"S1351324922000080_ref85","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1278"},{"key":"S1351324922000080_ref37","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1277"},{"key":"S1351324922000080_ref36","unstructured":"Krek, S. , Dobrovoljc, K. , Erjavec, T. , Mo\u017ee, S. , Ledinek, N. , Holz, N. , Zupan, K. , Gantar, P. , Kuzman, T. , \u010cibej, J. , Holdt, \u0160. A. , Kav\u010di\u010d, T. , \u0160krjanec, I. , Marko, D. , Jezer\u0161ek, L. and Zajc, A. (2019). Training corpus ssj500k 2.2. Available at http:\/\/hdl.handle.net\/11356\/1210. Slovenian language resource repository CLARIN.SI."},{"key":"S1351324922000080_ref7","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390177"},{"key":"S1351324922000080_ref16","doi-asserted-by":"publisher","DOI":"10.1007\/s11063-021-10528-4"},{"key":"S1351324922000080_ref38","unstructured":"Kuratov, Y. and Arkhipov, M. (2019). Adaptation of deep bidirectional multilingual transformers for Russian language. In Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference \u201cDialogue 2019\u201c."},{"key":"S1351324922000080_ref4","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-70939-8_13"},{"key":"S1351324922000080_ref15","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0256175"},{"key":"S1351324922000080_ref69","unstructured":"Seddah, D. , Koebler, S. and Tsarfaty, R. (eds). (2010). Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages."},{"key":"S1351324922000080_ref54","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.socialnlp-1.4"},{"key":"S1351324922000080_ref32","unstructured":"Kapo\u010di\u016bt\u0117-Dzikien\u0117, J. , Nivre, J. and Krupavi\u010dius, A. (2013). Lithuanian dependency parsing with rich morphological features. In Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages, pp. 12\u201321."},{"key":"S1351324922000080_ref31","doi-asserted-by":"publisher","DOI":"10.4135\/9781849208499"},{"key":"S1351324922000080_ref61","first-page":"0","article-title":"Towards named entity annotation of Latvian national library corpus","volume":"247","author":"Paikens","year":"2012","journal-title":"Front. Artif. Intell. Appl."},{"key":"S1351324922000080_ref23","unstructured":"Haji\u010d, J. and Zeman, D. (eds). (2017). Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. Association for Computational Linguistics."},{"key":"S1351324922000080_ref56","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K18-2008"},{"key":"S1351324922000080_ref77","first-page":"1","article-title":"The probable error of a mean","year":"1908","journal-title":"Biometrika"},{"key":"S1351324922000080_ref75","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K18-2020"},{"key":"S1351324922000080_ref45","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K18-2014"},{"key":"S1351324922000080_ref88","doi-asserted-by":"publisher","DOI":"10.1109\/29.21701"},{"volume-title":"Speech and Language Processing","year":"2009","author":"Jurafsky","key":"S1351324922000080_ref30"},{"key":"S1351324922000080_ref6","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1082"},{"key":"S1351324922000080_ref3","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D15-1041"},{"key":"S1351324922000080_ref66","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-019-09471-7"},{"key":"S1351324922000080_ref90","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052591"},{"key":"S1351324922000080_ref33","unstructured":"Khallash, M. , Hadian, A. and Minaei-Bidgoli, B. (2013). An empirical study on the effect of morphological and lexical features in Persian dependency parsing. In Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages, pp. 97\u2013107."},{"key":"S1351324922000080_ref55","unstructured":"Nemeskey, D.M. (2021). Introducing huBERT. In XVII. Magyar Sz\u00e1m\u00edt\u00f3g\u00e9pes Nyelv\u00e9szeti Konferencia (MSZNY2021)."},{"key":"S1351324922000080_ref82","unstructured":"Tkachenko, A. , Petmanson, T. and Laur, S. (2013). Named entity recognition in Estonian. In Proceedings of the 4th Biennial International Workshop on Balto-Slavic Natural Language Processing, pp. 78\u201383."},{"key":"S1351324922000080_ref47","unstructured":"Ljube\u0161i\u0107, N. , Agi\u0107, \u017d. , Klubi\u010dka, F. , Batanovi\u0107, V. and Erjavec, T. (2018). Training corpus hr500k 1.0. Available at http:\/\/hdl.handle.net\/11356\/1183. Slovenian language resource repository CLARIN.SI."},{"key":"S1351324922000080_ref8","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1198"},{"key":"S1351324922000080_ref18","doi-asserted-by":"publisher","DOI":"10.26615\/978-954-452-049-6_036"},{"key":"S1351324922000080_ref50","doi-asserted-by":"publisher","DOI":"10.3115\/1220575.1220641"},{"key":"S1351324922000080_ref51","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.sigtyp-1.10"},{"key":"S1351324922000080_ref9","unstructured":"Devlin, J. , Chang, M.-W. , Lee, K. and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171\u20134186."},{"key":"S1351324922000080_ref24","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S19-2116"},{"key":"S1351324922000080_ref42","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K17-3022"},{"key":"S1351324922000080_ref62","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-1031"},{"key":"S1351324922000080_ref91","unstructured":"Yamada, H. and Matsumoto, Y. (2003). Statistical dependency analysis with support vector machines. In Proceedings of the Eighth International Conference on Parsing Technologies, pp. 195\u2013206."},{"key":"S1351324922000080_ref12","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K17-3002"},{"key":"S1351324922000080_ref21","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324918000281"},{"key":"S1351324922000080_ref27","doi-asserted-by":"crossref","unstructured":"Jawahar, G. , Sagot, B. and Seddah, D. (2019). What does BERT learn about the structure of language? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3651\u20133657.","DOI":"10.18653\/v1\/P19-1356"},{"key":"S1351324922000080_ref41","unstructured":"Levow, G.-A. (2006). The third international Chinese language processing bakeoff: Word segmentation and named entity recognition. In Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, pp. 108\u2013117."},{"key":"S1351324922000080_ref73","doi-asserted-by":"publisher","DOI":"10.26615\/978-954-452-056-4_127"},{"key":"S1351324922000080_ref44","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i05.6351"},{"key":"S1351324922000080_ref81","doi-asserted-by":"publisher","DOI":"10.3115\/1119176.1119195"},{"key":"S1351324922000080_ref86","unstructured":"Vaswani, A. , Shazeer, N. , Parmar, N. , Uszkoreit, J. , Jones, L. , Gomez, A.N. , Kaiser, \u0141. and Polosukhin, I. (2017). Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000\u20136010."},{"key":"S1351324922000080_ref53","unstructured":"Mohseni, M. and Tebbifakhr, A. (2019). MorphoBERT: A Persian NER system with BERT and morphological analysis. In Proceedings of The First International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2019) co-located with ICNLSP 2019 - Short Papers, pp. 23\u201330."},{"key":"S1351324922000080_ref94","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.semeval-1.188"},{"key":"S1351324922000080_ref13","unstructured":"Edmiston, D. (2020). A systematic analysis of morphological content in BERT models for multiple languages. arXiv:2004.03032."},{"key":"S1351324922000080_ref11","unstructured":"Dozat, T. and Manning, C.D. (2016). Deep biaffine attention for neural dependency parsing. In Proceedings on International Conference on Learning Representation."}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324922000080","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,3,13]],"date-time":"2023-03-13T04:19:42Z","timestamp":1678681182000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324922000080\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,21]]},"references-count":96,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,3]]}},"alternative-id":["S1351324922000080"],"URL":"https:\/\/doi.org\/10.1017\/s1351324922000080","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"type":"print","value":"1351-3249"},{"type":"electronic","value":"1469-8110"}],"subject":[],"published":{"date-parts":[[2022,2,21]]},"assertion":[{"value":"\u00a9 The Author(s), 2022. Published by Cambridge University Press","name":"copyright","label":"Copyright","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}}]}}