{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,13]],"date-time":"2025-05-13T22:00:27Z","timestamp":1747173627433,"version":"3.40.5"},"reference-count":73,"publisher":"Cambridge University Press (CUP)","issue":"5","license":[{"start":{"date-parts":[[2023,8,11]],"date-time":"2023-08-11T00:00:00Z","timestamp":1691712000000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["cambridge.org"],"crossmark-restriction":true},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2024,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>One of the most interesting aspects of natural language is how texts cohere, which involves the pragmatic or semantic relations that hold between clauses (addition, cause-effect, conditional, similarity), referred to as discourse relations. A focus on the identification and classification of discourse relations appears as an imperative challenge to be resolved to support tasks such as text summarization, dialogue systems, and machine translation that need information above the clause level. Despite the recent interest in discourse relations in well-known languages such as English, data and experiments are still needed for typologically different and less-resourced languages. We report the most comprehensive investigation of shallow discourse parsing in Turkish, focusing on two main sub-tasks: identification of discourse relation realization types and the sense classification of explicit and implicit relations. The work is based on the approach of fine-tuning a pre-trained language model (BERT) as an encoder and classifying the encoded data with neural network-based classifiers. We firstly identify the discourse relation realization type that holds in a given text, if there is any. Then, we move on to the sense classification of the identified explicit and implicit relations. In addition to in-domain experiments on a held-out test set from the Turkish Discourse Bank (TDB 1.2), we also report the out-domain performance of our models in order to evaluate its generalization abilities, using the Turkish part of the TED Multilingual Discourse Bank. Finally, we explore the effect of multilingual data aggregation on the classification of relation realization type through a cross-lingual experiment. The results suggest that our models perform relatively well despite the limited size of the TDB 1.2 and that there are language-specific aspects of detecting the types of discourse relation realization. We believe that the findings are important both in providing insights regarding the performance of the modern language models in a typologically different language and in the low-resource scenario, given that the TDB 1.2 is 1\/20th of the Penn Discourse TreeBank in terms of the number of total relations.<\/jats:p>","DOI":"10.1017\/s1351324923000359","type":"journal-article","created":{"date-parts":[[2023,8,11]],"date-time":"2023-08-11T09:59:07Z","timestamp":1691747947000},"page":"1009-1034","update-policy":"https:\/\/doi.org\/10.1017\/policypage","source":"Crossref","is-referenced-by-count":0,"title":["Toward a shallow discourse parser for Turkish"],"prefix":"10.1017","volume":"30","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0419-3327","authenticated-orcid":false,"given":"Ferhat","family":"Kutlu","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9248-0141","authenticated-orcid":false,"given":"Deniz","family":"Zeyrek","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7020-8275","authenticated-orcid":false,"given":"Murathan","family":"Kurfal\u0131","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2023,8,11]]},"reference":[{"key":"S1351324923000359_ref6","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-022-09614-3"},{"key":"S1351324923000359_ref66","doi-asserted-by":"crossref","unstructured":"Zeyrek, D. and Kurfal\u0131, M. (2017). TDB 1.1: Extensions on Turkish Discourse Bank. In Proceedings of the 11th Linguistic Annotation Workshop, LAW@EACL 2017, April 3, 2017, Valencia, Spain. ACL, pp. 76\u201381.","DOI":"10.18653\/v1\/W17-0809"},{"key":"S1351324923000359_ref8","first-page":"467","article-title":"Class-based n-gram models of natural language","volume":"18","author":"Brown","year":"1992","journal-title":"Computational Linguistics,"},{"key":"S1351324923000359_ref45","unstructured":"Prasad, R. , Miltsakaki, E. , Dinesh, N. , Lee, A. and Joshi, A. (2008). The Penn Discourse TreeBank 2.0 Annotation Manual. Technical report, Institute for Research in Cognitive Science, University of Pennsylvania."},{"key":"S1351324923000359_ref30","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D15-1166"},{"volume-title":"Proceedings of the Third Turkish Symposium on Artificial Intelligence and Artificial Neural Networks.","year":"1994","author":"Oflazer","key":"S1351324923000359_ref37"},{"key":"S1351324923000359_ref71","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.codi-main.10"},{"key":"S1351324923000359_ref57","doi-asserted-by":"crossref","unstructured":"Warrens, M.J. (2014). New Interpretations of Cohen\u2019s Kappa, Journal of Mathematics, Hindawi Publishing Corporation. Available at https:\/\/www.hindawi.com\/journals\/jmath\/2014\/203907\/.","DOI":"10.1155\/2014\/203907"},{"key":"S1351324923000359_ref64","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-3308"},{"key":"S1351324923000359_ref25","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K17-1034"},{"key":"S1351324923000359_ref70","unstructured":"Zeyrek, D. and Webber, B. (2008). A discourse resource for Turkish: Annotating discourse connectives in the METU corpus. In Proceedings of the 6th Workshop on Asian Language Resources."},{"key":"S1351324923000359_ref11","unstructured":"CoNLL 2015 Shared Task: Shallow Discourse Parsing (2015). (accessed September 30, 2020), Available at: https:\/\/www.cs.brandeis.edu\/~clp\/conll15st\/"},{"key":"S1351324923000359_ref16","doi-asserted-by":"publisher","DOI":"10.1162\/coli.2008.07-017-R1-06-83"},{"key":"S1351324923000359_ref12","unstructured":"Dauphin, Y.N. , T\u00fcr, G. , T\u00fcr, D.H. and Heck, L.P. (2014). Zero-shot learning and clustering for semantic utterance classification. In 2nd International Conference on Learning Representations, ICLR 2014, April 14-16, 2014, Banff, AB, Canada, Conference Track Proceedings, Conference Track Proceedings."},{"key":"S1351324923000359_ref20","unstructured":"Grandini, M. , Bagli, E. and Visani, G. (2020). Metrics for Multi-Class Classification: An Overview, ArXiv. \/abs\/2008.05756."},{"key":"S1351324923000359_ref41","doi-asserted-by":"publisher","DOI":"10.3115\/1690219.1690241"},{"key":"S1351324923000359_ref17","doi-asserted-by":"publisher","DOI":"10.1037\/h0031619"},{"key":"S1351324923000359_ref19","unstructured":"Gopalan, S. and Devi, S.L. (2016). BioDCA Identifier: a system for automatic identification of discourse connective and arguments from biomedical text. In Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining, BioTxtM@COLING 2016, December 12, 2016, Osaka, Japan, pp. 89\u201398."},{"key":"S1351324923000359_ref44","unstructured":"Prasad, R. , Dinesh, N. , Lee, A. , Miltsakaki, E. , Robaldo, L. , Joshi, A. and Webber, B. (2008). The Penn Discourse Treebank 2.0. In Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008, Marrakech, Morocco."},{"key":"S1351324923000359_ref55","unstructured":"Vaswani, A. , Shazeer, N. , Parmar, N. , Uszkoreit, J. , Jones, L. , Gomez, A.N. , Kaiser, L. and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 2017, Long Beach, CA, USA, pp. 5998\u20136008."},{"key":"S1351324923000359_ref13","unstructured":"Devlin, J. , Chang, M.W. , Lee, K. and Toutanova, K. (2019). BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the ACL: Human Language Technologies, NAACL-HLT 2019, Volume 1 (Long and Short Papers), June 2-7, 2019, Minneapolis, MN, USA, ACL, pp. 4171\u20134186."},{"key":"S1351324923000359_ref61","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K15-2001"},{"key":"S1351324923000359_ref47","unstructured":"Prasad, R. , Webber, B. and Lee, A. (2018). Discourse annotation in the PDTB: the next generation. In Proceedings 14th Joint ACL - ISO Workshop on Interoperable Semantic Annotation, August 2018, Santa Fe, New Mexico, USA. ACL, pp. 87\u201397."},{"key":"S1351324923000359_ref10","unstructured":"Caselli, T. and \u00dcst\u00fcn, A. (2019). There and back again: Cross-lingual transfer learning for event detection. In Proceedings of the Sixth Italian Conference on Computational Linguistics, CEUR 2019, Trento, Italy, CEUR Workshop Proceedings (CEUR-WS.org), vol. 2481, CEUR Workshop Proceedings (CEUR-WS.org)."},{"key":"S1351324923000359_ref39","first-page":"71","article-title":"The Proposition Bank: an annotated corpus of semantic roles","volume":"31","author":"Palmer","year":"2005","journal-title":"ACL"},{"key":"S1351324923000359_ref27","doi-asserted-by":"publisher","DOI":"10.3115\/1699510.1699555"},{"key":"S1351324923000359_ref7","unstructured":"Bos, J. (2013). The Groningen Meaning Bank. In Proceedings of the Joint Symposium on Semantic Processing Textual Inference and Structures in Corpora, JSSP 2013, November 20-22, 2013, Trento, Italy, ACL, p. 2."},{"key":"S1351324923000359_ref35","unstructured":"Muller, P. , Braud, C. and Morey, M. (2019). ToNy: Contextual embeddings for accurate multilingual discourse segmentation of full documents. In Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019, June 2019, Minneapolis, MN. ACL, pp. 115\u2013124."},{"key":"S1351324923000359_ref4","unstructured":"Bahdanau, D. , Kyunghyun, C. and Yoshua, B. (2015). Neural Machine Translation by Jointly Learning to Align and Translate. 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings."},{"key":"S1351324923000359_ref18","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D15-1070"},{"key":"S1351324923000359_ref32","doi-asserted-by":"crossref","unstructured":"Marcu, D. and Echihabi, A. (2002). An unsupervised approach to recognizing discourse relations. In Proceedings of the 40th Annual Meeting of ACL, July 6-12, 2002, Philadelphia, PA, USA. ACL, pp. 368\u2013375.","DOI":"10.3115\/1073083.1073145"},{"key":"S1351324923000359_ref56","doi-asserted-by":"publisher","DOI":"10.1145\/3293318"},{"key":"S1351324923000359_ref52","doi-asserted-by":"publisher","DOI":"10.1515\/cllt.2010.009"},{"key":"S1351324923000359_ref43","first-page":"87","volume-title":"Coling 2008: Companion Volume: Posters","author":"Pitler","year":"2008"},{"key":"S1351324923000359_ref68","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-019-09445-9"},{"key":"S1351324923000359_ref53","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/W14-1105"},{"key":"S1351324923000359_ref48","doi-asserted-by":"publisher","DOI":"10.1136\/amiajnl-2011-000775"},{"key":"S1351324923000359_ref22","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.480"},{"key":"S1351324923000359_ref23","unstructured":"Kishimoto, Y. , Murawaki, Y. and Kurohashi, S. (2020). Adapting BERT to implicit discourse relation classification with a focus on discourse connectives. In European Language Resources Association, Proceedings of The 12th Language Resources and Evaluation Conference, LREC 2020, May 11-16, 2020, Marseille, France, pp.1152\u20131158, European Language Resources Association."},{"key":"S1351324923000359_ref72","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-90165-7_8"},{"key":"S1351324923000359_ref31","unstructured":"Ma, Y. , Cambria, E. and Gao, S. (2016). Label embedding for zero-shot fine-grained named entity typing. In COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, December 11-16, 2016, Osaka, Japan. ACL, pp. 171\u2013180."},{"key":"S1351324923000359_ref3","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-011-1715-9"},{"key":"S1351324923000359_ref42","doi-asserted-by":"publisher","DOI":"10.3115\/1667583.1667589"},{"key":"S1351324923000359_ref50","doi-asserted-by":"crossref","unstructured":"Schuster, S. , Gupta, S. , Shah, R. and Lewis, M. (2019). Cross-lingual transfer learning for multilingual task oriented dialog. In (Long and Short Papers), Proceedings of the 2019 Conference of the North American Chapter of ACL: Human Language Technologies, NAACL-HLT 2019, June 2-7, 2019, Minneapolis, MN, USA, 1, pp. 3795\u20133805, (Long and Short Papers).","DOI":"10.18653\/v1\/N19-1380"},{"key":"S1351324923000359_ref54","first-page":"413","volume-title":"Studies in Computational Intelligence","author":"Tall\u00f3n-Ballesteros","year":"2014"},{"key":"S1351324923000359_ref38","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-90165-7"},{"key":"S1351324923000359_ref15","unstructured":"DISRPT21 (Discourse Unit Segmentation Across Formalisms 2021) (2021). (accessed December 30, 2021). Available at: https:\/\/sites.google.com\/georgetown.edu\/disrpt2021."},{"key":"S1351324923000359_ref14","unstructured":"DISRPT19 (Discourse Unit Segmentation Across Formalisms 2019) (2019). (accessed September 30, 2020). Available at: https:\/\/sites.google.com\/view\/disrpt2019\/shared-task."},{"key":"S1351324923000359_ref34","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K15-2009"},{"key":"S1351324923000359_ref46","first-page":"921","article-title":"Reflections on the Penn Discourse Treebank: comparable corpora, and complementary annotation","volume":"40","author":"Prasad","year":"2014","journal-title":"ACL"},{"key":"S1351324923000359_ref40","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-1037"},{"key":"S1351324923000359_ref36","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1442"},{"key":"S1351324923000359_ref21","first-page":"339","article-title":"Google\u2019s multilingual neural machine translation system: enabling zero-shot translation","volume":"5","author":"Johnson","year":"2017","journal-title":"Transactions of the ACL"},{"key":"S1351324923000359_ref73","doi-asserted-by":"publisher","DOI":"10.3233\/SW-223011"},{"key":"S1351324923000359_ref29","unstructured":"Loshchilov, I. and Hutter, F. (2017). Decoupled weight decay regularization. ArXiv.\/abs\/1711.05101."},{"key":"S1351324923000359_ref33","doi-asserted-by":"crossref","unstructured":"Muermans, T.C. and Kosseim, L. (2022). A BERT-based approach for multilingual discourse connective detection. In Proceedings, Natural Language Processing and Information Systems: 27th International Conference on Applications of Natural Language to Information Systems, NLDB 2022, vol. 13286. Valencia, Spain: Springer Nature, pp. 449.","DOI":"10.1007\/978-3-031-08473-7_41"},{"key":"S1351324923000359_ref60","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W16-1704"},{"key":"S1351324923000359_ref69","unstructured":"Zeyrek, D. , Mendes, A. and Kurfal\u0131, M. (2018). Multilingual Extension of PDTB-Style Annotation: The Case of TED Multilingual Discourse Bank. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA)."},{"key":"S1351324923000359_ref1","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324919000627"},{"key":"S1351324923000359_ref28","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324912000307"},{"key":"S1351324923000359_ref9","doi-asserted-by":"publisher","DOI":"10.1016\/0895-4356(93)90018-V"},{"key":"S1351324923000359_ref2","unstructured":"Alsaif, A. and Markert, K. (2011). Modelling discourse relations for Arabic. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP, John McIntyre Conference Centre, A meeting of SIGDAT, a Special Interest Group of the ACL, 27-31 July 2011, Edinburgh, UK, pp. 736\u2013747."},{"key":"S1351324923000359_ref24","first-page":"159","article-title":"The measurement of observer agreement for categorical data","volume":"33","author":"Landis","year":"1977","journal-title":"Wiley, International Biometric Society"},{"key":"S1351324923000359_ref59","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-0411"},{"key":"S1351324923000359_ref62","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D15-1027"},{"key":"S1351324923000359_ref65","unstructured":"Zeyrek, D. and Er, M.E. (2022). A description of Turkish Discourse Bank 1.2 and an examination of common dependencies in Turkish Discourse. In Proceedings of The International Conference and Workshop on Agglutinative Language Technologies as a challenge of Natural Language Processing, ALTNLP\u201922, June 7-8, 2022, Koper, Slovenia, ceur-ws.org\/Vol-3315\/paper04.pdf."},{"key":"S1351324923000359_ref51","doi-asserted-by":"publisher","DOI":"10.3233\/SW-170253"},{"key":"S1351324923000359_ref26","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.codi-1.14"},{"key":"S1351324923000359_ref67","unstructured":"Zeyrek, D. and Kurfal\u0131, M. (2018). An assessment of explicit inter- and intra-sentential discourse connectives in Turkish Discourse Bank. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation, LREC 2018, May 7-12, 2018, Miyazaki, Japan. European Language Resources Association (ELRA)."},{"key":"S1351324923000359_ref49","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/E14-1068"},{"key":"S1351324923000359_ref5","doi-asserted-by":"crossref","unstructured":"Baker, C.F. , Fillmore, C.J. and Lowe, J.B. (1998). The Berkeley FrameNet project. In Proceedings of 36th Annual Meeting of the ACL and 17th International Conference on Computational Linguistics, COLING-ACL\u201998, August 10-14, 1998, Universit\u00e9 de Montr\u00e9al, Morgan Kaufmann Publishers\/ACL, pp. 86\u201390.","DOI":"10.3115\/980845.980860"},{"key":"S1351324923000359_ref63","doi-asserted-by":"crossref","unstructured":"Zeldes, A. , Das, D. , Maziero, E.G. , Antonio, J.D. and Iruskieta, M. (2019). The DISRPT 2019 shared task on elementary discourse unit segmentation and connective detection. In Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019, June 2019, Minneapolis, MN. ACL, pp. 97\u2013104.","DOI":"10.18653\/v1\/W19-2701"},{"key":"S1351324923000359_ref58","unstructured":"Webber, B. , Prasad, R. , Alan, L. and Joshi, A. (2019). The Penn Discourse Treebank 3.0 Annotation Manual. Technical report. Institute for Research in Cognitive Science, University of Pennsylvania."}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324923000359","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,8]],"date-time":"2024-11-08T01:04:45Z","timestamp":1731027885000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324923000359\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,11]]},"references-count":73,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,9]]}},"alternative-id":["S1351324923000359"],"URL":"https:\/\/doi.org\/10.1017\/s1351324923000359","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"type":"print","value":"1351-3249"},{"type":"electronic","value":"1469-8110"}],"subject":[],"published":{"date-parts":[[2023,8,11]]},"assertion":[{"value":"\u00a9 The Author(s), 2023. Published by Cambridge University Press","name":"copyright","label":"Copyright","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}},{"value":"This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http:\/\/creativecommons.org\/licenses\/by\/4.0\/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.","name":"license","label":"License","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}},{"value":"This content has been made available to all.","name":"free","label":"Free to read"}]}}