{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T21:28:48Z","timestamp":1777152528214,"version":"3.51.4"},"reference-count":78,"publisher":"Cambridge University Press (CUP)","issue":"3","license":[{"start":{"date-parts":[[2021,11,2]],"date-time":"2021-11-02T00:00:00Z","timestamp":1635811200000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["cambridge.org"],"crossmark-restriction":true},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2023,5]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In this study, we investigate the process of generating single-sentence representations for the purpose of Dialogue Act (DA) classification, including several aspects of text pre-processing and input representation which are often overlooked or underreported within the literature, for example, the number of words to keep in the vocabulary or input sequences. We assess each of these with respect to two DA-labelled corpora, using a range of supervised models, which represent those most frequently applied to the task. Additionally, we compare context-free word embedding models with that of transfer learning via pre-trained language models, including several based on the transformer architecture, such as Bidirectional Encoder Representations from Transformers (BERT) and XLNET, which have thus far not been widely explored for the DA classification task. Our findings indicate that these text pre-processing considerations do have a statistically significant effect on classification accuracy. Notably, we found that viable input sequence lengths, and vocabulary sizes, can be much smaller than is typically used in DA classification experiments, yielding no significant improvements beyond certain thresholds. We also show that in some cases the contextual sentence representations generated by language models do not reliably outperform supervised methods. Though BERT, and its derivative models, do represent a significant improvement over supervised approaches, and much of the previous work on DA classification.<\/jats:p>","DOI":"10.1017\/s1351324921000310","type":"journal-article","created":{"date-parts":[[2021,11,2]],"date-time":"2021-11-02T09:41:38Z","timestamp":1635846098000},"page":"794-823","update-policy":"https:\/\/doi.org\/10.1017\/policypage","source":"Crossref","is-referenced-by-count":9,"title":["Sentence encoding for Dialogue Act classification"],"prefix":"10.1017","volume":"29","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6084-4406","authenticated-orcid":false,"given":"Nathan","family":"Duran","sequence":"first","affiliation":[]},{"given":"Steve","family":"Battle","sequence":"additional","affiliation":[]},{"given":"Jim","family":"Smith","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2021,11,2]]},"reference":[{"key":"S1351324921000310_ref11","doi-asserted-by":"crossref","unstructured":"Cer, D. , Yang, Y. , Kong, S.-Y. , Hua, N. , Limtiaco, N. , John, R.S. , Constant, N. , Guajardo-Cespedes, M. , Yuan, S. , Tar, C. , Sung, Y.-H. , Strope, B. and Kurzweil, R. (2018). Universal Sentence Encoder. arXiv.","DOI":"10.18653\/v1\/D18-2029"},{"key":"S1351324921000310_ref49","unstructured":"Louwerse, M. and Crossley, S. (2006). Dialog act classification using N-gram algorithms. In FLAIRS Conference 2006, Melbourne Beach, Australia, pp. 758\u2013763."},{"key":"S1351324921000310_ref35","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-1062"},{"key":"S1351324921000310_ref55","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"S1351324921000310_ref7","first-page":"1137","article-title":"A neural probabilistic language model","volume":"3","author":"Bengio","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324921000310_ref50","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D15-1166"},{"key":"S1351324921000310_ref2","unstructured":"Amanova, D. , Petukhova, V. and Klakow, D. (2016). Creating annotated dialogue resources: Cross-domain dialogue act classification. In 9th International Conference on Language Resources and Evaluation, pp. 111\u2013117."},{"key":"S1351324921000310_ref73","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2018.8622245"},{"key":"S1351324921000310_ref8","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2018-2527"},{"key":"S1351324921000310_ref75","unstructured":"Webb, N. and Hepple, M. (2005). Dialogue act classification based on intra-utterance features. In Proceedings of the AAAI Workshop on Spoken Language Understanding."},{"key":"S1351324921000310_ref32","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/E17-2068"},{"key":"S1351324921000310_ref69","first-page":"25","article-title":"The HCRC map task corpus: Natural dialogue for speech recognition","volume":"34","author":"Thompson","year":"1991","journal-title":"Language and Speech"},{"key":"S1351324921000310_ref30","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"S1351324921000310_ref41","unstructured":"Lan, Z. , Chen, M. , Goodman, S. , Gimpel, K. , Sharma, P. and Soricut, R. (2019). ALBERT: A lite BERT for self-supervised learning of language representations. In ICLR 2020."},{"key":"S1351324921000310_ref62","unstructured":"Rojas-Barahona, L.M. , Gasic, M. , Mrk\u0160i\u0106, N. , Su, P.-H. , Ultes, S. , Wen, T.-H. and Young, S. (2016). Exploiting sentence and context representations in deep neural models for spoken language understanding. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics, Osaka, Japan, pp. 258\u2013267."},{"key":"S1351324921000310_ref14","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/W14-4012"},{"key":"S1351324921000310_ref57","unstructured":"Radford, A. , Jozefowicz, R. and Sutskever, I. (2017). Learning to generate reviews and discovering sentiment. arXiv."},{"key":"S1351324921000310_ref45","doi-asserted-by":"crossref","unstructured":"Li, R. , Lin, C. , Collinson, M. , Li, X. and Chen, G. (2018). A dual-attention hierarchical recurrent neural network for dialogue act classification. arXiv.","DOI":"10.18653\/v1\/K19-1036"},{"key":"S1351324921000310_ref37","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1181"},{"key":"S1351324921000310_ref21","doi-asserted-by":"publisher","DOI":"10.1162\/089976698300017197"},{"key":"S1351324921000310_ref61","unstructured":"Ribeiro, E. , Ribeiro, R. and Martins De Matos, D. (2015). The influence of context on dialog act recognition. arXiv."},{"key":"S1351324921000310_ref44","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-2050"},{"key":"S1351324921000310_ref19","first-page":"73","article-title":"Sentence length in education research articles: A comparison between anglophone and Turkish authors","volume":"13","author":"Deveci","year":"2019","journal-title":"Linguistics Journal"},{"key":"S1351324921000310_ref36","unstructured":"Keizer, S. (2001). A Bayesian approach to dialogue act classification. In BI-DIALOG 2001: Proc. of the 5th Workshop on Formal Semantics and Pragmatics of Dialogue, pp. 210\u2013218."},{"key":"S1351324921000310_ref15","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1179"},{"key":"S1351324921000310_ref10","unstructured":"Bouckaert, R.R. (2003). Choosing between two learning algorithms based on calibrated tests. In Proceedings, Twentieth International Conference on Machine Learning, volume 1, pp. 51\u201358."},{"key":"S1351324921000310_ref26","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.1992.225858"},{"key":"S1351324921000310_ref51","unstructured":"Mikolov, T. , Yih, W.-T. and Zweig, G. (2013). Linguistic regularities in continuous space word representations. Proceedings of NAACL-HLT (June), 746\u2013751."},{"key":"S1351324921000310_ref3","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-5946"},{"key":"S1351324921000310_ref59","doi-asserted-by":"publisher","DOI":"10.1109\/ICSLP.1996.607446"},{"key":"S1351324921000310_ref18","first-page":"1","article-title":"Statistical comparisons of classifiers over multiple data sets","volume":"7","author":"Dem\u0161ar","year":"2006","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324921000310_ref13","doi-asserted-by":"publisher","DOI":"10.1145\/3209978.3209997"},{"key":"S1351324921000310_ref29","doi-asserted-by":"crossref","unstructured":"Henderson, M. , Casanueva, I. , Mrk\u0160i\u0106, N. , Su, P.-H. , Tsung, Hsien and Vuli\u0106, I. (2019). ConveRT: Efficient and accurate conversational representations from transformers. arXiv.","DOI":"10.18653\/v1\/2020.findings-emnlp.196"},{"key":"S1351324921000310_ref42","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-3002"},{"key":"S1351324921000310_ref64","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9781139173438"},{"key":"S1351324921000310_ref46","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2017.7953251"},{"key":"S1351324921000310_ref56","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1202"},{"key":"S1351324921000310_ref4","volume-title":"How To Do Things With Words","author":"Austin","year":"1962"},{"key":"S1351324921000310_ref16","doi-asserted-by":"crossref","unstructured":"Cuay\u00e1huitl, H. , Yu, S. , Williamson, A. and Carse, J. (2016). Deep reinforcement learning for multi-domain dialogue systems. In NIPS Workshop on Deep Reinforcement Learning, Barcelona, Spain, pp. 1\u20139.","DOI":"10.1109\/IJCNN.2017.7966275"},{"key":"S1351324921000310_ref78","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-demos.30"},{"key":"S1351324921000310_ref53","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-5530"},{"key":"S1351324921000310_ref34","unstructured":"Kalchbrenner, N. and Blunsom, P. (2013). Recurrent convolutional neural networks for discourse compositionality. In Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality, Sofia, Bulgaria: Association for Computational Linguistics, pp. 119\u2013126."},{"key":"S1351324921000310_ref58","unstructured":"Radford, A. , Wu, J. , Child, R. , Luan, D. , Amodei, D. and Sutskever, I. (2019). Language models are unsupervised multitask learners. arXiv."},{"key":"S1351324921000310_ref38","unstructured":"Krause, B. , Lu, L. , Murray, I. and Renals, S. (2016). Multiplicative LSTM for sequence modelling. In ICLR 2017, pp. 1\u201311."},{"key":"S1351324921000310_ref22","unstructured":"Dubay, W.H. (2004). The Principles of Readability. Technical report."},{"key":"S1351324921000310_ref23","doi-asserted-by":"publisher","DOI":"10.3390\/app10103386"},{"key":"S1351324921000310_ref52","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2019.8682881"},{"key":"S1351324921000310_ref20","unstructured":"Devlin, J. , Chang, M.W. , Lee, K. and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL HLT 2019-2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, volume 1, pp. 4171\u20134186."},{"key":"S1351324921000310_ref33","unstructured":"Jurafsky, D. , Shriberg, E. and Biasca, D. (1997). Switchboard SWBD-DAMSL Shallow-Discourse-Function Annotation Coders Manual. Technical report."},{"key":"S1351324921000310_ref48","unstructured":"Liu, Y. , Ott, M. , Goyal, N. , Du, J. , Joshi, M. , Chen, D. , Levy, O. , Lewis, M. , Zettlemoyer, L. , Stoyanov, V. and Allen, P.G. (2019). RoBERTa: A robustly optimized BERT pretraining approach. arXiv."},{"key":"S1351324921000310_ref43","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N16-1062"},{"key":"S1351324921000310_ref68","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2006-535"},{"key":"S1351324921000310_ref6","first-page":"1","article-title":"Time for a change: A tutorial for comparing multiple classifiers through Bayesian analysis","volume":"18","author":"Benavoli","year":"2017","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324921000310_ref71","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/E17-1041"},{"key":"S1351324921000310_ref77","article-title":"XLNet: Generalized autoregressive pretraining for language understanding","volume":"32","author":"Yang","year":"2019","journal-title":"In Advances in Neural Information Processing Systems"},{"key":"S1351324921000310_ref28","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2008.04.001"},{"key":"S1351324921000310_ref31","doi-asserted-by":"crossref","unstructured":"Ji, Y. , Haffari, G. and Eisenstein, J. (2016). A latent variable recurrent neural network for discourse relation language models. In NAACL-HLT 2016, San Diego, California: Association for Computational Linguistics, pp. 332\u2013342.","DOI":"10.18653\/v1\/N16-1037"},{"key":"S1351324921000310_ref12","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2017.07.009"},{"key":"S1351324921000310_ref72","unstructured":"Vaswani, A. , Shazeer, N. , Parmar, N. , Uszkoreit, J. , Jones, L. , Gomez, A.N. , Kaiser, L. and Polosukhin, I. (2017). Attention is all you need. In 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA."},{"key":"S1351324921000310_ref17","volume-title":"Oxford Guid to Plain English","author":"Cutts","year":"2013"},{"key":"S1351324921000310_ref63","doi-asserted-by":"publisher","DOI":"10.1023\/A:1009752403260"},{"key":"S1351324921000310_ref1","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331375"},{"key":"S1351324921000310_ref24","doi-asserted-by":"crossref","unstructured":"Firdaus, M. , Golchha, H. , Ekbal, A. , and Bhattacharyya, P. (2020). A deep multi-task model for dialogue act classification, intent detection and slot filling. Cognitive Computation.","DOI":"10.1007\/s12559-020-09718-4"},{"key":"S1351324921000310_ref76","doi-asserted-by":"crossref","unstructured":"Wen, T.-H. , Ga\u0161i\u0107, M. , Mrk\u0161i\u0107, N. , Rojas-Barahona, L.M. , Su, P.-H. , Ultes, S. , Vandyke, D. and Young, S. (2016). A network-based end-to-end trainable task-oriented dialogue system. In Proceedings of EACL 2017.","DOI":"10.18653\/v1\/E17-1042"},{"key":"S1351324921000310_ref65","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2016-1359"},{"key":"S1351324921000310_ref9","unstructured":"Bothe, C. , Weber, C. , Magg, S. and Wermter, S. (2018b). A context-based approach for dialogue act recognition using simple recurrent neural networks. In Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan."},{"key":"S1351324921000310_ref54","doi-asserted-by":"publisher","DOI":"10.21437\/SemDial.2017-9"},{"key":"S1351324921000310_ref5","unstructured":"Bahdanau, D. , Cho, K. and Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In ICLR 2015."},{"key":"S1351324921000310_ref39","unstructured":"Kumar, H. , Agarwal, A. , Dasgupta, R. , Joshi, S. and Kumar, A. (2017). Dialogue act sequence labeling using hierarchical encoder with CRF. In The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18). AAAI, pp. 3440\u20133447."},{"key":"S1351324921000310_ref25","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W15-4648"},{"key":"S1351324921000310_ref40","doi-asserted-by":"crossref","unstructured":"Lai, S. , Xu, L. , Liu, K. and Zhao, J. (2015). Recurrent convolutional neural networks for text classification. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI\u201915), pp. 2267\u20132273. AAAI.","DOI":"10.1609\/aaai.v29i1.9513"},{"key":"S1351324921000310_ref66","unstructured":"Speer, R. , Chin, J. and Havasi, C. (2016). ConceptNet 5.5: An open multilingual graph of general knowledge. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) ConceptNet, pp. 4444\u20134451."},{"key":"S1351324921000310_ref67","doi-asserted-by":"publisher","DOI":"10.1162\/089120100561737"},{"key":"S1351324921000310_ref27","unstructured":"Grau, S. , Sanchis, E. , Castro, M. and Vilar, D. (2004). Dialogue act classification using a Bayesian approach. In 9th Conference Speech and Computer, St. Petersberg, Russia."},{"key":"S1351324921000310_ref74","doi-asserted-by":"publisher","DOI":"10.1145\/3386252"},{"key":"S1351324921000310_ref47","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1187"},{"key":"S1351324921000310_ref70","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-2083"},{"key":"S1351324921000310_ref60","doi-asserted-by":"publisher","DOI":"10.1613\/jair.1.11594"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324921000310","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,19]],"date-time":"2023-05-19T07:31:33Z","timestamp":1684481493000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324921000310\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,2]]},"references-count":78,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,5]]}},"alternative-id":["S1351324921000310"],"URL":"https:\/\/doi.org\/10.1017\/s1351324921000310","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"value":"1351-3249","type":"print"},{"value":"1469-8110","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,11,2]]},"assertion":[{"value":"\u00a9 The Author(s), 2021. Published by Cambridge University Press","name":"copyright","label":"Copyright","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}},{"value":"This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https:\/\/creativecommons.org\/licenses\/by\/4.0\/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.","name":"license","label":"License","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}},{"value":"This content has been made available to all.","name":"free","label":"Free to read"}]}}