{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,13]],"date-time":"2026-07-13T12:25:33Z","timestamp":1783945533891,"version":"3.55.0"},"reference-count":44,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,8,14]],"date-time":"2024-08-14T00:00:00Z","timestamp":1723593600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,8,14]],"date-time":"2024-08-14T00:00:00Z","timestamp":1723593600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Urban Info"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Building task-oriented dialogue systems has become a topic of interest in the research community and industry. The task-oriented dialogue system is a closed-domain dialogue system that can perform specific tasks for users. The natural language understanding module of a task-oriented dialogue system is crucial because it is related to a task-oriented dialogue system that provides correctional services for users. The natural language understanding module of a task-oriented dialogue system performs two tasks: intent detection and slot filling. The intent detection task can be regarded as a text classification task; a classification model is trained to predict the intention of the user from the user\u2019s input information. The slot filling task can be regarded as a sequence analysis task; a sequence analysis model is trained to predict the details of the user\u2019s intention. In this paper, we proposed a novel model based on a transformer encoder for intent detection and slot filling. It follows the encoder-decoder structure, including a vanilla Transformer encoder, a bidirectional LSTM encoder, a linear classification decoder for intent detection, and a conditional random field decoder for slot filling. The experimental results on two public datasets show that our proposed model outperforms the existing methods based on the Transformer and can be combined with BERT to achieve better intent detection and slot filling results.<\/jats:p>","DOI":"10.1007\/s44212-024-00056-6","type":"journal-article","created":{"date-parts":[[2024,8,14]],"date-time":"2024-08-14T08:02:01Z","timestamp":1723622521000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["A novel model based on a transformer for intent detection and slot filling"],"prefix":"10.1007","volume":"3","author":[{"given":"Dapeng","family":"Li","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5326-7209","authenticated-orcid":false,"given":"Shuliang","family":"Wang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Boxiang","family":"Zhao","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zhiqiang","family":"Ma","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Leixiao","family":"Li","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2024,8,14]]},"reference":[{"key":"56_CR1","first-page":"06450","volume":"1607","author":"J Ba","year":"2016","unstructured":"Ba, J., Kiros, J., & Hinton, G. (2016). Layer Normalization. Arxiv Preprint arXiv, 1607, 06450.","journal-title":"Layer Normalization. Arxiv Preprint arXiv"},{"key":"56_CR2","unstructured":"Chen, Q., Zhuo, Z., Wang, W. (2019). BERT for joint intent classification and slot filling. arXiv preprint arXiv: 1902.10909"},{"key":"56_CR3","unstructured":"Coucke, A., Saade, A., Ball, A., Bluche, T., Caulier, A., Leroy, D., Doumouro, C., Gisselbrecht, T., Caltagirone, F., Lavril, T., Primet, M., Dureau, J. (2018). Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces. arXiv preprint arXiv: 1805.10190."},{"key":"56_CR4","first-page":"4171","volume":"1","author":"J Devlin","year":"2019","unstructured":"Devlin, J., Chang, M., Lee, K., & Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv, 1, 4171\u20134186.","journal-title":"arXiv preprint arXiv"},{"key":"56_CR5","doi-asserted-by":"crossref","unstructured":"E, H., Niu, P., Chen, Z., Song, M.: A novel bi-directional interrelated model for joint intent detection and slot filling. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5467\u20135471. ACL, Florence, Italy (2019).","DOI":"10.18653\/v1\/P19-1544"},{"key":"56_CR6","doi-asserted-by":"crossref","unstructured":"Gao, S., Takanobu, R., Peng, W., Liu, Q., Huang, M.: HyKnow: end-to-end task-oriented dialog modeling with hybrid knowledge management. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 1591\u20131602, ACL, Online (2021).","DOI":"10.18653\/v1\/2021.findings-acl.139"},{"key":"56_CR7","doi-asserted-by":"crossref","unstructured":"Goo, C., Gao G., Hsu, Y., Huo, C., Chen, T., Hsu, K., Chen, Y.: Slot-gated modeling for joint slot filling and intent prediction. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2, pp. 753\u2013757. ACL, New Orleans, Louisiana (2018).","DOI":"10.18653\/v1\/N18-2118"},{"key":"56_CR8","doi-asserted-by":"crossref","unstructured":"Gunaratna, K., Srinivasan, V., Yerukola, A., Jin, H.: Explainable slot type attentions to improve joint intent detection and slot filling. In: Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 3367\u20133378, ACL, Abu Dhabi, United Arab Emirates (2022).","DOI":"10.18653\/v1\/2022.findings-emnlp.245"},{"key":"56_CR9","doi-asserted-by":"crossref","unstructured":"Hakkani-T\u00fcr, D., Tur, G., Celikyilmaz, A., Chen, Y.V., Gao, J., Deng, L., Wang, Y.: Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM. In: Proc. Interspeech 2016, pp. 715\u2013719. ISCA, San Francisco, USA (2016).","DOI":"10.21437\/Interspeech.2016-402"},{"key":"56_CR10","doi-asserted-by":"crossref","unstructured":"Hashimoto, K., Xiong, C., Tsuruoka, Y., Socher, R.: A joint many-task model: growing a neural network for multiple NLP tasks. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1923\u20131933. ACL, Copenhagen, Denmark (2017).","DOI":"10.18653\/v1\/D17-1206"},{"key":"56_CR11","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognitions, pp. 770\u2013778. IEEE, Las Vegas, NV, USA (2016).","DOI":"10.1109\/CVPR.2016.90"},{"key":"56_CR12","doi-asserted-by":"crossref","unstructured":"Hemphill, T., Godfrey, J., Doddington, G.: The ATIS spoken language systems pilot corpus. In: Proceedings of the DARPA Speech and Natural Language Workshop, pp. 96\u2013101, Morgan Kaufmann, Hidden Valley, PA, USA (1990).","DOI":"10.3115\/116580.116613"},{"key":"56_CR13","unstructured":"Hinton, G., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv: 1207.0580"},{"key":"56_CR14","unstructured":"Huang, Z., Xu, P., Liang, D., Mishra, A., Xiang, B. (2020). TRANS-BLSTM: transformer with bidirectional LSTM for language understanding. arXiv preprint arXiv: 2003.07000"},{"key":"56_CR15","unstructured":"Kingma, D.P., Ba, J. Adam.\u00a0(2015). a method for stochastic optimization. In: Third International Conference on Learning Representations, San Diego, CA, USA"},{"key":"56_CR16","doi-asserted-by":"crossref","unstructured":"Li, C., Li, L., Qi, J.: A self-attentive model with gate mechanism for spoken language understanding. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3824\u20133833. ACL, Brussels, Belgium (2018).","DOI":"10.18653\/v1\/D18-1417"},{"key":"56_CR17","doi-asserted-by":"crossref","unstructured":"Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. In: Proc. Interspeech 2016, pp. 685\u2013689, ISCA, San Francisco, CA, USA (2016).","DOI":"10.21437\/Interspeech.2016-1352"},{"key":"56_CR18","doi-asserted-by":"crossref","unstructured":"Liu, J., Takanobu, R., Wen, J., Wan, D., Li, H., Nie, W., Li, C., Peng, W., Huang, M.: Robustness testing of language understanding in task-oriented dialog. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 2467\u20132480, ACL, Online (2021).","DOI":"10.18653\/v1\/2021.acl-long.192"},{"key":"56_CR19","doi-asserted-by":"crossref","unstructured":"Liu, Y., Meng, F., Zhang, J., Zhou, J., Chen, Y., Xu, J.: CM-Net: a novel collaborative memory network for spoken language understanding. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 1051\u20131060. ACL, Hong Kong, China (2019).","DOI":"10.18653\/v1\/D19-1097"},{"key":"56_CR20","doi-asserted-by":"crossref","unstructured":"Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: Proc. Interspeech 2013, pp. 3771\u20133775, ISCA, Lyon, France (2013).","DOI":"10.21437\/Interspeech.2013-596"},{"key":"56_CR21","doi-asserted-by":"crossref","unstructured":"Mi, F., Chen, L., Zhao, M., Huang, M., Faltings, B.: Continual learning for natural language generation in task-oriented dialog systems. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 3461\u20133474, ACL, Online (2020).","DOI":"10.18653\/v1\/2020.findings-emnlp.310"},{"key":"56_CR22","doi-asserted-by":"crossref","unstructured":"Mi, F., Huang, M., Zhang, J., Faltings, B.: Meta-learning for low-resource natural language generation in task-oriented dialogue systems. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19), pp. 3151\u20133157, International Joint Conferences on Artificial Intelligence Organization, Macao, China (2019).","DOI":"10.24963\/ijcai.2019\/437"},{"key":"56_CR23","doi-asserted-by":"crossref","unstructured":"Mi, F., Zhou, W., Cai, F., Kong, L., Huang, M., Faltings, B.: Self-training improves pre-training for few-shot learning in task-oriented dialog systems. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 1887\u20131898, ACL, Online and Punta Cana, Dominican Republic (2021).","DOI":"10.18653\/v1\/2021.emnlp-main.142"},{"key":"56_CR24","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems 32, pp. 8026\u20138037. Curran Associates, Vancouver, Canada (2019)."},{"key":"56_CR25","doi-asserted-by":"crossref","unstructured":"Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532\u20131543. ACL, Doha, Qatar (2014).","DOI":"10.3115\/v1\/D14-1162"},{"key":"56_CR26","doi-asserted-by":"crossref","unstructured":"Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227\u20132237, ACL, New Orleans, Louisiana, USA (2018).","DOI":"10.18653\/v1\/N18-1202"},{"key":"56_CR27","doi-asserted-by":"crossref","unstructured":"Qin, L., Che, W., Li, Y., Wen, H., Liu, T.: A stack-propagation framework with token-level intent detection for spoken language understanding. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 2078\u20132087. ACL, Hong Kong, China (2019).","DOI":"10.18653\/v1\/D19-1214"},{"key":"56_CR28","doi-asserted-by":"crossref","unstructured":"Qin, L., Liu, T., Che, W., Kang, B., Zhao, S., Liu, T.: A co-interactive transformer for joint slot filling and intent detection. In: 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8193\u20138197. IEEE, Virtual Conference (2021).","DOI":"10.1109\/ICASSP39728.2021.9414110"},{"key":"56_CR29","doi-asserted-by":"crossref","unstructured":"Siddhant, A, Goyal, A., Metallinou, A.: Unsupervised transfer learning for spoken language understanding in intelligent agents. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, pp. 4959\u20134966. AAAI Press, Honolulu, Hawaii, USA (2019).","DOI":"10.1609\/aaai.v33i01.33014959"},{"key":"56_CR30","doi-asserted-by":"crossref","unstructured":"Takanobu, R., Liang, R., Huang, M.: Multi-agent task-oriented dialog policy learning with role-aware reward decomposition. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 625\u2013638, ACL, Online (2020).","DOI":"10.18653\/v1\/2020.acl-main.59"},{"key":"56_CR31","doi-asserted-by":"crossref","unstructured":"Takanobu, R., Zhu, H., Huang, M.: Guided dialog policy learning: reward estimation for multi-domain task-oriented dialog. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 100\u2013110, ACL, Hong Kong, China (2019).","DOI":"10.18653\/v1\/D19-1010"},{"key":"56_CR32","doi-asserted-by":"crossref","unstructured":"Tur, G., Hakkani-T\u00fcr, D., Heck, L.: What is left to be understood in ATIS? In: 2010 IEEE Spoken Language Technology Workshop, pp. 19\u201324. IEEE, Berkeley, CA, USA (2010).","DOI":"10.1109\/SLT.2010.5700816"},{"key":"56_CR33","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000\u20136010. Curran Associates, Long Beach, California, USA (2017)."},{"key":"56_CR34","doi-asserted-by":"publisher","DOI":"10.1108\/DTA-07-2022-0281","author":"H Wang","year":"2023","unstructured":"Wang, H., Yang, D., Guo, L., & Zhang, X. (2023). Joint modeling method of question intent detection and slot filling for domain-oriented question answering system. Data Technologies and Applications. https:\/\/doi.org\/10.1108\/DTA-07-2022-0281","journal-title":"Data Technologies and Applications"},{"key":"56_CR35","doi-asserted-by":"crossref","unstructured":"Wang, J., Wei, K., Radfar, M., Zhang, W., Chung, C.: Encoding syntactic knowledge in transformer encoder for intent detection and slot filling. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence, pp. 13943\u201313951. AAAI Press, Virtual Conference (2021).","DOI":"10.1609\/aaai.v35i16.17642"},{"issue":"2","key":"56_CR36","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1177\/1729881419841930","volume":"16","author":"S Wang","year":"2019","unstructured":"Wang, S., Li, D., Geng, J., Yang, L., & Dai, T. (2019). Learning bi-utterance for multi-turn response selection in retrieval-based chatbots. International Journal of Advanced Robotic Systems, 16(2), 1\u201310.","journal-title":"International Journal of Advanced Robotic Systems"},{"issue":"4","key":"56_CR37","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1177\/1729881420953006","volume":"17","author":"S Wang","year":"2020","unstructured":"Wang, S., Li, D., Geng, J., Yang, L., & Leng, H. (2020). Learning to balance the coherence and diversity of response generation in generation-based chatbots. International Journal of Advanced Robotic Systems, 17(4), 1\u201311.","journal-title":"International Journal of Advanced Robotic Systems"},{"key":"56_CR38","doi-asserted-by":"crossref","unstructured":"Wang, Y., Shen, Y., Jin, H.: A bi-model based RNN semantic frame parsing model for intent detection and slot filling. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2, pp. 309\u2013314. ACL, New Orleans, Louisiana (2018).","DOI":"10.18653\/v1\/N18-2050"},{"key":"56_CR39","doi-asserted-by":"crossref","unstructured":"Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P. (2020). HuggingFace's Transformers: state-of-the-art natural language processing. arXiv preprint arXiv: 1910.03771","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"56_CR40","doi-asserted-by":"crossref","unstructured":"Zhang, C., Li, Y., Du, N., Fan, W., Yu, P.: Joint slot filling and intent detection via capsule neural networks. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5259\u20135267. ACL, Florence, Italy (2019).","DOI":"10.18653\/v1\/P19-1519"},{"key":"56_CR41","doi-asserted-by":"crossref","unstructured":"Zhang, L., Ma, D., Zhang, X., Yan, X., Wang, H.: Graph LSTM with context-gated mechanism for spoken language understanding. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, pp. 9539\u20139546. AAAI Press, New York, USA (2020).","DOI":"10.1609\/aaai.v34i05.6499"},{"issue":"3","key":"56_CR42","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3317612","volume":"37","author":"Z Zhang","year":"2019","unstructured":"Zhang, Z., Huang, M., Zhao, Z., Ji, F., Chen, H., & Zhu, X. (2019b). Memory-augmented dialogue management for task-oriented dialogue systems. ACM Trans. Inf. Syst., 37(3), 1\u201330.","journal-title":"ACM Trans. Inf. Syst."},{"key":"56_CR43","doi-asserted-by":"publisher","first-page":"2011","DOI":"10.1007\/s11431-020-1692-3","volume":"63","author":"Z Zhang","year":"2020","unstructured":"Zhang, Z., Takanobu, R., Zhu, Q., Huang, M., & Zhu, X. (2020b). Recent advances and challenges in task-oriented dialog systems. Sci China Tech Sci, 63, 2011\u20132027.","journal-title":"Sci China Tech Sci"},{"key":"56_CR44","doi-asserted-by":"crossref","unstructured":"Zhu, S., Yu, K.: Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5675\u20135679, IEEE, New Orleans, LA, USA (2017).","DOI":"10.1109\/ICASSP.2017.7953243"}],"container-title":["Urban Informatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44212-024-00056-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44212-024-00056-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44212-024-00056-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,14]],"date-time":"2024-08-14T08:26:16Z","timestamp":1723623976000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44212-024-00056-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,14]]},"references-count":44,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["56"],"URL":"https:\/\/doi.org\/10.1007\/s44212-024-00056-6","relation":{},"ISSN":["2731-6963"],"issn-type":[{"value":"2731-6963","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,8,14]]},"assertion":[{"value":"11 August 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 July 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 August 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 August 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no conflict of interest.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"24"}}