{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:27:47Z","timestamp":1750220867017,"version":"3.41.0"},"reference-count":38,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2020,9,27]],"date-time":"2020-09-27T00:00:00Z","timestamp":1601164800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["U1936216, U1936208, U1836204, U1705261"],"award-info":[{"award-number":["U1936216, U1936208, U1836204, U1705261"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Key Research and Development Program of China","award":["2018YFC1604002"],"award-info":[{"award-number":["2018YFC1604002"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2020,11,30]]},"abstract":"<jats:p>Chatbots such as Xiaoice have gained huge popularity in recent years. Users frequently mention their favorite works such as songs and movies in conversations with chatbots. Detecting these entities can help design better chat strategies and improve user experience. Existing named entity recognition methods are mainly designed for formal texts, and their performance on the informal chatbot conversation texts may not be optimal. In addition, these methods rely on massive manually annotated data for model training. In this article, we propose a neural approach to detect entities of works for Chinese chatbot. Our approach is based on a language model (LM) long-short term memory (LSTM) convolutional neural network (CNN) conditional random value (CRF), or LM-LSTM-CNN-CRF, framework, which contains a language model to generate context-aware character embeddings, a Bi-LSTM network to learn contextual character representations from global contexts, a CNN to learn character representations from local contexts, and a CRF layer to jointly decode the character label sequence. In addition, we propose an automatic text annotation method via quote marks to reduce the effort of manual annotation. Besides, we propose an iterative data purification method to improve the quality of the automatically constructed labeled data. Massive experiments on a real-world dataset validate that our approach can achieve good performance on entity detection for Chinese chatbots.<\/jats:p>","DOI":"10.1145\/3414901","type":"journal-article","created":{"date-parts":[[2020,9,27]],"date-time":"2020-09-27T22:05:45Z","timestamp":1601244345000},"page":"1-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Detecting Entities of Works for Chinese Chatbot"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5730-8792","authenticated-orcid":false,"given":"Chuhan","family":"Wu","sequence":"first","affiliation":[{"name":"Department of Electronic Engineering and BNRist, Tsinghua University, Beijing, China"}]},{"given":"Fangzhao","family":"Wu","sequence":"additional","affiliation":[{"name":"Microsoft Research Asia, Beijing, China"}]},{"given":"Tao","family":"Qi","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering and BNRist, Tsinghua University, Beijing, China"}]},{"given":"Junxin","family":"Liu","sequence":"additional","affiliation":[{"name":"ByteDance Ltd., Beijing, China"}]},{"given":"Yongfeng","family":"Huang","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering and BNRist, Tsinghua University, Beijing, China"}]},{"given":"Xing","family":"Xie","sequence":"additional","affiliation":[{"name":"Microsoft Research Asia, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2020,9,27]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","unstructured":"John Blitzer Ryan McDonald and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In EMNLP. ACM 120--128.  John Blitzer Ryan McDonald and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In EMNLP. ACM 120--128.","DOI":"10.3115\/1610075.1610094"},{"key":"e_1_2_1_2_1","unstructured":"Rich Caruana Steve Lawrence and C. Lee Giles. 2001. Overfitting in neural nets: Backpropagation conjugate gradient and early stopping. In NIPS. 402--408.  Rich Caruana Steve Lawrence and C. Lee Giles. 2001. Overfitting in neural nets: Backpropagation conjugate gradient and early stopping. In NIPS. 402--408."},{"volume-title":"Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. 173--176","year":"2006","author":"Chen Aitao","key":"e_1_2_1_3_1"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00104"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3015467"},{"key":"e_1_2_1_6_1","unstructured":"Yann Dauphin Harm de Vries and Yoshua Bengio. 2015. Equilibrated adaptive learning rates for non-convex optimization. In NIPS. 1504--1512.  Yann Dauphin Harm de Vries and Yoshua Bengio. 2015. Equilibrated adaptive learning rates for non-convex optimization. In NIPS. 1504--1512."},{"volume-title":"Analysis of named entity recognition and linking for tweets. Information Processing 8 Management 51, 2","year":"2015","author":"Derczynski Leon","key":"e_1_2_1_7_1"},{"volume-title":"BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT. 4171--4186.","year":"2019","author":"Devlin Jacob","key":"e_1_2_1_8_1"},{"volume-title":"Multichannel LSTM-CRF for named entity recognition in Chinese social media","author":"Dong Chuanhai","key":"e_1_2_1_9_1"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-50496-4_20"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W15-3904"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2818717"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1162\/089120105775299177"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2005.06.042"},{"key":"e_1_2_1_15_1","unstructured":"Hangfeng He and Xu Sun. 2017. A unified model for cross-domain and semi-supervised named entity recognition in Chinese social media. In AAAI. 3216--3222.  Hangfeng He and Xu Sun. 2017. A unified model for cross-domain and semi-supervised named entity recognition in Chinese social media. In AAAI. 3216--3222."},{"volume-title":"Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991","year":"2015","author":"Huang Zhiheng","key":"e_1_2_1_16_1"},{"key":"e_1_2_1_17_1","first-page":"8","article-title":"Urdu named entity recognition: Corpus generation and deep learning applications","volume":"19","author":"Kanwal Safia","year":"2019","journal-title":"TALLIP"},{"volume-title":"Pereira","year":"2001","author":"Lafferty John D.","key":"e_1_2_1_18_1"},{"key":"e_1_2_1_19_1","doi-asserted-by":"crossref","unstructured":"Siwei Lai Liheng Xu Kang Liu and Jun Zhao. 2015. Recurrent convolutional neural networks for text classification. In AAAI.  Siwei Lai Liheng Xu Kang Liu and Jun Zhao. 2015. Recurrent convolutional neural networks for text classification. In AAAI.","DOI":"10.1609\/aaai.v29i1.9513"},{"key":"e_1_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Guillaume Lample Miguel Ballesteros Sandeep Subramanian Kazuya Kawakami and Chris Dyer. 2016. Neural architectures for named entity recognition. In NAACL. 260--270.  Guillaume Lample Miguel Ballesteros Sandeep Subramanian Kazuya Kawakami and Chris Dyer. 2016. Neural architectures for named entity recognition. In NAACL. 260--270.","DOI":"10.18653\/v1\/N16-1030"},{"key":"e_1_2_1_21_1","unstructured":"Shuying Lin Huosheng Xie Liang-Chih Yu and K. Robert Lai. 2017. SentiNLP at IJCNLP-2017 task 4: Customer feedback analysis using a Bi-LSTM-CNN model. In IJCNLP Shared Tasks. 149--154.  Shuying Lin Huosheng Xie Liang-Chih Yu and K. Robert Lai. 2017. SentiNLP at IJCNLP-2017 task 4: Customer feedback analysis using a Bi-LSTM-CNN model. In IJCNLP Shared Tasks. 149--154."},{"key":"e_1_2_1_22_1","unstructured":"Yankai Lin Shiqi Shen Zhiyuan Liu Huanbo Luan and Maosong Sun. 2016. Neural relation extraction with selective attention over instances. In ACL. 2124--2133.  Yankai Lin Shiqi Shen Zhiyuan Liu Huanbo Luan and Maosong Sun. 2016. Neural relation extraction with selective attention over instances. In ACL. 2124--2133."},{"key":"e_1_2_1_23_1","doi-asserted-by":"crossref","unstructured":"Zhangxun Liu Conghui Zhu and Tiejun Zhao. 2010. Chinese named entity recognition with a sequence labeling approach: Based on characters or based on words? In Advanced Intelligent Computing Theories and Applications with Aspects of Artificial Intelligence. Springer 634--640.  Zhangxun Liu Conghui Zhu and Tiejun Zhao. 2010. Chinese named entity recognition with a sequence labeling approach: Based on characters or based on words? In Advanced Intelligent Computing Theories and Applications with Aspects of Artificial Intelligence. Springer 634--640.","DOI":"10.1007\/978-3-642-14932-0_78"},{"key":"e_1_2_1_24_1","doi-asserted-by":"crossref","unstructured":"Gang Luo Xiaojiang Huang Chin-Yew Lin and Zaiqing Nie. 2015. Joint entity recognition and disambiguation. In EMNLP. 879--888.  Gang Luo Xiaojiang Huang Chin-Yew Lin and Zaiqing Nie. 2015. Joint entity recognition and disambiguation. In EMNLP. 879--888.","DOI":"10.18653\/v1\/D15-1104"},{"key":"e_1_2_1_25_1","doi-asserted-by":"crossref","unstructured":"Wencan Luo and Fan Yang. 2016. An empirical study of automatic chinese word segmentation for spoken language understanding and named entity recognition. In NAACL. 238--248.  Wencan Luo and Fan Yang. 2016. An empirical study of automatic chinese word segmentation for spoken language understanding and named entity recognition. In NAACL. 238--248.","DOI":"10.18653\/v1\/N16-1028"},{"volume-title":"Hovy","year":"2016","author":"Ma Xuezhe","key":"e_1_2_1_26_1"},{"volume-title":"Toward mention detection robustness with recurrent neural networks. arXiv preprint arXiv:1602.07749","year":"2016","author":"Nguyen Thien Huu","key":"e_1_2_1_27_1"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D15-1064"},{"volume-title":"Improving named entity recognition for Chinese social media with word segmentation representation learning. arXiv preprint arXiv:1603.00786","year":"2016","author":"Peng Nanyun","key":"e_1_2_1_29_1"},{"key":"e_1_2_1_30_1","doi-asserted-by":"crossref","unstructured":"Matthew Peters Waleed Ammar Chandra Bhagavatula and Russell Power. 2017. Semi-supervised sequence tagging with bidirectional language models. In ACL. 1756--1765.  Matthew Peters Waleed Ammar Chandra Bhagavatula and Russell Power. 2017. Semi-supervised sequence tagging with bidirectional language models. In ACL. 1756--1765.","DOI":"10.18653\/v1\/P17-1161"},{"key":"e_1_2_1_31_1","first-page":"1498","article-title":"Embedding semantic similarity in tree kernels for domain adaptation of relation extraction","volume":"1","author":"Plank Barbara","year":"2013","journal-title":"ACL"},{"key":"e_1_2_1_32_1","doi-asserted-by":"crossref","unstructured":"Desh Raj Sunil Sahu and Ashish Anand. 2017. Learning local and global contexts using a convolutional recurrent network model for relation classification in biomedical text. In CoNLL. 311--321.  Desh Raj Sunil Sahu and Ashish Anand. 2017. Learning local and global contexts using a convolutional recurrent network model for relation classification in biomedical text. In CoNLL. 311--321.","DOI":"10.18653\/v1\/K17-1032"},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Marc-Antoine Rondeau and Yi Su. 2016. LSTM-Based NeuroCRFs for named entity recognition. In INTERSPEECH. 665--669.  Marc-Antoine Rondeau and Yi Su. 2016. LSTM-Based NeuroCRFs for named entity recognition. In INTERSPEECH. 665--669.","DOI":"10.21437\/Interspeech.2016-288"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2670313"},{"key":"e_1_2_1_35_1","unstructured":"Xiaojun Wan Liang Zong Xiaojiang Huang Tengfei Ma Houping Jia Yuqian Wu and Jianguo Xiao. 2011. Named entity recognition in Chinese news comments on the web. In IJCNLP. 856--864.  Xiaojun Wan Liang Zong Xiaojiang Huang Tengfei Ma Houping Jia Yuqian Wu and Jianguo Xiao. 2011. Named entity recognition in Chinese news comments on the web. In IJCNLP. 856--864."},{"key":"e_1_2_1_36_1","unstructured":"Fangzhao Wu Junxin Liu Chuhan Wu Yongfeng Huang and Xing Xie. 2019. Neural Chinese named entity recognition via CNN-LSTM-CRF and joint training with word segmentation. In WWW. 3342--3348.  Fangzhao Wu Junxin Liu Chuhan Wu Yongfeng Huang and Xing Xie. 2019. Neural Chinese named entity recognition via CNN-LSTM-CRF and joint training with word segmentation. In WWW. 3342--3348."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-68636-1_10"},{"key":"e_1_2_1_38_1","unstructured":"Yue Zhang and Jie Yang. 2018. Chinese NER using lattice LSTM. In ACL. 1554--1564.  Yue Zhang and Jie Yang. 2018. Chinese NER using lattice LSTM. In ACL. 1554--1564."}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3414901","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3414901","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:23:45Z","timestamp":1750202625000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3414901"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,27]]},"references-count":38,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2020,11,30]]}},"alternative-id":["10.1145\/3414901"],"URL":"https:\/\/doi.org\/10.1145\/3414901","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2020,9,27]]},"assertion":[{"value":"2019-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-09-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}