{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:19:41Z","timestamp":1750220381662,"version":"3.41.0"},"reference-count":52,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2021,12,30]],"date-time":"2021-12-30T00:00:00Z","timestamp":1640822400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61976211 and 61922085"],"award-info":[{"award-number":["61976211 and 61922085"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Beijing Academy of Artifcial Intelligence","award":["BAAI2019QN0301"],"award-info":[{"award-number":["BAAI2019QN0301"]}]},{"name":"Key Research Program of the Chinese Academy of Sciences","award":["ZDBS-SSW-JSC006"],"award-info":[{"award-number":["ZDBS-SSW-JSC006"]}]},{"DOI":"10.13039\/501100011222","name":"National Laboratory of Pattern Recognition","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100011222","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100004739","name":"Youth Innovation Promotion Association CAS","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100004739","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2022,3,31]]},"abstract":"<jats:p>Active learning is an effective method to substantially alleviate the problem of expensive annotation cost for data-driven models. Recently, pre-trained language models have been demonstrated to be powerful for learning language representations. In this article, we demonstrate that the pre-trained language model can also utilize its learned textual characteristics to enrich criteria of active learning. Specifically, we provide extra textual criteria with the pre-trained language model to measure instances, including noise, coverage, and diversity. With these extra textual criteria, we can select more efficient instances for annotation and obtain better results. We conduct experiments on both English and Chinese sentence matching datasets. The experimental results show that the proposed active learning approach can be enhanced by the pre-trained language model and obtain better performance.<\/jats:p>","DOI":"10.1145\/3480937","type":"journal-article","created":{"date-parts":[[2021,12,30]],"date-time":"2021-12-30T12:59:43Z","timestamp":1640869183000},"page":"1-19","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Using Pre-trained Language Model to Enhance Active Learning for Sentence Matching"],"prefix":"10.1145","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8013-8605","authenticated-orcid":false,"given":"Guirong","family":"Bai","sequence":"first","affiliation":[{"name":"National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences, and School of Artificial Intelligence, University of Chinese Academy of Sciences China, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shizhu","family":"He","sequence":"additional","affiliation":[{"name":"National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences, and School of Artificial Intelligence, University of Chinese Academy of Sciences China, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kang","family":"Liu","sequence":"additional","affiliation":[{"name":"National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences, and School of Artificial Intelligence, University of Chinese Academy of Sciences China, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jun","family":"Zhao","sequence":"additional","affiliation":[{"name":"National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences, and School of Artificial Intelligence, University of Chinese Academy of Sciences China, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,12,30]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"Jordan T. Ash Chicheng Zhang Akshay Krishnamurthy John Langford and Alekh Agarwal. 2019. Deep batch active learning by diverse uncertain gradient lower bounds. arXiv:1906.03671. Retrieved from https:\/\/arxiv.org\/abs\/1906.03671."},{"key":"e_1_3_2_3_2","doi-asserted-by":"crossref","unstructured":"Guirong Bai Shizhu He Kang Liu Jun Zhao and Zaiqing Nie. 2020. Pre-trained language model based active learning for sentence matching. In Proceedings of the 28th International Conference on Computational Linguistics . 1495\u20131504. https:\/\/doi.org\/10.18653\/v1\/2020.coling-main.130","DOI":"10.18653\/v1\/2020.coling-main.130"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.5555\/944919.944937"},{"key":"e_1_3_2_5_2","doi-asserted-by":"crossref","unstructured":"Samuel R. Bowman Gabor Angeli Christopher Potts and Christopher D. Manning. 2015. A large annotated corpus for learning natural language inference. arXiv:1508.05326. Retrieved from https:\/\/arxiv.org\/abs\/1508.05326.","DOI":"10.18653\/v1\/D15-1075"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1536"},{"key":"e_1_3_2_7_2","doi-asserted-by":"crossref","unstructured":"Qian Chen Xiaodan Zhu Zhenhua Ling Si Wei Hui Jiang and Diana Inkpen. 2016. Enhanced lstm for natural language inference. arXiv:1609.06038. Retrieved from https:\/\/arxiv.org\/abs\/.","DOI":"10.18653\/v1\/P17-1152"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.5555\/3504035.3504659"},{"key":"e_1_3_2_9_2","doi-asserted-by":"crossref","unstructured":"Alexis Conneau Douwe Kiela Holger Schwenk Loic Barrault and Antoine Bordes. 2017. Supervised learning of universal sentence representations from natural language inference data. arXiv:1705.02364. Retrieved from https:\/\/arxiv.org\/abs\/1705.02364.","DOI":"10.18653\/v1\/D17-1070"},{"key":"e_1_3_2_10_2","unstructured":"Jacob Devlin Ming-Wei Chang Kenton Lee and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805. Retrieved from https:\/\/arxiv.org\/abs\/1810.04805."},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3377704"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1231"},{"key":"e_1_3_2_13_2","doi-asserted-by":"crossref","unstructured":"Meng Fang Yuan Li and Trevor Cohn. 2017. Learning how to active learn: A deep reinforcement learning approach. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing . 595\u2013605. https:\/\/doi.org\/10.18653\/v1\/D17-1063","DOI":"10.18653\/v1\/D17-1063"},{"key":"e_1_3_2_14_2","unstructured":"Daniel Gissin and Shai Shalev-Shwartz. 2019. Discriminative active learning. arXiv:1907.06347. Retrieved from https:\/\/arxiv.org\/abs\/1907.06347."},{"key":"e_1_3_2_15_2","unstructured":"Yichen Gong Heng Luo and Jian Zhang. 2017. Natural language inference over interaction space. arXiv:1709.04348. Retrieved from https:\/\/arxiv.org\/abs\/1709.04348."},{"key":"e_1_3_2_16_2","article-title":"First Quora Dataset Release: Question Pairs","author":"Iyer Shankar","year":"2017","unstructured":"Shankar Iyer, Nikhil Dandekar, and Korn\u00e9l Csernai. 2017. First Quora Dataset Release: Question Pairs. Retrieved from Data.quora.com.","journal-title":"Data.quora.com"},{"key":"e_1_3_2_17_2","doi-asserted-by":"crossref","unstructured":"Jungo Kasai Kun Qian Sairam Gurajada Yunyao Li and Lucian Popa. 2019. Low-resource deep entity resolution with transfer and active learning. arXiv:1906.08042. Retrieved from https:\/\/arxiv.org\/abs\/1906.08042.","DOI":"10.18653\/v1\/P19-1586"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33016586"},{"key":"e_1_3_2_19_2","article-title":"Adam: A method for stochastic optimization","author":"Kingma Diederik P.","year":"2014","unstructured":"Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. Comput. Sci. (2014).","journal-title":"Comput. Sci."},{"key":"e_1_3_2_20_2","unstructured":"Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv:1312.6114. Retrieved from https:\/\/arxiv.org\/abs\/1312.6114."},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.5555\/2969442.2969607"},{"key":"e_1_3_2_22_2","unstructured":"Zhenzhong Lan Mingda Chen Sebastian Goodman Kevin Gimpel Piyush Sharma and Radu Soricut. 2020. ALBERT: A lite BERT for self-supervised learning of language representations. arXiv:1909.11942. Retrieved from https:\/\/arxiv.org\/abs\/1909.11942."},{"key":"e_1_3_2_23_2","first-page":"1952","volume-title":"Proceedings of the 27th International Conference on Computational Linguistics","author":"Liu Xin","year":"2018","unstructured":"Xin Liu, Qingcai Chen, Chong Deng, Huajun Zeng, Jing Chen, Dongfang Li, and Buzhou Tang. 2018. Lcqmc: A large-scale chinese question matching corpus. In Proceedings of the 27th International Conference on Computational Linguistics. 1952\u20131962."},{"key":"e_1_3_2_24_2","unstructured":"Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy Mike Lewis Luke Zettlemoyer and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv:1907.11692. Retrieved from https:\/\/arxiv.org\/abs\/1907.11692."},{"key":"e_1_3_2_25_2","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"van der Maaten Laurens","year":"2008","unstructured":"Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 11 (2008), 2579\u20132605.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.5555\/2999792.2999959"},{"key":"e_1_3_2_27_2","doi-asserted-by":"crossref","unstructured":"Yixin Nie and Mohit Bansal. 2017. Shortcut-stacked sentence encoders for multi-domain inference. arXiv:1708.02312. Retrieved from https:\/\/arxiv.org\/abs\/1708.02312.","DOI":"10.18653\/v1\/W17-5308"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1165"},{"key":"e_1_3_2_29_2","doi-asserted-by":"crossref","unstructured":"Ankur P. Parikh Oscar T\u00e4ckstr\u00f6m Dipanjan Das and Jakob Uszkoreit. 2016. A decomposable attention model for natural language inference. arXiv:1606.01933. Retrieved from https:\/\/arxiv.org\/abs\/1606.01933.","DOI":"10.18653\/v1\/D16-1244"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_3_2_31_2","doi-asserted-by":"crossref","unstructured":"Matthew E. Peters Mark Neumann Mohit Iyyer Matt Gardner Christopher Clark Kenton Lee and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv:1802.05365. Retrieved from https:\/\/arxiv.org\/abs\/1802.05365.","DOI":"10.18653\/v1\/N18-1202"},{"key":"e_1_3_2_32_2","volume-title":"Improving Language Understanding with Unsupervised Learning","author":"Radford Alec","year":"2018","unstructured":"Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving Language Understanding with Unsupervised Learning. Technical Report. OpenAI."},{"key":"e_1_3_2_33_2","volume-title":"Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL\u201906)","author":"Romano Lorenza","year":"2006","unstructured":"Lorenza Romano, Milen Kouylekov, Idan Szpektor, Ido Dagan, and Alberto Lavelli. 2006. Investigating a generic paraphrase-based approach for relation extraction. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL\u201906). 409\u2013416."},{"key":"e_1_3_2_34_2","unstructured":"Ozan Sener and Silvio Savarese. 2018. Active learning for convolutional neural networks: A core-set approach. arXiv:1708.00489. Retrieved from https:\/\/arxiv.org\/abs\/1708.00489."},{"key":"e_1_3_2_35_2","volume-title":"Active Learning Literature Survey","author":"Settles Burr","year":"2009","unstructured":"Burr Settles. 2009. Active Learning Literature Survey. Technical Report. University of Wisconsin\u2014Madison Department of Computer Sciences."},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.5555\/1613715.1613855"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.3115\/1218955.1219030"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.5555\/3304222.3304374"},{"key":"e_1_3_2_39_2","doi-asserted-by":"crossref","unstructured":"Yanyao Shen Hyokun Yun Zachary C. Lipton Yakov Kronrod and Animashree Anandkumar. 2017. Deep active learning for named entity recognition. arXiv:1707.05928. Retrieved from https:\/\/arxiv.org\/abs\/1707.05928.","DOI":"10.18653\/v1\/W17-2630"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1318"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.5555\/3294996.3295163"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1162\/153244302760185243"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"key":"e_1_3_2_44_2","first-page":"22","volume-title":"Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL\u201907)","author":"Wang Mengqiu","year":"2007","unstructured":"Mengqiu Wang, Noah A. Smith, and Teruko Mitamura. 2007. What is the jeopardy model? a quasi-synchronous grammar for QA. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL\u201907). 22\u201332."},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.5555\/3171837.3171865"},{"key":"e_1_3_2_46_2","unstructured":"Adina Williams Nikita Nangia and Samuel R. Bowman. 2017. A broad-coverage challenge corpus for sentence understanding through inference. arXiv:1704.05426. Retrieved from https:\/\/arxiv.org\/abs\/1704.05426."},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1079"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00366"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983818"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.5555\/3454287.3454804"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.637"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.5555\/3298023.3298060"},{"key":"e_1_3_2_53_2","doi-asserted-by":"crossref","unstructured":"Jingbo Zhu Huizhen Wang Tianshun Yao and Benjamin K Tsou. 2008. Active learning with sampling by uncertainty and density for word sense disambiguation and text classification. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008) . 1137\u20131144.","DOI":"10.3115\/1599081.1599224"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3480937","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3480937","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:18:23Z","timestamp":1750191503000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3480937"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,30]]},"references-count":52,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,3,31]]}},"alternative-id":["10.1145\/3480937"],"URL":"https:\/\/doi.org\/10.1145\/3480937","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2021,12,30]]},"assertion":[{"value":"2020-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-12-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}