{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,12]],"date-time":"2025-12-12T13:08:26Z","timestamp":1765544906128,"version":"3.41.0"},"reference-count":79,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2023,8,21]],"date-time":"2023-08-21T00:00:00Z","timestamp":1692576000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61972192, 62172208, 61906085, and 41972111"],"award-info":[{"award-number":["61972192, 62172208, 61906085, and 41972111"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Collaborative Innovation Center of Novel Software Technology and Industrialization"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2024,1,31]]},"abstract":"<jats:p>\n            Few-shot sequence labeling aims to identify novel classes based on only a few labeled samples. Existing methods solve the data scarcity problem mainly by designing token-level or span-level labeling models based on metric learning. However, these methods are only trained at a single granularity (i.e., either token-level or span-level) and have some weaknesses of the corresponding granularity. In this article, we first unify token- and span-level supervisions and propose a Consistent Dual Adaptive Prototypical (CDAP) network for few-shot sequence labeling. CDAP contains the token- and span-level networks, jointly trained at different granularities. To align the outputs of two networks, we further propose a consistent loss to enable them to learn from each other. During the inference phase, we propose a consistent greedy inference algorithm that first adjusts the predicted probability and then greedily selects non-overlapping spans with maximum probability. Extensive experiments show that our model achieves new state-of-the-art results on three benchmark datasets. All the code and data of this work will be released at\n            <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"url\" xlink:href=\"https:\/\/github.com\/zifengcheng\/CDAP\">https:\/\/github.com\/zifengcheng\/CDAP<\/jats:ext-link>\n            .\n          <\/jats:p>","DOI":"10.1145\/3610403","type":"journal-article","created":{"date-parts":[[2023,7,20]],"date-time":"2023-07-20T12:03:05Z","timestamp":1689854585000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Unifying Token- and Span-level Supervisions for Few-shot Sequence Labeling"],"prefix":"10.1145","volume":"42","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8486-2614","authenticated-orcid":false,"given":"Zifeng","family":"Cheng","sequence":"first","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4389-1582","authenticated-orcid":false,"given":"Qingyu","family":"Zhou","sequence":"additional","affiliation":[{"name":"Tencent Cloud Xiaowei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5243-4992","authenticated-orcid":false,"given":"Zhiwei","family":"Jiang","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1525-1569","authenticated-orcid":false,"given":"Xuemin","family":"Zhao","sequence":"additional","affiliation":[{"name":"Tencent Cloud Xiaowei, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-2558-5206","authenticated-orcid":false,"given":"Yunbo","family":"Cao","sequence":"additional","affiliation":[{"name":"Tencent Cloud Xiaowei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1112-790X","authenticated-orcid":false,"given":"Qing","family":"Gu","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology, Nanjing University, China"}]}],"member":"320","published-online":{"date-parts":[[2023,8,21]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3417996"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.27"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.392"},{"key":"e_1_3_2_5_2","volume-title":"Proceedings of the 7th International Conference on Learning Representations (ICLR\u201919)","author":"Chen Wei-Yu","year":"2019","unstructured":"Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, and Jia-Bin Huang. 2019. A closer look at few-shot classification. In Proceedings of the 7th International Conference on Learning Representations (ICLR\u201919). OpenReview.net. Retrieved from https:\/\/openreview.net\/forum?id=HkxLXnAcFQ."},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00893"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2021.3102194"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00104"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i05.6259"},{"key":"e_1_3_2_10_2","unstructured":"Alice Coucke Alaa Saade Adrien Ball Th\u00e9odore Bluche Alexandre Caulier David Leroy Cl\u00e9ment Doumouro Thibault Gisselbrecht Francesco Caltagirone Thibaut Lavril Ma\u00ebl Primet and Joseph Dureau. 2018. Snips voice platform: An embedded spoken language understanding system for private-by-design voice interfaces. Retrieved from http:\/\/arxiv.org\/abs\/1805.10190."},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-acl.161"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1422"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.439"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/w17-4418"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/n19-1423"},{"key":"e_1_3_2_16_2","volume-title":"Proceedings of the 8th International Conference on Learning Representations (ICLR\u201920)","author":"Dhillon Guneet Singh","year":"2020","unstructured":"Guneet Singh Dhillon, Pratik Chaudhari, Avinash Ravichandran, and Stefano Soatto. 2020. A baseline for few-shot image classification. In Proceedings of the 8th International Conference on Learning Representations (ICLR\u201920). OpenReview.net. Retrieved from https:\/\/openreview.net\/forum?id=rylXBkrYDS"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.acl-long.248"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2006.79"},{"key":"e_1_3_2_19_2","first-page":"1126","volume-title":"Proceedings of the 34th International Conference on Machine Learning (ICML\u201917)","author":"Finn Chelsea","year":"2017","unstructured":"Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning (ICML\u201917). 1126\u20131135. Retrieved from http:\/\/proceedings.mlr.press\/v70\/finn17a.html"},{"key":"e_1_3_2_20_2","first-page":"9537","volume-title":"Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS\u201918)","author":"Finn Chelsea","year":"2018","unstructured":"Chelsea Finn, Kelvin Xu, and Sergey Levine. 2018. Probabilistic model-agnostic meta-learning. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS\u201918). 9537\u20139548. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2018\/hash\/8e2c381d4dd04f1c55093f22c59c3a08-Abstract.html"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3297280.3297378"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.acl-long.558"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33016407"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/3486250"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.264"},{"key":"e_1_3_2_26_2","first-page":"4005","volume-title":"Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS\u201919)","author":"Hou Ruibing","year":"2019","unstructured":"Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, and Xilin Chen. 2019. Cross attention network for few-shot classification. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS\u201919). 4005\u20134016. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/01894d6f048493d2cacde3c579c315a3-Abstract.html"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.128"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.findings-acl.53"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.813"},{"key":"e_1_3_2_30_2","unstructured":"Zhiheng Huang Wei Xu and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. Retrieved from http:\/\/arxiv.org\/abs\/1508.01991"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.192"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/n16-1030"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.192"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01259"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2020.3038670"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01348"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/3366423.3379994"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.519"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3442381.3449943"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/p15-1172"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3419972"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3531978"},{"key":"e_1_3_2_43_2","volume-title":"Proceedings of the 7th International Conference on Learning Representations (ICLR\u201919)","author":"Loshchilov Ilya","year":"2019","unstructured":"Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In Proceedings of the 7th International Conference on Learning Representations (ICLR\u201919). Retrieved from https:\/\/openreview.net\/forum?id=Bkg6RiCqY7"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/p19-1344"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.findings-acl.155"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-acl.88"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/3464377"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.naacl-main.420"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.findings-acl.124"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/p16-1101"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.eacl-main.134"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.575"},{"key":"e_1_3_2_53_2","first-page":"143","volume-title":"Proceedings of the 17th Conference on Computational Natural Language Learning (CoNLL\u201913)","author":"Pradhan Sameer","year":"2013","unstructured":"Sameer Pradhan, Alessandro Moschitti, Nianwen Xue, Hwee Tou Ng, Anders Bj\u00f6rkelund, Olga Uryupina, Yuchen Zhang, and Zhi Zhong. 2013. Towards robust linguistic analysis using ontonotes. In Proceedings of the 17th Conference on Computational Natural Language Learning (CoNLL\u201913). ACL, 143\u2013152. Retrieved from https:\/\/aclanthology.org\/W13-3516\/"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1214"},{"key":"e_1_3_2_55_2","first-page":"142","volume-title":"Proceedings of the 7th Conference on Natural Language Learning (CoNLL\u201903) Held in Cooperation with North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL\u201903)","author":"Sang Erik F. Tjong Kim","year":"2003","unstructured":"Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of the 7th Conference on Natural Language Learning (CoNLL\u201903) Held in Cooperation with North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL\u201903). ACL, 142\u2013147. Retrieved from https:\/\/aclanthology.org\/W03-0419\/"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.67"},{"key":"e_1_3_2_57_2","article-title":"Prototypical networks for few-shot learning","author":"Snell Jake","year":"2017","unstructured":"Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning. Advances in Neural Information Processing Systems. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/cb8da6767461f2812ae4290eac7cbc42-Abstract.html","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1045"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00131"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2021\/542"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.acl-long.487"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1145\/3466796"},{"key":"e_1_3_2_63_2","first-page":"5998","volume-title":"Proceedings of the Annual Conference on Neural Information Processing Systems: Advances in Neural Information Processing Systems 30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Annual Conference on Neural Information Processing Systems: Advances in Neural Information Processing Systems 30. 5998\u20136008. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html"},{"key":"e_1_3_2_64_2","article-title":"Matching networks for one shot learning","author":"Vinyals Oriol","year":"2016","unstructured":"Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et\u00a0al. 2016. Matching networks for one shot learning. Advances in Neural Information Processing Systems. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2016\/hash\/90e1357833654983612fb05e3ec9148c-Abstract.html","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.naacl-main.369"},{"key":"e_1_3_2_66_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-emnlp.139"},{"key":"e_1_3_2_67_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00760"},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.acl-long.451"},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.516"},{"key":"e_1_3_2_70_2","first-page":"7115","volume-title":"Proceedings of the 36th International Conference on Machine Learning (ICML\u201919)","author":"Yoon Sung Whan","year":"2019","unstructured":"Sung Whan Yoon, Jun Seo, and Jaekyun Moon. 2019. TapNet: Neural network augmented with task-adaptive projection for few-shot learning. In Proceedings of the 36th International Conference on Machine Learning (ICML\u201919). 7115\u20137123. Retrieved from http:\/\/proceedings.mlr.press\/v97\/yoon19a.html"},{"key":"e_1_3_2_71_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.59"},{"key":"e_1_3_2_72_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.577"},{"key":"e_1_3_2_73_2","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462856"},{"key":"e_1_3_2_74_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-016-9343-x"},{"key":"e_1_3_2_75_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.10995"},{"key":"e_1_3_2_76_2","first-page":"2371","volume-title":"Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS\u201918)","author":"Zhang Ruixiang","year":"2018","unstructured":"Ruixiang Zhang, Tong Che, Zoubin Ghahramani, Yoshua Bengio, and Yangqiu Song. 2018. MetaGAN: An adversarial approach to few-shot learning. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS\u201918). 2371\u20132380. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2018\/hash\/4e4e53aa080247bc31d0eb4e7aeb07a0-Abstract.html"},{"key":"e_1_3_2_77_2","article-title":"Reachable distance function for KNN classification","author":"Zhang Shichao","year":"2022","unstructured":"Shichao Zhang, Jiaye Li, and Yangding Li. 2022. Reachable distance function for KNN classification. IEEE Trans. Knowl. Data Eng. 35, 7 (2022), 7382\u20137396.","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"e_1_3_2_78_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.59"},{"key":"e_1_3_2_79_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.592"},{"key":"e_1_3_2_80_2","doi-asserted-by":"crossref","unstructured":"Enwei Zhu Yiyang Liu and Jinpeng Li. 2022. Deep span representations for named entity recognition. Retrieved from https:\/\/arXiv:2210.04182.","DOI":"10.18653\/v1\/2023.findings-acl.672"}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3610403","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3610403","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:49:02Z","timestamp":1750182542000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3610403"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,21]]},"references-count":79,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,1,31]]}},"alternative-id":["10.1145\/3610403"],"URL":"https:\/\/doi.org\/10.1145\/3610403","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"type":"print","value":"1046-8188"},{"type":"electronic","value":"1558-2868"}],"subject":[],"published":{"date-parts":[[2023,8,21]]},"assertion":[{"value":"2022-11-11","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-07-11","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-08-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}