{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T17:51:42Z","timestamp":1775065902497,"version":"3.50.1"},"reference-count":39,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2023,2,16]],"date-time":"2023-02-16T00:00:00Z","timestamp":1676505600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61772282"],"award-info":[{"award-number":["61772282"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100018735","name":"Ant Group","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100018735","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2023,4,30]]},"abstract":"<jats:p>Many practical applications, such as social media and monitoring system, will constantly generate streaming data, which has problems of instability, lack of labels and multiclass imbalance. In order to solve these problems, a cluster-based active learning method is proposed to achieve data stream classification. Firstly, a label query strategy combining marginal threshold matrix is proposed, which selects difficult to classify or potential concept drift samples for marking, to solve the problem of high cost label and unbalanced data. Secondly, dynamic maintenance of a group of micro clusters, by adjusting the weight of micro clusters in the model, correctly reflects the current data distribution, and finally, uses the buffer to store new micro clusters to participate in the update of the model, to adapt to the new data environment. Experimental results on three real data sets and three synthetic data sets show that compared with the classical data stream classification algorithm, it is less affected by concept drift and has higher classification accuracy than the online semi-supervised learning algorithm ADSM. The average accuracy of the six datasets increased by 5.56%, 2.32%, 1.77%, 1.83%, 3.78%, and 2.04%, respectively. The model processes data streams online and improves classification performance with less memory consumption.<\/jats:p>","DOI":"10.1145\/3579830","type":"journal-article","created":{"date-parts":[[2023,1,9]],"date-time":"2023-01-09T13:18:52Z","timestamp":1673270332000},"page":"1-18","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["Clustering-based Active Learning Classification towards Data Stream"],"prefix":"10.1145","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5764-2432","authenticated-orcid":false,"given":"Chunyong","family":"Yin","sequence":"first","affiliation":[{"name":"Nanjing University of Information Science and Technology, Nanjing, Jiangsu, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4970-0705","authenticated-orcid":false,"given":"Shuangshuang","family":"Chen","sequence":"additional","affiliation":[{"name":"Nanjing University of Information Science and Technology, Nanjing, Jiangsu, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3458-7909","authenticated-orcid":false,"given":"Zhichao","family":"Yin","sequence":"additional","affiliation":[{"name":"Southeast University, Nanjing, Jiangsu, China"}]}],"member":"320","published-online":{"date-parts":[[2023,2,16]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"77","volume-title":"Fourth International Workshop on Knowledge Discovery from Data Streams","author":"Baena-Garc\u0131a Manuel","year":"2006","unstructured":"Manuel Baena-Garc\u0131a, Jos\u00e9 del Campo-\u00c1vila, Ra\u00fal Fidalgo, Albert Bifet, R. Gavalda, and Rafael Morales-Bueno. 2006. Early drift detection method. In Fourth International Workshop on Knowledge Discovery from Data Streams, Vol. 6. 77\u201386."},{"key":"e_1_3_1_3_2","unstructured":"Maria-Florina Balcan and Ruth Urner. 2016. Active Learning-Modern Learning Theory."},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1007\/s13173-012-0072-8"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/BRICS-CCI-CBIC.2013.63"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2013.12.011"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2016.7727427"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2019.08.050"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2020.03.052"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2012.136"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/mci.2015.2471196"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/347090.347107"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2011.2160459"},{"key":"e_1_3_1_14_2","article-title":"Finding and tracking multi-density clusters in online dynamic data streams","author":"Fahy Conor","year":"2019","unstructured":"Conor Fahy and Shengxiang Yang. 2019. Finding and tracking multi-density clusters in online dynamic data streams. IEEE Transactions on Big Data (2019).","journal-title":"IEEE Transactions on Big Data"},{"key":"e_1_3_1_15_2","first-page":"1","volume-title":"2019 International Joint Conference on Neural Networks (IJCNN\u201919)","author":"Ferreira Luis Eduardo Boiko","year":"2019","unstructured":"Luis Eduardo Boiko Ferreira, Heitor Murilo Gomes, Albert Bifet, and Luiz S. Oliveira. 2019. Adaptive random forests with resampling for imbalanced data streams. In 2019 International Joint Conference on Neural Networks (IJCNN\u201919). IEEE, 1\u20136."},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.5555\/1855075"},{"key":"e_1_3_1_17_2","first-page":"286","volume-title":"Brazilian Symposium on Artificial Intelligence","author":"Gama Joao","year":"2004","unstructured":"Joao Gama, Pedro Medas, Gladys Castillo, and Pedro Rodrigues. 2004. Learning with drift detection. In Brazilian Symposium on Artificial Intelligence. Springer, 286\u2013295."},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972771.1"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3532038"},{"key":"e_1_3_1_20_2","article-title":"Active learning by querying informative and representative examples","volume":"23","author":"Huang Sheng-Jun","year":"2010","unstructured":"Sheng-Jun Huang, Rong Jin, and Zhi-Hua Zhou. 2010. Active learning by querying informative and representative examples. Advances in Neural Information Processing Systems 23 (2010).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_21_2","first-page":"133","volume-title":"Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications","author":"Ienco Dino","year":"2014","unstructured":"Dino Ienco, Indr\u0117 \u017dliobait\u0117, and Bernhard Pfahringer. 2014. High density-focused uncertainty sampling for active learning over evolving stream data. In Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications. PMLR, 133\u2013148."},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.5555\/1314498.1390333"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/3401025.3401763"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-24282-8_10"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2021.106778"},{"key":"e_1_3_1_26_2","first-page":"2393","volume-title":"IJCAI","author":"Lu Yang","year":"2017","unstructured":"Yang Lu, Yiu-ming Cheung, and Yuan Yan Tang. 2017. Dynamic weighted majority for incremental learning of imbalanced data streams with concept drift. In IJCAI. 2393\u20132399."},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2019.2951814"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-011-0447-8"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2016.04.008"},{"key":"e_1_3_1_30_2","unstructured":"Burr Settles. 2009. Active learning literature survey. (2009)."},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2018.2844332"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2018.09.035"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/956750.956778"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/FUZZ-IEEE.2018.8491674"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2019.105140"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2017.05.046"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF00116900"},{"key":"e_1_3_1_38_2","first-page":"552","volume-title":"International Symposium on Methodologies for Intelligent Systems","author":"Woolam Clay","year":"2009","unstructured":"Clay Woolam, Mohammad M. Masud, and Latifur Khan. 2009. Lacking labels in the stream: Classifying evolving stream data with few labels. In International Symposium on Methodologies for Intelligent Systems. Springer, 552\u2013562."},{"key":"e_1_3_1_39_2","first-page":"1","article-title":"A clustering-based active learning method to query informative and representative samples","author":"Yan Xuyang","year":"2022","unstructured":"Xuyang Yan, Shabnam Nazmi, Biniam Gebru, Mohd Anwar, Abdollah Homaifar, Mrinmoy Sarkar, and Kishor Datta Gupta. 2022. A clustering-based active learning method to query informative and representative samples. Applied Intelligence (2022), 1\u201318.","journal-title":"Applied Intelligence"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2012.2236570"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3579830","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3579830","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:51:27Z","timestamp":1750182687000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3579830"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,16]]},"references-count":39,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,4,30]]}},"alternative-id":["10.1145\/3579830"],"URL":"https:\/\/doi.org\/10.1145\/3579830","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"value":"2157-6904","type":"print"},{"value":"2157-6912","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,16]]},"assertion":[{"value":"2022-06-20","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-12-06","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-02-16","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}