{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T07:22:39Z","timestamp":1768029759227,"version":"3.49.0"},"reference-count":28,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2020,8,5]],"date-time":"2020-08-05T00:00:00Z","timestamp":1596585600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100014718","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CAREER 1452425, IIS 1408287"],"award-info":[{"award-number":["CAREER 1452425, IIS 1408287"]}],"id":[{"id":"10.13039\/100014718","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2020,10,31]]},"abstract":"<jats:p>\n            Given a labeled dataset that contains a rare (or minority) class containing of-interest instances, as well as a large class of instances that are\n            <jats:italic>not<\/jats:italic>\n            of interest, how can we learn to recognize future of-interest instances over a continuous stream? The setting is different from traditional classification in that instances from\n            <jats:italic>novel<\/jats:italic>\n            minority subclasses might continually emerge over time\u2014and hence is often referred as continual, life-long, or open-world classification. We introduce RaRecognize, which (\n            <jats:italic>i<\/jats:italic>\n            ) estimates a\n            <jats:italic>general<\/jats:italic>\n            decision boundary between the rare class and the majority class, (\n            <jats:italic>ii<\/jats:italic>\n            ) learns to recognize the individual rare subclasses that exist within the training data, as well as (\n            <jats:italic>iii<\/jats:italic>\n            ) flags instances from previously\n            <jats:italic>unseen<\/jats:italic>\n            rare subclasses as newly emerging (i.e., novel). The learner in\n            <jats:italic>(i)<\/jats:italic>\n            is general in the sense that by construction it is dissimilar to the\n            <jats:italic>specialized<\/jats:italic>\n            learners in\n            <jats:italic>(ii)<\/jats:italic>\n            , thus distinguishes minority from the majority without overly tuning to what is only seen in the training data. Thanks to this generality, RaRecognize ignores all future instances that it labels as majority and recognizes the\n            <jats:italic>recurring as well as emerging<\/jats:italic>\n            rare subclasses only. This saves effort at test time as well as ensures that the model size grows moderately over time as it only maintains specialized minority learners. Overall, we build an end-to-end system which consists of (1) a representation learning component that transforms data instances into suitable vector inputs; (2) a continual classifier that labels incoming instances as majority (not of interest), rare recurrent, or rare emerging; and (3) a clustering component that groups the rare emerging instances into novel subclasses for expert vetting and model re-training. Through extensive experiments, we show that RaRecognize outperforms state-of-the art baselines on three real-world datasets that contain documents related to corporate-risk and (natural and man-made) disasters as rare classes.\n          <\/jats:p>","DOI":"10.1145\/3399660","type":"journal-article","created":{"date-parts":[[2020,8,5]],"date-time":"2020-08-05T18:53:00Z","timestamp":1596653580000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["End-to-End Continual Rare-Class Recognition with Emerging Novel Subclasses"],"prefix":"10.1145","volume":"14","author":[{"given":"Hung","family":"Nguyen","sequence":"first","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, PA"}]},{"given":"Xuejian","family":"Wang","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, PA"}]},{"given":"Leman","family":"Akoglu","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, PA"}]}],"member":"320","published-online":{"date-parts":[[2020,8,5]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Data Clustering: Algorithms and Applications","author":"Aggarwal Charu C."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-009-0159-9"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1214\/aop\/1176996548"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.2200\/S00737ED1V01Y201610AIM033"},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the 28th Annual ACM Symposium on Applied Computing. ACM, 795--800","author":"Faria Elaine R."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/S1364-6613(99)01294-2"},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the International Conference on Learning Representations.","author":"Kemker Ronald","year":"2018"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1181"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1611835114"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2011.5767923"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the International Conference on Machine Learning.","volume":"14","author":"Quoc"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature14539"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the Conference on Neural Information Processing Systems. 4652--4662","author":"Lee Sang-Woo","year":"2017"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery 8 Data Mining. ACM","author":"Manzoor Emaad","year":"2018"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2017.2691702"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence.","author":"Mu Xin","year":"2017"},{"key":"e_1_2_1_17_1","volume-title":"Manning","author":"Pennington Jeffrey","year":"2014"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-015-5521-0"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2001--2010","author":"Rebuffi Sylvestre-Alvise"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the Conference on Neural Information Processing Systems. 2990--2999","author":"Shin Hanul","year":"2017"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1314"},{"key":"e_1_2_1_22_1","volume-title":"Unseen class discovery in open-world classification. arXiv preprint arXiv:1801.05609","author":"Shu Lei","year":"2018"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098144"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1244002.1244107"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the 22nd International Joint Conference on Artificial Intelligence.","author":"Tan Swee Chuan","year":"2011"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 2014 IEEE International Conference on Data Mining. IEEE, 600--609","author":"Wu Ke"},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the World Wide Web Conference.","author":"Xu Hu"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the Conference on Neural Information Processing Systems. 649--657","author":"Zhang Xiang","year":"2015"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3399660","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3399660","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:33:32Z","timestamp":1750199612000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3399660"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,8,5]]},"references-count":28,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2020,10,31]]}},"alternative-id":["10.1145\/3399660"],"URL":"https:\/\/doi.org\/10.1145\/3399660","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,8,5]]},"assertion":[{"value":"2019-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-08-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}