{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T14:39:43Z","timestamp":1777559983948,"version":"3.51.4"},"reference-count":15,"publisher":"SAGE Publications","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AIC"],"published-print":{"date-parts":[[2020,12,18]]},"abstract":"<jats:p>In this paper we present an approach for novelty detection in text data. The approach can also be considered as semi-supervised anomaly detection because it operates with the training dataset containing labelled instances for the known classes only. During the training phase the classification model is learned. It is assumed that at least two known classes exist in the available training dataset. In the testing phase instances are classified as normal or anomalous based on the classifier confidence. In other words, if the classifier cannot assign any of the known class labels to the given instance with sufficiently high confidence (probability), the instance will be declared as novelty (anomaly). We propose two procedures to objectively measure the classifier confidence. Experimental results show that the proposed approach is comparable to methods known in the literature.<\/jats:p>","DOI":"10.3233\/aic-200649","type":"journal-article","created":{"date-parts":[[2020,12,18]],"date-time":"2020-12-18T10:09:12Z","timestamp":1608286152000},"page":"1-15","source":"Crossref","is-referenced-by-count":1,"title":["An approach for outlier and novelty detection for text data based on classifier confidence"],"prefix":"10.1177","author":[{"given":"Nikola","family":"Pi\u017eurica","sequence":"first","affiliation":[{"name":"Faculty of Mathematics and Natural Sciences, University of Montenegro, Cetinjska 2, 81000 Podgorica, Montenegro. E-mails:\u00a0nikola.pizurica.1998@gmail.com,\u00a0savot@ucg.ac.me"}]},{"given":"Savo","family":"Tomovi\u0107","sequence":"additional","affiliation":[{"name":"Faculty of Mathematics and Natural Sciences, University of Montenegro, Cetinjska 2, 81000 Podgorica, Montenegro. E-mails:\u00a0nikola.pizurica.1998@gmail.com,\u00a0savot@ucg.ac.me"}]}],"member":"179","reference":[{"key":"10.3233\/AIC-200649_ref1","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1007\/978-1-4614-3223-4_4","volume-title":"Mining Text Data, Chapter\u00a04","author":"Aggrawal","year":"2012"},{"issue":"5","key":"10.3233\/AIC-200649_ref2","doi-asserted-by":"publisher","first-page":"35","DOI":"10.5120\/1475-1991","article-title":"Document image processing\u00a0\u2013 a review","volume":"10","author":"Akram","year":"2010","journal-title":"International Journal of Computer Applications"},{"key":"10.3233\/AIC-200649_ref3","unstructured":"S.\u00a0Bird, E.\u00a0Klein and E.\u00a0Loper, Natural Language Processing with Python, O\u2019Reilly Media, 2009."},{"key":"10.3233\/AIC-200649_ref4","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/1541880.1541882","article-title":"Anomaly detection a survey","author":"Chandola","year":"2009","journal-title":"ACM Computing Surveys"},{"key":"10.3233\/AIC-200649_ref5","doi-asserted-by":"crossref","unstructured":"P.\u00a0Chen, C.\u00a0Lin and B.\u00a0Scholkopf, A tutorial on \u03bd-support vector machines, in: Applied Stochastic Models in Business and Industry, 2005, pp.\u00a0111\u2013136.","DOI":"10.1002\/asmb.537"},{"key":"10.3233\/AIC-200649_ref6","doi-asserted-by":"publisher","DOI":"10.1145\/3086512.3086520"},{"key":"10.3233\/AIC-200649_ref7","unstructured":"T.\u00a0Dasgupta and L.\u00a0Dey, Automatic scoring for innovativeness of textual ideas, in: Workshops at the Thirtieth AAAI Conference on Artificial Intelligence, 2016, pp.\u00a0507\u2013511."},{"key":"10.3233\/AIC-200649_ref8","doi-asserted-by":"publisher","DOI":"10.1117\/12.908542"},{"key":"10.3233\/AIC-200649_ref9","unstructured":"T.\u00a0Ghosal, V.\u00a0Edithal, A.\u00a0Ekbal, P.\u00a0Bhattacharyya, G.\u00a0Tsatsaronis, S.S.\u00a0Sameer and K.\u00a0Chivukula, Novelty goes deep. A deep neural solution to document level novelty detection, in: Proceedings of 27th International Conference on Computational Linguistics (COLING 2018), 2018."},{"key":"10.3233\/AIC-200649_ref12","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","article-title":"Array programming with NumPy","volume":"585","author":"Harris","year":"2020","journal-title":"Nature"},{"key":"10.3233\/AIC-200649_ref14","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"JMLR"},{"key":"10.3233\/AIC-200649_ref15","doi-asserted-by":"crossref","unstructured":"P.J.\u00a0Rousseeuw and A.M.\u00a0Leroy, Robust Regression and Outlier Detection, John Wiley & Sons, Inc., New York, NY, USA, 1987.","DOI":"10.1002\/0471725382"},{"key":"10.3233\/AIC-200649_ref16","doi-asserted-by":"publisher","DOI":"10.1109\/ICDAR.2013.28"},{"key":"10.3233\/AIC-200649_ref17","doi-asserted-by":"publisher","DOI":"10.1109\/RISP.1990.63857"},{"key":"10.3233\/AIC-200649_ref18","doi-asserted-by":"publisher","DOI":"10.1145\/564376.564393"}],"container-title":["AI Communications"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/AIC-200649","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T18:27:56Z","timestamp":1777400876000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/AIC-200649"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12,18]]},"references-count":15,"URL":"https:\/\/doi.org\/10.3233\/aic-200649","relation":{},"ISSN":["1875-8452","0921-7126"],"issn-type":[{"value":"1875-8452","type":"electronic"},{"value":"0921-7126","type":"print"}],"subject":[],"published":{"date-parts":[[2020,12,18]]}}}