{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T16:13:09Z","timestamp":1772554389434,"version":"3.50.1"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2020,6,22]],"date-time":"2020-06-22T00:00:00Z","timestamp":1592784000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Defense Advanced Research Projects Agency Contracts","award":["W911NF-11-C-0088 and FA8650-15-C-7557"],"award-info":[{"award-number":["W911NF-11-C-0088 and FA8650-15-C-7557"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2020,8,31]]},"abstract":"<jats:p>Unsupervised anomaly detection algorithms search for outliers and then predict that these outliers are the anomalies. When deployed, however, these algorithms are often criticized for high false-positive and high false-negative rates. One main cause of poor performance is that not all outliers are anomalies and not all anomalies are outliers. In this article, we describe the Active Anomaly Discovery (AAD) algorithm, which incorporates feedback from an expert user that labels a queried data instance as an anomaly or nominal point. This feedback is intended to adjust the anomaly detector so that the outliers it discovers are more in tune with the expert user\u2019s semantic understanding of the anomalies.<\/jats:p>\n          <jats:p>\n            The AAD algorithm is based on a weighted ensemble of anomaly detectors. When it receives a label from the user, it adjusts the weights on each individual ensemble member such that the anomalies rank higher in terms of their anomaly score than the outliers. The AAD approach is designed to operate in an interactive data exploration loop. In each iteration of this loop, our algorithm first selects a data instance to present to the expert as a potential anomaly and then the expert labels the instance as an anomaly or as a nominal data point. When it receives the instance label, the algorithm updates its internal model and the loop continues until a budget of\n            <jats:italic>B<\/jats:italic>\n            queries is spent. The goal of our approach is to maximize the total number of true anomalies in the\n            <jats:italic>B<\/jats:italic>\n            instances presented to the expert. We show that the AAD method performs well and in some cases doubles the number of true anomalies found compared to previous methods. In addition we present approximations that make the AAD algorithm much more computationally efficient while maintaining a desirable level of performance.\n          <\/jats:p>","DOI":"10.1145\/3396608","type":"journal-article","created":{"date-parts":[[2020,6,22]],"date-time":"2020-06-22T18:37:32Z","timestamp":1592851052000},"page":"1-32","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Discovering Anomalies by Incorporating Feedback from an Expert"],"prefix":"10.1145","volume":"14","author":[{"given":"Shubhomoy","family":"Das","sequence":"first","affiliation":[{"name":"Oregon State University, SW Park Terrace, Corvallis, Oregon"}]},{"given":"Weng-Keen","family":"Wong","sequence":"additional","affiliation":[{"name":"Oregon State University, SW Park Terrace, Corvallis, Oregon"}]},{"given":"Thomas","family":"Dietterich","sequence":"additional","affiliation":[{"name":"Oregon State University, SW Park Terrace, Corvallis, Oregon"}]},{"given":"Alan","family":"Fern","sequence":"additional","affiliation":[{"name":"Oregon State University, SW Park Terrace, Corvallis, Oregon"}]},{"given":"Andrew","family":"Emmott","sequence":"additional","affiliation":[{"name":"Oregon State University, SW Park Terrace, Corvallis, Oregon"}]}],"member":"320","published-online":{"date-parts":[[2020,6,22]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150459"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2481244.2481252"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/1009380.1009668"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the Advances in Neural Information Processing Systems.","author":"Boyd Stephen","year":"2012"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/342009.335388"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the Advances in Neural Information Processing Systems. 679--686","author":"Cohn David A.","year":"1994"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2016.0102"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence. 75--82","author":"Dereszynski Ethan","year":"2007"},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the 25th International Conference on Machine Learning. ACM, 248--255","author":"Donmez Pinar"},{"key":"e_1_2_1_10_1","volume-title":"Carbonell","author":"Donmez Pinar","year":"2009"},{"key":"e_1_2_1_11_1","volume-title":"Systematic construction of anomaly detection benchmarks from real data. CoRR abs\/1503.01158","author":"Emmott Andrew","year":"2015"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1613\/jair.3623"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.comnet.2016.05.021"},{"key":"e_1_2_1_14_1","volume-title":"Advances in Neural Information Processing Systems 20","author":"He Jingrui"},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the 10th International Symposium on Artificial Intelligence and Mathematics.","author":"He Jingrui","year":"2008"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2008.122"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/775047.775067"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the 32nd International Conference on International Conference on Machine Learning. 189--198","author":"Kar Purushottam","year":"2015"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/800057.808695"},{"key":"e_1_2_1_20_1","volume-title":"Ng","author":"Knorr Edwin M.","year":"1998"},{"key":"e_1_2_1_21_1","volume-title":"Quantile Regression","author":"Koenker Roger"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1081870.1081891"},{"key":"e_1_2_1_23_1","unstructured":"Moshe Lichman. 2013. UCI Machine Learning Repository. Retrieved from http:\/\/archive.ics.uci.edu\/ml.  Moshe Lichman. 2013. UCI Machine Learning Repository. Retrieved from http:\/\/archive.ics.uci.edu\/ml."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2008.17"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1561\/1500000016"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1835449.1835495"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/JISIC.2014.23"},{"key":"e_1_2_1_28_1","volume-title":"Moore","author":"Pelleg Dan","year":"2004"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-015-5521-0"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390255"},{"key":"e_1_2_1_31_1","volume-title":"Less is more: Building selective anomaly ensembles. ACM Transactions on Knowledge Discovery from Data 10, 4","author":"Rayana Shebuti","year":"2016"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2487575.2488213"},{"key":"e_1_2_1_33_1","first-page":"55","article-title":"Active learning literature survey. Technical Report. University of Wisconsin","volume":"52","author":"Settles Burr","year":"2010","journal-title":"Madison"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.3115\/1613715.1613855"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/130385.130417"},{"key":"e_1_2_1_36_1","volume-title":"ALADIN: Active learning of anomalies to detect intrusions. Technique Report. Microsoft Network Security Redmond, WA 98052","author":"Stokes Jack W.","year":"2008"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the 17th International Conference on Machine Learning. 999--1006","author":"Tong Simon","year":"2000"},{"key":"e_1_2_1_38_1","first-page":"2579","article-title":"Visualizing Data Using t-SNE","author":"van der Maaten Laurens","year":"2008","journal-title":"Journal of Machine Learning Research 9"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557112"},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the IEEE International Conference on Big Data Security.","author":"Veeramachaneni Kalyan","year":"2016"},{"key":"e_1_2_1_41_1","first-page":"1961","article-title":"What\u2019s strange about recent events (WSARE): An algorithm for the early detection of disease outbreaks","author":"Wong Weng-Keen","year":"2005","journal-title":"Journal of Machine Learning Research 6"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0218001493000698"},{"key":"e_1_2_1_43_1","series-title":"Lecture Notes in Computer Science","volume-title":"Advances in Information Retrieval","author":"Xu Zuobing"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2594473.2594476"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3396608","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3396608","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:33:28Z","timestamp":1750199608000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3396608"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,22]]},"references-count":44,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,8,31]]}},"alternative-id":["10.1145\/3396608"],"URL":"https:\/\/doi.org\/10.1145\/3396608","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,6,22]]},"assertion":[{"value":"2017-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-04-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-06-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}