{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T01:34:37Z","timestamp":1777685677837,"version":"3.51.4"},"reference-count":21,"publisher":"SAGE Publications","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["HIS"],"published-print":{"date-parts":[[2021,2,5]]},"abstract":"<jats:p>Statistical uncertainties are rarely incorporated into machine learning algorithms, especially for anomaly detection. Here we present the Bayesian Anomaly Detection And Classification (BADAC) formalism, which provides a unified statistical approach to classification and anomaly detection within a hierarchical Bayesian framework. BADAC deals with uncertainties by marginalising over the unknown, true, value of the data. Using simulated data with Gaussian noise as an example, BADAC is shown to be superior to standard algorithms in both classification and anomaly detection performance in the presence of uncertainties. Additionally, BADAC provides well-calibrated classification probabilities, valuable for use in scientific pipelines. We show that BADAC can work in online mode and is fairly robust to model errors, which can be diagnosed through model-selection methods. In addition it can perform unsupervised new class detection and can naturally be extended to search for anomalous subsets of data. BADAC is therefore ideal where computational cost is not a limiting factor and statistical rigour is important. We discuss approximations to speed up BADAC, such as the use of Gaussian processes, and finally introduce a new metric, the Rank-Weighted Score (RWS), that is particularly suited to evaluating an algorithm\u2019s ability to detect anomalies.<\/jats:p>","DOI":"10.3233\/his-200282","type":"journal-article","created":{"date-parts":[[2020,7,3]],"date-time":"2020-07-03T13:23:48Z","timestamp":1593782628000},"page":"207-222","source":"Crossref","is-referenced-by-count":1,"title":["Bayesian anomaly detection and classification for noisy data"],"prefix":"10.1177","volume":"16","author":[{"given":"Ethan","family":"Roberts","sequence":"first","affiliation":[{"name":"University of Cape Town, Rondebosch, Cape Town, South Africa"},{"name":"African Institute of Mathematical Sciences, Muizenburg, Cape Town, South Africa"}]},{"given":"Bruce A.","family":"Bassett","sequence":"additional","affiliation":[{"name":"University of Cape Town, Rondebosch, Cape Town, South Africa"},{"name":"African Institute of Mathematical Sciences, Muizenburg, Cape Town, South Africa"},{"name":"South African Radio Astronomical Observatory, Observatory, Cape Town, South Africa"},{"name":"South African Astronomical Observatory, Observatory, Cape Town, South Africa"}]},{"given":"Michelle","family":"Lochner","sequence":"additional","affiliation":[{"name":"African Institute of Mathematical Sciences, Muizenburg, Cape Town, South Africa"},{"name":"South African Radio Astronomical Observatory, Observatory, Cape Town, South Africa"}]}],"member":"179","reference":[{"key":"10.3233\/HIS-200282_ref1","doi-asserted-by":"crossref","first-page":"14410","DOI":"10.1109\/ACCESS.2018.2807385","article-title":"Threat of adversarial attacks on deep learning in computer vision: A survey","volume":"6","author":"Akhtar","year":"2018","journal-title":"IEEE Access"},{"key":"10.3233\/HIS-200282_ref2","doi-asserted-by":"crossref","unstructured":"L. Breiman and E. Schapire, Random forests, in: Machine Learning, 2001, pp. 5\u201332.","DOI":"10.1023\/A:1010933404324"},{"issue":"2","key":"10.3233\/HIS-200282_ref3","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1145\/335191.335388","article-title":"Lof: Identifying density-based local outliers","volume":"29","author":"Breunig","year":"2000","journal-title":"SIGMOD Rec."},{"key":"10.3233\/HIS-200282_ref6","doi-asserted-by":"crossref","unstructured":"J. Davis and M. Goadrich, The relationship between precision-recall and roc curves, in: Proceedings of the 23rd International Conference on Machine Learning, ICML \u201906, New York, NY, USA, ACM, 2006, pp. 233\u2013240.","DOI":"10.1145\/1143844.1143874"},{"issue":"2\u20133","key":"10.3233\/HIS-200282_ref8","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1023\/A:1007413511361","article-title":"On the optimality of the simple bayesian classifier under zero-one loss","volume":"29","author":"Domingos","year":"1997","journal-title":"Machine Learning"},{"issue":"2","key":"10.3233\/HIS-200282_ref10","doi-asserted-by":"crossref","first-page":"1687","DOI":"10.1093\/mnras\/stu1866","article-title":"Generalized fisher matrices","volume":"445","author":"Heavens","year":"2014","journal-title":"Mon. Not. Roy. Astron. Soc."},{"issue":"2","key":"10.3233\/HIS-200282_ref11","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1088\/0004-637X\/752\/2\/79","article-title":"Photometric supernova cosmology with beams and sdss-ii","volume":"752","author":"Hlozek","year":"2012","journal-title":"The Astrophysical Journal"},{"key":"10.3233\/HIS-200282_ref13","doi-asserted-by":"crossref","unstructured":"A. Kim and E. Linder, Correlated supernova systematics and ground based surveys, JCAP 6(20) (2011).","DOI":"10.1088\/1475-7516\/2011\/06\/020"},{"key":"10.3233\/HIS-200282_ref14","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1088\/1475-7516\/2013\/01\/039","article-title":"Extending BEAMS to incorporate correlated systematic uncertainties","volume":"1","author":"Knights","year":"2013","journal-title":"Journal of Cosmology and Astroparticle Physics"},{"key":"10.3233\/HIS-200282_ref15","first-page":"103508","article-title":"Bayesian estimation applied to multiple species: Towards cosmology with a million supernovae","volume":"D75","author":"Kunz","year":"2007","journal-title":"Phys. Rev."},{"key":"10.3233\/HIS-200282_ref16","doi-asserted-by":"crossref","unstructured":"F.T. Liu, K.M. Ting and Z.-H. Zhou, Isolation forest, in: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, ICDM \u201908, Washington, DC, USA, 2008, pp.\u00a0413\u2013422. IEEE Computer Society.","DOI":"10.1109\/ICDM.2008.17"},{"issue":"2","key":"10.3233\/HIS-200282_ref17","doi-asserted-by":"crossref","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","article-title":"Comparison of the predicted and observed secondary structure of t4 phage lysozyme","volume":"405","author":"Matthews","year":"1975","journal-title":"Biochimica et Biophysica Acta (BBA) \u2013 Protein Structure"},{"issue":"2","key":"10.3233\/HIS-200282_ref18","doi-asserted-by":"crossref","first-page":"913","DOI":"10.1111\/j.1365-2966.2011.20147.x","article-title":"Parameter estimation with Bayesian estimation applied to multiple species in the presence of biases and correlations","volume":"421","author":"Newling","year":"2012","journal-title":"Monthly Notices of the Royal Astronomical Society"},{"key":"10.3233\/HIS-200282_ref19","doi-asserted-by":"crossref","unstructured":"A. Niculescu-Mizil and R. Caruana, Predicting good probabilities with supervised learning, in: Proceedings of the 22Nd International Conference on Machine Learning, ICML \u201905, New York, NY, USA, ACM, 2005, pp. 625\u2013632.","DOI":"10.1145\/1102351.1102430"},{"key":"10.3233\/HIS-200282_ref20","doi-asserted-by":"crossref","unstructured":"A. Niculescu-Mizil and R. Caruana, Predicting good probabilities with supervised learning, in: Proceedings of the 22nd International Conference on Machine Learning, ACM, 2005, pp. 625\u2013632.","DOI":"10.1145\/1102351.1102430"},{"key":"10.3233\/HIS-200282_ref21","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"Journal of Machine Learning Research"},{"issue":"10","key":"10.3233\/HIS-200282_ref23","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1088\/1475-7516\/2017\/10\/036","article-title":"zBEAMS: A unified solution for supernova cosmology with redshift uncertainties","volume":"1710","author":"Roberts","year":"2017","journal-title":"JCAP"},{"key":"10.3233\/HIS-200282_ref24","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1088\/0004-637X\/738\/2\/162","article-title":"Photometric type ia supernova candidates from the 3-year SDSS-II SN survey data","volume":"738","author":"Sako","year":"2011","journal-title":"The Astrophysical Journal"},{"issue":"1","key":"10.3233\/HIS-200282_ref25","doi-asserted-by":"crossref","first-page":"72","DOI":"10.2307\/1412159","article-title":"The proof and measurement of association between two things","volume":"15","author":"Spearman","year":"1904","journal-title":"The American Journal of Psychology"},{"issue":"12","key":"10.3233\/HIS-200282_ref27","doi-asserted-by":"crossref","first-page":"1342","DOI":"10.1109\/34.735807","article-title":"Bayesian classification with gaussian processes","volume":"20","author":"Williams","year":"1998","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"10.3233\/HIS-200282_ref28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2019\/2686378","article-title":"Recent progress of anomaly detection","volume":"2019","author":"Xu","year":"2019","journal-title":"Complexity"}],"container-title":["International Journal of Hybrid Intelligent Systems"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/HIS-200282","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T08:52:50Z","timestamp":1777452770000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/HIS-200282"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,5]]},"references-count":21,"journal-issue":{"issue":"4"},"URL":"https:\/\/doi.org\/10.3233\/his-200282","relation":{},"ISSN":["1448-5869","1875-8819"],"issn-type":[{"value":"1448-5869","type":"print"},{"value":"1875-8819","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,2,5]]}}}