{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T01:01:06Z","timestamp":1770512466531,"version":"3.49.0"},"reference-count":50,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,12,19]],"date-time":"2025-12-19T00:00:00Z","timestamp":1766102400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,12,19]],"date-time":"2025-12-19T00:00:00Z","timestamp":1766102400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"EPSRC Centre for Doctoral Training in Modern Statistics and Statistical Machine Learning","award":["EP\/S023151\/1"],"award-info":[{"award-number":["EP\/S023151\/1"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Stat Comput"],"published-print":{"date-parts":[[2026,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Outlier detection is an important data mining tool that becomes particularly challenging when dealing with nominal data. First and foremost, flagging observations as outlying requires a well-defined notion of nominal outlyingness. This paper presents a definition of nominal outlyingness and introduces a general framework for quantifying outlyingness of nominal data. The proposed framework makes use of ideas from the association rule mining literature and can be used for calculating scores that indicate how outlying a nominal observation is. Methods for determining the involved hyperparameter values are presented and the concepts of variable contributions and outlyingness depth are introduced, in an attempt to enhance interpretability of the results. The proposed framework is evaluated on both synthetic and publicly available data sets, demonstrating comparable performance to state-of-the-art frequent pattern mining algorithms and even outperforming them in certain cases. The ideas presented can serve as a tool for assessing the degree to which an observation differs from the rest of the data, under the assumption of sequences of nominal levels having been generated from a Multinomial distribution with varying event probabilities.<\/jats:p>","DOI":"10.1007\/s11222-025-10798-1","type":"journal-article","created":{"date-parts":[[2025,12,19]],"date-time":"2025-12-19T13:48:35Z","timestamp":1766152115000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A novel framework for quantifying nominal outlyingness"],"prefix":"10.1007","volume":"36","author":[{"given":"Efthymios","family":"Costa","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ioanna","family":"Papatsouma","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2025,12,19]]},"reference":[{"key":"10798_CR1","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-47534-9","volume-title":"Data Streams: Models and Algorithms","author":"CC Aggarwal","year":"2007","unstructured":"Aggarwal, C.C.: Data Streams: Models and Algorithms, vol. 31. Springer, Germany (2007)"},{"issue":"1","key":"10798_CR2","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1145\/2830544.2830549","volume":"17","author":"CC Aggarwal","year":"2015","unstructured":"Aggarwal, C.C., Sathe, S.: Theoretical foundations and algorithms for outlier ensembles. ACM SIGKDD Explor. Newslett. 17(1), 24\u201347 (2015)","journal-title":"ACM SIGKDD Explor. Newslett."},{"key":"10798_CR3","unstructured":"Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the International Conference on Very Large Data Bases, VLDB, Citeseer, 487\u2013499 (1994)"},{"key":"10798_CR4","doi-asserted-by":"crossref","unstructured":"Agrawal, R., Imieli\u0144ski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp 207\u2013216 (1993)","DOI":"10.1145\/170035.170072"},{"issue":"2","key":"10798_CR5","doi-asserted-by":"publisher","first-page":"3240","DOI":"10.1016\/j.eswa.2008.01.009","volume":"36","author":"MF Akay","year":"2009","unstructured":"Akay, M.F.: Support vector machines combined with feature selection for breast cancer diagnosis. Expert Syst. Appl. 36(2), 3240\u20133247 (2009)","journal-title":"Expert Syst. Appl."},{"key":"10798_CR6","doi-asserted-by":"crossref","unstructured":"Bay, S.D., Schwabacher, M.: Mining distance-based outliers in near linear time with randomization and a simple pruning rule. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 29\u201338 (2003)","DOI":"10.1145\/956750.956758"},{"issue":"447","key":"10798_CR7","doi-asserted-by":"publisher","first-page":"947","DOI":"10.1080\/01621459.1999.10474199","volume":"94","author":"C Becker","year":"1999","unstructured":"Becker, C., Gather, U.: The masking breakdown point of multivariate outlier identification rules. J. Am. Stat. Assoc. 94(447), 947\u2013955 (1999)","journal-title":"J. Am. Stat. Assoc."},{"issue":"4","key":"10798_CR8","doi-asserted-by":"publisher","first-page":"2095","DOI":"10.1007\/s00405-023-08299-w","volume":"281","author":"S Borzooei","year":"2024","unstructured":"Borzooei, S., Briganti, G., Golparian, M., et al.: Machine learning for risk stratification of thyroid cancer patients: a 15-year cohort study. Eur. Arch. Otorhinolaryngol. 281(4), 2095\u20132104 (2024)","journal-title":"Eur. Arch. Otorhinolaryngol."},{"issue":"2","key":"10798_CR9","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1016\/0749-5978(92)90059-G","volume":"53","author":"GL Bradshaw","year":"1992","unstructured":"Bradshaw, G.L., Shaw, D.: Forecasting solar flares: experts and artificial systems. Organ. Behav. Hum. Decis. Process. 53(2), 135\u2013157 (1992)","journal-title":"Organ. Behav. Hum. Decis. Process."},{"key":"10798_CR10","doi-asserted-by":"crossref","unstructured":"Breunig, M.M., Kriegel, H.P., Ng, R.T., et al.: LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 93\u2013104 (2000)","DOI":"10.1145\/342009.335388"},{"key":"10798_CR11","doi-asserted-by":"publisher","DOI":"10.1016\/j.cam.2020.113214","volume":"386","author":"A Calvino","year":"2021","unstructured":"Calvino, A., Martin, N., Pardo, L.: Robustness of minimum density power divergence estimators and wald-type test statistics in loglinear models with multinomial sampling. J. Comput. Appl. Math. 386, 113214 (2021)","journal-title":"J. Comput. Appl. Math."},{"key":"10798_CR12","doi-asserted-by":"publisher","first-page":"891","DOI":"10.1007\/s10618-015-0444-8","volume":"30","author":"GO Campos","year":"2016","unstructured":"Campos, G.O., Zimek, A., Sander, J., et al.: On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min. Knowl. Disc. 30, 891\u2013927 (2016)","journal-title":"Data Min. Knowl. Disc."},{"key":"10798_CR13","unstructured":"Chen, T., Tang, L.A., Sun, Y., et al.: Entity embedding-based anomaly detection for heterogeneous categorical events. In: IJCAI\u201916: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. AAAI Press, 1396\u20131403 (2016)"},{"key":"10798_CR14","doi-asserted-by":"publisher","first-page":"325","DOI":"10.1016\/j.neucom.2019.07.069","volume":"365","author":"L Cheng","year":"2019","unstructured":"Cheng, L., Wang, Y., Ma, X.: A neural probabilistic outlier detection method for categorical data. Neurocomputing 365, 325\u2013335 (2019)","journal-title":"Neurocomputing"},{"key":"10798_CR15","unstructured":"Clark, P., Niblett, T.: Induction in noisy domains. In: Proceedings of the 2nd European Working Session on Learning (EWSL) (1987)"},{"key":"10798_CR16","doi-asserted-by":"crossref","unstructured":"Costa, E.: SONO: scores of Nominal outlyingness (SONO). https:\/\/cran.r-project.org\/package=SONO, R package version 1.2 (2025)","DOI":"10.32614\/CRAN.package.SONO"},{"issue":"2","key":"10798_CR17","doi-asserted-by":"publisher","first-page":"553","DOI":"10.1214\/aos\/1031833664","volume":"25","author":"JA Cuesta-Albertos","year":"1997","unstructured":"Cuesta-Albertos, J.A., Gordaliza, A., Matr\u00e1n, C.: Trimmed $$k$$-means: an attempt to robustify quantizers. Ann. Stat. 25(2), 553\u2013576 (1997)","journal-title":"Ann. Stat."},{"issue":"423","key":"10798_CR18","doi-asserted-by":"publisher","first-page":"782","DOI":"10.1080\/01621459.1993.10476339","volume":"88","author":"L Davies","year":"1993","unstructured":"Davies, L., Gather, U.: The identification of multiple outliers. J. Am. Stat. Assoc. 88(423), 782\u2013792 (1993)","journal-title":"J. Am. Stat. Assoc."},{"key":"10798_CR19","volume-title":"Intrusion Detection Systems","author":"R Di Pietro","year":"2008","unstructured":"Di Pietro, R., Mancini, L.V.: Intrusion Detection Systems, vol. 38. Springer Science & Business Media, Germany (2008)"},{"key":"10798_CR20","doi-asserted-by":"crossref","unstructured":"Ghoting, A., Otey, M.E., Parthasarathy, S.: LOADED: link-based outlier and anomaly detection in evolving data sets. In: Fourth IEEE International Conference on Data Mining (ICDM\u201904), IEEE, pp 387\u2013390 (2004)","DOI":"10.1109\/ICDM.2004.10011"},{"issue":"1","key":"10798_CR21","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1148\/radiology.143.1.7063747","volume":"143","author":"JA Hanley","year":"1982","unstructured":"Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1), 29\u201336 (1982)","journal-title":"Radiology"},{"key":"10798_CR22","doi-asserted-by":"crossref","unstructured":"He, Z., Xu, X., Huang, J.Z., et al.: A frequent pattern discovery method for outlier detection. In: Advances in Web-Age Information Management: 5th International Conference (WAIM), Springer, pp 726\u2013732 (2004)","DOI":"10.1007\/978-3-540-27772-9_80"},{"key":"10798_CR23","volume-title":"Finding Groups In Data: An Introduction To Cluster Analysis","author":"L Kaufman","year":"2009","unstructured":"Kaufman, L., Rousseeuw, P.J.: Finding Groups In Data: An Introduction To Cluster Analysis. John Wiley & Sons, USA (2009)"},{"key":"10798_CR24","unstructured":"Kelly, M., Longjohn, R., Nottingham, K.: UCI Machine Learning Repository. http:\/\/archive.ics.uci.edu\/ml (2024)"},{"issue":"2","key":"10798_CR25","doi-asserted-by":"publisher","first-page":"259","DOI":"10.1007\/s10618-009-0148-z","volume":"20","author":"A Koufakou","year":"2010","unstructured":"Koufakou, A., Georgiopoulos, M.: A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes. Data Min. Knowl. Disc. 20(2), 259\u2013289 (2010)","journal-title":"Data Min. Knowl. Disc."},{"key":"10798_CR26","doi-asserted-by":"crossref","unstructured":"Koufakou, A., Ortiz, E.G., Georgiopoulos, M., et al.: A scalable and efficient outlier detection strategy for categorical data. In: 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007), IEEE, pp 210\u2013217 (2007)","DOI":"10.1109\/ICTAI.2007.125"},{"key":"10798_CR27","unstructured":"Kucha\u0159, J., Sv\u00e1tek, V.: Spotlighting anomalies using frequent patterns. In: KDD 2017 Workshop on Anomaly Detection in Finance, PMLR, 33\u201342 (2018)"},{"issue":"3","key":"10798_CR28","doi-asserted-by":"publisher","first-page":"281","DOI":"10.1007\/s00184-008-0230-3","volume":"71","author":"S Kuhnt","year":"2010","unstructured":"Kuhnt, S.: Breakdown concepts for contingency tables. Metrika 71(3), 281\u2013294 (2010)","journal-title":"Metrika"},{"key":"10798_CR29","doi-asserted-by":"publisher","first-page":"481","DOI":"10.1007\/s11222-013-9382-8","volume":"24","author":"S Kuhnt","year":"2014","unstructured":"Kuhnt, S., Rapallo, F., Rehage, A.: Outlier detection in contingency tables based on minimal patterns. Stat. Comput. 24, 481\u2013491 (2014)","journal-title":"Stat. Comput."},{"key":"10798_CR30","doi-asserted-by":"crossref","unstructured":"Lazarevic, A., Kumar, V.: Feature bagging for outlier detection. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, 157\u2013166 (2005)","DOI":"10.1145\/1081870.1081891"},{"key":"10798_CR31","unstructured":"Lee, W., Xiang, D.: Information-theoretic measures for anomaly detection. In: Proceedings 2001 IEEE Symposium on Security and Privacy. S &P 2001, IEEE, 130\u2013143 (2000)"},{"key":"10798_CR32","doi-asserted-by":"crossref","unstructured":"Li, S., Lee, R., Lang, S.D.: Mining distance-based outliers from categorical data. In: Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007), IEEE, 225\u2013230 (2007)","DOI":"10.1109\/ICDMW.2007.75"},{"key":"10798_CR33","doi-asserted-by":"crossref","unstructured":"Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, IEEE, 413\u2013422 (2008)","DOI":"10.1109\/ICDM.2008.17"},{"key":"10798_CR34","unstructured":"Michalski, R.S., Mozeti\u010d, I., Hong, J., et al.: The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. In: AAAI Conference on Artificial Intelligence (1986)"},{"issue":"3","key":"10798_CR35","doi-asserted-by":"publisher","first-page":"559","DOI":"10.1016\/j.dss.2010.08.006","volume":"50","author":"EW Ngai","year":"2011","unstructured":"Ngai, E.W., Hu, Y., Wong, Y.H., et al.: The application of data mining techniques in financial fraud detection: a classification framework and an academic review of literature. Decis. Support Syst. 50(3), 559\u2013569 (2011)","journal-title":"Decis. Support Syst."},{"issue":"2","key":"10798_CR36","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1007\/s10618-005-0014-6","volume":"12","author":"ME Otey","year":"2006","unstructured":"Otey, M.E., Ghoting, A., Parthasarathy, S.: Fast distributed outlier detection in mixed-attribute data sets. Data Min. Knowl. Disc. 12(2), 203\u2013228 (2006)","journal-title":"Data Min. Knowl. Disc."},{"issue":"388","key":"10798_CR37","doi-asserted-by":"publisher","first-page":"871","DOI":"10.1080\/01621459.1984.10477105","volume":"79","author":"PJ Rousseeuw","year":"1984","unstructured":"Rousseeuw, P.J.: Least median of squares regression. J. Am. Stat. Assoc. 79(388), 871\u2013880 (1984)","journal-title":"J. Am. Stat. Assoc."},{"issue":"283\u2013297","key":"10798_CR38","first-page":"37","volume":"8","author":"PJ Rousseeuw","year":"1985","unstructured":"Rousseeuw, P.J.: Multivariate estimation with high breakdown point. Math. Stat. Appl. 8(283\u2013297), 37 (1985)","journal-title":"Math. Stat. Appl."},{"key":"10798_CR39","doi-asserted-by":"publisher","first-page":"62","DOI":"10.1016\/j.socnet.2014.05.002","volume":"39","author":"D Savage","year":"2014","unstructured":"Savage, D., Zhang, X., Yu, X., et al.: Anomaly detection in online social networks. Soc. Netw. 39, 62\u201370 (2014)","journal-title":"Soc. Netw."},{"key":"10798_CR40","doi-asserted-by":"publisher","first-page":"1207","DOI":"10.1016\/j.procs.2019.04.173","volume":"151","author":"J Silva","year":"2019","unstructured":"Silva, J., Varela, N., L\u00f3pez, L.A.B., et al.: Association rules extraction for customer segmentation in the SMEs sector using the apriori algorithm. Procedia Comput. Sci. 151, 1207\u20131212 (2019)","journal-title":"Procedia Comput. Sci."},{"key":"10798_CR41","doi-asserted-by":"crossref","unstructured":"Sison, C.P., Glaz, J.: Simultaneous confidence intervals and sample size determination for multinomial proportions. J. Am. Stat. Assoc. 90(429), 366\u2013369 (1995)","DOI":"10.1080\/01621459.1995.10476521"},{"key":"10798_CR42","doi-asserted-by":"crossref","unstructured":"Sripriya, T., Srinivasan, M., Gallo, M.: Robust distance measure to detect outliers for categorical data. Soft. Comput. 24, 13557\u201313564 (2020)","DOI":"10.1007\/s00500-019-04340-5"},{"key":"10798_CR43","unstructured":"Stahel, W.A.: Robuste sch\u00e4tzungen: infinitesimale optimalit\u00e4t und sch\u00e4tzungen von kovarianzmatrizen. PhD thesis, ETH Z\u00fcrich (1981)"},{"issue":"2","key":"10798_CR44","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3312739","volume":"52","author":"A Taha","year":"2019","unstructured":"Taha, A., Hadi, A.S.: Anomaly detection methods for categorical data: a review. ACM Comput. Surv. (CSUR) 52(2), 1\u201335 (2019)","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"10798_CR45","doi-asserted-by":"crossref","unstructured":"Tschuchnig, M.E., Gadermayr, M.: Anomaly detection in medical imaging - a mini review. In: Data Science \u2013 Analytics and Applications. Springer, 33\u201338 (2022)","DOI":"10.1007\/978-3-658-36295-9_5"},{"issue":"7","key":"10798_CR46","doi-asserted-by":"publisher","first-page":"1615","DOI":"10.1080\/03610926.2020.1716255","volume":"50","author":"YA \u00dcnvan","year":"2021","unstructured":"\u00dcnvan, Y.A.: Market basket analysis with association rules. Commun. Stat.-Theory Methods 50(7), 1615\u20131628 (2021)","journal-title":"Commun. Stat.-Theory Methods"},{"issue":"3","key":"10798_CR47","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1109\/TKDE.2011.261","volume":"25","author":"S Wu","year":"2011","unstructured":"Wu, S., Wang, S.: Information-theoretic outlier detection for large-scale categorical data. IEEE Trans. Knowl. Data Eng. 25(3), 589\u2013602 (2011)","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"10798_CR48","doi-asserted-by":"publisher","first-page":"469","DOI":"10.1007\/s13042-013-0202-4","volume":"5","author":"X Zhao","year":"2014","unstructured":"Zhao, X., Liang, J., Cao, F.: A simple and effective outlier detection algorithm for categorical data. Int. J. Mach. Learn. Cybern. 5, 469\u2013477 (2014)","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"10798_CR49","doi-asserted-by":"crossref","unstructured":"Zimek, A., Gaudet, M., Campello, R.J., et al.: Subsampling for efficient and effective unsupervised outlier detection ensembles. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 428\u2013436 (2013)","DOI":"10.1145\/2487575.2487676"},{"issue":"2","key":"10798_CR50","first-page":"461","volume":"28","author":"Y Zuo","year":"2000","unstructured":"Zuo, Y., Serfling, R.: General notions of statistical depth function. Ann. Stat. 28(2), 461\u2013482 (2000)","journal-title":"Ann. Stat."}],"container-title":["Statistics and Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11222-025-10798-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11222-025-10798-1","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11222-025-10798-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T03:53:56Z","timestamp":1770436436000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11222-025-10798-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,19]]},"references-count":50,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,2]]}},"alternative-id":["10798"],"URL":"https:\/\/doi.org\/10.1007\/s11222-025-10798-1","relation":{},"ISSN":["0960-3174","1573-1375"],"issn-type":[{"value":"0960-3174","type":"print"},{"value":"1573-1375","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,19]]},"assertion":[{"value":"4 February 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 December 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 December 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"41"}}