{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T09:31:15Z","timestamp":1775122275442,"version":"3.50.1"},"reference-count":28,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2020,9,29]],"date-time":"2020-09-29T00:00:00Z","timestamp":1601337600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>The article presents both methods of clustering and outlier detection in complex data, such as rule-based knowledge bases. What distinguishes this work from others is, first, the application of clustering algorithms to rules in domain knowledge bases, and secondly, the use of outlier detection algorithms to detect unusual rules in knowledge bases. The aim of the paper is the analysis of using four algorithms for outlier detection in rule-based knowledge bases: Local Outlier Factor (LOF), Connectivity-based Outlier Factor (COF), K-MEANS, and SMALLCLUSTERS. The subject of outlier mining is very important nowadays. Outliers in rules If-Then mean unusual rules, which are rare in comparing to others and should be explored by the domain expert as soon as possible. In the research, the authors use the outlier detection methods to find a given number of outliers in rules (1%, 5%, 10%), while in small groups, the number of outliers covers no more than 5% of the rule cluster. Subsequently, the authors analyze which of seven various quality indices, which they use for all rules and after removing selected outliers, improve the quality of rule clusters. In the experimental stage, the authors use six different knowledge bases. The best results (the most often the clusters quality was improved) are achieved for two outlier detection algorithms LOF and COF.<\/jats:p>","DOI":"10.3390\/e22101096","type":"journal-article","created":{"date-parts":[[2020,9,29]],"date-time":"2020-09-29T08:43:27Z","timestamp":1601369007000},"page":"1096","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Exploration of Outliers in If-Then Rule-Based Knowledge Bases"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7238-1170","authenticated-orcid":false,"given":"Agnieszka","family":"Nowak-Brzezi\u0144ska","sequence":"first","affiliation":[{"name":"Institute of Computer Science, Faculty of Science and Technology, University of Silesia, Bankowa 12, 40-007 Katowice, Poland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0592-8011","authenticated-orcid":false,"given":"Czes\u0142aw","family":"Hory\u0144","sequence":"additional","affiliation":[{"name":"Institute of Computer Science, Faculty of Science and Technology, University of Silesia, Bankowa 12, 40-007 Katowice, Poland"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,9,29]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Yahyaoui, A., Jamil, J., Rasheed, J., and Yesiltepe, M. (2019, January 6\u20137). A decision support system for diabetes prediction using machine learning and deep learning techniques. Proceedings of the 1st International Informatics and Software Engineering Conference (UBMYK), Ankara, Turkey.","DOI":"10.1109\/UBMYK48245.2019.8965556"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Alamelumangai, M., and Sathiyabhama, B. (2017, January 15\u201316). Personalized care: A clinical decision support system for breast cancer screening using clustering and classification. Proceedings of the International Conference on Intelligent Computing Systems (ICICS 2017), Madurai, India.","DOI":"10.2139\/ssrn.3134277"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Pandit, R.K., Kolios, A., and Infield, D. (2020). Data-driven weather forecasting models performance comparison for improving offshore wind turbine availability and maintenance. IET Renewable Power Generation, Institution of Engineering and Technology.","DOI":"10.1049\/iet-rpg.2019.0941"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Bazan, J.G., and Szczuka, M.S. (2000, January 16\u201319). RSES and RSESlib\u2014A collection of tools for rough set computations. Proceedings of the Second International Conference on Rough Sets and Current Trends in Computing, Banff, AB, Canada.","DOI":"10.1007\/3-540-45554-X_12"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1016\/j.ins.2019.02.019","article-title":"Exploration of rule-based knowledge bases: A knowledge engineers support","volume":"485","year":"2019","journal-title":"Inf. Sci."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"2065491","DOI":"10.1155\/2018\/2065491","article-title":"Enhancing the efficiency of a decision support system through the clustering of complex rule-based knowledge bases and modification of the inference algorithm","volume":"2018","year":"2018","journal-title":"Complexity"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"857","DOI":"10.2307\/2528823","article-title":"A general coefficient of similarity and some of its properties","volume":"27","author":"Gower","year":"1971","journal-title":"Biometrics"},{"key":"ref_8","unstructured":"Breunig, M.M., Kriegel, H.-P., Ng, R.T., and Sander, J. (2000). Identifying Density\u2014Based Local Outliers, ACM Press."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Tang, J., Chen, Z., Fu, A., and Cheung, D. (2002, January 6\u20138). Enhancing effectiveness of outlier detections for low density patterns. Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining 2002, Taipei, Taiwan.","DOI":"10.1007\/3-540-47887-6_53"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhang, K., Hutter, M., and Jin, H. (2009, January 27\u201330). A new local distance-based outlier detection approach for scattered real-world data. Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining 2009, Bangkok, Thailand.","DOI":"10.1007\/978-3-642-01307-2_84"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Kannan, R., Woo, H., Aggarwal, C.C., and Park, H. (2017, January 27\u201329). Outlier detection for text data. Proceedings of the 2017 SIAM International Conference on Data Mining, Houston, TX, USA.","DOI":"10.1137\/1.9781611974973.55"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"691","DOI":"10.1016\/S0167-8655(00)00131-8","article-title":"Two-phase clustering process for outlier detection","volume":"22","author":"Jiang","year":"2001","journal-title":"Pattern Recognit. Lett."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1007\/s101150200013","article-title":"Findout: Finding outliers in very large datasets","volume":"4","author":"Yu","year":"2002","journal-title":"Knowl. Inf. Syst."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1641","DOI":"10.1016\/S0167-8655(03)00003-5","article-title":"Discovering cluster-based local outliers","volume":"24","author":"He","year":"2003","journal-title":"Pattern Recognit. Lett."},{"key":"ref_15","first-page":"429","article-title":"Clustering-Based Outlier Detection Method","volume":"2","author":"Jiang","year":"2008","journal-title":"FSKD"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Pamula, R., Deka, J.K., and Nandi, S. (2011, January 19\u201320). An oulier Detection Method Based on Clustering. Proceedings of the Second International Conference on Emerging Applications of Information Technology EAIT, Kolkata, India.","DOI":"10.1109\/EAIT.2011.25"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1241","DOI":"10.3906\/elk-1210-62","article-title":"Comparison of AIS and fuzzy c-means clustering methods on the classification of breast cancer and diabetes datasets","volume":"22","author":"Ceylan","year":"2014","journal-title":"Turk. J. Electr. Eng. Comput. Sci."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1476","DOI":"10.1016\/j.eswa.2013.08.044","article-title":"Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms","volume":"41","author":"Zheng","year":"2014","journal-title":"Expert Syst."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Cao, L., Wei, M., Di, Y., and Rundensteiner, E.A. (2015). Online Outlier Exploration Over Large Datasets, ACM SIGKDD.","DOI":"10.1145\/2783258.2783387"},{"key":"ref_20","unstructured":"Ruff, L., Geornitz, N., Deecke, L., Siddiqui, S.A., Vandermeulen, R., Binder, A., M\u00fcller, E., and Kloft, M. (2018, January 10\u201315). Deep one-class classification. Proceedings of the International Conference on Machine Learning (ICML 2018), Stockholm, Sweden."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Schlegl, T., Seeb\u00f6ck, P., Waldstein, S.M., Schmidt-Erfurth, U., and Langs, G. (2017, January 25\u201330). Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. Proceedings of the International Conference on Information Processing in Medical Imaging, Boone, NC, USA.","DOI":"10.1007\/978-3-319-59050-9_12"},{"key":"ref_22","unstructured":"Zenati, H., Foo, C.S., Lecouat, B., Manek, G., and Chandrasekhar, V.R. (2018). Efficient gan-based anomaly detection. arXiv."},{"key":"ref_23","unstructured":"Li, D., Chen, D., Goh, J., and Ng, S.-K. (2018). Anomaly detection with generative adversarial networks for multivariate time series. arXiv."},{"key":"ref_24","unstructured":"K\u0142opotek, W., and Wierzcho\u0144, S. (2017). Algorytmy Analizy Skupie\u0144, PWN."},{"key":"ref_25","unstructured":"Jach, T. (2013). Optymalizacja Proces\u00f3w Wnioskowania z Wiedz\u0105 Niepe\u0142n\u0105. [Ph.D. Thesis, Institute of Computer Science University of Silesia in Katowice]."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"27","DOI":"10.3233\/FI-1997-3113","article-title":"A new version of the rule induction system LERS","volume":"31","year":"1997","journal-title":"Fundam. Inform."},{"key":"ref_27","first-page":"341","article-title":"Rough sets","volume":"11","author":"Pawlak","year":"1982","journal-title":"Int. J. Parallel Program."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Grzyma\u0142a-Busse, J.W. (2005). LERS\u2014A data mining system. The Data Mining and Knowledge Discovery Handbook, Springer.","DOI":"10.1007\/0-387-25465-X_65"}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/22\/10\/1096\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:14:49Z","timestamp":1760177689000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/22\/10\/1096"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,29]]},"references-count":28,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2020,10]]}},"alternative-id":["e22101096"],"URL":"https:\/\/doi.org\/10.3390\/e22101096","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9,29]]}}}