{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,11,17]],"date-time":"2023-11-17T00:11:03Z","timestamp":1700179863409},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2022,2,21]],"date-time":"2022-02-21T00:00:00Z","timestamp":1645401600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,3,30]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In this paper incomplete data sets, or data sets with missing attribute values, have three interpretations, lost values, attribute-concept values and \u2018do not care\u2019 conditions. Additionally, the process of data mining is based on two types of probabilistic approximations, global and saturated. We present results of experiments on mining incomplete data sets using six approaches, combining three interpretations of missing attribute values with two types of probabilistic approximations. We compare our six approaches, using the error rate computed as a result of ten-fold cross validation as a criterion of quality. We show that for some data sets the error rate is significantly smaller (5% level of significance) for lost values, for some data sets the smaller error rate is associated with attribute-concept values, and sometimes with \u2018do not care\u2019 conditions. Again, for some approaches the error rate is significantly smaller for saturated probabilistic approximations than for global probabilistic approximations, while for some approaches it is the other way around. Thus, for an incomplete data set, the best approach to data mining should be chosen by trying all six approaches.<\/jats:p>","DOI":"10.1093\/jigpal\/jzac015","type":"journal-article","created":{"date-parts":[[2022,1,25]],"date-time":"2022-01-25T12:19:36Z","timestamp":1643113176000},"page":"223-239","source":"Crossref","is-referenced-by-count":0,"title":["Global and saturated probabilistic approximations based on generalized maximal consistent blocks"],"prefix":"10.1093","volume":"31","author":[{"given":"Patrick G","family":"Clark","sequence":"first","affiliation":[{"name":"Department of Electrical Engineering and Computer Science , University of Kansas, Lawrence, KS 66045, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jerzy W","family":"Grzymala-Busse","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Computer Science , University of Kansas, Lawrence, KS 66045, USA and Department of Artificial Intelligence, University of Information Technology and Management, 35-225 Rzeszow, Poland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zdzislaw S","family":"Hippe","sequence":"additional","affiliation":[{"name":"Department of Artificial Intelligence , University of Information Technology and Management, 35-225 Rzeszow, Poland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Teresa","family":"Mroczek","sequence":"additional","affiliation":[{"name":"Department of Artificial Intelligence , University of Information Technology and Management, 35-225 Rzeszow, Poland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rafal","family":"Niemiec","sequence":"additional","affiliation":[{"name":"Department of Artificial Intelligence , University of Information Technology and Management, 35-225 Rzeszow, Poland"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2022,2,21]]},"reference":[{"key":"2023033115514436800_","volume-title":"Classification and Regression Trees","author":"Breiman","year":"1984"},{"key":"2023033115514436800_","first-page":"477","article-title":"Characteristic sets and generalized maximal consistent blocks in mining incomplete data","author":"Clark","year":"2017","journal-title":"In Proceedings of the International Joint Conference on Rough Sets"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1016\/j.ins.2018.04.025","article-title":"Characteristic sets and generalized maximal consistent blocks in mining incomplete data","volume":"453","author":"Clark","year":"2018","journal-title":"Information Sciences"},{"key":"2023033115514436800_","first-page":"324","article-title":"A comparison of concept and global probabilistic approximations based on mining incomplete data","volume-title":"Proceedings of ICIST 2018, the International Conference on Information and Software Technologies","author":"Clark","year":"2018"},{"key":"2023033115514436800_","article-title":"Complexity of rule sets in mining incomplete data using characteristic sets and generalized maximal consistent blocks","author":"Clark","year":"2020","journal-title":"Logic Journal of the IGPL"},{"key":"2023033115514436800_","first-page":"387","article-title":"Global and saturated probabilistic approximations based on generalized maximal consistent blocks","author":"Clark","year":"2020","journal-title":"Proceedings of the 15th International Conference on Hybrid Artificial Intelligence Systems (HAIS 2020)"},{"key":"2023033115514436800_","first-page":"12","article-title":"Mining incomplete data with many missing attribute values. a comparison of probabilistic and rough set approaches","volume-title":"Proceedings of the Second International Conference on Intelligent Systems and Applications","author":"Clark","year":"2013"},{"key":"2023033115514436800_","first-page":"10","article-title":"A comparison of global and saturated probabilistic approximations using characteristic sets in mining incomplete data","volume-title":"Proceedings of the Eight International Conference on Intelligent Systems and Applications","author":"Clark","year":"2019"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1007\/978-3-030-30275-7_35","article-title":"Rule set complexity in mining incomplete data using global and saturated probabilistic approximations","volume-title":"Proceedings of the 25th International Conference on Information and Software Technologies","author":"Clark","year":"2019"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/978-94-015-7975-9_1","article-title":"LERS\u2014a system for learning from examples based on rough sets","volume-title":"Intelligent Decision Support","author":"Grzymala-Busse","year":"1992"},{"key":"2023033115514436800_","first-page":"243","article-title":"MLEM2: a new algorithm for rule induction from imperfect data","volume-title":"Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems","author":"Grzymala-Busse","year":"2002"},{"key":"2023033115514436800_","first-page":"155","article-title":"A comparison of traditional and rough set approaches to missing attribute values in data mining","volume-title":"Proceedings of the 10th International Conference on Data Mining, Detection, Protection and Security, Royal Mare Village, Crete","author":"Grzymala-Busse","year":"2009"},{"key":"2023033115514436800_","first-page":"153","article-title":"Mining data with missing attribute values: a comparison of probabilistic and rough set approaches","volume-title":"Proceedings of the 4th International Conference on Intelligent Systems and Knowledge Engineering","author":"Grzymala-Busse","year":"2009"},{"key":"2023033115514436800_","first-page":"214","article-title":"Rough set and CART approaches to mining incomplete data","volume-title":"Proceedings of the International Conference on Soft Computing and Pattern Recognition","author":"Grzymala-Busse","year":"2010"},{"key":"2023033115514436800_","first-page":"52","article-title":"A comparison of some rough set approaches to mining symbolic data with missing attribute values","volume-title":"Proceedings of the 19th International Symposium on Methodologies for Intelligent Systems","author":"Grzymala-Busse","year":"2011"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"136","DOI":"10.1007\/978-3-642-24425-4_20","article-title":"Generalized parameterized approximations","volume-title":"Proceedings of the 6th International Conference on Rough Sets and Knowledge Technology","author":"Grzymala-Busse","year":"2011"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1016\/j.ijar.2013.04.007","article-title":"Generalized probabilistic approximations of incomplete data","volume":"132","author":"Grzymala-Busse","year":"2014","journal-title":"International Journal of Approximate Reasoning"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1002\/int.10012","article-title":"A comparison of three closest fit approaches to missing attribute values in preterm birth data","volume":"17","author":"Grzymala-Busse","year":"2002","journal-title":"International Journal of Intelligent Systems"},{"key":"2023033115514436800_","first-page":"340","article-title":"A comparison of several approaches to missing attribute values in data mining","volume-title":"Proceedings of the Second International Conference on Rough Sets and Current Trends in Computing","author":"Grzymala-Busse","year":"2000"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1007\/11908029_27","article-title":"Local and global approximations for incomplete data","volume-title":"Proceedings of the Fifth International Conference on Rough Sets and Current Trends in Computing","author":"Grzymala-Busse","year":"2006"},{"key":"2023033115514436800_","first-page":"21","article-title":"Local and global approximations for incomplete data","volume":"8","author":"Grzymala-Busse","year":"2008","journal-title":"Transactions on Rough Sets"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"142","DOI":"10.4018\/978-1-59140-051-6.ch006","article-title":"Data mining based on rough sets","volume-title":"Data Mining: Opportunities and Challenges","author":"Grzymala-Busse","year":"2003"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1016\/j.ejor.2004.03.032","article-title":"Knowledge acquisition in incomplete information systems: a rough set approach","volume":"168","author":"Leung","year":"2006","journal-title":"European Journal of Operational Research"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1016\/j.ins.2006.06.006","article-title":"Rough sets: some extensions","volume":"177","author":"Pawlak","year":"2007","journal-title":"Information Sciences"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1016\/S0020-7373(88)80032-4","article-title":"Rough sets: probabilistic versus deterministic approach","volume":"29","author":"Pawlak","year":"1988","journal-title":"International Journal of Man-Machine Studies"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1016\/B978-1-55860-036-2.50048-5","article-title":"Unknown attribute values in induction","volume-title":"Proceedings of the 6th Int. Workshop on Machine Learning","author":"Quinlan","year":"1989"},{"key":"2023033115514436800_","volume-title":"C4.5: Programs for Machine Learning","author":"Quinlan","year":"1993"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1016\/j.ijar.2004.11.004","article-title":"The investigation of the bayesian rough set model","volume":"40","author":"\u015ale\u0327zak","year":"2005","journal-title":"International Journal of Approximate Reasoning"},{"key":"2023033115514436800_","first-page":"713","article-title":"INFER\u2014an adaptive decision support system based on the probabilistic approximate classification","volume-title":"Proceedings of the 6th International Workshop on Expert Systems and Their Applications","author":"Wong","year":"1986"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"255","DOI":"10.1016\/j.ijar.2007.05.019","article-title":"Probabilistic rough set approximations","volume":"49","author":"Yao","year":"2008","journal-title":"International Journal of Approximate Reasoning"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"793","DOI":"10.1016\/0020-7373(92)90069-W","article-title":"A decision theoretic framework for approximate concepts","volume":"37","author":"Yao","year":"1992","journal-title":"International Journal of Man-Machine Studies"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1016\/0022-0000(93)90048-2","article-title":"Variable precision rough set model","volume":"46","author":"Ziarko","year":"1993","journal-title":"Journal of Computer and System Sciences"},{"key":"2023033115514436800_","doi-asserted-by":"crossref","first-page":"272","DOI":"10.1016\/j.ijar.2007.06.014","article-title":"Probabilistic approach to rough sets","volume":"49","author":"Ziarko","year":"2008","journal-title":"International Journal of Approximate Reasoning"}],"container-title":["Logic Journal of the IGPL"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jigpal\/article-pdf\/31\/2\/223\/49705956\/jzac015.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jigpal\/article-pdf\/31\/2\/223\/49705956\/jzac015.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,16]],"date-time":"2023-11-16T07:23:46Z","timestamp":1700119426000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jigpal\/article\/31\/2\/223\/6532159"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,21]]},"references-count":33,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2022,2,21]]},"published-print":{"date-parts":[[2023,3,30]]}},"URL":"https:\/\/doi.org\/10.1093\/jigpal\/jzac015","relation":{},"ISSN":["1367-0751","1368-9894"],"issn-type":[{"value":"1367-0751","type":"print"},{"value":"1368-9894","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,4]]},"published":{"date-parts":[[2022,2,21]]}}}