{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T00:28:49Z","timestamp":1777854529972,"version":"3.51.4"},"reference-count":48,"publisher":"SAGE Publications","issue":"6","license":[{"start":{"date-parts":[[2022,1,11]],"date-time":"2022-01-11T00:00:00Z","timestamp":1641859200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2023,12]]},"abstract":"<jats:p>\n                    The original K-nearest neighbour ( KNN) algorithm was meant to classify homogeneous complete data, that is, data with only numerical features whose values exist completely. Thus, it faces problems when used with heterogeneous incomplete (HI) data, which has also categorical features and is plagued with missing values. Many solutions have been proposed over the years but most have pitfalls. For example, some solve heterogeneity by converting categorical features into numerical ones, inflicting structural damage. Others solve incompleteness by imputation or elimination, causing semantic disturbance. Almost all use the same K for all query objects, leading to misclassification. In the present work, we introduce KNN\n                    <jats:sup>HI<\/jats:sup>\n                    , a KNN-based algorithm for HI data classification that avoids all these pitfalls. Leveraging rough set theory, KNN\n                    <jats:sup>HI<\/jats:sup>\n                    preserves both categorical and numerical features, leaves missing values untouched and uses a different K for each query. The end result is an accurate classifier, as demonstrated by extensive experimentation on nine datasets mostly from the University of California Irvine repository, using a 10-fold cross-validation technique. We show that KNN\n                    <jats:sup>HI<\/jats:sup>\n                    outperforms six recently published KNN-based algorithms, in terms of precision, recall, accuracy and F-Score. In addition to its function as a mighty classifier, KNN\n                    <jats:sup>HI<\/jats:sup>\n                    can also serve as a K calculator, helping KNN-based algorithms that use a single K value for all queries that find the best such value. Sure enough, we show how four such algorithms improve their performance using the K obtained by KNN\n                    <jats:sup>HI<\/jats:sup>\n                    . Finally, KNN\n                    <jats:sup>HI<\/jats:sup>\n                    exhibits impressive resilience to the degree of incompleteness, degree of heterogeneity and the metric used to measure distance.\n                  <\/jats:p>","DOI":"10.1177\/01655515211069539","type":"journal-article","created":{"date-parts":[[2022,1,11]],"date-time":"2022-01-11T04:48:47Z","timestamp":1641876527000},"page":"1631-1655","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":7,"title":["<i>K<\/i>\n                    NN\n                    <sup>HI<\/sup>\n                    : Resilient\n                    <i>K<\/i>\n                    NN algorithm for heterogeneous incomplete data classification and\n                    <i>K<\/i>\n                    identification using rough set theory"],"prefix":"10.1177","volume":"49","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0928-548X","authenticated-orcid":false,"given":"Ahmed","family":"Hamed","sequence":"first","affiliation":[{"name":"Suez Canal University, Egypt"}]},{"given":"Mohamed","family":"Tahoun","sequence":"additional","affiliation":[{"name":"Suez Canal University, Egypt"}]},{"given":"Hamed","family":"Nassar","sequence":"additional","affiliation":[{"name":"Suez Canal University, Egypt"}]}],"member":"179","published-online":{"date-parts":[[2022,1,11]]},"reference":[{"key":"bibr1-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1007\/s00500-017-2567-x"},{"key":"bibr2-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.artmed.2020.101985"},{"key":"bibr3-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2015.08.112"},{"key":"bibr4-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1967.1053964"},{"key":"bibr5-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1002\/9780470977811.ch8"},{"key":"bibr6-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1214\/07-AOS537"},{"key":"bibr7-01655515211069539","volume-title":"Proceedings of the IOP conference series: materials science and engineering","author":"Jaafar H"},{"key":"bibr8-01655515211069539","first-page":"2279","volume-title":"Proceedings of the 2017 IEEE international conference on robotics and biomimetics (ROBIO)","author":"Yi C"},{"key":"bibr9-01655515211069539","first-page":"4417","volume-title":"Proceedings of the 2018 international conference on power system technology (POWERCON)","author":"Fan H"},{"key":"bibr10-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1061\/JPEODX.0000175"},{"key":"bibr11-01655515211069539","first-page":"1","volume-title":"In: Proceedings of the Brazilian symposium on bioinformatics","author":"Jaskowiak PA"},{"key":"bibr12-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2018.04.023"},{"key":"bibr13-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1029\/95WR02966"},{"key":"bibr14-01655515211069539","first-page":"457","volume":"14","author":"Ghosh AK","year":"2004","journal-title":"Stat Sinica"},{"key":"bibr15-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2018.2812279"},{"key":"bibr16-01655515211069539","first-page":"241","volume-title":"Proceedings of the IEEE 8th international workshop on computational advances in multi-sensor adaptive processing (CAMSAP)","author":"Barrash S"},{"key":"bibr17-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2017.09.036"},{"key":"bibr18-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1007\/s13369-020-05212-z"},{"key":"bibr19-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2018.08.021"},{"key":"bibr20-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.biosystems.2018.12.009"},{"key":"bibr21-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2020.105606"},{"key":"bibr22-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1007\/s11277-017-4295-z"},{"key":"bibr23-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2020.08.049"},{"key":"bibr24-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2963053"},{"key":"bibr25-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2019.04.017"},{"key":"bibr26-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-28553-1_9"},{"key":"bibr27-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1007\/s40747-018-0090-z"},{"key":"bibr28-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.11.101"},{"key":"bibr29-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2019.03.003"},{"key":"bibr30-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2017.07.012"},{"key":"bibr31-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.aei.2017.05.003"},{"key":"bibr32-01655515211069539","first-page":"373","volume-title":"In: Proceedings of the 22nd IEEE international conference on tools with artificial intelligence, Arras","author":"Pereira CL","year":"2010"},{"key":"bibr33-01655515211069539","unstructured":"Blake C. UCI repository of machine learning databases, https:\/\/archive.ics.uci.edu\/ml\/index.php (2021, accessed 31 December 2021, 22:00 GMT)."},{"key":"bibr34-01655515211069539","unstructured":"Boetticher G. The PROMISE repository of empirical software engineering data, http:\/\/promisedata.org\/repository (2021, accessed 7 November 2021, 22:00 GMT)."},{"key":"bibr35-01655515211069539","unstructured":"Horse colic UCI Machine Learning Repository, https:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/horse-colic\/horse-colic.data (2021, accessed 07 November 2021, 09:00 GMT)."},{"key":"bibr36-01655515211069539","unstructured":"Annealing UCI Machine Learning Repository, https:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/annealing\/anneal.data (2021, accessed 7 November 2021, 09:00 GMT)."},{"key":"bibr37-01655515211069539","unstructured":"Hepatitis UCI Machine Learning Repository, https:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/hepatitis\/hepatitis.data (2021, accessed 7 November 2021, 09:00 GMT)."},{"key":"bibr38-01655515211069539","unstructured":"Secom UCI Machine Learning Repository, https:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/secom\/secom.data (2021, accessed 7 November 2021, 09:00 GMT)."},{"key":"bibr39-01655515211069539","unstructured":"Camel Promise Repository, https:\/\/github.com\/feiwww\/PROMISE-backup\/blob\/master\/bug-data\/camel\/camel-1.0.csv (2021, accessed 7 November 2021, 09:00 GMT)."},{"key":"bibr40-01655515211069539","unstructured":"Heart UCI Machine Learning Repository, https:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/statlog\/heart\/heart.dat (2021, accessed 7 November 2021, 09:00 GMT)."},{"key":"bibr41-01655515211069539","unstructured":"Liver UCI Machine Learning Repository, https:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/liver-disorders\/bupa.data (2021, accessed 7 November 2021, 09:00 GMT)."},{"key":"bibr42-01655515211069539","unstructured":"Australian UCI Machine Learning Repository, https:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/statlog\/australian\/australian.dat (2021, accessed 7 November 2021, 09:00 GMT)."},{"key":"bibr43-01655515211069539","unstructured":"Climate UCI Machine Learning Repository, https:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/00252\/pop_failures.dat (2021, accessed 7 November 2021, 09:00 GMT)."},{"key":"bibr44-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2020.104034"},{"key":"bibr45-01655515211069539","first-page":"345","volume-title":"In: Proceedings of the European conference on information retrieval, Santiago de Compostela","author":"Goutte C","year":"2005"},{"key":"bibr46-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1016\/j.swevo.2011.02.002"},{"key":"bibr47-01655515211069539","doi-asserted-by":"publisher","DOI":"10.1002\/pst.210"},{"key":"bibr48-01655515211069539","doi-asserted-by":"publisher","DOI":"10.22271\/maths.2021.v6.i1a.636"}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/01655515211069539","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/01655515211069539","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/01655515211069539","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T23:09:30Z","timestamp":1777504170000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/01655515211069539"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,11]]},"references-count":48,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,12]]}},"alternative-id":["10.1177\/01655515211069539"],"URL":"https:\/\/doi.org\/10.1177\/01655515211069539","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"value":"0165-5515","type":"print"},{"value":"1741-6485","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,11]]}}}