{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:43:20Z","timestamp":1760237000484,"version":"build-2065373602"},"reference-count":17,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2020,2,11]],"date-time":"2020-02-11T00:00:00Z","timestamp":1581379200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Knowledge base (KB) is an important aspect in artificial intelligence. One significant challenge faced by KB construction is that it contains many noises, which prevent its effective usage. Even though some KB cleansing algorithms have been proposed, they focus on the structure of the knowledge graph and neglect the relation between the concepts, which could be helpful to discover wrong relations in KB. Motived by this, we measure the relation of two concepts by the distance between their corresponding instances and detect errors within the intersection of the conflicting concept sets. For efficient and effective knowledge base cleansing, we first apply a distance-based model to determine the conflicting concept sets using two different methods. Then, we propose and analyze several algorithms on how to detect and repair the errors based on our model, where we use a hash method for an efficient way to calculate distance. Experimental results demonstrate that the proposed approaches could cleanse the knowledge bases efficiently and effectively.<\/jats:p>","DOI":"10.3390\/info11020097","type":"journal-article","created":{"date-parts":[[2020,2,11]],"date-time":"2020-02-11T11:45:30Z","timestamp":1581421530000},"page":"97","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Error Detection in a Large-Scale Lexical Taxonomy"],"prefix":"10.3390","volume":"11","author":[{"given":"Yinan","family":"An","sequence":"first","affiliation":[{"name":"Department of Computer Science and Technology, Harbin Institute of Technology, Harbin 150000, China"}]},{"given":"Sifan","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, Harbin Institute of Technology, Harbin 150000, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7521-2871","authenticated-orcid":false,"given":"Hongzhi","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, Harbin Institute of Technology, Harbin 150000, China"}]}],"member":"1968","published-online":{"date-parts":[[2020,2,11]]},"reference":[{"key":"ref_1","unstructured":"Weikum, G. (2007). Yago: A Core of Semantic Knowledge. International Conference on World Wide Web, Association for Computing Machinery."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Yu, L. (2014). DBpedia. A Developer\u2019s Guide to the Semantic Web, Springer Science & Business Media.","DOI":"10.1007\/978-3-662-43796-4_8"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"897","DOI":"10.1016\/S0888-7543(05)80111-9","article-title":"A knowledge base for predicting protein localization sites in eukaryotic cells","volume":"14","author":"Nakai","year":"1992","journal-title":"Genomics"},{"key":"ref_4","unstructured":"Murray, K.J.B. (1986). Knowledge-Based Model Construction: An Automatic Programming Approach to Simulation Modeling. [Ph.D Thesis, Texas A&M University]."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Liang, J., Xiao, Y., Zhang, Y., Hwang, S., and Wang, H. (2017, January 4\u20139). Graph-Based Wrong IsA Relation Detection in a Large-Scale Lexical Taxonomy. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 2017.","DOI":"10.1609\/aaai.v31i1.10676"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Lu, H., Fan, W., Goh, C.H., Madnick, S.E., and Cheung, D.W. (1998). Discovering and reconciling semantic conflicts: A data mining perspective. Data Mining and Reverse Engineering, Springer.","DOI":"10.1007\/978-0-387-35300-5_17"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Hearst, M.A. (1992). Automatic acquisition of hyponyms from large text corpora. Conference on Computational Linguistics, Association for Computational Linguistics.","DOI":"10.3115\/992133.992154"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Espinosa-Anke, L., Ronzano, F., and Saggion, H. (2015). Hypernym Extraction: Combining Machine-Learning and Dependency Grammar, CICLing.","DOI":"10.1007\/978-3-319-18111-0_28"},{"key":"ref_9","unstructured":"Broder, A. (1997). On the Resemblance and Containment of Documents, IEEE."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"630","DOI":"10.1006\/jcss.1999.1690","article-title":"Min-Wise Independent Permutations","volume":"60","author":"Broder","year":"2000","journal-title":"J. Comput. Syst. Sci."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1007\/978-3-540-30466-1_25","article-title":"OWL-Based Semantic Conflicts Detection and Resolution for Data Interoperability","volume":"3289","author":"Li","year":"2004","journal-title":"Lec. Notes Comput. Sci."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Van der Broeck, J., Cunningham, S.A., Eeckels, R., and Herbst, K. (2005). Data cleaning: Detecting, diagnosing, and editing data abnormalities. Plos Med., 2.","DOI":"10.1371\/journal.pmed.0020267"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1038\/nature06830","article-title":"Hierarchical structure and the prediction of missing links in networks","volume":"453","author":"Clauset","year":"2008","journal-title":"Nature"},{"key":"ref_14","unstructured":"Gupte, M., Shankar, P., Li, J., Muthukrishnan, S., and Iftode, L. (April, January 28). Finding hierarchy in directed online social networks. Proceedings of the International Conference on World Wide Web, Hyderabad, India."},{"key":"ref_15","unstructured":"Tong, S. (2014). Document Similarity Detection. (No. 8650199), U.S. Patent."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1002\/j.1538-7305.1950.tb00463.x","article-title":"Error Detecting and Error Correcting Codes","volume":"29","author":"Hamming","year":"2014","journal-title":"Bell Syst. Tech. J."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1111\/j.1469-8137.1912.tb05611.x","article-title":"The Disbution of the flora in the alpine zone.1","volume":"11","author":"Jaccard","year":"2010","journal-title":"New Phytolog."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/11\/2\/97\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T08:56:55Z","timestamp":1760173015000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/11\/2\/97"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,2,11]]},"references-count":17,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2020,2]]}},"alternative-id":["info11020097"],"URL":"https:\/\/doi.org\/10.3390\/info11020097","relation":{},"ISSN":["2078-2489"],"issn-type":[{"type":"electronic","value":"2078-2489"}],"subject":[],"published":{"date-parts":[[2020,2,11]]}}}