{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,10]],"date-time":"2026-06-10T15:25:35Z","timestamp":1781105135380,"version":"3.54.1"},"reference-count":18,"publisher":"IGI Global Scientific Publishing","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,10,1]]},"abstract":"<p>In the information era, data is crucial in decision making. Most data sets contain impurities that need to be weeded out before any meaningful decision can be made from the data. Hence, data cleaning is essential and often takes more than 80 percent of time and resources of the data analyst. Adequate tools and techniques must be used for data cleaning. There exist a lot of data cleaning tools but it is unclear how to choose them in various situations. This research aims at helping researchers and organizations choose the right tools for data cleaning. This article conducts a comparative study of four commonly used data cleaning tools on two real data sets and answers the research question of which tool will be useful based on different scenario.<\/p>","DOI":"10.4018\/ijdwm.2019100103","type":"journal-article","created":{"date-parts":[[2019,9,13]],"date-time":"2019-09-13T10:16:59Z","timestamp":1568369819000},"page":"48-65","source":"Crossref","is-referenced-by-count":12,"title":["A Comparative Study of Data Cleaning Tools"],"prefix":"10.4018","volume":"15","author":[{"given":"Samson","family":"Oni","sequence":"first","affiliation":[{"name":"University of Maryland Baltimore County, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zhiyuan","family":"Chen","sequence":"additional","affiliation":[{"name":"University of Maryland Baltimore County, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Susan","family":"Hoban","sequence":"additional","affiliation":[{"name":"University of Maryland, Baltimore County, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Onimi","family":"Jademi","sequence":"additional","affiliation":[{"name":"University of Maryland, Baltimore County, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"2432","reference":[{"key":"IJDWM.2019100103-0","doi-asserted-by":"publisher","DOI":"10.1145\/27633.27634"},{"key":"IJDWM.2019100103-1","article-title":"A review of data fusion techniques.","author":"F.Castanedo","year":"2013","journal-title":"The Scientific World Journal"},{"key":"IJDWM.2019100103-2","doi-asserted-by":"publisher","DOI":"10.1002\/0471448354"},{"key":"IJDWM.2019100103-3","doi-asserted-by":"crossref","unstructured":"Galhardas, H., Florescu, D., Shasha, D., & Simon, E. (2000). AJAX: an extensible data cleaning tool.","DOI":"10.1145\/342009.336568"},{"key":"IJDWM.2019100103-4","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2016.2569061"},{"issue":"2","key":"IJDWM.2019100103-5","article-title":"R and the Journal of Statistical Software.","volume":"73","author":"F.John","year":"2016","journal-title":"Journal of Statistical Software"},{"key":"IJDWM.2019100103-6","doi-asserted-by":"crossref","unstructured":"Kandel, S., Paepcke, A., Hellerstein, J., & Heer, J. (2011). Wrangler: Interactive visual specification of data transformation scripts. Paper presented at theProceedings of the SIGCHI Conference on Human Factors in Computing Systems.","DOI":"10.1145\/1978942.1979444"},{"issue":"7","key":"IJDWM.2019100103-7","first-page":"371","article-title":"Comparative Analysis of Data Cleaning Tools Using SQL Server and Winpure Tool.","volume":"3","author":"A. E.Karrar","year":"2016","journal-title":"International Journal of Computer Applications in Technology"},{"issue":"1","key":"IJDWM.2019100103-8","article-title":"Extraction, Transformation, Loading (ETL) and Data Cleaning Problems.","volume":"6","author":"S.Kumar","year":"2008","journal-title":"Journal of Independent Studies and Research on Computing"},{"key":"IJDWM.2019100103-9","doi-asserted-by":"crossref","unstructured":"Lee, M. L., Lu, H., Ling, T. W., & Ko, Y. T. (1999). Cleansing data for mining and warehousing. Paper presented at the10th International Conference on Database and Expert Systems Applications.","DOI":"10.1007\/3-540-48309-8_70"},{"key":"IJDWM.2019100103-10","doi-asserted-by":"crossref","unstructured":"Martinez-Mosquera, D., Luj\u00e1n-Mora, S., L\u00f3pez, G., & Santos, L. (2017). Data Cleaning Technique for Security Logs Based on Fellegi-Sunter Theory. Paper presented at the SIGSAND-EuroSymposium, Gdansk, Poland.","DOI":"10.1007\/978-3-319-66996-0_1"},{"key":"IJDWM.2019100103-11","author":"H.M\u00fcller","year":"2005","journal-title":"Problems, methods, and challenges in comprehensive data cleansing"},{"key":"IJDWM.2019100103-12","doi-asserted-by":"publisher","DOI":"10.1145\/276404.276408"},{"issue":"3","key":"IJDWM.2019100103-13","article-title":"Requirement to cleanse DATA in ETL process and Why is data cleansing in Business Application?","volume":"2","author":"S.Patel","year":"2012","journal-title":"International Journal of Engineering Research and Applications"},{"issue":"17","key":"IJDWM.2019100103-14","article-title":"A Comparative Analysis of Data Cleaning Approaches to Dirty Data.","volume":"62","author":"S.Porwal","year":"2013","journal-title":"International Journal of Computers and Applications"},{"issue":"4","key":"IJDWM.2019100103-15","first-page":"3","article-title":"Data cleaning: Problems and current approaches.","volume":"23","author":"E.Rahm","year":"2000","journal-title":"IEEE Data Eng. Bull."},{"key":"IJDWM.2019100103-16","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1145\/583890.583893","article-title":"Conceptual modeling for ETL processes.","author":"P.Vassiliadis","year":"2002","journal-title":"Proceedings of the 5th ACM international workshop on Data Warehousing and OLAP"},{"key":"IJDWM.2019100103-17","author":"R.Verborgh","year":"2013","journal-title":"Using OpenRefine"}],"container-title":["International Journal of Data Warehousing and Mining"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=237137","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,6]],"date-time":"2022-05-06T17:49:50Z","timestamp":1651859390000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/IJDWM.2019100103"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2019,10,1]]},"references-count":18,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2019,10]]}},"URL":"https:\/\/doi.org\/10.4018\/ijdwm.2019100103","relation":{},"ISSN":["1548-3924","1548-3932"],"issn-type":[{"value":"1548-3924","type":"print"},{"value":"1548-3932","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,10,1]]}}}