{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,25]],"date-time":"2025-03-25T15:40:04Z","timestamp":1742917204762,"version":"3.40.3"},"reference-count":25,"publisher":"IGI Global","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,1,1]]},"abstract":"<p>To assist business intelligence companies dealing with data preparation problems, different approaches have been developed to handle the dirty data. However, these data cleansing approaches do not have real-time monitoring capabilities. Therefore, business intelligence companies and their clients are not able to predict the final outcome before running all business process. This yields an extra cost for the company if the data are highly corrupted. Therefore, to reduce cost for these types of businesses, the authors design a framework that monitors the quality attributes during the data cleansing process. Moreover, the system provides feedback to the user and allows the user to restructure the workflow based on quality attributes. The main concept of the framework is based on client-server architecture that uses multithreading to allow real-time monitoring of the process. A child thread is dedicated to run and another is dedicated to monitor the processes and give feedback to the user. The real-time monitoring system not only displays the cleansing process done on the data set, but also estimates the risk propagation probabilities in the data cleansing process. De-duplication elimination, address normalization, spelling correction for personal names, and non-ASCII character removal techniques are employed.<\/p>","DOI":"10.4018\/jbir.2012010106","type":"journal-article","created":{"date-parts":[[2012,4,5]],"date-time":"2012-04-05T13:10:49Z","timestamp":1333631449000},"page":"83-93","source":"Crossref","is-referenced-by-count":0,"title":["Real-Time Data Quality Monitoring System for Data Cleansing"],"prefix":"10.4018","volume":"3","author":[{"given":"Cihan","family":"Varol","sequence":"first","affiliation":[{"name":"Sam Houston State University, USA"}]},{"given":"Henry","family":"Neumann","sequence":"additional","affiliation":[{"name":"Sam Houston State University, USA"}]}],"member":"2432","reference":[{"unstructured":"Barateiro, J., & Galhardas, H. (2005). A survey of data quality tools. Databank-Spectrum, 14.","key":"jbir.2012010106-0"},{"doi-asserted-by":"crossref","unstructured":"Bethem, T., Evans, M., Vafaie, H., & Shaughnessy, M. (2002, October 29-31). Development of a real-time data quality monitoring system using embedded intelligence. In Proceedings of the Oceans MTS\/IEEE Marine Frontiers Conference, Biloxi, MS (pp. 1820-1824).","key":"jbir.2012010106-1","DOI":"10.1109\/OCEANS.2002.1191909"},{"unstructured":"Cardoso, J. (2004). Adaptive algorithm to predict the QoS of web processes and workflows. In Proceedings of the International Conference on Computational Intelligence (pp. 490-493).","key":"jbir.2012010106-2"},{"doi-asserted-by":"publisher","key":"jbir.2012010106-3","DOI":"10.1109\/69.824597"},{"unstructured":"Cohen, W. W., Ravikumar, P., & Stephen, E. F. (2003). A comparison of string distance metrics for name-matching tasks. In Proceedings of the IJCAI Workshop on Information Integration on the Web, Acapulco, Mexico (pp. 73-78).","key":"jbir.2012010106-4"},{"year":"2009","author":"F.Dravis","journal-title":"Information quality: The quest for justification","key":"jbir.2012010106-5"},{"unstructured":"Eckerson, W. (2002, May 1). Data warehousing special report: Data quality and the bottom line. Retrieved from http:\/\/adtmag.com\/articles\/2002\/05\/01\/data-warehousing-special-report-data-quality-and-the-bottom-line.aspx","key":"jbir.2012010106-6"},{"year":"2008","author":"C.Fisher","journal-title":"Introduction to information quality","key":"jbir.2012010106-7"},{"unstructured":"Goasdou\u00e9, V., Nugier, A., Duquennoy, D., & Laboisse, B. (2007). An evaluation framework for data quality tools. In Proceedings of the International Conference for Information Quality.","key":"jbir.2012010106-8"},{"unstructured":"Higgins, B. R. (2002). US Patent No. 6438546: Method of standardizing address data. Washington, DC: United States Patent and Trademark Office.","key":"jbir.2012010106-9"},{"doi-asserted-by":"publisher","key":"jbir.2012010106-10","DOI":"10.1145\/505248.506007"},{"unstructured":"Loshin, D. (2007). Data profiling, data integration and data quality: The pillars of master data management. Retrieved from http:\/\/www.beyeresearch.com\/study\/4390","key":"jbir.2012010106-11"},{"issue":"1","key":"jbir.2012010106-12","first-page":"2","article-title":"Over view and framework for data and information quality research.","volume":"1","author":"S.Madnick","year":"2009","journal-title":"ACM Journal of Data Quality"},{"unstructured":"Pushkarev, V., Neumann, H., Varol, C., & Talburt, J. (2010, July 12-15). An overview of open source data quality tools. In Proceedings of the International Conference on Information and Knowledge Engineering, Las Vegas, NV.","key":"jbir.2012010106-13"},{"year":"1986","author":"W. P.Rogers","journal-title":"Report of the presidential commission on the space shuttle Challenger accident","key":"jbir.2012010106-14"},{"unstructured":"Shankaranarayanan, G., & Wang, R. Y. (2007, November). IPMAP: Current state and perspectives. In Proceedings of the International Conference on Information Quality, Boston, MA.","key":"jbir.2012010106-15"},{"doi-asserted-by":"publisher","key":"jbir.2012010106-16","DOI":"10.1145\/253769.253804"},{"unstructured":"Talburt, J., & Campbell, T. (2006). Designing a balanced data quality scorecard. In Proceedings of the Information Resources Management Association Conference (pp. 506-508).","key":"jbir.2012010106-17"},{"unstructured":"Trillium Software. (2011). Data monitoring. Retrieved from http:\/\/trilliumsoftware.com\/home\/products\/data-profiling\/data-monitoring.aspx","key":"jbir.2012010106-18"},{"unstructured":"Varol, C., & Bayrak, C. (2008, September 4-6). Measuring reliability component for quality of service (QoS) in business process automation. In Proceedings of the 14th International Conference on Distributed Multimedia Systems, Boston, MA.","key":"jbir.2012010106-19"},{"unstructured":"Varol, C., & Bayrak, C. (2009, February 13). Personal name-based pattern and phonetic matching techniques: A survey. In Proceedings of the Conference on Applied Research in Information Technology, Conway, AR.","key":"jbir.2012010106-20"},{"doi-asserted-by":"publisher","key":"jbir.2012010106-21","DOI":"10.1145\/269012.269022"},{"doi-asserted-by":"crossref","unstructured":"Wang, R. Y., Kon, H., & Madnick, S. (1993, April). Data quality requirements analysis and modeling. In Proceedings of the Ninth International Conference on Data Engineering, Vienna, Austria (pp. 670-677).","key":"jbir.2012010106-22","DOI":"10.1109\/ICDE.1993.344012"},{"unstructured":"Winkler, W. E. (2006). Overview of record linkage and current research directions (Tech. Rep. No. RR2006\/02). Washington, DC: US Bureau of the Census.","key":"jbir.2012010106-23"},{"unstructured":"Yancey, W. E. (2005). Evaluating string comparator performance for record linkage (Tech. Rep. No. RR2005\/05). Washington, DC: US Bureau of the Census.","key":"jbir.2012010106-24"}],"container-title":["International Journal of Business Intelligence Research"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=62024","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,25]],"date-time":"2025-03-25T14:36:28Z","timestamp":1742913388000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/jbir.2012010106"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2012,1,1]]},"references-count":25,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2012,1]]}},"URL":"https:\/\/doi.org\/10.4018\/jbir.2012010106","relation":{},"ISSN":["1947-3591","1947-3605"],"issn-type":[{"type":"print","value":"1947-3591"},{"type":"electronic","value":"1947-3605"}],"subject":[],"published":{"date-parts":[[2012,1,1]]}}}