{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T07:34:02Z","timestamp":1761896042707,"version":"3.41.0"},"reference-count":61,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2015,3,2]],"date-time":"2015-03-02T00:00:00Z","timestamp":1425254400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Data and Information Quality"],"published-print":{"date-parts":[[2015,3,3]]},"abstract":"<jats:p>The data extracted from electronic archives is a valuable asset; however, the issue of the (poor) data quality should be addressed before performing data analysis and decision-making activities. Poor data quality is frequently cleansed using business rules derived from domain knowledge. Unfortunately, the process of designing and implementing cleansing activities based on business rules requires a relevant effort. In this article, we illustrate a model-based approach useful to perform inconsistency identification and corrective interventions, thus simplifying the process of developing cleansing activities. The article shows how the cleansing activities required to perform a sensitivity analysis can be easily developed using the proposed model-based approach. The sensitivity analysis provides insights on how the cleansing activities can affect the results of indicators computation. The approach has been successfully used on a database describing the working histories of an Italian area population. A model formalizing how data should evolve over time (i.e., a data consistency model) in such domain was created (by means of formal methods) and used to perform the cleansing and sensitivity analysis activities.<\/jats:p>","DOI":"10.1145\/2641575","type":"journal-article","created":{"date-parts":[[2015,3,3]],"date-time":"2015-03-03T14:08:19Z","timestamp":1425391699000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["A Model-Based Approach for Developing Data Cleansing Solutions"],"prefix":"10.1145","volume":"5","author":[{"given":"Mario","family":"Mezzanzanica","sequence":"first","affiliation":[{"name":"Department of Statistics and Quantitative Methods, C.R.I.S.P. Research Centre, University of Milano-Bicocca, Italy"}]},{"given":"Roberto","family":"Boselli","sequence":"additional","affiliation":[{"name":"Department of Statistics and Quantitative Methods, C.R.I.S.P. Research Centre, University of Milano-Bicocca, Italy"}]},{"given":"Mirko","family":"Cesarini","sequence":"additional","affiliation":[{"name":"Department of Statistics and Quantitative Methods, C.R.I.S.P. Research Centre, University of Milano-Bicocca, Italy"}]},{"given":"Fabio","family":"Mercorio","sequence":"additional","affiliation":[{"name":"Department of Statistics and Quantitative Methods, C.R.I.S.P. Research Centre, University of Milano-Bicocca, Italy"}]}],"member":"320","published-online":{"date-parts":[[2015,3,2]]},"reference":[{"volume-title":"Resende","year":"2002","author":"Abello James","key":"e_1_2_1_1_1"},{"volume-title":"Principles of Model Checking (Representation and Mind Series)","author":"Baier Christel","key":"e_1_2_1_2_1"},{"volume-title":"Latent Markov Models for Longitudinal Data","author":"Bartolucci Francesco","key":"e_1_2_1_3_1","doi-asserted-by":"crossref","DOI":"10.1201\/b13246"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1541880.1541883"},{"volume-title":"Data Quality: Concepts, Methodologies and Techniques","year":"2006","author":"Batini Carlo","key":"e_1_2_1_5_1"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1147376.1147391"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0065-2458(03)58003-2"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2007.367920"},{"volume-title":"Human-Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data, Andreas Holzinger and Gabriella Pasi (Eds.)","series-title":"Lecture Notes in Computer Science","author":"Boselli Roberto","key":"e_1_2_1_9_1"},{"volume-title":"Proceedings of the 24th International Conference on Automated Planning and Scheduling (ICAPS\u201914)","year":"2014","author":"Boselli Roberto","key":"e_1_2_1_10_1"},{"volume-title":"Interactive Knowledge Discovery and Data Mining in Biomedical Informatics, Andreas Holzinger and Igor Jurisica (Eds.)","series-title":"Lecture Notes in Computer Science","author":"Boselli Roberto","key":"e_1_2_1_11_1"},{"volume-title":"Towards data cleansing via planning. Intelligenza Artificiale 8, 1 (1","year":"2014","author":"Boselli Roberto","key":"e_1_2_1_12_1"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.4018\/jcit.2007100105"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.2197\/ipsjdc.2.826"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/210197.210200"},{"volume-title":"On the computational complexity of minimal-change integrity maintenance in relational databases","author":"Chomicki Jan","key":"e_1_2_1_16_1","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-540-30597-2_5"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536258.2536262"},{"volume-title":"Proceedings of the 15th National Conference on Artificial Intelligence (AAAI\u201998) and the 10th Conference on Innovative Applications of Artificial Intelligence (IAAI\u201998)","year":"1998","author":"Cimatti Alessandro","key":"e_1_2_1_18_1"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/775832.775928"},{"volume-title":"Peled","year":"1999","author":"Clarke Edmund M.","key":"e_1_2_1_20_1"},{"key":"e_1_2_1_21_1","unstructured":"CMurphi Web Page. 2011. Homepage. Retrieved from http:\/\/www.dsi.uniroma1.it\/tronci\/cached.murphi.html.  CMurphi Web Page. 2011. Homepage. Retrieved from http:\/\/www.dsi.uniroma1.it\/tronci\/cached.murphi.html."},{"volume-title":"Information Theory: Coding Theorems for Discrete Memoryless Systems.","year":"1981","author":"Csisz\u00e1r Imre","key":"e_1_2_1_22_1"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2465327"},{"volume-title":"Proceedings of ICAPS","year":"2009","author":"Penna Giuseppe Della","key":"e_1_2_1_24_1"},{"volume-title":"Proceedings of the 8th International Conference on Informatics in Control, Automation and Robotics (ICINCO\u201911)","year":"2011","author":"Penna Giuseppe Della","key":"e_1_2_1_25_1"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10489-011-0306-z"},{"volume-title":"Informatics in Control Automation and Robotics, JuanAndrade Cetto, Jean-Louis Ferrier, Jos Miguel Costa dias Pereira, and Joaquim Filipe (Eds.)","series-title":"Lecture Notes in Electrical Engineering","author":"Penna Giuseppe Della","key":"e_1_2_1_27_1"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1080\/07421222.2000.11518265"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2007.9"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376916.1376940"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920867"},{"volume-title":"Introduction to Information Quality","author":"Fisher Craig","key":"e_1_2_1_32_1"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0378-7206(01)00083-0"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536360.2536363"},{"volume-title":"Recent Advances in AI Planning, Susanne Biundo and Maria Fox (Eds.)","series-title":"Lecture Notes in Computer Science","author":"Giunchiglia Fausto","key":"e_1_2_1_35_1"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2004.04.016"},{"volume-title":"Gibbons","year":"2006","author":"Hedeker Donald","key":"e_1_2_1_37_1"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/s100090050008"},{"volume-title":"On knowledge discovery and interactive intelligent visualization of biomedical dataChallenges in human-computer interaction and biomedical informatics","author":"Holzinger Andreas","key":"e_1_2_1_39_1"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-14-191"},{"volume-title":"Finite Markov Processes and their Applications","author":"Iosifescu Marius","key":"e_1_2_1_41_1"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1514894.1514901"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.533956"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11135-011-9578-y"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2009.11.018"},{"volume-title":"Information Systems for Regional Labour Market Monitoring\u2014State of the Art and Prospectives","author":"Martini Mattia","key":"e_1_2_1_46_1"},{"volume-title":"Proceedings of the International Conference on Data Technologies and Applications (DATA\u201912)","year":"2012","author":"Mezzanzanica Mario","key":"e_1_2_1_47_1"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00625968"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2013.6544847"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10844-009-0106-7"},{"key":"e_1_2_1_51_1","first-page":"3","article-title":"Data cleaning: Problems and current approaches","volume":"23","author":"Rahm Erhard","year":"2000","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.5555\/646045.676458"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/269012.269025"},{"volume-title":"Proceedings of IJCAI","year":"1987","author":"Schoppers Marcel Joachim","key":"e_1_2_1_54_1"},{"volume-title":"Willett","year":"2003","author":"Singer Judith D.","key":"e_1_2_1_55_1"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/253769.253804"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1504\/IJIQ.2007.016713"},{"key":"e_1_2_1_58_1","unstructured":"The Italian Ministry of Labour and Welfare. 2012. Annual Report about the CO System. Retrieved from http:\/\/www.cliclavoro.gov.it\/Barometro-Del-Lavoro\/Documents\/Rapporto_CO\/Executive_summary.pdf. (2012).  The Italian Ministry of Labour and Welfare. 2012. Annual Report about the CO System. Retrieved from http:\/\/www.cliclavoro.gov.it\/Barometro-Del-Lavoro\/Documents\/Rapporto_CO\/Executive_summary.pdf. (2012)."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-25364-5_11"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/2287714.2287715"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2463706"}],"container-title":["Journal of Data and Information Quality"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2641575","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2641575","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:56:18Z","timestamp":1750229778000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2641575"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,3,2]]},"references-count":61,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2015,3,3]]}},"alternative-id":["10.1145\/2641575"],"URL":"https:\/\/doi.org\/10.1145\/2641575","relation":{},"ISSN":["1936-1955","1936-1963"],"issn-type":[{"type":"print","value":"1936-1955"},{"type":"electronic","value":"1936-1963"}],"subject":[],"published":{"date-parts":[[2015,3,2]]},"assertion":[{"value":"2013-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-03-02","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}