{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,5]],"date-time":"2025-10-05T16:56:13Z","timestamp":1759683373510,"version":"3.41.0"},"reference-count":10,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2019,1,4]],"date-time":"2019-01-04T00:00:00Z","timestamp":1546560000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Data and Information Quality"],"published-print":{"date-parts":[[2019,3,31]]},"abstract":"<jats:p>High-quality data is critical for effective data science. As the use of data science has grown, so too have concerns that individuals\u2019 rights to privacy will be violated. This has led to the development of data protection regulations around the globe and the use of sophisticated anonymization techniques to protect privacy. Such measures make it more challenging for the data scientist to understand the data, exacerbating issues of data quality. Responsible data science aims to develop useful insights from the data while fully embracing these considerations.<\/jats:p>\n          <jats:p>\n            We pose the high-level problem in this article,\n            <jats:italic>\u201cHow can a data scientist develop the needed trust that private data has high quality?\u201d<\/jats:italic>\n            We then identify a series of challenges for various data-centric communities and outline research questions for data quality and privacy researchers, which would need to be addressed to effectively answer the problem posed in this article.\n          <\/jats:p>","DOI":"10.1145\/3287168","type":"journal-article","created":{"date-parts":[[2019,1,7]],"date-time":"2019-01-07T13:42:28Z","timestamp":1546868548000},"page":"1-9","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Ensuring High-Quality Private Data for Responsible Data Science"],"prefix":"10.1145","volume":"11","author":[{"given":"Divesh","family":"Srivastava","sequence":"first","affiliation":[{"name":"AT8T Labs Research, Bedminster, NJ, USA"}]},{"given":"Monica","family":"Scannapieco","sequence":"additional","affiliation":[{"name":"Italian National Institute of Statistics, Roma, Italy"}]},{"given":"Thomas C.","family":"Redman","sequence":"additional","affiliation":[{"name":"Data Quality Solutions, Rumson, NJ, USA"}]}],"member":"320","published-online":{"date-parts":[[2019,1,4]]},"reference":[{"doi-asserted-by":"crossref","unstructured":"C. Batini and M. Scannapieco. 2016. Data and Information Quality\u2014Dimensions Principles and Techniques. Springer International Publishing.   C. Batini and M. Scannapieco. 2016. Data and Information Quality\u2014Dimensions Principles and Techniques. Springer International Publishing.","key":"e_1_2_1_1_1","DOI":"10.1007\/978-3-319-24106-7"},{"volume-title":"Improving Data Warehouse and Business Information Quality","author":"English","unstructured":"L. English . 1999. Improving Data Warehouse and Business Information Quality . Wiley . L. English. 1999. Improving Data Warehouse and Business Information Quality. Wiley.","key":"e_1_2_1_2_1"},{"unstructured":"S. Lohr. 2018. Facial Recognition is Accurate\u2014If You\u2019re a White Guy. Retrieved from https:\/\/www.nytimes.com\/2018\/02\/09\/technology\/facial-recognition-race-artificial-intelligence.html.  S. Lohr. 2018. Facial Recognition is Accurate\u2014If You\u2019re a White Guy. Retrieved from https:\/\/www.nytimes.com\/2018\/02\/09\/technology\/facial-recognition-race-artificial-intelligence.html.","key":"e_1_2_1_3_1"},{"unstructured":"D. McGilvray. 2008. Executing Data Quality Projects. Morgan Kaufmann.   D. McGilvray. 2008. Executing Data Quality Projects. Morgan Kaufmann.","key":"e_1_2_1_4_1"},{"unstructured":"T. Nagle T. Redman and D. Sammon. 2017. Only 3% of Companies\u2019 Data Meets Basic Quality Standards. Retrieved from https:\/\/hbr.org\/2017\/09\/only-3-of-companies-data-meets-basic-quality-standards.  T. Nagle T. Redman and D. Sammon. 2017. Only 3% of Companies\u2019 Data Meets Basic Quality Standards. Retrieved from https:\/\/hbr.org\/2017\/09\/only-3-of-companies-data-meets-basic-quality-standards.","key":"e_1_2_1_5_1"},{"unstructured":"European Statistical System Project. 2018. ESSnet Big Data Pilots-I. Retrieved from https:\/\/webgate.ec.europa.eu\/fpfis\/mwikis\/essnetbigdata\/index.php\/Main_Page.  European Statistical System Project. 2018. ESSnet Big Data Pilots-I. Retrieved from https:\/\/webgate.ec.europa.eu\/fpfis\/mwikis\/essnetbigdata\/index.php\/Main_Page.","key":"e_1_2_1_6_1"},{"unstructured":"T. Redman. 2016. Getting in Front on Data: Who Does What. Technics.  T. Redman. 2016. Getting in Front on Data: Who Does What. Technics.","key":"e_1_2_1_7_1"},{"unstructured":"T. Redman. 2018. If Your Data Is Bad Your Machine Learning Tools Are Useless. Retrieved from https:\/\/hbr.org\/2018\/04\/if-your-data-is-bad-your-machine-learning-tools-are-useless.  T. Redman. 2018. If Your Data Is Bad Your Machine Learning Tools Are Useless. Retrieved from https:\/\/hbr.org\/2018\/04\/if-your-data-is-bad-your-machine-learning-tools-are-useless.","key":"e_1_2_1_8_1"},{"unstructured":"G. Stateva O. Bosch D. Windmeijer J. Maslankowski G. Barcaroli M. Scannapieco D. Summa M. Greenaway I. Jansson and D. Wu. 2018. Web Scraping Enterprise Characteristics-Final Report. Retrieved from https:\/\/webgate.ec.europa.eu\/fpfis\/mwikis\/essnetbigdata\/images\/e\/ee\/Wp2_Del2_4.pdf.  G. Stateva O. Bosch D. Windmeijer J. Maslankowski G. Barcaroli M. Scannapieco D. Summa M. Greenaway I. Jansson and D. Wu. 2018. Web Scraping Enterprise Characteristics-Final Report. Retrieved from https:\/\/webgate.ec.europa.eu\/fpfis\/mwikis\/essnetbigdata\/images\/e\/ee\/Wp2_Del2_4.pdf.","key":"e_1_2_1_9_1"},{"unstructured":"E. Wilder-James. 2016. Breaking Down Data Silos. Retrieved from https:\/\/hbr.org\/2016\/12\/breaking-down-data-silos.  E. Wilder-James. 2016. Breaking Down Data Silos. Retrieved from https:\/\/hbr.org\/2016\/12\/breaking-down-data-silos.","key":"e_1_2_1_10_1"}],"container-title":["Journal of Data and Information Quality"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3287168","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3287168","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:01:53Z","timestamp":1750208513000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3287168"}},"subtitle":["Vision and Challenges"],"short-title":[],"issued":{"date-parts":[[2019,1,4]]},"references-count":10,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2019,3,31]]}},"alternative-id":["10.1145\/3287168"],"URL":"https:\/\/doi.org\/10.1145\/3287168","relation":{},"ISSN":["1936-1955","1936-1963"],"issn-type":[{"type":"print","value":"1936-1955"},{"type":"electronic","value":"1936-1963"}],"subject":[],"published":{"date-parts":[[2019,1,4]]},"assertion":[{"value":"2018-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-01-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}