{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T15:17:24Z","timestamp":1770995844426,"version":"3.50.1"},"reference-count":43,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2009,9,1]],"date-time":"2009-09-01T00:00:00Z","timestamp":1251763200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Data and Information Quality"],"published-print":{"date-parts":[[2009,9]]},"abstract":"<jats:p>The range of information now available in queryable repositories opens up a host of possibilities for new and valuable forms of data analysis. Database query languages such as SQL and XQuery offer a concise and high-level means by which such analyses can be implemented, facilitating the extraction of relevant data subsets into either generic or bespoke data analysis environments. Unfortunately, the quality of data in these repositories is often highly variable. The data is still useful, but only if the consumer is aware of the data quality problems and can work around them. Standard query languages offer little support for this aspect of data management. In principle, however, it should be possible to embed constraints describing the consumer\u2019s data quality requirements into the query directly, so that the query evaluator can take over responsibility for enforcing them during query processing.<\/jats:p>\n          <jats:p>\n            Most previous attempts to incorporate information quality constraints into database queries have been based around a small number of highly generic quality measures, which are defined and computed by the information provider. This is a useful approach in some application areas but, in practice, quality criteria are more commonly determined by the user of the information not by the provider. In this article, we explore an approach to incorporating quality constraints into database queries where the definition of quality is set by the user and not the provider of the information. Our approach is based around the concept of a\n            <jats:italic>quality view<\/jats:italic>\n            , a configurable quality assessment component into which domain-specific notions of quality can be embedded. We examine how quality views can be incorporated into XQuery, and draw from this the language features that are required in general to embed quality views into any query language. We also propose some syntactic sugar on top of XQuery to simplify the process of querying with quality constraints.\n          <\/jats:p>","DOI":"10.1145\/1577840.1577846","type":"journal-article","created":{"date-parts":[[2009,11,30]],"date-time":"2009-11-30T14:56:36Z","timestamp":1259592996000},"page":"1-31","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Incorporating Domain-Specific Information Quality Constraints into Database Queries"],"prefix":"10.1145","volume":"1","author":[{"given":"Suzanne M.","family":"Embury","sequence":"first","affiliation":[{"name":"University of Manchester"}]},{"given":"Paolo","family":"Missier","sequence":"additional","affiliation":[{"name":"University of Manchester"}]},{"given":"Sandra","family":"Sampaio","sequence":"additional","affiliation":[{"name":"University of Manchester"}]},{"given":"R. Mark","family":"Greenwood","sequence":"additional","affiliation":[{"name":"University of Manchester"}]},{"given":"Alun D.","family":"Preece","sequence":"additional","affiliation":[{"name":"Cardiff University"}]}],"member":"320","published-online":{"date-parts":[[2009,9]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Data Quality: Concepts, Methodologies, and Techniques","author":"Batini C.","year":"2006","unstructured":"Batini , C. and Scannapieco , M . 2006 . Data Quality: Concepts, Methodologies, and Techniques . Springer , Berlin . Batini, C. and Scannapieco, M. 2006. Data Quality: Concepts, Methodologies, and Techniques. Springer, Berlin."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 9th International Conference on Information Quality (IQ\u201904)","author":"Berti-Equille L.","year":"2004","unstructured":"Berti-Equille , L. 2004 . Quality-Adaptive query processing over distributed sources . In Proceedings of the 9th International Conference on Information Quality (IQ\u201904) . 285--296. Berti-Equille, L. 2004. Quality-Adaptive query processing over distributed sources. In Proceedings of the 9th International Conference on Information Quality (IQ\u201904). 285--296."},{"key":"e_1_2_1_3_1","unstructured":"Berti-\u00c9quille L. 2007. Quality awareness for managing and mining data. Habilitation L\u2019Universit\u00e9 de Rennes.  Berti-\u00c9quille L. 2007. Quality awareness for managing and mining data. Habilitation L\u2019Universit\u00e9 de Rennes."},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 8th Working Conference on Reverse Engineering (WCRE\u201901)","author":"Blaha M.","year":"2001","unstructured":"Blaha , M. 2001 . A retrospective on industrial database reverse engineering projects . In Proceedings of the 8th Working Conference on Reverse Engineering (WCRE\u201901) . IEEE Computer Society Press, 136--146. Blaha, M. 2001. A retrospective on industrial database reverse engineering projects. In Proceedings of the 8th Working Conference on Reverse Engineering (WCRE\u201901). IEEE Computer Society Press, 136--146."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1012453.1012464"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gni167"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1021\/pr049946o"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/HICSS.2006.443"},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Dasu T. and Johnson T. 2003. Exploratory Data Mining and Data Cleaning. John Wiley New York.   Dasu T. and Johnson T. 2003. Exploratory Data Mining and Data Cleaning . John Wiley New York.","DOI":"10.1002\/0471448354"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2007.9"},{"key":"e_1_2_1_11_1","unstructured":"Embury S. Sampaio S. Missier P. and Greenwood R. 2007. The Syntax and Semantics of QXQuery. Tech. rep. School of Computer Science University of Manchester. www.qurator.org.  Embury S. Sampaio S. Missier P. and Greenwood R. 2007. The Syntax and Semantics of QXQuery. Tech. rep. School of Computer Science University of Manchester. www.qurator.org."},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the International Conference on Information Quality (IQ\u201907)","author":"F\u00fchring P.","unstructured":"F\u00fchring , P. and Naumann , F . 2007. Emergent data quality annotation and visualization . In Proceedings of the International Conference on Information Quality (IQ\u201907) . F\u00fchring, P. and Naumann, F. 2007. Emergent data quality annotation and visualization. In Proceedings of the International Conference on Information Quality (IQ\u201907)."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1002\/mrm.20169"},{"key":"e_1_2_1_14_1","volume-title":"Taverna: A tool for building and running workflows of services. Nucleic Acids Res. 34, (Web Server issue) W729--W732.","author":"Hull D.","year":"2006","unstructured":"Hull , D. , Wolstencroft , K. , Stevens , R. , Goble , C. , Pocock , M. , Li , P. , and Oinn , T . 2006 . Taverna: A tool for building and running workflows of services. Nucleic Acids Res. 34, (Web Server issue) W729--W732. Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M., Li, P., and Oinn, T. 2006. Taverna: A tool for building and running workflows of services. Nucleic Acids Res. 34, (Web Server issue) W729--W732."},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the 29th International Conference on Very Large Databases (VLDB\u201903)","author":"Korn F.","unstructured":"Korn , F. , Muthukrishnan , S. , and Zhu , Y . 2003. Checks and balances: Monitoring data quality problems in network traffic databases . In Proceedings of the 29th International Conference on Very Large Databases (VLDB\u201903) . ACM Press, 536--547. Korn, F., Muthukrishnan, S., and Zhu, Y. 2003. Checks and balances: Monitoring data quality problems in network traffic databases. In Proceedings of the 29th International Conference on Very Large Databases (VLDB\u201903). ACM Press, 536--547."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0378-7206(02)00043-5"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1077501.1077508"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2164-5-1"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-003-0101-5"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the International Workshop on Web Databases (WebDB\u201900)","author":"Mihaila G.","unstructured":"Mihaila , G. , Raschid , L. , and Vidal , M . -E. 2000. Using quality of data metadata for source selection and ranking . In Proceedings of the International Workshop on Web Databases (WebDB\u201900) . 93--98. Mihaila, G., Raschid, L., and Vidal, M.-E. 2000. Using quality of data metadata for source selection and ranking. In Proceedings of the International Workshop on Web Databases (WebDB\u201900). 93--98."},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the Workshop on Data and Information Quality (CAiSE\u201904)","volume":"2","author":"Milano D.","unstructured":"Milano , D. , Scannapieco , M. , and Catarci , T . 2004. Quality-Driven query processing of XQuery queries . In Proceedings of the Workshop on Data and Information Quality (CAiSE\u201904) . J. Grundspenkis and M. Kirikova, Eds. Vol. 2 . 78--89. Milano, D., Scannapieco, M., and Catarci, T. 2004. Quality-Driven query processing of XQuery queries. In Proceedings of the Workshop on Data and Information Quality (CAiSE\u201904). J. Grundspenkis and M. Kirikova, Eds. Vol. 2. 78--89."},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 32nd International Conference on Very Large Data Bases (VLDB\u201906)","author":"Missier P.","unstructured":"Missier , P. , Embury , S. , Greenwood , R. , Preece , A. , and Jin , B . 2006. Quality views: Capturing and exploiting the user perspective on data quality . In Proceedings of the 32nd International Conference on Very Large Data Bases (VLDB\u201906) , ACM Press, 977--988. Missier, P., Embury, S., Greenwood, R., Preece, A., and Jin, B. 2006. Quality views: Capturing and exploiting the user perspective on data quality. In Proceedings of the 32nd International Conference on Very Large Data Bases (VLDB\u201906), ACM Press, 977--988."},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 4th International Workshop on Data Integration in the Life Sciences (DILS\u201907)","author":"Missier P.","unstructured":"Missier , P. , Embury , S. , Hedeler , C. , Greenwood , R. , Pennock , J. , and Brass , A . 2007. Accelerating disease gene identification through integrated SNP data analysis . In Proceedings of the 4th International Workshop on Data Integration in the Life Sciences (DILS\u201907) , S. Cohen Boulakia and V. Tannen, Eds. Springer, 215--230. Missier, P., Embury, S., Hedeler, C., Greenwood, R., Pennock, J., and Brass, A. 2007. Accelerating disease gene identification through integrated SNP data analysis. In Proceedings of the 4th International Workshop on Data Integration in the Life Sciences (DILS\u201907), S. Cohen Boulakia and V. Tannen, Eds. Springer, 215--230."},{"key":"e_1_2_1_25_1","series-title":"Lecture Notes in Computer Science","volume-title":"Quality-Driven Query Answering for Integrated Information Systems","author":"Naumann F.","unstructured":"Naumann , F. 2002. Quality-Driven Query Answering for Integrated Information Systems . Lecture Notes in Computer Science , vol. 2261 . Springer , Berlin . Naumann, F. 2002. Quality-Driven Query Answering for Integrated Information Systems. Lecture Notes in Computer Science, vol. 2261. Springer, Berlin."},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 25th International Conference on Very Large Databases (VLDB\u201999)","author":"Naumann F.","unstructured":"Naumann , F. , Leser , U. , and Freytag , J . 1999. Quality-Driven integration of heterogeneous information systems . In Proceedings of the 25th International Conference on Very Large Databases (VLDB\u201999) . Morgan Kaufmann, 447--458. Naumann, F., Leser, U., and Freytag, J. 1999. Quality-Driven integration of heterogeneous information systems. In Proceedings of the 25th International Conference on Very Large Databases (VLDB\u201999). Morgan Kaufmann, 447--458."},{"key":"e_1_2_1_27_1","unstructured":"Peralta V. 2006. Data freshness and data accuracy: A state of the art. Tech. rep. Universidad de la Republica Uruguay.  Peralta V. 2006. Data freshness and data accuracy: A state of the art. Tech. rep. Universidad de la Republica Uruguay."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/505248.506010"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CBMS.2006.160"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/11762256_35"},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 27th International Conference on Very Large Databases (VLDB\u201901)","author":"Raman V.","unstructured":"Raman , V. and Hellerstein , J . 2001. Potter\u2019s wheel: An interactive data cleaning system . In Proceedings of the 27th International Conference on Very Large Databases (VLDB\u201901) , P. Apers, et al., Eds. Morgan Kaufmann, 381--390. Raman, V. and Hellerstein, J. 2001. Potter\u2019s wheel: An interactive data cleaning system. In Proceedings of the 27th International Conference on Very Large Databases (VLDB\u201901), P. Apers, et al., Eds. Morgan Kaufmann, 381--390."},{"key":"e_1_2_1_32_1","volume-title":"Data Quality for the Information Age","author":"Redman T.","unstructured":"Redman , T. 1996. Data Quality for the Information Age . Artech House , Boston . Redman, T. 1996. Data Quality for the Information Age. Artech House, Boston."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/269012.269025"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/11581116_6"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2003.12.004"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDEW.2006.150"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1074\/mcp.M500426-MCP200"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1038\/nbt0303-247"},{"key":"e_1_2_1_39_1","doi-asserted-by":"crossref","unstructured":"Topaloglou T. 2006. Informatics solutions for high-throughput proteomics. Drug Discov. Today 11 11\/12 509--516.  Topaloglou T. 2006. Informatics solutions for high-throughput proteomics. Drug Discov. Today 11 11\/12 509--516.","DOI":"10.1016\/j.drudis.2006.04.011"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1080\/07421222.1996.11518099"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/269012.269022"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the 19th International Conference on Advanced Information Systems Engineering (CAiSE\u201907)","volume":"4495","author":"Weis M.","unstructured":"Weis , M. and Manolescu , I . 2007. Declarative XML data cleaning with XClean . In Proceedings of the 19th International Conference on Advanced Information Systems Engineering (CAiSE\u201907) , J. Krogstie et al., Eds. Lecture Notes in Computer Science , vol. 4495 . Springer, 96--110. Weis, M. and Manolescu, I. 2007. Declarative XML data cleaning with XClean. In Proceedings of the 19th International Conference on Advanced Information Systems Engineering (CAiSE\u201907), J. Krogstie et al., Eds. Lecture Notes in Computer Science, vol. 4495. Springer, 96--110."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2003.12.003"},{"key":"e_1_2_1_44_1","volume-title":"Overview of record linkage and current research directions. Statistical Res. rep. series rr2006\/02","author":"Winkler W.","unstructured":"Winkler , W. 2006. Overview of record linkage and current research directions. Statistical Res. rep. series rr2006\/02 , US Bureau of the Census, Washington D.C. Winkler, W. 2006. Overview of record linkage and current research directions. Statistical Res. rep. series rr2006\/02, US Bureau of the Census, Washington D.C."}],"container-title":["Journal of Data and Information Quality"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1577840.1577846","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1577840.1577846","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T13:30:16Z","timestamp":1750253416000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1577840.1577846"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,9]]},"references-count":43,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2009,9]]}},"alternative-id":["10.1145\/1577840.1577846"],"URL":"https:\/\/doi.org\/10.1145\/1577840.1577846","relation":{},"ISSN":["1936-1955","1936-1963"],"issn-type":[{"value":"1936-1955","type":"print"},{"value":"1936-1963","type":"electronic"}],"subject":[],"published":{"date-parts":[[2009,9]]},"assertion":[{"value":"2007-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-09-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}