{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,19]],"date-time":"2026-05-19T07:13:13Z","timestamp":1779174793808,"version":"3.51.4"},"reference-count":34,"publisher":"Association for Computing Machinery (ACM)","issue":"10","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,6]]},"abstract":"<jats:p>Denial Constraints (DCs) are a flexible formalism to express many types of data rules, making them a widely adopted tool for many applications. This flexibility led to the development of numerous algorithms to automatically discover DCs directly from data. However, few studies have been conducted on the quality of the discovered DCs. We experimentally quantify the lack of quality in the results obtained by state-of-the-art algorithms, showing how the proportion of discovered DCs that are false is rarely below 95%. We hypothesize that the common source of these erroneous DCs stems from the adoption of the current DC validity definition. We use a statistical approach to explain the mechanism leading to these results, and propose a redefinition of DC validity properties to avoid the acceptance of false DCs. We validate this redefinition experimentally, showing that it exclusively accepts true constraints of the data, and is reliable enough to discover DCs missed by domain experts. Additionally, we provide curated sets of golden DCs for each dataset used in our study, those generated by domain experts and those discovered using our approach.<\/jats:p>","DOI":"10.14778\/3748191.3748209","type":"journal-article","created":{"date-parts":[[2025,9,4]],"date-time":"2025-09-04T13:50:16Z","timestamp":1756993816000},"page":"3477-3489","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["How and Why False Denial Constraints are Discovered"],"prefix":"10.14778","volume":"18","author":[{"given":"Albert","family":"Martin","sequence":"first","affiliation":[{"name":"Universitat Polit\u00e8cnica de Catalunya, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eduardo C.","family":"de Almeida","sequence":"additional","affiliation":[{"name":"Federal University of Paran\u00e1, Curitiba, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Oscar","family":"Romero","sequence":"additional","affiliation":[{"name":"Universitat Polit\u00e8cnica de Catalunya, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anna","family":"Queralt","sequence":"additional","affiliation":[{"name":"Universitat Polit\u00e8cnica de Catalunya, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,9,4]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Naser Ayat Hamideh Afsarmanesh Reza Akbarinia and Patrick Valduriez. 2012. Pay-As-You-Go Data Integration Using Functional Dependencies. In Multidisciplinary Research and Practice for Information Systems. 375\u2013389.","DOI":"10.1007\/978-3-642-32498-7_28"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.14778\/3204028.3204032"},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 33rd ACM International Conference on Information and Knowledge Management. 120\u2013129","author":"Bian Lingfeng","year":"2024","unstructured":"Lingfeng Bian, Weidong Yang, Jingyi Xu, and Zijing Tan. 2024. Discovering Denial Constraints Based on Deep Reinforcement Learning. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management. 120\u2013129."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.14778\/3407790.3407824"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.14778\/3157794.3157800"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.14778\/1453856.1453980"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1031171.1031254"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536258.2536262"},{"key":"e_1_2_1_9_1","volume-title":"Principles of data integration","author":"Doan AnHai","unstructured":"AnHai Doan, Alon Halevy, and Zachary Ives. 2012. Principles of data integration. Elsevier."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2010.154"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389775"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732240.2732248"},{"key":"e_1_2_1_13_1","volume-title":"TANE: An efficient algorithm for discovering functional and approximate dependencies. The computer journal 42, 2","author":"Huhtala Yka","year":"1999","unstructured":"Yka Huhtala, Juha K\u00e4rkk\u00e4inen, Pasi Porkka, and Hannu Toivonen. 1999. TANE: An efficient algorithm for discovering functional and approximate dependencies. The computer journal 42, 2 (1999), 100\u2013111."},{"key":"e_1_2_1_14_1","volume-title":"Continuous univariate distributions","author":"Johnson Norman L","unstructured":"Norman L Johnson, Samuel Kotz, and Narayanaswamy Balakrishnan. 1995. Continuous univariate distributions, volume 2. Vol. 2. John wiley & sons."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.14778\/3192965.3192968"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/S00778-015-0412-3"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/3401960.3401966"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915203"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.5441\/002\/EDBT.2017.31"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2023.3336630"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE60146.2024.00270"},{"key":"e_1_2_1_22_1","volume-title":"Database and Expert Systems Applications: 29th International Conference, DEXA 2018, Regensburg, Germany, September 3\u20136, 2018, Proceedings, Part I 29","author":"Pena Eduardo HM","year":"2018","unstructured":"Eduardo HM Pena and Eduardo Cunha de Almeida. 2018. BFASTDC: A bitwise algorithm for mining denial constraints. In Database and Expert Systems Applications: 29th International Conference, DEXA 2018, Regensburg, Germany, September 3\u20136, 2018, Proceedings, Part I 29. Springer, 53\u201368."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.14778\/3368289.3368293"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.14778\/3574245.3574254"},{"key":"e_1_2_1_25_1","first-page":"3","article-title":"Mind Your Dependencies for Semantic Query Optimization","volume":"9","author":"Pena Eduardo H. M.","year":"2018","unstructured":"Eduardo H. M. Pena, Erik Falk, Jorge Augusto Meira, and Eduardo Cunha de Almeida. 2018. Mind Your Dependencies for Semantic Query Optimization. J. Inf. Data Manag. 9, 1 (2018), 3\u201319. https:\/\/sol.sbc.org.br\/journals\/index.php\/jidm\/article\/view\/1633","journal-title":"J. Inf. Data Manag."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-023-00788-y"},{"key":"e_1_2_1_27_1","volume-title":"Holoclean: Holistic data repairs with probabilistic inference. arXiv preprint arXiv:1702.00820","author":"Rekatsinas Theodoros","year":"2017","unstructured":"Theodoros Rekatsinas, Xu Chu, Ihab F Ilyas, and Christopher R\u00e9. 2017. Holoclean: Holistic data repairs with probabilistic inference. arXiv preprint arXiv:1702.00820 (2017)."},{"key":"e_1_2_1_28_1","unstructured":"Avi Silberschatz Henry F. Korth and S. Sudarshan. 2020. Database System Concepts Seventh Edition. McGraw-Hill Book Company. https:\/\/www.db-book.com\/"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/2556549.2556568"},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the 38th Brazilian Symposium on Databases, SBBD 2023, Belo Horizonte, MG, Brazil, September 25\u201329","author":"Tamalu Nicolas","year":"2023","unstructured":"Nicolas Tamalu, Leandro Augusto Ensina, Eduardo Cunha de Almeida, Eduardo Henrique Monteiro Pena, and Luiz Eduardo Soares de Oliveira. 2023. Fault Detection in Transmission Lines: a Denial Constraint Approach. In Proceedings of the 38th Brazilian Symposium on Databases, SBBD 2023, Belo Horizonte, MG, Brazil, September 25\u201329, 2023. SBC, 231\u2013243. https:\/\/sol.sbc.org.br\/index.php\/sbbd\/article\/view\/25530"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2019.00137"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.IS.2023.102224"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.14778\/3565816.3565828"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.3390\/math12193102"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3748191.3748209","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,4]],"date-time":"2025-09-04T13:53:34Z","timestamp":1756994014000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3748191.3748209"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6]]},"references-count":34,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2025,6]]}},"alternative-id":["10.14778\/3748191.3748209"],"URL":"https:\/\/doi.org\/10.14778\/3748191.3748209","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,6]]},"assertion":[{"value":"2025-09-04","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}