{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,19]],"date-time":"2026-05-19T07:13:25Z","timestamp":1779174805423,"version":"3.51.4"},"reference-count":22,"publisher":"Association for Computing Machinery (ACM)","issue":"10","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2015,6]]},"abstract":"<jats:p>Functional dependencies are important metadata used for schema normalization, data cleansing and many other tasks. The efficient discovery of functional dependencies in tables is a well-known challenge in database research and has seen several approaches. Because no comprehensive comparison between these algorithms exist at the time, it is hard to choose the best algorithm for a given dataset. In this experimental paper, we describe, evaluate, and compare the seven most cited and most important algorithms, all solving this same problem.<\/jats:p>\n          <jats:p>First, we classify the algorithms into three different categories, explaining their commonalities. We then describe all algorithms with their main ideas. The descriptions provide additional details where the original papers were ambiguous or incomplete. Our evaluation of careful re-implementations of all algorithms spans a broad test space including synthetic and real-world data. We show that all functional dependency algorithms optimize for certain data characteristics and provide hints on when to choose which algorithm. In summary, however, all current approaches scale surprisingly poorly, showing potential for future research.<\/jats:p>","DOI":"10.14778\/2794367.2794377","type":"journal-article","created":{"date-parts":[[2015,7,30]],"date-time":"2015-07-30T14:37:34Z","timestamp":1438267054000},"page":"1082-1093","source":"Crossref","is-referenced-by-count":157,"title":["Functional dependency discovery"],"prefix":"10.14778","volume":"8","author":[{"given":"Thorsten","family":"Papenbrock","sequence":"first","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jens","family":"Ehrlich","sequence":"additional","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jannik","family":"Marten","sequence":"additional","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tommy","family":"Neubert","sequence":"additional","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jan-Peer","family":"Rudolph","sequence":"additional","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Martin","family":"Sch\u00f6nberg","sequence":"additional","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jakob","family":"Zwiener","sequence":"additional","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Felix","family":"Naumann","sequence":"additional","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2015,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661829.2661884"},{"key":"e_1_2_1_2_1","first-page":"487","volume-title":"Proceedings of the International Conference on Very Large Databases (VLDB)","author":"Agrawal R.","year":"1994","unstructured":"R. Agrawal and R. Srikant . Fast algorithms for mining association rules in large databases . In Proceedings of the International Conference on Very Large Databases (VLDB) , pages 487 -- 499 , 1994 . R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of the International Conference on Very Large Databases (VLDB), pages 487--499, 1994."},{"key":"e_1_2_1_3_1","first-page":"487","volume-title":"Proceedings of the International Conference on Very Large Databases (VLDB)","author":"Agrawal R.","year":"1994","unstructured":"R. Agrawal and R. Srikant . Fast algorithms for mining association rules in large databases . In Proceedings of the International Conference on Very Large Databases (VLDB) , pages 487 -- 499 , 1994 . R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of the International Conference on Very Large Databases (VLDB), pages 487--499, 1994."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/0022-0000(86)90019-X"},{"key":"e_1_2_1_5_1","unstructured":"FDEP home page. www.cs.bris.ac.uk\/~flach\/fdep. Accessed: 2015-02-26.  FDEP home page. www.cs.bris.ac.uk\/~flach\/fdep. Accessed: 2015-02-26."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/1216155.1216159"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/42.2.100"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007568.1007641"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2516641.2516643"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/0304-3975(95)00028-U"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2010.197"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/645339.650138"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/0166-218X(92)90031-5"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/645504.656418"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0306-4379(01)00032-1"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/B978-1-55860-307-3.50043-5"},{"key":"e_1_2_1_17_1","unstructured":"Tane home page. www.cs.helsinki.fi\/research\/fdk\/datamining\/tane\/. Accessed: 2015-02-26.  Tane home page. www.cs.helsinki.fi\/research\/fdk\/datamining\/tane\/. Accessed: 2015-02-26."},{"key":"e_1_2_1_18_1","unstructured":"UCI machine learning repository. http:\/\/archive.ics.uci.edu\/ml. Accessed: 2015-02-26.  UCI machine learning repository. http:\/\/archive.ics.uci.edu\/ml. Accessed: 2015-02-26."},{"key":"e_1_2_1_19_1","volume-title":"Principles of Database and Knowledge-Base Systems: Volume II: The New Technologies","author":"Ullman J. D.","year":"1990","unstructured":"J. D. Ullman . Principles of Database and Knowledge-Base Systems: Volume II: The New Technologies . W. H. Freeman & Co. , New York, NY, USA , 1990 . J. D. Ullman. Principles of Database and Knowledge-Base Systems: Volume II: The New Technologies. W. H. Freeman & Co., New York, NY, USA, 1990."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/646110.679455"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-007-0083-9"},{"key":"e_1_2_1_22_1","first-page":"729","volume-title":"Proceedings of the International Conference on Data Mining (ICDM)","author":"Yao H.","year":"2002","unstructured":"H. Yao , H. J. Hamilton , and C. J. Butz . FD_Mine: discovering functional dependencies in a database using equivalences . In Proceedings of the International Conference on Data Mining (ICDM) , pages 729 -- 732 , 2002 . H. Yao, H. J. Hamilton, and C. J. Butz. FD_Mine: discovering functional dependencies in a database using equivalences. In Proceedings of the International Conference on Data Mining (ICDM), pages 729--732, 2002."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2794367.2794377","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:11:07Z","timestamp":1672225867000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2794367.2794377"}},"subtitle":["an experimental evaluation of seven algorithms"],"short-title":[],"issued":{"date-parts":[[2015,6]]},"references-count":22,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2015,6]]}},"alternative-id":["10.14778\/2794367.2794377"],"URL":"https:\/\/doi.org\/10.14778\/2794367.2794377","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2015,6]]}}}