{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,13]],"date-time":"2026-05-13T17:34:46Z","timestamp":1778693686152,"version":"3.51.4"},"reference-count":12,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2015,8]]},"abstract":"<jats:p>Data profiling is the discipline of discovering metadata about given datasets. The metadata itself serve a variety of use cases, such as data integration, data cleansing, or query optimization. Due to the importance of data profiling in practice, many tools have emerged that support data scientists and IT professionals in this task. These tools provide good support for profiling statistics that are easy to compute, but they are usually lacking automatic and efficient discovery of complex statistics, such as inclusion dependencies, unique column combinations, or functional dependencies.<\/jats:p>\n          <jats:p>\n            We present Metanome, an extensible profiling platform that incorporates many state-of-the-art profiling algorithms. While Metanome is able to calculate simple profiling statistics in relational data, its focus lies on the automatic\n            <jats:italic>discovery of complex metadata.<\/jats:italic>\n            Metanome's goal is to provide novel profiling algorithms from research, perform comparative evaluations, and to support developers in building and testing new algorithms. In addition, Metanome is able to rank profiling results according to various metrics and to visualize the, at times, large metadata sets.\n          <\/jats:p>","DOI":"10.14778\/2824032.2824086","type":"journal-article","created":{"date-parts":[[2015,9,16]],"date-time":"2015-09-16T12:18:17Z","timestamp":1442405897000},"page":"1860-1863","source":"Crossref","is-referenced-by-count":86,"title":["Data profiling with metanome"],"prefix":"10.14778","volume":"8","author":[{"given":"Thorsten","family":"Papenbrock","sequence":"first","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tanja","family":"Bergmann","sequence":"additional","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Moritz","family":"Finke","sequence":"additional","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jakob","family":"Zwiener","sequence":"additional","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Felix","family":"Naumann","sequence":"additional","affiliation":[{"name":"Hasso-Plattner-Institut, Potsdam, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2015,8]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"949","volume-title":"Proceedings of the International Conference on Information and Knowledge Management (CIKM)","author":"Abedjan Z.","year":"2014"},{"key":"e_1_2_1_2_1","first-page":"2","volume-title":"ICDE Workshops","author":"Bauckmann J.","year":"2006"},{"issue":"3","key":"e_1_2_1_3_1","first-page":"139","article-title":"Database dependency discovery: a machine learning approach","volume":"12","author":"Flach P. A.","year":"1999","journal-title":"AI Communications"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732240.2732248"},{"issue":"2","key":"e_1_2_1_5_1","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1093\/comjnl\/42.2.100","article-title":"TANE: An efficient algorithm for discovering functional and approximate dependencies","volume":"42","author":"Huhtala Y.","year":"1999","journal-title":"The Computer Journal"},{"key":"e_1_2_1_6_1","first-page":"350","volume-title":"Proceedings of the International Conference on Extending Database Technology (EDBT)","author":"Lopes S.","year":"2000"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10844-007-0048-x"},{"key":"e_1_2_1_8_1","first-page":"189","volume-title":"Proceedings of the International Conference on Database Theory (ICDT)","author":"Novelli N.","year":"2001"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.14778\/2794367.2794377"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.14778\/2752939.2752946"},{"key":"e_1_2_1_11_1","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1007\/3-540-44801-2_11","volume-title":"Proceedings of the International Conference of Data Warehousing and Knowledge Discovery (DaWaK)","author":"Wyss C.","year":"2001"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-007-0083-9"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2824032.2824086","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T10:05:07Z","timestamp":1672221907000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2824032.2824086"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,8]]},"references-count":12,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2015,8]]}},"alternative-id":["10.14778\/2824032.2824086"],"URL":"https:\/\/doi.org\/10.14778\/2824032.2824086","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2015,8]]}}}