{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,18]],"date-time":"2025-10-18T15:03:39Z","timestamp":1760799819287},"reference-count":23,"publisher":"Association for Computing Machinery (ACM)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2014,8]]},"abstract":"<jats:p>\n            Microsoft SQL Server PowerPivot for Excel, or PowerPivot for short, is an in-memory business intelligence (BI) engine that enables Excel users to interactively create pivot tables over large data sets imported from sources such as relational databases, text files and web data feeds. Unlike traditional pivot tables in Excel that are defined on a single table, PowerPivot allows analysis over multiple tables connected via foreign-key joins. In many cases however, these foreign-key relationships are not known a priori, and information workers are often not be sophisticated enough to define these relationships. Therefore, the ability to automatically discover foreign-key relationships in PowerPivot is valuable, if not essential. The key challenge is to perform this detection\n            <jats:italic>interactively<\/jats:italic>\n            and with\n            <jats:italic>high precision<\/jats:italic>\n            even when data sets scale to hundreds of millions of rows and the schema contains tens of tables and hundreds of columns. In this paper, we describe techniques for fast foreign-key detection in PowerPivot and experimentally evaluate its accuracy, performance and scale on both synthetic benchmarks and real-world data sets. These techniques have been incorporated into PowerPivot for Excel.\n          <\/jats:p>","DOI":"10.14778\/2733004.2733014","type":"journal-article","created":{"date-parts":[[2015,5,12]],"date-time":"2015-05-12T15:37:52Z","timestamp":1431445072000},"page":"1417-1428","source":"Crossref","is-referenced-by-count":24,"title":["Fast foreign-key detection in Microsoft SQL server PowerPivot for Excel"],"prefix":"10.14778","volume":"7","author":[{"given":"Zhimin","family":"Chen","sequence":"first","affiliation":[{"name":"Microsoft Research"}]},{"given":"Vivek","family":"Narasayya","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]},{"given":"Surajit","family":"Chaudhuri","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]}],"member":"320","published-online":{"date-parts":[[2014,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1978542.1978562"},{"key":"e_1_2_1_2_1","unstructured":"PowerPivot for Excel. http:\/\/msdn.microsoft.com\/en-us\/library\/ee210644.aspx PowerPivot for Excel. http:\/\/msdn.microsoft.com\/en-us\/library\/ee210644.aspx"},{"key":"e_1_2_1_3_1","unstructured":"Chaudhuri S. Narasayya V. Program for TPC-D Data Generation with skew. ftp:\/\/ftp.research.microsoft.com\/users\/viveknar\/TPCDSkew\/  Chaudhuri S. Narasayya V. Program for TPC-D Data Generation with skew. ftp:\/\/ftp.research.microsoft.com\/users\/viveknar\/TPCDSkew\/"},{"key":"e_1_2_1_4_1","unstructured":"AdventureWorks database. http:\/\/msftdbprodsamples.codeplex.com\/releases\/view\/55926  AdventureWorks database. http:\/\/msftdbprodsamples.codeplex.com\/releases\/view\/55926"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/276698.276781"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1989323.1989336"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920944"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1871437.1871440"},{"key":"e_1_2_1_10_1","volume-title":"WebDB","author":"Rostin O.","year":"2009","unstructured":"A. Rostin , O. Albrecht , J. Bauckmann , F. Naumann , and U. Leser . A machine learning approach to foreign key discovery . In WebDB , 2009 . A. Rostin, O. Albrecht, J. Bauckmann, F. Naumann, and U. Leser. A machine learning approach to foreign key discovery. In WebDB, 2009."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0306-4379(01)00027-8"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s007780100057"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872796"},{"key":"e_1_2_1_14_1","volume-title":"Workshop on Information Integration on the Web. IIWeb","author":"Cohen W. W.","year":"2003","unstructured":"Cohen , W. W. Ravikumar , P. , Fienberg , S. E. A Comparison of String Distance Metrics for Name-Matching tasks . In: Workshop on Information Integration on the Web. IIWeb , 2003 . Cohen, W. W. Ravikumar, P., Fienberg, S. E. A Comparison of String Distance Metrics for Name-Matching tasks. In: Workshop on Information Integration on the Web. IIWeb, 2003."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213836.2213961"},{"key":"e_1_2_1_16_1","unstructured":"Microsoft Power Query for Excel. http:\/\/office.microsoft.com\/en-us\/excel\/download-microsoft-power-query-for-excel-FX104018616.aspx  Microsoft Power Query for Excel. http:\/\/office.microsoft.com\/en-us\/excel\/download-microsoft-power-query-for-excel-FX104018616.aspx"},{"key":"e_1_2_1_17_1","unstructured":"TPC-H benchmark. http:\/\/www.tpch.org\/tpch  TPC-H benchmark. http:\/\/www.tpch.org\/tpch"},{"key":"e_1_2_1_18_1","unstructured":"Relationships in PowerPivot. http:\/\/technet.microsoft.com\/en-us\/library\/gg399148.aspx  Relationships in PowerPivot. http:\/\/technet.microsoft.com\/en-us\/library\/gg399148.aspx"},{"key":"e_1_2_1_19_1","unstructured":"TPC-E benchmark. http:\/\/www.tpch.org\/tpce  TPC-E benchmark. http:\/\/www.tpch.org\/tpce"},{"key":"e_1_2_1_20_1","volume-title":"Efficiently Detecting Inclusion Dependencies. In International Conference on Data Engineering","author":"Bauckmann J.","year":"2007","unstructured":"Bauckmann , J. , Efficiently Detecting Inclusion Dependencies. In International Conference on Data Engineering , 2007 , Istanbul, Turkey. Bauckmann, J., et al. Efficiently Detecting Inclusion Dependencies. In International Conference on Data Engineering, 2007, Istanbul, Turkey."},{"key":"e_1_2_1_21_1","unstructured":"IBM Infosphere. Data profiling and analysis. http:\/\/www.ibm.com\/software\/data\/infosphere\/  IBM Infosphere. Data profiling and analysis. http:\/\/www.ibm.com\/software\/data\/infosphere\/"},{"key":"e_1_2_1_22_1","unstructured":"Data Profiling Task in Microsoft SQL Server Integration Services. http:\/\/technet.microsoft.com\/en-us\/library\/bb895263.aspx  Data Profiling Task in Microsoft SQL Server Integration Services. http:\/\/technet.microsoft.com\/en-us\/library\/bb895263.aspx"},{"key":"e_1_2_1_23_1","unstructured":"Trifacta. http:\/\/www.trifacta.com\/  Trifacta. http:\/\/www.trifacta.com\/"},{"key":"e_1_2_1_24_1","unstructured":"OpenRefine (formerly Google Refine). http:\/\/openrefine.org\/  OpenRefine (formerly Google Refine). http:\/\/openrefine.org\/"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2733004.2733014","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:40:12Z","timestamp":1672220412000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2733004.2733014"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,8]]},"references-count":23,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2014,8]]}},"alternative-id":["10.14778\/2733004.2733014"],"URL":"https:\/\/doi.org\/10.14778\/2733004.2733014","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2014,8]]}}}