{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T09:46:56Z","timestamp":1763459216216,"version":"3.45.0"},"reference-count":33,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2016,7,18]],"date-time":"2016-07-18T00:00:00Z","timestamp":1468800000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"NSF","award":["IIS-1217869, IIS-1450560 and IIS-1524382"],"award-info":[{"award-number":["IIS-1217869, IIS-1450560 and IIS-1524382"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Database Syst."],"published-print":{"date-parts":[[2016,8,8]]},"abstract":"<jats:p>We introduce and develop a declarative framework for entity linking and, in particular, for entity resolution. As in some earlier approaches, our framework is based on a systematic use of constraints. However, the constraints we adopt are link-to-source constraints, unlike in earlier approaches where source-to-link constraints were used to dictate how to generate links. Our approach makes it possible to focus entirely on the intended properties of the outcome of entity linking, thus separating the constraints from any procedure of how to achieve that outcome. The core language consists of link-to-source constraints that specify the desired properties of a link relation in terms of source relations and built-in predicates such as similarity measures. A key feature of the link-to-source constraints is that they employ disjunction, which enables the declarative listing of all the reasons two entities should be linked. We also consider extensions of the core language that capture collective entity resolution by allowing interdependencies among the link relations.<\/jats:p>\n                  <jats:p>\n                    We identify a class of \u201cgood\u201d solutions for entity-linking specifications, which we call\n                    <jats:italic toggle=\"yes\">maximum-value solutions<\/jats:italic>\n                    and which capture the strength of a link by counting the reasons that justify it. We study natural algorithmic problems associated with these solutions, including the problem of enumerating the \u201cgood\u201d solutions and the problem of finding the certain links, which are the links that appear in every \u201cgood\u201d solution. We show that these problems are tractable for the core language but may become intractable once we allow interdependencies among the link relations. We also make some surprising connections between our declarative framework, which is deterministic, and probabilistic approaches such as ones based on Markov Logic Networks.\n                  <\/jats:p>","DOI":"10.1145\/2894748","type":"journal-article","created":{"date-parts":[[2016,7,19]],"date-time":"2016-07-19T07:59:22Z","timestamp":1468915162000},"page":"1-38","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["A Declarative Framework for Linking Entities"],"prefix":"10.1145","volume":"41","author":[{"given":"Douglas","family":"Burdick","sequence":"first","affiliation":[{"name":"IBM Research -- Almaden, San Jose, CA"}]},{"given":"Ronald","family":"Fagin","sequence":"additional","affiliation":[{"name":"IBM Research -- Almaden, San Jose, CA"}]},{"given":"Phokion G.","family":"Kolaitis","sequence":"additional","affiliation":[{"name":"UC Santa Cruz &amp; IBM Research -- Almaden, Santa Cruz, CA"}]},{"given":"Lucian","family":"Popa","sequence":"additional","affiliation":[{"name":"IBM Research -- Almaden, San Jose, CA"}]},{"given":"Wang-Chiew","family":"Tan","sequence":"additional","affiliation":[{"name":"UC Santa Cruz, Santa Cruz, CA"}]}],"member":"320","published-online":{"date-parts":[[2016,7,18]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"N. Alur A. K. Jha B. Rosen and T. Skov. 2008. IBM WebSphere QualityStage Methodologies Standardization and Matching. Redbooks. http:\/\/www.redbooks.ibm.com\/redbooks\/pdfs\/sg247546.pdf."},{"key":"e_1_2_1_2_1","doi-asserted-by":"crossref","unstructured":"B. Alexe D. Burdick M. A. Hern\u00e1ndez G. Koutrika R. Krishnamurthy L. Popa I. R. Stanoi and R. Wisnesky. 2013. High-level rules for integration and analysis of data: New challenges. In LNCS 8000: In Search of Elegance in the Theory and Practice of Computation. 36--55.","DOI":"10.1007\/978-3-642-41660-6_3"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","unstructured":"A. Arasu C. Re and D. Suciu. 2009. Large-scale deduplication with constraints using dedupalog. In ICDE. 952--963. 10.1109\/ICDE.2009.43","DOI":"10.1109\/ICDE.2009.43"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ic.2013.06.002"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","unstructured":"M. Arenas L. E. Bertossi and J. Chomicki. 1999. Consistent query answers in inconsistent databases. In PODS. 68--79. 10.1145\/303976.303983","DOI":"10.1145\/303976.303983"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00224-012-9402-7"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","unstructured":"I. Bhattacharya and L. Getoor. 2007. Collective entity resolution in relational data. TKDD 1 1 (2007). 10.1145\/1217299.1217304","DOI":"10.1145\/1217299.1217304"},{"key":"e_1_2_1_8_1","volume-title":"18th International Conference on Database Theory (ICDT\u201915)","author":"Burdick D.","year":"2015","unstructured":"D. Burdick, R. Fagin, Ph. G. Kolaitis, L. Popa, and W.-C. Tan. 2015. A declarative framework for linking entities. In 18th International Conference on Database Theory (ICDT\u201915). 25--43."},{"key":"e_1_2_1_9_1","first-page":"60","article-title":"Extracting, linking and integrating data from public sources: A financial case study","volume":"34","author":"Burdick D.","year":"2011","unstructured":"D. Burdick, M. A. Hern\u00e1ndez, H. Ho, G. Koutrika, R. Krishnamurthy, L. Popa, I. R. Stanoi, S. Vaithyanathan, and S. Das. 2011. Extracting, linking and integrating data from public sources: A financial case study. IEEE Data Eng. Bull. 34, 3 (2011), 60--67.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/0166-218X(87)90017-5"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ic.2004.04.007"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2011.127"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","unstructured":"X. Dong A. Y. Halevy and J. Madhavan. 2005. Reference reconciliation in complex information spaces. In SIGMOD. 85--96. 10.1145\/1066157.1066168","DOI":"10.1145\/1066157.1066168"},{"volume-title":"Proceedings of Algorithms and Computation - 25th International Symposium, (ISAAC\u201914)","author":"Droschinsky A.","key":"e_1_2_1_14_1","unstructured":"A. Droschinsky, B. Heinemann, N. Kriege, and P. Mutzel. 2014. Enumeration of maximum common subtree isomorphisms with polynomial-delay. In Proceedings of Algorithms and Computation - 25th International Symposium, (ISAAC\u201914). 81--93."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2007.9"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2004.10.033"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","unstructured":"W. Fan. 2008. Dependencies revisited for improving data quality. In PODS. 159--170. 10.1145\/1376916.1376940","DOI":"10.1145\/1376916.1376940"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","unstructured":"W. Fan and F. Geerts. 2012. Foundations of Data Quality Management. Morgan & Claypool Publishers.","DOI":"10.5555\/2371176"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1969.10501049"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","unstructured":"H. Galhardas D. Florescu D. Shasha E. Simon and C.-A. Saita. 2001. Declarative data cleaning: Language model and algorithms. In VLDB. 371--380.","DOI":"10.5555\/645927.672042"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/2566736"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.14778\/2367502.2367564"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","unstructured":"O. Hassanzadeh A. Kementsietsidis L. Lim R. J. Miller and M. Wang. 2009. A framework for semantic link discovery over relational data. In CIKM. 1027--1036. 10.1145\/1645953.1646084","DOI":"10.1145\/1645953.1646084"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2452376.2452440"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","unstructured":"M. A. Hern\u00e1ndez and S. J. Stolfo. 1995. The merge\/purge problem for large databases. In SIGMOD. 127--138. 10.1145\/568271.223807","DOI":"10.1145\/568271.223807"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/322092.322093"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/0020-0190(88)90065-8"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2004.08.006"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","unstructured":"N. Koudas S. Sarawagi and D. Srivastava. 2006. Record linkage: Similarity measures and algorithms. In SIGMOD. 802--803. 10.1145\/1142473.1142599","DOI":"10.1145\/1142473.1142599"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1287\/opre.16.3.682"},{"volume-title":"Computational Complexity","author":"Papadimitriou C. H.","key":"e_1_2_1_31_1","unstructured":"C. H. Papadimitriou. 1994. Computational Complexity. Addison-Wesley."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-006-5833-1"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","unstructured":"P. Singla and P. Domingos. 2006. Entity resolution with Markov logic. In ICDM. 572--582. 10.1109\/ICDM.2006.65","DOI":"10.1109\/ICDM.2006.65"}],"container-title":["ACM Transactions on Database Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2894748","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2894748","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2894748","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T09:40:48Z","timestamp":1763458848000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2894748"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,7,18]]},"references-count":33,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2016,8,8]]}},"alternative-id":["10.1145\/2894748"],"URL":"https:\/\/doi.org\/10.1145\/2894748","relation":{},"ISSN":["0362-5915","1557-4644"],"issn-type":[{"type":"print","value":"0362-5915"},{"type":"electronic","value":"1557-4644"}],"subject":[],"published":{"date-parts":[[2016,7,18]]},"assertion":[{"value":"2015-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-02-01","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-07-18","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}