{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T08:37:03Z","timestamp":1765355823881},"reference-count":12,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2009,8]]},"abstract":"<jats:p>\n            Modern information management applications often require integrating data from a variety of data sources, some of which may copy or buy data from other sources. When these data sources model a dynamically changing world (\n            <jats:italic>e.g.<\/jats:italic>\n            , people's contact information changes over time, restaurants open and go out of business), sources often provide out-of-date data. Errors can also creep into data when sources are updated often. Given out-of-date and erroneous data provided by different, possibly dependent, sources, it is challenging for data integration systems to provide the true values. Straightforward ways to resolve such inconsistencies (\n            <jats:italic>e.g.<\/jats:italic>\n            , voting) may lead to noisy results, often with detrimental consequences.\n          <\/jats:p>\n          <jats:p>\n            In this paper, we study the problem of finding true values and determining the copying relationship between sources, when the update history of the sources is known. We model the quality of sources over time by their\n            <jats:italic>coverage, exactness<\/jats:italic>\n            and\n            <jats:italic>freshness<\/jats:italic>\n            . Based on these measures, we conduct a probabilistic analysis. First, we develop a Hidden Markov Model that decides whether a source is a copier of another source and identifies the specific moments at which it copies. Second, we develop a Bayesian model that aggregates information from the sources to decide the true value for a data item, and the evolution of the true values over time. Experimental results on both real-world and synthetic data show high accuracy and scalability of our techniques.\n          <\/jats:p>","DOI":"10.14778\/1687627.1687691","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"562-573","source":"Crossref","is-referenced-by-count":136,"title":["Truth discovery and copying detection in a dynamic world"],"prefix":"10.14778","volume":"2","author":[{"given":"Xin Luna","family":"Dong","sequence":"first","affiliation":[{"name":"AT&amp;T Labs--Research, Florham Park, NJ"}]},{"given":"Laure","family":"Berti-Equille","sequence":"additional","affiliation":[{"name":"Universit\u00e9 de Rennes, Rennes cedex, France"}]},{"given":"Divesh","family":"Srivastava","sequence":"additional","affiliation":[{"name":"AT&amp;T Labs--Research, Florham Park, NJ"}]}],"member":"320","published-online":{"date-parts":[[2009,8]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"CIDR","author":"Berti-Equille L.","year":"2009","unstructured":"L. Berti-Equille , A. D. Sarma , X. L. Dong , A. Marian , and D. Srivastava . Sailing the information ocean with awareness of currents: Discovery and application of source dependence . In CIDR , 2009 . L. Berti-Equille, A. D. Sarma, X. L. Dong, A. Marian, and D. Srivastava. Sailing the information ocean with awareness of currents: Discovery and application of source dependence. In CIDR, 2009."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1247480.1247646"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/335191.335391"},{"key":"e_1_2_1_4_1","unstructured":"X. L. Dong L. Berti-Equille and D. Srivastava. Truth discovery and copying detection in a dynamic world. http:\/\/www.research.att.com\/~lunadong\/publication\/lifespan_techReport.pdf.  X. L. Dong L. Berti-Equille and D. Srivastava. Truth discovery and copying detection in a dynamic world. http:\/\/www.research.att.com\/~lunadong\/publication\/lifespan_techReport.pdf."},{"key":"e_1_2_1_5_1","volume-title":"Integrating conflicting data: the role of source dependence. PVLDB, 2(1--2)","author":"Dong X. L.","year":"2009","unstructured":"X. L. Dong , L. Berti-Equille , and D. Srivastava . Integrating conflicting data: the role of source dependence. PVLDB, 2(1--2) , 2009 . X. L. Dong, L. Berti-Equille, and D. Srivastava. Integrating conflicting data: the role of source dependence. PVLDB, 2(1--2), 2009."},{"key":"e_1_2_1_6_1","volume-title":"Data fusion--resolving data conflicts for integration. PVLDB, 2(1--2)","author":"Dong X. L.","year":"2009","unstructured":"X. L. Dong and F. Naumann . Data fusion--resolving data conflicts for integration. PVLDB, 2(1--2) , 2009 . X. L. Dong and F. Naumann. Data fusion--resolving data conflicts for integration. PVLDB, 2(1--2), 2009."},{"key":"e_1_2_1_7_1","volume-title":"VLDB","author":"Guo H.","year":"2005","unstructured":"H. Guo , P.-\u00c5. Larson, and R. Ramakrishnan . Caching with 'good enough' currency, consistency, and completeness . In VLDB , 2005 . H. Guo, P.-\u00c5. Larson, and R. Ramakrishnan. Caching with 'good enough' currency, consistency, and completeness. In VLDB, 2005."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-004-0131-7"},{"issue":"1","key":"e_1_2_1_9_1","first-page":"11","article-title":"Efficient monitoring and querying of distributed, dynamic data via approximate replication","volume":"28","author":"Olston C.","year":"2005","unstructured":"C. Olston and J. Widom . Efficient monitoring and querying of distributed, dynamic data via approximate replication . IEEE Data Eng. Bull. , 28 ( 1 ): 11 -- 18 , 2005 . C. Olston and J. Widom. Efficient monitoring and querying of distributed, dynamic data via approximate replication. IEEE Data Eng. Bull., 28(1):11--18, 2005.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.18626"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0218843001000369"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1281192.1281309"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/1687627.1687691","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:29:44Z","timestamp":1672226984000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/1687627.1687691"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,8]]},"references-count":12,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2009,8]]}},"alternative-id":["10.14778\/1687627.1687691"],"URL":"https:\/\/doi.org\/10.14778\/1687627.1687691","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2009,8]]}}}