{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,28]],"date-time":"2025-06-28T07:23:05Z","timestamp":1751095385606,"version":"3.41.0"},"reference-count":9,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2015,8]]},"abstract":"<jats:p>Semi-structured data are prevalent on the web, with formats such as XML and JSON soaring in popularity due to their generality, flexibility and easy customization. However, these very same features make semi-structured data prone to a range of data quality errors, from errors in content to errors in structure. While the former has been well studied, little attention has been paid to structural errors.<\/jats:p><jats:p>In this demonstration, we present T<jats:sc>ree<\/jats:sc>S<jats:sc>cope<\/jats:sc>, which analyzes semi-structured data sets with the goal of automatically identifying<jats:italic>structural anomalies<\/jats:italic>from the data. Our techniques learn robust structural models that have high support, to identify potential errors in the structure. Identified structural anomalies are then concisely summarized to provide plausible explanations of the potential errors. The goal of this demonstration is to enable an interactive exploration of the process of identifying and summarizing structural anomalies in semi-structured data sets.<\/jats:p>","DOI":"10.14778\/2824032.2824097","type":"journal-article","created":{"date-parts":[[2015,9,16]],"date-time":"2015-09-16T12:18:17Z","timestamp":1442405897000},"page":"1904-1907","source":"Crossref","is-referenced-by-count":5,"title":["TreeScope"],"prefix":"10.14778","volume":"8","author":[{"given":"Shanshan","family":"Ying","sequence":"first","affiliation":[{"name":"Advanced Digital Sciences Center"}]},{"given":"Flip","family":"Korn","sequence":"additional","affiliation":[{"name":"Google Research"}]},{"given":"Barna","family":"Saha","sequence":"additional","affiliation":[{"name":"University of Massachusetts Amherst"}]},{"given":"Divesh","family":"Srivastava","sequence":"additional","affiliation":[{"name":"AT&amp;T Labs--Research"}]}],"member":"320","published-online":{"date-parts":[[2015,8]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Data Quality: Concepts, Methodologies and Techniques. Data-Centric Systems and Applications","author":"Batini C.","year":"2006","unstructured":"C. Batini and M. Scannapieco . Data Quality: Concepts, Methodologies and Techniques. Data-Centric Systems and Applications . Springer , 2006 . C. Batini and M. Scannapieco. Data Quality: Concepts, Methodologies and Techniques. Data-Centric Systems and Applications. Springer, 2006."},{"key":"e_1_2_1_2_1","first-page":"115","volume-title":"VLDB","author":"Bex G. J.","year":"2006","unstructured":"G. J. Bex , F. Neven , T. Schwentick , and K. Tuyls . Inference of Concise DTDs from XML Data . In VLDB , pages 115 -- 126 , 2006 . G. J. Bex, F. Neven, T. Schwentick, and K. Tuyls. Inference of Concise DTDs from XML Data. In VLDB, pages 115--126, 2006."},{"key":"e_1_2_1_3_1","volume-title":"Foundations of Data Quality Management. Synthesis Lectures on Data Management","author":"Fan W.","year":"2012","unstructured":"W. Fan and F. Geerts . Foundations of Data Quality Management. Synthesis Lectures on Data Management . Morgan & Claypool Publishers , 2012 . W. Fan and F. Geerts. Foundations of Data Quality Management. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2012."},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1145\/342009.335409","volume-title":"SIGMOD","author":"Garofalakis M. N.","year":"2000","unstructured":"M. N. Garofalakis , A. Gionis , R. Rastogi , S. Seshadri , and K. Shim . XTRACT: A System for Extracting Document Type Descriptors from XML Documents . In SIGMOD , pages 165 -- 176 , 2000 . 10.1145\/342009.335409 M. N. Garofalakis, A. Gionis, R. Rastogi, S. Seshadri, and K. Shim. XTRACT: A System for Extracting Document Type Descriptors from XML Documents. In SIGMOD, pages 165--176, 2000. 10.1145\/342009.335409"},{"key":"e_1_2_1_5_1","first-page":"19","volume-title":"IEEE Data Eng. Bull","author":"Garofalakis M. N.","year":"2003","unstructured":"M. N. Garofalakis , A. Gionis , R. Rastogi , S. Seshadri , and K. Shim . DTD Inference From XML Documents: The XTRACT Approach . IEEE Data Eng. Bull , pages 19 -- 25 , 2003 . M. N. Garofalakis, A. Gionis, R. Rastogi, S. Seshadri, and K. Shim. DTD Inference From XML Documents: The XTRACT Approach. IEEE Data Eng. Bull, pages 19--25, 2003."},{"key":"e_1_2_1_6_1","first-page":"1719","volume-title":"CIKM","author":"Grijzenhout S.","year":"2011","unstructured":"S. Grijzenhout and M. Marx . The Quality of the XML Web . In CIKM , pages 1719 -- 1724 , 2011 . 10.1145\/2063576.2063824 S. Grijzenhout and M. Marx. The Quality of the XML Web. In CIKM, pages 1719--1724, 2011. 10.1145\/2063576.2063824"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687553.1687577"},{"key":"e_1_2_1_8_1","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1145\/502585.502613","volume-title":"CIKM","author":"Sankey J.","year":"2001","unstructured":"J. Sankey and R. K. Wong . Structural Inference for Semistructured Data . In CIKM , pages 159 -- 166 , 2001 . 10.1145\/502585.502613 J. Sankey and R. K. Wong. Structural Inference for Semistructured Data. In CIKM, pages 159--166, 2001. 10.1145\/502585.502613"},{"key":"e_1_2_1_9_1","first-page":"37","volume-title":"SIGMOD","author":"Truong B. Q.","year":"2013","unstructured":"B. Q. Truong , S. S. Bhowmick , C. E. Dyreson , and A. Sun . MESSIAH: Missing Element-conscious SLCA Nodes Search in XML Data . In SIGMOD , pages 37 -- 48 , 2013 . 10.1145\/2463676.2463699 B. Q. Truong, S. S. Bhowmick, C. E. Dyreson, and A. Sun. MESSIAH: Missing Element-conscious SLCA Nodes Search in XML Data. In SIGMOD, pages 37--48, 2013. 10.1145\/2463676.2463699"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2824032.2824097","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,30]],"date-time":"2025-05-30T17:41:15Z","timestamp":1748626875000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2824032.2824097"}},"subtitle":["finding structural anomalies in semi-structured data"],"short-title":[],"issued":{"date-parts":[[2015,8]]},"references-count":9,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2015,8]]}},"alternative-id":["10.14778\/2824032.2824097"],"URL":"https:\/\/doi.org\/10.14778\/2824032.2824097","relation":{},"ISSN":["2150-8097"],"issn-type":[{"type":"print","value":"2150-8097"}],"subject":[],"published":{"date-parts":[[2015,8]]}}}