{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,5]],"date-time":"2025-10-05T19:46:02Z","timestamp":1759693562100,"version":"3.41.0"},"reference-count":77,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2014,12,4]],"date-time":"2014-12-04T00:00:00Z","timestamp":1417651200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGMOD Rec."],"published-print":{"date-parts":[[2014,12,4]]},"abstract":"<jats:p>Efficient document processing is a must when large volumes of XML data are involved. In such critical scenarios, a well-known solution to this problem is to distribute (map) the data among several processing nodes, and then distribute the processing accordingly, taking advantage of parallelism. This is the approach taken by distributed databases and MapReduce environments. Fragmentation techniques play an important role in these scenarios. They provide a way to \"cut\" the database into pieces and distribute the pieces over a network. This way, queries can also be \"cut\" into sub-queries that run in parallel, thus achieving better performance when compared to the centralized environment. However, there is no consensus in the database community as to what an XML fragment is. In fact, several approaches in literature present definitions of XML fragments. In addition to query processing, using XML fragmentation techniques may also be helpful when managing XML documents distributed along the web or clouds. This paper surveys the existing XML fragmentation approaches in literature, comparing their features and highlighting their drawbacks. Our contribution resides in establishing a map of the area.<\/jats:p>","DOI":"10.1145\/2694428.2694434","type":"journal-article","created":{"date-parts":[[2014,12,8]],"date-time":"2014-12-08T16:17:14Z","timestamp":1418055434000},"page":"24-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["A Survey on XML Fragmentation"],"prefix":"10.1145","volume":"43","author":[{"given":"Vanessa","family":"Braganholo","sequence":"first","affiliation":[{"name":"Fluminense Federal University, UFF, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Marta","family":"Mattoso","sequence":"additional","affiliation":[{"name":"Federal University of Rio de Janeiro, COPPE\/UFRJ, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2014,12,4]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/1315451.1315549"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872821"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687627.1687731"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/11896548_15"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/304181.304590"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/SITIS.2010.47"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.datak.2006.01.011"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-74469-6_53"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1031453.1031464"},{"key":"e_1_2_1_10_1","first-page":"97","volume-title":"WebDB","author":"Bose S.","year":"2005","unstructured":"S. Bose and L. Fegaras . XFrag: a query processing framework for fragmented XML data . In WebDB , pages 97 -- 102 , 2005 . S. Bose and L. Fegaras. XFrag: a query processing framework for fragmented XML data. In WebDB, pages 97--102, 2005."},{"key":"e_1_2_1_11_1","first-page":"195","volume-title":"DBPL","author":"Bose S.","year":"2003","unstructured":"S. Bose , L. Fegaras , D. Levine , and V. Chaluvadi . A query algebra for fragmented XML stream data . In DBPL , pages 195 -- 215 , 2003 . S. Bose, L. Fegaras, D. Levine, and V. Chaluvadi. A query algebra for fragmented XML stream data. In DBPL, pages 195--215, 2003."},{"key":"e_1_2_1_12_1","first-page":"73","volume-title":"WebDB","author":"Bremer J.-M.","year":"2003","unstructured":"J.-M. Bremer and M. Gertz . On distributing XML repositories . In WebDB , pages 73 -- 78 , 2003 . J.-M. Bremer and M. Gertz. On distributing XML repositories. In WebDB, pages 73--78, 2003."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-004-0150-4"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2004.75"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-005-0172-6"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-005-0357-x"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2396761.2398745"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564706"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2389241.2389251"},{"key":"e_1_2_1_20_1","first-page":"341","volume-title":"VLDB","author":"Cooper B. F.","year":"2001","unstructured":"B. F. Cooper , N. Sample , M. J. Franklin , G. R. Hjaltason , and M. Shadmon . A fast index for semistructured data . In VLDB , pages 341 -- 350 , 2001 . B. F. Cooper, N. Sample, M. J. Franklin, G. R. Hjaltason, and M. Shadmon. A fast index for semistructured data. In VLDB, pages 341--350, 2001."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.14778\/1978665.1978668"},{"key":"e_1_2_1_22_1","first-page":"137","volume-title":"OSDI","author":"Dean J.","year":"2004","unstructured":"J. Dean and S. Ghemawat . MapReduce: simplified data processing on large clusters . In OSDI , pages 137 -- 150 , 2004 . J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. In OSDI, pages 137--150, 2004."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327492"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1629175.1629198"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920908"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.14778\/2350229.2350272"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.Companion.2012.129"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2247596.2247601"},{"key":"e_1_2_1_29_1","first-page":"1","volume-title":"WebDB","author":"Fegaras L.","year":"2011","unstructured":"L. Fegaras , C. Li , U. Gupta , and J. J. Philip . XML query optimization in map-reduce . In WebDB , pages 1 -- 6 , 2011 . L. Fegaras, C. Li, U. Gupta, and J. J. Philip. XML query optimization in map-reduce. In WebDB, pages 1--6, 2011."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.5555\/646838.708506"},{"issue":"3","key":"e_1_2_1_31_1","first-page":"455","article-title":"Processing queries over distributed XML databases","volume":"1","author":"Figueiredo G.","year":"2010","unstructured":"G. Figueiredo , V. Braganholo , and M. Mattoso . Processing queries over distributed XML databases . JIDM , 1 ( 3 ): 455 -- 470 , 2010 . G. Figueiredo, V. Braganholo, and M. Mattoso. Processing queries over distributed XML databases. JIDM, 1(3):455--470, 2010.","journal-title":"JIDM"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/563932.563912"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.5555\/646130.679818"},{"key":"e_1_2_1_34_1","first-page":"436","volume-title":"VLDB","author":"Goldman R.","year":"1997","unstructured":"R. Goldman and J. Widom . DataGuides: enabling query formulation and optimization in semistructured databases . In VLDB , pages 436 -- 445 , 1997 . R. Goldman and J. Widom. DataGuides: enabling query formulation and optimization in semistructured databases. In VLDB, pages 436--445, 1997."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2007.1060"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/11733836_33"},{"key":"e_1_2_1_37_1","first-page":"149","volume-title":"DBPL","author":"Jagadish H. V.","year":"2001","unstructured":"H. V. Jagadish , L. V. S. Lakshmanan , D. Srivastava , and K. Thompson . TAX: a tree algebra for XML . In DBPL , pages 149 -- 164 , 2001 . H. V. Jagadish, L. V. S. Lakshmanan, D. Srivastava, and K. Thompson. TAX: a tree algebra for XML. In DBPL, pages 149--164, 2001."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/PDCAT.2007.80"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDEW.2006.120"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.datak.2004.06.001"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.14778\/1880172.1880173"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10619-011-7085-8"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/AINA.2007.64"},{"key":"e_1_2_1_44_1","first-page":"202","volume-title":"HICSS","author":"Lee K.","year":"2002","unstructured":"K. Lee , J. Min , and K. Park . A design and implementation of XML-Based mediation framework (XMF) for integration of internet information resources . In HICSS , pages 202 -- 202 , 2002 . K. Lee, J. Min, and K. Park. A design and implementation of XML-Based mediation framework (XMF) for integration of internet information resources. In HICSS, pages 202--202, 2002."},{"issue":"5","key":"e_1_2_1_45_1","first-page":"757","article-title":"Memory-efficient query processing over XML fragment stream with fragment labeling","volume":"29","author":"Lee S.","year":"2010","unstructured":"S. Lee , J. Kim , and H. Kang . Memory-efficient query processing over XML fragment stream with fragment labeling . Computing and Informatics , 29 ( 5 ): 757 -- 782 , 2010 . S. Lee, J. Kim, and H. Kang. Memory-efficient query processing over XML fragment stream with fragment labeling. Computing and Informatics, 29(5):757--782, 2010.","journal-title":"Computing and Informatics"},{"issue":"1","key":"e_1_2_1_46_1","first-page":"75","article-title":"Adaptive virtual partitioning for OLAP query processing in a database cluster","volume":"1","author":"Lima A.","year":"2010","unstructured":"A. Lima , M. Mattoso , and P. Valduriez . Adaptive virtual partitioning for OLAP query processing in a database cluster . JIDM , 1 ( 1 ): 75 -- 88 , 2010 . A. Lima, M. Mattoso, and P. Valduriez. Adaptive virtual partitioning for OLAP query processing in a database cluster. JIDM, 1(1):75--88, 2010.","journal-title":"JIDM"},{"key":"e_1_2_1_47_1","first-page":"200","volume-title":"SBBD","author":"Ma H.","year":"2003","unstructured":"H. Ma and K.-D. Schewe . Fragmentation of XML documents . In SBBD , pages 200 -- 214 , 2003 . H. Ma and K.-D. Schewe. Fragmentation of XML documents. In SBBD, pages 200--214, 2003."},{"key":"e_1_2_1_48_1","first-page":"131","volume-title":"CAISE","author":"Ma H.","year":"2005","unstructured":"H. Ma and K.-D. Schewe . Heuristic horizontal XML fragmentation . In CAISE , pages 131 -- 136 , 2005 . H. Ma and K.-D. Schewe. Heuristic horizontal XML fragmentation. In CAISE, pages 131--136, 2005."},{"issue":"1","key":"e_1_2_1_49_1","first-page":"21","article-title":"Fragmentation of XML documents","volume":"1","author":"Ma H.","year":"2010","unstructured":"H. Ma and K.-D. Schewe . Fragmentation of XML documents . JIDM , 1 ( 1 ): 21 -- 34 , 2010 . H. Ma and K.-D. Schewe. Fragmentation of XML documents. JIDM, 1(1):21--34, 2010.","journal-title":"JIDM"},{"issue":"1","key":"e_1_2_1_50_1","first-page":"35","article-title":"Fragmentation of XML documents","volume":"1","author":"Ma H.","year":"2010","unstructured":"H. Ma and K.-D. Schewe . Revisiting \" Fragmentation of XML documents \". JIDM , 1 ( 1 ): 35 -- 36 , 2010 . H. Ma and K.-D. Schewe. Revisiting \"Fragmentation of XML documents\". JIDM, 1(1):35--36, 2010.","journal-title":"JIDM"},{"key":"e_1_2_1_51_1","first-page":"183","volume-title":"ADC","author":"Ma H.","year":"2006","unstructured":"H. Ma , K.-D. Schewe , and Q. Wang . A heuristic approach to cost-efficient fragmentation and allocation of complex value databases . In ADC , pages 183 -- 192 , 2006 . H. Ma, K.-D. Schewe, and Q. Wang. A heuristic approach to cost-efficient fragmentation and allocation of complex value databases. In ADC, pages 183--192, 2006."},{"key":"e_1_2_1_52_1","first-page":"103","volume-title":"ADC","author":"Ma H.","year":"2007","unstructured":"H. Ma , K.-D. Schewe , and Q. Wang . A heuristic approach to cost-efficient derived horizontal fragmentation of complex value databases . In ADC , pages 103 -- 111 , 2007 . H. Ma, K.-D. Schewe, and Q. Wang. A heuristic approach to cost-efficient derived horizontal fragmentation of complex value databases. In ADC, pages 103--111, 2007."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/1516241.1516322"},{"key":"e_1_2_1_54_1","doi-asserted-by":"crossref","first-page":"3340","DOI":"10.1007\/978-0-387-39940-9_1083","volume-title":"Encyclopedia of Database Systems","author":"Mattoso M.","year":"2009","unstructured":"M. Mattoso . Virtual partitioning. In L. Liu and M. T. Ozsu, editors, Encyclopedia of Database Systems , pages 3340 -- 3341 . 2009 . M. Mattoso. Virtual partitioning. In L. Liu and M. T. Ozsu, editors, Encyclopedia of Database Systems, pages 3340--3341. 2009."},{"key":"e_1_2_1_55_1","first-page":"315","volume-title":"VLDB","author":"McHugh J.","year":"1999","unstructured":"J. McHugh and J. Widom . Query optimization for XML . In VLDB , pages 315 -- 326 , 1999 . J. McHugh and J. Widom. Query optimization for XML. In VLDB, pages 315--326, 1999."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815918.1815924"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.5555\/1783823.1783906"},{"key":"e_1_2_1_58_1","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4419-8834-8","volume-title":"Principles of Distributed Database Systems. 3 edition","author":"Ozsu M. T.","year":"2011","unstructured":"M. T. Ozsu and P. Valduriez . Principles of Distributed Database Systems. 3 edition , 2011 . M. T. Ozsu and P. Valduriez. Principles of Distributed Database Systems. 3 edition, 2011."},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007568.1007579"},{"key":"e_1_2_1_60_1","volume-title":"XML fragment interchange. W3C candidate recommendation 12 february","author":"Grosso Paul","year":"2001","unstructured":"Paul Grosso and Daniel Veillard . XML fragment interchange. W3C candidate recommendation 12 february 2001 ., 2001. W3C Candidate Recommendation 12 February 2001. Paul Grosso and Daniel Veillard. XML fragment interchange. W3C candidate recommendation 12 february 2001., 2001. W3C Candidate Recommendation 12 February 2001."},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559865"},{"issue":"3","key":"e_1_2_1_62_1","first-page":"495","article-title":"Virtual partitioning ad-hoc queries over distributed XML databases","volume":"2","author":"Rodrigues C.","year":"2011","unstructured":"C. Rodrigues , V. Braganholo , and M. Mattoso . Virtual partitioning ad-hoc queries over distributed XML databases . JIDM , 2 ( 3 ): 495 -- 510 , 2011 . C. Rodrigues, V. Braganholo, and M. Mattoso. Virtual partitioning ad-hoc queries over distributed XML databases. JIDM, 2(3):495--510, 2011.","journal-title":"JIDM"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.5555\/646291.687075"},{"key":"e_1_2_1_64_1","first-page":"253","volume-title":"BalticDB","author":"Schewe K.-D.","year":"2002","unstructured":"K.-D. Schewe . Fragmentation of object oriented and semistructured data . In BalticDB , pages 253 -- 266 , 2002 . K.-D. Schewe. Fragmentation of object oriented and semistructured data. In BalticDB, pages 253--266, 2002."},{"key":"e_1_2_1_65_1","volume-title":"DocEng","author":"Silva L.","year":"2013","unstructured":"L. Silva , L. Silva , M. Mattoso , and V. Braganholo . On the performance of the position() XPath function . In DocEng , 2013 . L. Silva, L. Silva, M. Mattoso, and V. Braganholo. On the performance of the position() XPath function. In DocEng, 2013."},{"issue":"1","key":"e_1_2_1_66_1","first-page":"27","article-title":"Towards recommendations for horizontal XML fragmentation","volume":"4","author":"Silva T.","year":"2013","unstructured":"T. Silva , F. Bai\u00e4o , J. Sampaio , M. Mattoso , and V. Braganholo . Towards recommendations for horizontal XML fragmentation . JIDM , 4 ( 1 ): 27 -- 36 , 2013 . T. Silva, F. Bai\u00e4o, J. Sampaio, M. Mattoso, and V. Braganholo. Towards recommendations for horizontal XML fragmentation. JIDM, 4(1):27--36, 2013.","journal-title":"JIDM"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/1629175.1629197"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/507234.507235"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.5555\/882511.885349"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564715"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2010.5447738"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.5555\/1018432.1021514"},{"key":"e_1_2_1_73_1","first-page":"55","volume-title":"DBISP2P","author":"Waldvogel M.","year":"2008","unstructured":"M. Waldvogel , M. Kramis , and S. Graf . Distributing XML with focus on parallel evaluation . In DBISP2P , pages 55 -- 67 , 2008 . M. Waldvogel, M. Kramis, and S. Graf. Distributing XML with focus on parallel evaluation. In DBISP2P, pages 55--67, 2008."},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2003.1260812"},{"key":"e_1_2_1_75_1","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1007\/3-540-36556-7_14","volume-title":"Efficiency and Effectiveness of XML Tools and Techniques and Data Integration over the Web-Revised Papers","author":"Yao B. B.","year":"2003","unstructured":"B. B. Yao , M. T. \u00d6zsu , and J. Keenleyside . XBench - a family of benchmarks for XML DBMSs . In Efficiency and Effectiveness of XML Tools and Techniques and Data Integration over the Web-Revised Papers , pages 162 -- 164 , 2003 . B. B. Yao, M. T. \u00d6zsu, and J. Keenleyside. XBench - a family of benchmarks for XML DBMSs. In Efficiency and Effectiveness of XML Tools and Techniques and Data Integration over the Web-Revised Papers, pages 162--164, 2003."},{"key":"e_1_2_1_76_1","doi-asserted-by":"crossref","first-page":"209","DOI":"10.1117\/12.543826","volume-title":"Data Mining and Knowledge Discovery: theory, tools and technology","author":"Zhang M.","year":"2004","unstructured":"M. Zhang and J. T. Yao . XML algebras for data mining . In Data Mining and Knowledge Discovery: theory, tools and technology , pages 209 -- 217 , 2004 . M. Zhang and J. T. Yao. XML algebras for data mining. In Data Mining and Knowledge Discovery: theory, tools and technology, pages 209--217, 2004."},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1145\/584931.584936"}],"container-title":["ACM SIGMOD Record"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2694428.2694434","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2694428.2694434","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:12:20Z","timestamp":1750227140000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2694428.2694434"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,12,4]]},"references-count":77,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2014,12,4]]}},"alternative-id":["10.1145\/2694428.2694434"],"URL":"https:\/\/doi.org\/10.1145\/2694428.2694434","relation":{},"ISSN":["0163-5808"],"issn-type":[{"type":"print","value":"0163-5808"}],"subject":[],"published":{"date-parts":[[2014,12,4]]},"assertion":[{"value":"2014-12-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}