{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,1,14]],"date-time":"2023-01-14T16:11:38Z","timestamp":1673712698526},"reference-count":15,"publisher":"Association for Computing Machinery (ACM)","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2008,8]]},"abstract":"<jats:p>In a publish-subscribe system based on filtering of XML documents subscribers specify their interests with profiles expressed in the XPath language. The system processes a stream of XML documents and delivers to subscribers a notification or content of documents that match the profiles. We present a new XML-document-filtering algorithm that is based on the classic Aho-Corasick pattern-matching automaton. The automaton has a size linear in the sum of the sizes of the filters. We assume that the XML documents all conform to a given DTD; our algorithm utilizes the DTD in the preprocessing phase of the automaton to prune out descendant axes (\/\/) and wildcards (*) from the XPath filters. The XPath subset currently supported consists of linear XPath expressions without predicates. In the case of a 683 MB protein-sequence database, we obtained a throughput of 18.8 MB\/sec for 50 000 filters and 17.0 MB\/sec for 500 000 filters, using a SAX parser with a throughput of 27 MB\/sec.<\/jats:p>","DOI":"10.14778\/1454159.1454245","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"1666-1671","source":"Crossref","is-referenced-by-count":3,"title":["XML-document-filtering automaton"],"prefix":"10.14778","volume":"1","author":[{"given":"Panu","family":"Silvasti","sequence":"first","affiliation":[{"name":"Helsinki University of Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Seppo","family":"Sippu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eljas","family":"Soisalon-Soininen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2008,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/360825.360855"},{"key":"e_1_2_1_2_1","first-page":"53","volume-title":"VLDB","author":"Altinel M.","year":"2000"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/645502.656107"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-002-0077-6"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1095890.1095916"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/958942.958947"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/645483.653613"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1042046.1042051"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872809"},{"key":"e_1_2_1_10_1","first-page":"217","volume-title":"VLDB '05: Proceedings of the 31st international conference on Very large data bases","author":"Kwon J.","year":"2005"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/956863.956928"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1187436.1187438"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/502034.502050"},{"key":"e_1_2_1_14_1","first-page":"321","volume-title":"SPIRE","author":"Soisalon-Soininen E.","year":"2004"},{"key":"e_1_2_1_15_1","unstructured":"D. Suciu. XMLData Repository -- The Database Research Group of University of Washington 2006. http:\/\/www.cs.washington.edu\/research\/xmldatasets\/.  D. Suciu. XMLData Repository -- The Database Research Group of University of Washington 2006. http:\/\/www.cs.washington.edu\/research\/xmldatasets\/."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/1454159.1454245","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:58:38Z","timestamp":1672221518000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/1454159.1454245"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,8]]},"references-count":15,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2008,8]]}},"alternative-id":["10.14778\/1454159.1454245"],"URL":"https:\/\/doi.org\/10.14778\/1454159.1454245","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2008,8]]}}}