{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,22]],"date-time":"2025-10-22T05:13:16Z","timestamp":1761109996347},"reference-count":30,"publisher":"Springer Science and Business Media LLC","issue":"S2","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2011,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch\/projects\/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users\u2019 requests for extracting information from the RDF data as well as the lack of users\u2019 knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-12-s2-s6","type":"journal-article","created":{"date-parts":[[2011,4,9]],"date-time":"2011-04-09T06:17:03Z","timestamp":1302329823000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Processing SPARQL queries with regular expressions in RDF databases"],"prefix":"10.1186","volume":"12","author":[{"given":"Jinsoo","family":"Lee","sequence":"first","affiliation":[]},{"given":"Minh-Duc","family":"Pham","sequence":"additional","affiliation":[]},{"given":"Jihwan","family":"Lee","sequence":"additional","affiliation":[]},{"given":"Wook-Shin","family":"Han","sequence":"additional","affiliation":[]},{"given":"Hune","family":"Cho","sequence":"additional","affiliation":[]},{"given":"Hwanjo","family":"Yu","sequence":"additional","affiliation":[]},{"given":"Jeong-Hoon","family":"Lee","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2011,3,29]]},"reference":[{"key":"4454_CR1","unstructured":"Dbpedia[ ] http:\/\/wiki.dbpedia.org"},{"key":"4454_CR2","unstructured":"freebase[ ] http:\/\/www.freebase.com"},{"key":"4454_CR3","unstructured":"Semantic web ontology examples http:\/\/openwetware.org\/wiki\/Synthetic_Biology[http:\/\/openwetware.org\/wiki\/Synthetic_Biology:Semantic_web_ontology\/Examples]"},{"key":"4454_CR4","unstructured":"Uniprot[ ] http:\/\/www.uniprot.org\/"},{"key":"4454_CR5","unstructured":"Bio2RDF[ ] http:\/\/bio2rdf.org"},{"key":"4454_CR6","unstructured":"SPARQL Query Language for RDF,[ ] http:\/\/www.w3.org\/TR\/rdf-sparql-query\/"},{"key":"4454_CR7","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1109\/TEC.1960.5221603","volume":"9","author":"R McNaughton","year":"1960","unstructured":"McNaughton R, Yamada H: Regular expressions and state graphs for automata. IEEE Transactions on Electronic Computers 1960, 9: 39\u201347. 10.1109\/TEC.1960.5221603","journal-title":"IEEE Transactions on Electronic Computers"},{"key":"4454_CR8","volume-title":"Programming Techniques: Regular expression search algorithm","author":"K Thompson","year":"1968","unstructured":"Thompson K: Programming Techniques: Regular expression search algorithm. 1968."},{"issue":"4","key":"4454_CR9","doi-asserted-by":"publisher","first-page":"481","DOI":"10.1145\/321239.321249","volume":"11","author":"J Brzozowski","year":"1964","unstructured":"Brzozowski J: Derivatives of regular expressions. Journal of the ACM (JACM) 1964, 11(4):481\u2013494. 10.1145\/321239.321249","journal-title":"Journal of the ACM (JACM)"},{"issue":"3","key":"4454_CR10","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1016\/0304-3975(86)90088-5","volume":"48","author":"G Berry","year":"1986","unstructured":"Berry G, Sethi R: From regular expressions to deterministic automata. Theoretical Computer Science 1986, 48(3):117\u2013126. 10.1016\/0304-3975(86)90088-5","journal-title":"Theoretical Computer Science"},{"issue":"6","key":"4454_CR11","doi-asserted-by":"publisher","first-page":"915","DOI":"10.1145\/235809.235810","volume":"43","author":"R Baeza-Yates","year":"1996","unstructured":"Baeza-Yates R, Gonnet G: Fast text searching for regular expressions or automaton searching on tries. Journal of the ACM (JACM) 1996, 43(6):915\u2013936. 10.1145\/235809.235810","journal-title":"Journal of the ACM (JACM)"},{"key":"4454_CR12","first-page":"419","volume-title":"International conference on data engineering","author":"J Cho","year":"2002","unstructured":"Cho J, Rajagopalan S: A fast regular expression indexing engine. In International conference on data engineering. Published by the IEEE Computer Society; 2002:419\u2013430."},{"key":"4454_CR13","volume-title":"Poster Proceedings of the 4th International Semantic Web Conference (ISWC'05)","author":"F Alkhateeb","year":"2005","unstructured":"Alkhateeb F, Baget J, Euzenat J: Complex path queries for RDF graphs. Poster Proceedings of the 4th International Semantic Web Conference (ISWC'05) 2005."},{"issue":"2","key":"4454_CR14","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1016\/j.websem.2009.02.002","volume":"7","author":"F Alkhateeb","year":"2009","unstructured":"Alkhateeb F, Baget J, Euzenat J: Extending SPARQL with regular expression patterns (for querying RDF). Web Semantics: Science Services and Agents on the World Wide Web 2009, 7(2):57\u201373. 10.1016\/j.websem.2009.02.002","journal-title":"Web Semantics: Science Services and Agents on the World Wide Web"},{"key":"4454_CR15","doi-asserted-by":"publisher","first-page":"145","DOI":"10.1007\/978-3-540-72667-8_12","volume-title":"The Semantic Web: Research and Applications","author":"K Kochut","year":"2007","unstructured":"Kochut K, Janik M: SPARQLeR: Extended SPARQL for semantic association discovery. The Semantic Web: Research and Applications 2007, 145\u2013159. full_text"},{"key":"4454_CR16","doi-asserted-by":"publisher","first-page":"953","DOI":"10.1109\/ICDE.2008.4497504","volume-title":"Proceedings of the 2008 IEEE 24th International Conference on Data Engineering","author":"G Kasneci","year":"2008","unstructured":"Kasneci G, Suchanek F, Ifrim G, Ramanath M, Weikum G: NAGA: Searching and Ranking Knowledge. In Proceedings of the 2008 IEEE 24th International Conference on Data Engineering. IEEE Computer Society; 2008:953\u2013962."},{"key":"4454_CR17","unstructured":"The Gene Ontology,[ ] http:\/\/www.geneontology.org"},{"key":"4454_CR18","first-page":"411","volume-title":"Proceedings of the 33rd international conference on Very large data bases","author":"D Abadi","year":"2007","unstructured":"Abadi D, Marcus A, Madden S, Hollenbach K: Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd international conference on Very large data bases. VLDB Endowment; 2007:411\u2013422."},{"key":"4454_CR19","doi-asserted-by":"publisher","first-page":"647","DOI":"10.14778\/1453856.1453927","volume":"1","author":"T Neumann","year":"2008","unstructured":"Neumann T, Weikum G: RDF-3X: a RISC-style engine for RDF. Proceedings of the VLDB Endowment 2008, 1: 647\u2013659.","journal-title":"Proceedings of the VLDB Endowment"},{"key":"4454_CR20","unstructured":"ICS-FORTH RDF suite,[ ] http:\/\/athena.ics.forth.gr:9090\/RDF\/"},{"key":"4454_CR21","volume-title":"2nd International Workshop on the Semantic Web","author":"S Alexaki","year":"2001","unstructured":"Alexaki S, Christophides V, Karvounarakis G, Plexousakis D, Tolle K: The ICS-FORTH RDFSuite: Managing voluminous RDF description bases. In 2nd International Workshop on the Semantic Web. Citeseer; 2001."},{"key":"4454_CR22","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1007\/3-540-48005-6_7","volume-title":"The Semantic Web-ISWC 2002","author":"J Broekstra","year":"2002","unstructured":"Broekstra J, Kampman A, Van Harmelen F: Sesame: A generic architecture for storing and querying rdf and rdf schema. The Semantic Web-ISWC 2002 2002, 54\u201368."},{"key":"4454_CR23","unstructured":"Openrdf,[ ] http:\/\/www.openrdf.org\/index.jsp"},{"key":"4454_CR24","unstructured":"Jena: a Semantic Web Framework for Java,[ ] http:\/\/jena.sourceforge.net"},{"key":"4454_CR25","first-page":"7","volume-title":"Proceedings of SWDB","author":"K Wilkinson","year":"2003","unstructured":"Wilkinson K, Sayers C, Kuno H, Reynolds D, et al.: Efficient RDF storage and retrieval in Jena2. In Proceedings of SWDB. Citeseer; 2003:7\u20138."},{"key":"4454_CR26","first-page":"1227","volume-title":"Proceedings of the 31st international conference on Very large data bases","author":"E Chong","year":"2005","unstructured":"Chong E, Das S, Eadon G, Srinivasan J: An efficient SQL-based RDF querying scheme. In Proceedings of the 31st international conference on Very large data bases. VLDB Endowment; 2005:1227."},{"key":"4454_CR27","unstructured":"Oracle technical network, semantic technologies center, http:\/\/www.oracle.com\/technology\/tech\/semantictechnologies\/index.html[http:\/\/www.oracle.com\/technology\/tech\/semantic\\_technologies\/index.html]"},{"key":"4454_CR28","volume-title":"Proceedings of the 31st international conference on Very large data bases","author":"S Chaudhuri","year":"2000","unstructured":"Chaudhuri S, Weikum G: Rethinking database system architecture: Towards a self-tuning RISC-style database system. In Proceedings of the 31st international conference on Very large data bases. Citeseer; 2000."},{"key":"4454_CR29","volume-title":"Introduction to automata theory, languages and computation","author":"JE Hopcroft","year":"1979","unstructured":"Hopcroft JE, Ullman JD: Introduction to automata theory, languages and computation. Addison-Wesley; 1979."},{"issue":"2","key":"4454_CR30","doi-asserted-by":"publisher","first-page":"73","DOI":"10.1145\/152610.152611","volume":"25","author":"G Graefe","year":"1993","unstructured":"Graefe G: Query Evaluation Techniques for Large Databases. ACM Comput. Surv 1993, 25(2):73\u2013170. 10.1145\/152610.152611","journal-title":"ACM Comput. Surv"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-12-S2-S6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T13:32:34Z","timestamp":1630503154000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-12-S2-S6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,3,29]]},"references-count":30,"journal-issue":{"issue":"S2","published-print":{"date-parts":[[2011,12]]}},"alternative-id":["4454"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-12-s2-s6","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,3,29]]},"assertion":[{"value":"29 March 2011","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S6"}}