{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T19:00:28Z","timestamp":1757617228984,"version":"3.44.0"},"reference-count":44,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2025,5,1]],"date-time":"2025-05-01T00:00:00Z","timestamp":1746057600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2025,5,1]],"date-time":"2025-05-01T00:00:00Z","timestamp":1746057600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Semantic Web: \u2013 Interoperability, Usability, Applicability"],"published-print":{"date-parts":[[2025,5]]},"abstract":"<jats:p>Recent advancements in declarative knowledge graph generation have introduced multiple mapping languages and engines, causing a shift in studies towards optimizing the knowledge graph generation process.  Although these engines commonly generate the knowledge graphs from heterogeneous data sources, sharing the optimization techniques and features remains challenging due to the lack of formal operational semantics. To address this, we propose a set of algebraic mapping operators that define operational semantics for general mapping processes. This algebra, based on the SPARQL algebra, enables reuse of established definitions and strengthens the link between knowledge graph generation and query engines. To evaluate language independence we translated mapping languages ShExML and the RDF Mapping Language (RML) into our algebraic mapping plan.  Our completeness evaluation shows that our algebraic operators cover the operational semantics of RML and partially support ShExML. Additional analysis is required to cover additional features of ShExML such as joining data from two input sources. For performance evaluation, our proof-of-concept algebraic mapping engine exhibits consistent and low memory usage across workloads, getting second place in the Knowledge Graph Construction Workshop's performance challenge. Algebraic mapping operators decouple mapping engines from specific languages, enabling multilingual mapping engines and allowing optimization techniques to be applied independently of the mapping process. This work lays the foundation for theoretical analysis of complexity and expressiveness of mapping languages and enforces consistency in execution semantics of mapping engines. Furthermore, aligning our algebra with SPARQL opens the door to advanced methods such as virtualization for querying heterogeneous data sources.<\/jats:p>","DOI":"10.1177\/22104968251361350","type":"journal-article","created":{"date-parts":[[2025,9,3]],"date-time":"2025-09-03T14:30:01Z","timestamp":1756909801000},"update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["Algebraic Mapping Operators for Knowledge Graph Generation"],"prefix":"10.1177","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9157-7507","authenticated-orcid":false,"given":"Sitt","family":"Min Oo","sequence":"first","affiliation":[{"name":"Department of Engineering and Architecture, University of Ghent \u2013 imec, Ghent, Belgium"}]},{"given":"Ben","family":"De Meester","sequence":"additional","affiliation":[{"name":"Department of Engineering and Architecture, University of Ghent \u2013 imec, Ghent, Belgium"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5118-256X","authenticated-orcid":false,"given":"Ruben","family":"Taelman","sequence":"additional","affiliation":[{"name":"Department of Engineering and Architecture, University of Ghent \u2013 imec, Ghent, Belgium"}]},{"given":"Pieter","family":"Colpaert","sequence":"additional","affiliation":[{"name":"Department of Engineering and Architecture, University of Ghent \u2013 imec, Ghent, Belgium"}]}],"member":"179","published-online":{"date-parts":[[2025,9,3]]},"reference":[{"key":"e_1_3_4_2_1","first-page":"1","article-title":"Morph-KGC: Scalable knowledge graph materialization with mapping partitions","volume":"15","author":"Arenas-Guerrero J.","year":"2022","unstructured":"Arenas-Guerrero J., Chaves-Fraga D., Toledo J., P\u00e9rez M. S., Corcho O. (2022). Morph-KGC: Scalable knowledge graph materialization with mapping partitions. Semantic Web, 15, 1\u201320. https:\/\/doi.org\/10.3233\/sw-223135","journal-title":"Semantic Web"},{"doi-asserted-by":"publisher","key":"e_1_3_4_3_1","DOI":"10.1145\/3555312"},{"unstructured":"Bagaria J. (2023). Set theory. In E. N. Zalta & U. Nodelman (Eds.) The stanford encyclopedia of philosophy Spring 2023 edn. Metaphysics Research Lab Stanford University.","key":"e_1_3_4_4_1"},{"doi-asserted-by":"publisher","key":"e_1_3_4_5_1","DOI":"10.1007\/s13740-012-0008-7"},{"unstructured":"Bizer C. Seaborne A. (2004). D2RQ-treating non-RDF databases as virtual RDF graphs. In Proceedings of the 3rd international semantic web conference (ISWC2004) Vol. 2004. Springer Hiroshima.","key":"e_1_3_4_6_1"},{"doi-asserted-by":"publisher","key":"e_1_3_4_7_1","DOI":"10.1305\/ndjfl\/1093634995"},{"unstructured":"Champin P.-A. (2020). Sophia: A linked data and semantic web toolkit for rust Taipei TW. https:\/\/www2020devtrack.github.io\/site\/schedule","key":"e_1_3_4_8_1"},{"doi-asserted-by":"publisher","key":"e_1_3_4_9_1","DOI":"10.1016\/j.datak.2009.04.001"},{"unstructured":"Chortaras A. Stamou G. (2018). D2RML: Integrating heterogeneous data and web services into custom RDF graphs. In LDOW@WWW. https:\/\/api.semanticscholar.org\/CorpusID:51950275","key":"e_1_3_4_10_1"},{"unstructured":"Cyganiak R. (2012). Tarql: SPARQL for Tables GitHub. https:\/\/github.com\/tarql\/tarql","key":"e_1_3_4_11_1"},{"doi-asserted-by":"crossref","unstructured":"Daga E. Asprino L. Mulholland P. Gangemi A. (2021). Facade-X: An opinionated approach to SPARQL anything. In Further with knowledge graphs \u2013 proceedings of the 17 th international conference on semantic systems 6\u20139 September 2021 Amsterdam The Netherlands Studies on the Semantic Web Vol. 53 (pp.\u00a058\u201373). IOS Press. ISSN 18681158 22150870. https:\/\/doi.org\/10.3233\/SSW210035","key":"e_1_3_4_12_1","DOI":"10.3233\/SSW210035"},{"unstructured":"Delva T. Assche D. V. Heyvaert P. Meester B. Dimou A. (2021). Integrating nested data into knowledge graphs with RML fields. https:\/\/www.semanticscholar.org\/paper\/Integrating-Nested-Data-into-Knowledge-Graphs-with-Delva-Assche\/cfd3929eb7eb98209acea307838be4c9ddc4d33c","key":"e_1_3_4_13_1"},{"unstructured":"Dimou A. Van der Sande M. Colpaert P. Verborgh R. Mannens E. Van de Walle R. (2014). RML: A generic language for integrated RDF mappings of heterogeneous data. In C. Bizer T. Heath S. Auer & T. Berners-Lee (Eds.) Proceedings of the 7th workshop on linked data on the web. CEUR workshop proceedings Vol. 1184. CEUR. ISSN 16130073. http:\/\/ceur-ws.org\/Vol-1184\/ldow2014_paper_01.pdf","key":"e_1_3_4_14_1"},{"doi-asserted-by":"crossref","unstructured":"Freund M. Schmid S. Dorsch R. Harth A. (2024a). FlexRML: A flexible and memory efficient knowledge graph materializer. In A. Mero\u00f1o Pe\u00f1uela A. Dimou R. Troncy O. Hartig M. Acosta M. Alam H. Paulheim & P. Lisena (Eds.) The semantic web Springer Nature Switzerland Cham (pp.\u00a040\u201356). ISBN 978-3-031-60635-9.","key":"e_1_3_4_15_1","DOI":"10.1007\/978-3-031-60635-9_3"},{"unstructured":"Freund M. Schmid S. Dorsch R. Harth A. (2024b). Performance results of FlexRML in the KGCW challenge 2024. In D. Chaves-Fraga A. Dimou A. Iglesias-Molina U. Serles & D. V. Assche (Eds.) Proceedings of the 5th international workshop on knowledge graph construction co-located with 21th extended semantic web conference (ESWC 2024) Hersonissos Greece May 27 2024 CEUR Workshop Proceedings Vol. 3718. CEUR-WS.org. https:\/\/ceur-ws.org\/Vol-3718\/paper9.pdf","key":"e_1_3_4_16_1"},{"doi-asserted-by":"publisher","key":"e_1_3_4_17_1","DOI":"10.7717\/peerj-cs.318"},{"doi-asserted-by":"crossref","unstructured":"G\u00f6ssner S. Normington G. Bormann C. (2024). JSONPath: Query expressions for JSON Request for Comments RFC Editor. https:\/\/doi.org\/10.17487\/RFC9535. https:\/\/www.rfc-editor.org\/info\/rfc9535","key":"e_1_3_4_18_1","DOI":"10.17487\/RFC9535"},{"unstructured":"Haesendonck G. Maroy W. Heyvaert P. Verborgh R. Dimou A. (2019). Parallel RDF generation from heterogeneous big data. In S. Groppe & L. Gruenwald (Eds.) Proceedings of the international workshop on semantic big data - SBD \u201919 SBD \u201919 ACM Press Amsterdam Netherlands. ISBN 978-1-4503-6766-0. https:\/\/doi.org\/10.1145\/3323878.3325802. https:\/\/biblio.ugent.be\/publication\/8619808\/file\/8659668.pdf","key":"e_1_3_4_19_1"},{"unstructured":"Halmos P. R. (1998). Naive set theory undergraduate texts in mathematics. Springer New York. ISBN 9780387900926. https:\/\/books.google.be\/books?id=x6cZBQ9qtgoC","key":"e_1_3_4_20_1"},{"doi-asserted-by":"publisher","key":"e_1_3_4_21_1","DOI":"10.3233\/SW-223224"},{"doi-asserted-by":"crossref","unstructured":"Iglesias-Molina A. Van Assche D. Arenas-Guerrero J. De Meester B. Debruyne C. Jozashoori S. Maria P. Michel F. Chaves-Fraga D. Dimou A. (2023). The RML ontology: A community-driven modular redesign after a decade of experience in mapping heterogeneous data to RDF. In Proceedings of the international semantic web conference (ISWC) Lecture Notes in Computer Science. Springer Cham (pp.\u00a0152\u2013175). ISSN 1611-3349. ISBN 9783031472435. https:\/\/doi.org\/10.1007\/978-3-031-47243-5_9","key":"e_1_3_4_22_1","DOI":"10.1007\/978-3-031-47243-5_9"},{"doi-asserted-by":"crossref","unstructured":"Iglesias E. Jozashoori S. Chaves-Fraga D. Collarana D. Vidal M.-E. (2020). SDM-RDFizer: An RML interpreter for the efficient creation of rdf knowledge graphs. In Proceedings of the 29th ACM international conference on information & knowledge management. ACM. https:\/\/doi.org\/10.1145\/3340531.3412881","key":"e_1_3_4_23_1","DOI":"10.1145\/3340531.3412881"},{"doi-asserted-by":"publisher","key":"e_1_3_4_24_1","DOI":"10.1016\/j.websem.2022.100755"},{"doi-asserted-by":"crossref","unstructured":"Lefran\u00e7ois M. Zimmermann A. Bakerally N. (2017). A SPARQL extension for generating RDF from heterogeneous formats. In E. Blomqvist D. Maynard A. Gangemi R. Hoekstra P. Hitzler & O. Hartig (Eds.) The semantic web 14th international conference ESWC 2017 Portoro\u017e Slovenia May 28 \u2013 June 1 2017 Proceedings. Springer International Publishing Portoroz Slovenia (pp.\u00a035\u201350). ISBN 978-3-319-58068-5. https:\/\/doi.org\/10.1007\/978-3-319-58068-5_3. http:\/\/www.maxime-lefrancois.info\/docs\/LefrancoisZimmermannBakerally-ESWC2017-Generate.pdf","key":"e_1_3_4_25_1","DOI":"10.1007\/978-3-319-58068-5_3"},{"unstructured":"Lopes N. Bischof S. Decker S. Polleres A. (2011). On the semantics of heterogeneous querying of relational XML and RDF data with XSPARQL. In Proceedings of the 15th Portuguese conference on artificial intelligence (EPIA 2011) Lisbon Portugal (pp.\u00a010\u201313). Citeseer.","key":"e_1_3_4_26_1"},{"doi-asserted-by":"crossref","unstructured":"Michel F. Djimenou L. Faron-Zucker C. Montagnat J. (2015). Translation of heterogeneous databases into RDF and application to the construction of a SKOS taxonomical reference. In International conference on web information systems and technologies (pp.\u00a0275\u2013296). Springer. https:\/\/doi.org\/10.1007\/978-3-319-30996-5_14","key":"e_1_3_4_27_1","DOI":"10.1007\/978-3-319-30996-5_14"},{"unstructured":"Min Oo S. De Meester B. Taelman R. Colpaert P. (2023). Towards algebraic mapping operators for knowledge graph construction (p. 5). ISBN 978-3-031-47239-8.","key":"e_1_3_4_28_1"},{"doi-asserted-by":"crossref","unstructured":"Min Oo S. Haesendonck G. De Meester B. Dimou A. (2022). RMLStreamer-SISO: An RDF stream generator from streaming heterogeneous data. In U. Sattler A. Hogan M. Keet V. Presutti J. P. A. Almeida H. Takeda P. Monnin G. Pirr\u00f2 & C. d\u2019Amato (Eds.) The semantic web \u2013 ISWC 2022 Springer International Publishing Cham (pp.\u00a0697\u2013713). Springer. ISBN 978-3-031-19433-7. https:\/\/doi.org\/10.1007\/978-3-031-19433-7_40","key":"e_1_3_4_29_1","DOI":"10.1007\/978-3-031-19433-7_40"},{"doi-asserted-by":"crossref","unstructured":"Min Oo S. Hartig O. (2025). An algebraic foundation for knowledge graph construction. In Proceedings of the 22nd extended semantic web conference (ESWC) Springer Nature Switzerland extend version available at https:\/\/arxiv.org\/abs\/2503.10385","key":"e_1_3_4_30_1","DOI":"10.1007\/978-3-031-94575-5_1"},{"unstructured":"Min Oo S. Verbeken T. De Meester B. (2024). RMLWeaver-JS: An algebraic mapping engine in the KGCW challenge 2024. In D. Chaves-Fraga A. Dimou A. Iglesias-Molina U. Serles & D. V. Assche (Eds.) Proceedings of the 5th international workshop on knowledge graph construction co-located with 21th extended semantic web conference (ESWC 2024) Hersonissos Greece May 27 2024 CEUR workshop proceedings Vol. 3718. CEUR-WS.org. https:\/\/ceur-ws.org\/Vol-3718\/paper8.pdf","key":"e_1_3_4_31_1"},{"doi-asserted-by":"publisher","key":"e_1_3_4_32_1","DOI":"10.1145\/1567274.1567278"},{"doi-asserted-by":"crossref","unstructured":"Priyatna F. Corcho O. Sequeda J. (2014). Formalisation and experiences of R2RML-based SPARQL to SQL query translation using morph. In Proceedings of the 23rd international conference on world wide web WWW \u201914 Association for Computing Machinery New York NY USA (pp.\u00a0479\u2013490). ISBN 9781450327442. https:\/\/doi.org\/10.1145\/2566486.2567981","key":"e_1_3_4_33_1","DOI":"10.1145\/2566486.2567981"},{"unstructured":"Prud\u2019hommeaux E. Boneva I. Labra Gayo J. E. Kellogg G. (2018). Shape expressions language 2.1 draft community group report world wide web consortium (W3C). http:\/\/shex.io\/shex-semantics\/","key":"e_1_3_4_34_1"},{"unstructured":"Scrocca M. Carenini A. Grassi M. Comerio M. Celino I. (2024). Not everybody speaks RDF: Knowledge conversion between different data representations. In D. Chaves-Fraga A. Dimou A. Iglesias-Molina U. Serles & D. V. Assche (Eds.) Proceedings of the 5th international workshop on knowledge graph construction co-located with 21th extended semantic web conference (ESWC 2024) Hersonissos Greece May 27 2024 CEUR workshop proceedings Vol. 3718. CEUR-WS.org. https:\/\/ceur-ws.org\/Vol-3718\/paper3.pdf","key":"e_1_3_4_35_1"},{"unstructured":"Seaborne A. Harris S. (2013). SPARQL 1.1 query language W3C recommendation W3C. https:\/\/www.w3.org\/TR\/2013\/REC-sparql11-query-20130321\/","key":"e_1_3_4_36_1"},{"unstructured":"Simsek U. K\u00e4rle E. Fensel D. A. (2019). RocketRML - A NodeJS implementation of a use case specific RML mapper. ArXiv abs\/1903.04969. https:\/\/doi.org\/10.48550\/ARXIV.1903.04969","key":"e_1_3_4_37_1"},{"unstructured":"Stadler C. Bin S. (2024). KGCW2024 challenge report: RDFProcessingToolkit. In D. Chaves-Fraga A. Dimou A. Iglesias-Molina U. Serles & D. V. Assche (Eds.) Proceedings of the 5th international workshop on knowledge graph construction co-located with 21th extended semantic web conference (ESWC 2024) Hersonissos Greece May 27 2024 CEUR Workshop Proceedings Vol. 3718. CEUR-WS.org. https:\/\/ceur-ws.org\/Vol-3718\/paper13.pdf","key":"e_1_3_4_38_1"},{"unstructured":"Stadler C. Unbehauen J. Westphal P. Sherif M. A. Lehmann J. (2015). Simplified RDB2RDF mapping. In LDOW@WWW. https:\/\/api.semanticscholar.org\/CorpusID:18692672","key":"e_1_3_4_39_1"},{"unstructured":"Sundara S. Das S. Cyganiak R. (2012). R2RML: RDB to RDF mapping language W3C recommendation W3C. https:\/\/www.w3.org\/TR\/2012\/REC-r2rml-20120927\/","key":"e_1_3_4_40_1"},{"doi-asserted-by":"crossref","unstructured":"Unbehauen J. Stadler C. Auer S. (2013). Optimizing SPARQL-to-SQL rewriting. In Proceedings of international conference on information integration and web-based applications & services IIWAS \u201913 Association for Computing Machinery New York NY USA (pp.\u00a0324\u2013330). ISBN 9781450321136. https:\/\/doi.org\/10.1145\/2539150.2539247","key":"e_1_3_4_41_1","DOI":"10.1145\/2539150.2539247"},{"unstructured":"Van Assche D. Chaves-Fraga D. Dimou A. Serles U. Iglesias A. (2024). KGCW 2024 Challenge @ ESWC 2024 Zenodo. https:\/\/doi.org\/10.5281\/zenodo.11577087","key":"e_1_3_4_42_1"},{"doi-asserted-by":"publisher","key":"e_1_3_4_43_1","DOI":"10.1016\/j.websem.2022.100753"},{"doi-asserted-by":"crossref","unstructured":"Van Assche D. Haesendonck G. De Mulder G. Delva T. Heyvaert P. De Meester B. Dimou A. (2021). Leveraging web of things W3C recommendations for knowledge graphs generation. In Web engineering 21st international conference ICWE 2021 Proceedings (pp.\u00a0337\u2013352). https:\/\/doi.org\/10.1007\/978-3-030-74296-6_26. https:\/\/dylanvanassche.be\/assets\/pdf\/icwe2021-wot-logical-target.pdf","key":"e_1_3_4_44_1","DOI":"10.1007\/978-3-030-74296-6_26"},{"doi-asserted-by":"crossref","unstructured":"Vu B. Pujara J. Knoblock C. A. (2019). D-REPR: A language for describing and mapping diversely-structured data sources to RDF. In Proceedings of the 10th international conference on knowledge capture K-CAP \u201919. Association for Computing Machinery New York NY USA (pp.\u00a0189\u2013196). ISBN 9781450370080. https:\/\/doi.org\/10.1145\/3360901.3364449","key":"e_1_3_4_45_1","DOI":"10.1145\/3360901.3364449"}],"container-title":["Semantic Web: \u2013 Interoperability, Usability, Applicability"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/22104968251361350","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/22104968251361350","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/22104968251361350","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,6]],"date-time":"2025-09-06T01:19:57Z","timestamp":1757121597000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/22104968251361350"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5]]},"references-count":44,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,5]]}},"alternative-id":["10.1177\/22104968251361350"],"URL":"https:\/\/doi.org\/10.1177\/22104968251361350","relation":{},"ISSN":["1570-0844","2210-4968"],"issn-type":[{"type":"print","value":"1570-0844"},{"type":"electronic","value":"2210-4968"}],"subject":[],"published":{"date-parts":[[2025,5]]},"article-number":"22104968251361350"}}