{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T00:56:17Z","timestamp":1772758577151,"version":"3.50.1"},"reference-count":60,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2024,3,12]],"date-time":"2024-03-12T00:00:00Z","timestamp":1710201600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2024,3,12]]},"abstract":"<jats:p>Cardinality Estimation over Knowledge Graphs (KG) is crucial for query optimization, yet remains a challenging task due to the semi-structured nature and complex correlations of data in typical KGs. In this work, we propose GNCE, a novel approach that leverages knowledge graph embeddings and Graph Neural Networks (GNN) to accurately predict the cardinality of conjunctive queries over KGs. GNCE first creates semantically meaningful embeddings for all entities in the KG, which are then used to learn a representation of a query using a GNN to estimate the cardinality of the query. We evaluate GNCE on several KGs in terms of q-Error and demonstrate that it outperforms state-of-the-art approaches based on sampling, summaries, and (machine) learning in terms of estimation accuracy while also having a low execution time and few parameters. Additionally, we show that GNCE performs similarly well on real-world queries and can inductively generalize to unseen entities, making it suitable for use in dynamic query processing scenarios. Our proposed approach has the potential to significantly improve query optimization and related applications that rely on accurate cardinality estimates of conjunctive queries.<\/jats:p>","DOI":"10.1145\/3639299","type":"journal-article","created":{"date-parts":[[2024,3,26]],"date-time":"2024-03-26T18:51:32Z","timestamp":1711479092000},"page":"1-26","source":"Crossref","is-referenced-by-count":12,"title":["Cardinality Estimation over Knowledge Graphs with Embeddings and Graph Neural Networks"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-7957-603X","authenticated-orcid":false,"given":"Tim","family":"Schwabe","sequence":"first","affiliation":[{"name":"TUM School of Computation, Information and Technology, Technical University of Munich &amp; Ruhr University Bochum, Munich, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1209-2868","authenticated-orcid":false,"given":"Maribel","family":"Acosta","sequence":"additional","affiliation":[{"name":"TUM School of Computation, Information and Technology, Technical University of Munich, Munich, Germany"}]}],"member":"320","published-online":{"date-parts":[[2024,3,26]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/978--3--319--25007--6_7"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-019-00558-9"},{"key":"e_1_2_1_3_1","volume-title":"Groups, Graphs, Geodesics, and Gauges. (4","author":"Bronstein Michael M.","year":"2021","unstructured":"Michael M. Bronstein, Joan Bruna, Taco Cohen, and 2021. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges. (4 2021). http:\/\/arxiv.org\/abs\/2104.13478"},{"key":"e_1_2_1_4_1","unstructured":"Daniel Casals Carlos Buil-Aranda and Carlos Valle. [n. d.]. SPARQL query execution time prediction using Deep Learning. ( [n. d.])."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5747"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2016.99"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1561\/1900000004"},{"key":"e_1_2_1_8_1","unstructured":"Richard Cyganiak David Wood and Markus Lanthaler. 2014. RDF 1.1 Concepts and Abstract Syntax. W3C Recommendation. W3C. https:\/\/www.w3.org\/TR\/2014\/REC-rdf11-concepts-20140225\/."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.48786\/edbt.2022.07"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.ICDT.2023.8"},{"key":"e_1_2_1_11_1","volume-title":"The Python Standard Library - time. https:\/\/docs.python.org\/3\/library\/time.html. [Online","author":"Foundation Python Software","year":"2023","unstructured":"Python Software Foundation. 2021. The Python Standard Library - time. https:\/\/docs.python.org\/3\/library\/time.html. [Online; accessed 27-February-2023]."},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","unstructured":"Lise Getoor Ben Taskar and Daphne Koller. 2001. Selectivity Estimation using Probabilistic Models.","DOI":"10.1145\/375663.375727"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.websem.2005.06.005"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/978--3-030--18579--4_1"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/978--3--319--11955--7_23"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.14778\/3384345.3384349"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447772"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/978--3-030--30793--6_15"},{"key":"e_1_2_1_19_1","volume-title":"Strategies for Pre-training Graph Neural Networks. In 8th International Conference on Learning Representations, ICLR 2020","author":"Hu Weihua","year":"2020","unstructured":"Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay S. Pande, and Jure Leskovec. 2020. Strategies for Pre-training Graph Neural Networks. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26--30, 2020. OpenReview.net. https:\/\/openreview.net\/forum?id=HJlWWJSFDH"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2063576.2063784"},{"key":"e_1_2_1_21_1","volume-title":"Kingma and Jimmy Ba","author":"Diederik","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7--9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http:\/\/arxiv.org\/abs\/1412.6980"},{"key":"e_1_2_1_22_1","volume-title":"Learned Cardinalities: Estimating Correlated Joins with Deep Learning. In 9th Biennial Conference on Innovative Data Systems Research, CIDR","author":"Kipf Andreas","year":"2019","unstructured":"Andreas Kipf, Thomas Kipf, Bernhard Radke, Viktor Leis, Peter A. Boncz, and Alfons Kemper. 2019. Learned Cardinalities: Estimating Correlated Joins with Deep Learning. In 9th Biennial Conference on Innovative Data Systems Research, CIDR 2019, Asilomar, CA, USA, January 13--16, 2019, Online Proceedings. www.cidrdb.org. http:\/\/cidrdb.org\/cidr2019\/papers\/p101-kipf-cidr19.pdf"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/978--3-030--30793--6_20"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.3233\/SW-140134"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915235"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0925-2312(98)00111-8"},{"key":"e_1_2_1_27_1","volume-title":"1st International Conference on Learning Representations, ICLR","author":"Mikolov Tom\u00e1s","year":"2013","unstructured":"Tom\u00e1s Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2--4, 2013, Workshop Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http:\/\/arxiv.org\/abs\/1301.3781"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-76298-0_58"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2011.5767868"},{"key":"e_1_2_1_30_1","doi-asserted-by":"crossref","unstructured":"Thomas Neumann and Gerhard Weikum. 2009. The RDF-3X Engine for Scalable Management of RDF Data.","DOI":"10.1007\/s00778-009-0165-y"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3331166"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389702"},{"key":"e_1_2_1_33_1","volume-title":"RDF2Vec Light - A Lightweight Approach for Knowledge Graph Embeddings. CoRR","author":"Portisch Jan","year":"2020","unstructured":"Jan Portisch, Michael Hladik, and Heiko Paulheim. 2020. RDF2Vec Light - A Lightweight Approach for Knowledge Graph Embeddings. CoRR, Vol. abs\/2009.07659 (2020). showeprint[arXiv]2009.07659 https:\/\/arxiv.org\/abs\/2009.07659"},{"key":"e_1_2_1_34_1","unstructured":"Patryk Preisner and Heiko Paulheim. 2023. Universal Preprocessing Operators for Embedding Knowledge Graphs with Literals. arxiv: 2309.03023 [cs.AI]"},{"key":"e_1_2_1_35_1","unstructured":"Eric Prud'hommeaux and Andy Seaborne. 2008. SPARQL Query Language for RDF. W3C Recommendation. W3C. https:\/\/www.w3.org\/TR\/2008\/REC-rdf-sparql-query-20080115\/."},{"key":"e_1_2_1_36_1","unstructured":"Petar Ristoski and Heiko Paulheim. [n. d.]. RDF2Vec: RDF Graph Embeddings for Data Mining. http:\/\/www.rapidminer.com\/"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46523-4_30"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3424672"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1328854.1328856"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178876.3186003"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1367497.1367578"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.websem.2008.06.001"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.14778\/3485450.3485459"},{"key":"e_1_2_1_44_1","unstructured":"PyG Team. 2022. CREATING MESSAGE PASSING NETWORKS. https:\/\/pytorch-geometric.readthedocs.io\/en\/latest\/notes\/create_gnn.html. Accessed: 2022--11--29."},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.14778\/3402707.3402724"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","unstructured":"Gilles Vandewiele Bram Steenwinckel Terencio Agozzino and Femke Ongenae. 2022. pyRDF2Vec: A Python Implementation and Extension of RDF2Vec. (2022). https:\/\/doi.org\/10.48550\/ARXIV.2205.02283","DOI":"10.48550\/ARXIV.2205.02283"},{"key":"e_1_2_1_47_1","volume-title":"Attention is all you need. Advances in neural information processing systems","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017)."},{"key":"e_1_2_1_48_1","unstructured":"David Vengerov Andre Cavalheiro Menck Mohamed Zait and Sunil P Chakkappen. 2150. Join Size Estimation Subject to Filter Conditions."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3526163"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00360"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482377"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/3329859.3329875"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2020.2978386"},{"key":"e_1_2_1_54_1","volume-title":"7th International Conference on Learning Representations, ICLR 2019","author":"Xu Keyulu","year":"2019","unstructured":"Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6--9, 2019. OpenReview.net. https:\/\/openreview.net\/forum?id=ryGs6iA5Km"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2019\/594"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11280-017-0498--1"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3526156"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457289"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3183739"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-21348-0_34"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3639299","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3639299","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,21]],"date-time":"2025-08-21T15:18:07Z","timestamp":1755789487000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3639299"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,12]]},"references-count":60,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,3,12]]}},"alternative-id":["10.1145\/3639299"],"URL":"https:\/\/doi.org\/10.1145\/3639299","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,3,12]]}}}