{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,19]],"date-time":"2026-05-19T07:22:21Z","timestamp":1779175341169,"version":"3.51.4"},"reference-count":116,"publisher":"Association for Computing Machinery (ACM)","issue":"6","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,2]]},"abstract":"<jats:p>Causal inference aids researchers in discovering cause-and-effect relationships, leading to scientific insights. Accurate causal estimation requires identifying confounding variables to avoid false discoveries. Pearl's causal model uses causal DAGs to identify confounding variables, but incorrect DAGs can lead to unreliable causal conclusions. However, for high dimensional data, the causal DAGs are often complex beyond human verifiability. Graph summarization is a logical next step, but current methods for general-purpose graph summarization are inadequate for causal DAG summarization. This paper addresses these challenges by proposing a causal graph summarization objective that balances graph simplification for better understanding while retaining essential causal information for reliable inference. We develop an efficient greedy algorithm and show that summary causal DAGs can be directly used for inference and are more robust to misspecification of assumptions, enhancing robustness for causal inference. Experimenting with six real-life datasets, we compared our algorithm to three existing solutions, showing its effectiveness in handling high-dimensional data and its ability to generate summary DAGs that ensure both reliable causal inference and robustness against misspecifications.<\/jats:p>","DOI":"10.14778\/3725688.3725717","type":"journal-article","created":{"date-parts":[[2025,8,29]],"date-time":"2025-08-29T14:19:21Z","timestamp":1756477161000},"page":"1933-1947","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Causal DAG Summarization"],"prefix":"10.14778","volume":"18","author":[{"given":"Anna","family":"Zeng","sequence":"first","affiliation":[{"name":"CSAIL, MIT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Cafarella","sequence":"additional","affiliation":[{"name":"CSAIL, MIT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Batya","family":"Kenig","sequence":"additional","affiliation":[{"name":"Technion, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Markos","family":"Markakis","sequence":"additional","affiliation":[{"name":"CSAIL, MIT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Brit","family":"Youngmann","sequence":"additional","affiliation":[{"name":"Technion, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Babak","family":"Salimi","sequence":"additional","affiliation":[{"name":"University of California, San Diego, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,8,29]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2016. Adult Census Income Dataset. https:\/\/www.kaggle.com\/datasets\/uciml\/adult-census-income. Accessed: 2024-04-04."},{"key":"e_1_2_1_2_1","unstructured":"2020. Kaggle Datasets: Flights Delay. https:\/\/www.kaggle.com\/usdot\/flight-delays."},{"key":"e_1_2_1_3_1","unstructured":"2024. Code Repository and Technical Report. https:\/\/github.com\/TechnionTDK\/causalens. Accessed: 2024-07-30."},{"key":"e_1_2_1_4_1","unstructured":"2024. Kaggle Datasets: malicious url detection. https:\/\/www.kaggle.com\/datasets\/pilarpieiro\/tabular-dataset-ready-for-malicious-url-detection. Accessed: 2024-04-04."},{"key":"e_1_2_1_5_1","unstructured":"2024. OpenAI ChatGPT (3.5) [Large language model]. https:\/\/openai.com\/blog\/chatgpt. Accessed: 2024-04-04."},{"key":"e_1_2_1_6_1","unstructured":"2024. SYS_QUERY_HISTORY - Amazon Redshift. https:\/\/docs.aws.amazon.com\/redshift\/latest\/dg\/SYS_QUERY_HISTORY.html. Accessed: 2024-04-04."},{"key":"e_1_2_1_7_1","volume-title":"CausalSim: A Causal Framework for Unbiased Trace-Driven Simulation. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23)","author":"Alomar Abdullah","year":"2023","unstructured":"Abdullah Alomar, Pouya Hamadanian, Arash Nasr-Esfahany, Anish Agarwal, Mohammad Alizadeh, and Devavrat Shah. 2023. CausalSim: A Causal Framework for Unbiased Trace-Driven Simulation. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). 1115\u20131147."},{"key":"e_1_2_1_8_1","unstructured":"Amazon Web Services. 2024. Amazon Redshift Serverless. https:\/\/aws.amazon.com\/redshift\/redshift-serverless\/."},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence.","author":"Anand Tara V","year":"2023","unstructured":"Tara V Anand, Adele H Ribeiro, Jin Tian, and Elias Bareinboim. 2023. Causal Effect Identification in Cluster DAGs. In Proceedings of the AAAI Conference on Artificial Intelligence."},{"key":"e_1_2_1_10_1","unstructured":"Arthur Asuncion and David Newman. 2007. UCI machine learning repository."},{"key":"e_1_2_1_11_1","volume-title":"Conference on causal learning and reasoning. PMLR, 90\u2013109","author":"Beckers Sander","year":"2022","unstructured":"Sander Beckers. 2022. Causal explanations and XAI. In Conference on causal learning and reasoning. PMLR, 90\u2013109."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33012678"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00224-016-9718-9"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 2022 International Conference on Management of Data. 2441\u20132447","author":"Bhowmick Sourav S","year":"2022","unstructured":"Sourav S Bhowmick and Byron Choi. 2022. Data-driven visual query interfaces for graphs: Past, present, and (near) future. In Proceedings of the 2022 International Conference on Management of Data. 2441\u20132447."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/988672.988752"},{"key":"e_1_2_1_16_1","unstructured":"Alessandro Castelnovo Riccardo Crupi Fabio Mercorio Mario Mezzanzanica Daniele Potert\u00ec and Daniele Regoli. 2024. Marrying LLMs with Domain Expert Validation for Causal Graph Generation. (2024)."},{"key":"e_1_2_1_17_1","unstructured":"Krzysztof Chalupka Frederick Eberhardt and Pietro Perona. 2016. Multi-level cause-effect systems. In Artificial intelligence and statistics. PMLR 361\u2013369."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/3020847.3020867"},{"key":"e_1_2_1_19_1","doi-asserted-by":"crossref","first-page":"280","DOI":"10.1137\/20M1362796","article-title":"Causal Structural Learning via Local Graphs","volume":"5","author":"Chen Wenyu","year":"2023","unstructured":"Wenyu Chen, Mathias Drton, and Ali Shojaie. 2023. Causal Structural Learning via Local Graphs. SIAM Journal on Mathematics of Data Science 5, 2 (2023), 280\u2013305.","journal-title":"SIAM Journal on Mathematics of Data Science"},{"key":"e_1_2_1_20_1","first-page":"507","article-title":"Optimal structure identification with greedy search","author":"Chickering D.M","year":"2002","unstructured":"D.M Chickering. 2002. Optimal structure identification with greedy search. JMLR 3, Nov (2002), 507\u2013554.","journal-title":"JMLR 3"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijar.2021.01.001"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.5555\/1618595.1618605"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-006-0004-3"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2470654.2466444"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213836.2213855"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 2022 International Conference on Management of Data. 1598\u20131611","author":"Galhotra Sainyam","year":"2022","unstructured":"Sainyam Galhotra, Amir Gilad, Sudeepa Roy, and Babak Salimi. 2022. Hyper: Hypothetical reasoning with what-if and how-to queries using a probabilistic causal approach. In Proceedings of the 2022 International Conference on Management of Data. 1598\u20131611."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE55515.2023.00213"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3458455"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3445814.3446700"},{"key":"e_1_2_1_30_1","volume-title":"Causal inference in sociological research. Annual review of sociology 36","author":"Gangl Markus","year":"2010","unstructured":"Markus Gangl. 2010. Causal inference in sociological research. Annual review of sociology 36 (2010), 21\u201347."},{"key":"e_1_2_1_31_1","volume-title":"UAI '88: Proceedings of the Fourth Annual Conference on Uncertainty in Artificial Intelligence","author":"Geiger Dan","year":"1988","unstructured":"Dan Geiger and Judea Pearl. 1988. On the logic of causal models. In UAI '88: Proceedings of the Fourth Annual Conference on Uncertainty in Artificial Intelligence, Minneapolis, MN, USA, July 10\u201312, 1988. 3\u201314."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1002\/net.3230200504"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1287\/trsc.1110.0401"},{"key":"e_1_2_1_34_1","volume-title":"Review of causal discovery methods based on graphical models. Frontiers in genetics 10","author":"Glymour Clark","year":"2019","unstructured":"Clark Glymour, Kun Zhang, and Peter Spirtes. 2019. Review of causal discovery methods based on graphical models. Frontiers in genetics 10 (2019), 524."},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 2017 ACM International Conference on Management of Data. 1707\u20131710","author":"Gudmundsdottir Helga","year":"2017","unstructured":"Helga Gudmundsdottir, Babak Salimi, Magdalena Balazinska, Dan RK Ports, and Dan Suciu. 2017. A demonstration of interactive analysis of performance measurements with viska. In Proceedings of the 2017 ACM International Conference on Management of Data. 1707\u20131710."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.5555\/2834389"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1137\/1024022"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1057\/ivs.2009.10"},{"key":"e_1_2_1_39_1","volume-title":"MANM-CS: Data generation for benchmarking causal structure learning from mixed discrete-continuous and nonlinear data. WHY-21 at NeurIPS 2021","author":"Huegle Johannes","year":"2021","unstructured":"Johannes Huegle, Christopher Hagedorn, Lukas Boehme, Mats Poerschke, Jonas Umland, and Rainer Schlosser. 2021. MANM-CS: Data generation for benchmarking causal structure learning from mixed discrete-continuous and nonlinear data. WHY-21 at NeurIPS 2021 (2021)."},{"key":"e_1_2_1_40_1","volume-title":"ACM SIGKDD 2017 Workshop on Interactive Data Exploration and Analytics.","author":"Jin Lisa","year":"2017","unstructured":"Lisa Jin and Danai Koutra. 2017. Ecoviz: Comparative vizualization of time-evolving network summaries. In ACM SIGKDD 2017 Workshop on Interactive Data Exploration and Analytics."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.14778\/3137765.3137825"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2011.07.001"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611973440.11"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.5555\/2517765"},{"key":"e_1_2_1_45_1","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1111\/1467-9868.00340","article-title":"Chain graph models and their causal interpretations","volume":"64","author":"Lauritzen Steffen L","year":"2002","unstructured":"Steffen L Lauritzen and Thomas S Richardson. 2002. Chain graph models and their causal interpretations. Journal of the Royal Statistical Society Series B: Statistical Methodology 64, 3 (2002), 321\u2013348.","journal-title":"Journal of the Royal Statistical Society Series B: Statistical Methodology"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403057"},{"key":"e_1_2_1_47_1","doi-asserted-by":"crossref","unstructured":"Chenhui Li George Baciu and Yunzhe Wang. 2015. Modulgraph: modularity-based visualization of massive graphs. In SIGGRAPH Asia 2015 Visualization in High Performance Computing. 1\u20134.","DOI":"10.1145\/2818517.2818542"},{"key":"e_1_2_1_48_1","volume-title":"The global k-means clustering algorithm. Pattern recognition 36, 2","author":"Likas Aristidis","year":"2003","unstructured":"Aristidis Likas, Nikos Vlassis, and Jakob J Verbeek. 2003. The global k-means clustering algorithm. Pattern recognition 36, 2 (2003), 451\u2013461."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661829.2661862"},{"key":"e_1_2_1_50_1","volume-title":"Graph summarization methods and applications: A survey. ACM computing surveys (CSUR) 51, 3","author":"Liu Yike","year":"2018","unstructured":"Yike Liu, Tara Safavi, Abhilash Dighe, and Danai Koutra. 2018. Graph summarization methods and applications: A survey. ACM computing surveys (CSUR) 51, 3 (2018), 1\u201334."},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1755\u20131764","author":"Maccioni Antonio","year":"2016","unstructured":"Antonio Maccioni and Daniel J Abadi. 2016. Scalable pattern matching over compressed graphs via dedensification. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1755\u20131764."},{"key":"e_1_2_1_52_1","volume-title":"Tom Claassen, Stephan Bongers, Philip Versteeg, and Joris M Mooij.","author":"Magliacane Sara","year":"2018","unstructured":"Sara Magliacane, Thijs Van Ommen, Tom Claassen, Stephan Bongers, Philip Versteeg, and Joris M Mooij. 2018. Domain adaptation by using causal inference to predict invariant conditional distributions. Advances in neural information processing systems 31 (2018)."},{"key":"e_1_2_1_53_1","volume-title":"2016 IEEE 32nd International Conference on Data Engineering (ICDE). IEEE, 109\u2013120","author":"Maneth Sebastian","year":"2016","unstructured":"Sebastian Maneth and Fabian Peternek. 2016. Compressing graphs by grammars. In 2016 IEEE 32nd International Conference on Data Engineering (ICDE). IEEE, 109\u2013120."},{"key":"e_1_2_1_54_1","volume-title":"Brit Youngmann, Trinity Gao, Ziyu Zhang, Rana Shahout, Peter Baile Chen, Chunwei Liu, Ibrahim Sabek, and Michael Cafarella.","author":"Markakis Markos","year":"2024","unstructured":"Markos Markakis, An Bo Chen, Brit Youngmann, Trinity Gao, Ziyu Zhang, Rana Shahout, Peter Baile Chen, Chunwei Liu, Ibrahim Sabek, and Michael Cafarella. 2024. Sawmill: From Logs to Causal Diagnosis of Large Systems. In SIGMOD. 444\u2013447."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.14778\/3705829.3705836"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3665601.3669842"},{"key":"e_1_2_1_57_1","volume-title":"Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23\u201327, 2013, Proceedings, Part II 13","author":"Mehmood Yasir","year":"2013","unstructured":"Yasir Mehmood, Nicola Barbieri, Francesco Bonchi, and Antti Ukkonen. 2013. Csi: Community-level social influence analysis. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23\u201327, 2013, Proceedings, Part II 13. Springer, 48\u201363."},{"key":"e_1_2_1_58_1","first-page":"59","article-title":"Causality in databases","volume":"33","author":"Meliou Alexandra","year":"2010","unstructured":"Alexandra Meliou, Wolfgang Gatterbauer, Joseph Y Halpern, Christoph Koch, Katherine F Moore, and Dan Suciu. 2010. Causality in databases. IEEE Data Engineering Bulletin 33, 3 (2010), 59\u201367.","journal-title":"IEEE Data Engineering Bulletin"},{"key":"e_1_2_1_59_1","volume-title":"Why so? or why no? functional causality for explaining query answers. arXiv preprint arXiv:0912.5340","author":"Meliou Alexandra","year":"2009","unstructured":"Alexandra Meliou, Wolfgang Gatterbauer, Katherine F Moore, and Dan Suciu. 2009. Why so? or why no? functional causality for explaining query answers. arXiv preprint arXiv:0912.5340 (2009)."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.14778\/2733004.2733070"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/3539597.3570441"},{"key":"e_1_2_1_62_1","volume-title":"Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)."},{"key":"e_1_2_1_63_1","volume-title":"Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence 267","author":"Miller Tim","year":"2019","unstructured":"Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence 267 (2019), 1\u201338."},{"key":"e_1_2_1_64_1","volume-title":"Graphical models for inference with missing data. Advances in neural information processing systems 26","author":"Mohan Karthika","year":"2013","unstructured":"Karthika Mohan, Judea Pearl, and Jin Tian. 2013. Graphical models for inference with missing data. Advances in neural information processing systems 26 (2013)."},{"key":"e_1_2_1_65_1","volume-title":"Proceedings of the 27th ACM SIGSPATIAL international conference on advances in geographic information systems. 33\u201342","author":"Moosavi Sobhan","year":"2019","unstructured":"Sobhan Moosavi, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, Radu Teodorescu, and Rajiv Ramnath. 2019. Accident risk prediction based on heterogeneous sparse data: New dataset and insights. In Proceedings of the 27th ACM SIGSPATIAL international conference on advances in geographic information systems. 33\u201342."},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376661"},{"key":"e_1_2_1_67_1","unstructured":"Xueyan Niu Xiaoyun Li and Ping Li. 2022. Learning Cluster Causal Diagrams: An Information-Theoretic Approach. (2022)."},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1097\/EDE.0000000000000659"},{"key":"e_1_2_1_69_1","unstructured":"OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]"},{"key":"e_1_2_1_70_1","volume-title":"Proceedings of the 19th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence. 1\u201316","author":"O'donnell RT","year":"2006","unstructured":"RT O'donnell, Ann E Nicholson, B Han, Kevin B Korb, MJ Alam, and LR Hope. 2006. Incorporating expert elicited structural information in the CaMML causal discovery program. In Proceedings of the 19th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence. 1\u201316."},{"key":"e_1_2_1_71_1","volume-title":"Conference on Probabilistic Graphical Models. PMLR, 380\u2013391","author":"Parviainen Pekka","year":"2016","unstructured":"Pekka Parviainen and Samuel Kaski. 2016. Bayesian networks for variable groups. In Conference on Probabilistic Graphical Models. PMLR, 380\u2013391."},{"key":"e_1_2_1_72_1","volume-title":"FAIM'18 Workshop on CausalML","author":"Pashami Sepideh","year":"2018","unstructured":"Sepideh Pashami, Anders Holst, Juhee Bae, and S\u0142awomir Nowaczyk. 2018. Causal discovery using clusters from observational data. In FAIM'18 Workshop on CausalML, Stockholm, Sweden, July 15, 2018."},{"key":"e_1_2_1_73_1","volume-title":"models, reasoning, and inference","author":"Pearl Judea","unstructured":"Judea Pearl. 2000. Causality : models, reasoning, and inference. Cambridge University Press."},{"key":"e_1_2_1_74_1","unstructured":"J. Pearl and D. Mackenzie. 2018. The book of why: the new science of cause and effect. Basic books."},{"key":"e_1_2_1_75_1","doi-asserted-by":"crossref","unstructured":"Sriram Pemmaraju Steven Skiena et al. 2003. Computational discrete mathematics: Combinatorics and graph theory with mathematica\u00ae. Cambridge university press.","DOI":"10.1017\/CBO9781139164849"},{"key":"e_1_2_1_76_1","volume-title":"Conference on Probabilistic Graphical Models. PMLR, 392\u2013402","author":"Pe\u00f1a Jose M","year":"2016","unstructured":"Jose M Pe\u00f1a. 2016. Learning acyclic directed mixed graphs from observations and interventions. In Conference on Probabilistic Graphical Models. PMLR, 392\u2013402."},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654963"},{"key":"e_1_2_1_78_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564759"},{"key":"e_1_2_1_79_1","volume-title":"2013 Joint IFSA World Congress and NAFIPS Annual Meeting (IFSA\/NAFIPS). IEEE, 122\u2013127","author":"Puente Cristina","year":"2013","unstructured":"Cristina Puente, Jos\u00e9 Angel Olivas, E Garrido, and R Seisdedos. 2013. Compressing the representation of a causal graph. In 2013 Joint IFSA World Congress and NAFIPS Annual Meeting (IFSA\/NAFIPS). IEEE, 122\u2013127."},{"key":"e_1_2_1_80_1","volume-title":"Proceedings 19th International Conference on Data Engineering (Cat. No. 03CH37405)","author":"Raghavan Sriram","year":"2003","unstructured":"Sriram Raghavan and Hector Garcia-Molina. 2003. Representing web graphs. In Proceedings 19th International Conference on Data Engineering (Cat. No. 03CH37405). IEEE, 405\u2013416."},{"key":"e_1_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-018-0121-z"},{"key":"e_1_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.14778\/3554821.3554902"},{"key":"e_1_2_1_83_1","volume-title":"Proceedings of the 2014 ACM SIGMOD international conference on Management of data. 1579\u20131590","author":"Roy Sudeepa","year":"2014","unstructured":"Sudeepa Roy and Dan Suciu. 2014. A formal approach to finding explanations for database queries. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data. 1579\u20131590."},{"key":"e_1_2_1_84_1","volume-title":"Causal consistency of structural equation models. arXiv preprint arXiv:1707.00819","author":"Rubenstein Paul K","year":"2017","unstructured":"Paul K Rubenstein, Sebastian Weichwald, Stephan Bongers, Joris M Mooij, Dominik Janzing, Moritz Grosse-Wentrup, and Bernhard Sch\u00f6lkopf. 2017. Causal consistency of structural equation models. arXiv preprint arXiv:1707.00819 (2017)."},{"key":"e_1_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1198\/016214504000001880"},{"key":"e_1_2_1_86_1","doi-asserted-by":"crossref","unstructured":"Babak Salimi Johannes Gehrke and Dan Suciu. 2018. Bias in olap queries: Detection explanation and removal. In SIGMOD. 1021\u20131035.","DOI":"10.1145\/3183713.3196914"},{"key":"e_1_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389759"},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3319901"},{"key":"e_1_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2021.3058954"},{"key":"e_1_2_1_90_1","first-page":"75","article-title":"On Summarizing Large-Scale Dynamic Graphs","volume":"40","author":"Shah Neil","year":"2017","unstructured":"Neil Shah, Danai Koutra, Lisa Jin, Tianmin Zou, Brian Gallagher, and Christos Faloutsos. 2017. On Summarizing Large-Scale Dynamic Graphs. IEEE Data Eng. Bull. 40, 3 (2017), 75\u201388.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_1_91_1","volume-title":"DoWhy: An End-to-End Library for Causal Inference. arXiv preprint arXiv:2011.04216","author":"Sharma Amit","year":"2020","unstructured":"Amit Sharma and Emre Kiciman. 2020. DoWhy: An End-to-End Library for Causal Inference. arXiv preprint arXiv:2011.04216 (2020)."},{"key":"e_1_2_1_92_1","volume-title":"Visual analysis of large heterogeneous social networks by semantic and structural abstraction","author":"Shen Zeqian","year":"2006","unstructured":"Zeqian Shen, Kwan-Liu Ma, and Tina Eliassi-Rad. 2006. Visual analysis of large heterogeneous social networks by semantic and structural abstraction. IEEE transactions on visualization and computer graphics 12, 6 (2006), 1427\u20131439."},{"key":"e_1_2_1_93_1","article-title":"A linear non-Gaussian acyclic model for causal discovery","volume":"7","author":"Shimizu Shohei","year":"2006","unstructured":"Shohei Shimizu, Patrik O Hoyer, Aapo Hyv\u00e4rinen, Antti Kerminen, and Michael Jordan. 2006. A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research 7, 10 (2006).","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_2_1_94_1","volume-title":"The World Wide Web Conference. 1679\u20131690","author":"Shin Kijung","year":"2019","unstructured":"Kijung Shin, Amol Ghoting, Myunghwan Kim, and Hema Raghavan. 2019. Sweg: Lossless and lossy summarization of web-scale graphs. In The World Wide Web Conference. 1679\u20131690."},{"key":"e_1_2_1_95_1","volume-title":"2013 IEEE International Conference on Big Data. IEEE, 597\u2013605","author":"Shoaran Maryam","year":"2013","unstructured":"Maryam Shoaran, Alex Thomo, and Jens H Weber-Jahnke. 2013. Zero-knowledge private graph summarization. In 2013 IEEE International Conference on Big Data. IEEE, 597\u2013605."},{"key":"e_1_2_1_96_1","volume-title":"Proceedings of the ACM India Joint International Conference on Data Science and Management of Data. Association for Computing Machinery, 46\u201356","author":"Singh Karamjit","year":"2018","unstructured":"Karamjit Singh, Garima Gupta, Vartika Tewari, and Gautam Shroff. 2018. Comparative benchmarking of causal discovery algorithms. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data. Association for Computing Machinery, 46\u201356."},{"key":"e_1_2_1_97_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2018.2807442"},{"key":"e_1_2_1_98_1","doi-asserted-by":"publisher","DOI":"10.1145\/1374376.1374456"},{"key":"e_1_2_1_99_1","doi-asserted-by":"crossref","unstructured":"P. Spirtes et al. 2000. Causation prediction and search. MIT press.","DOI":"10.7551\/mitpress\/1754.001.0001"},{"key":"e_1_2_1_100_1","first-page":"16846","article-title":"Recovering latent causal factor for generalization to distributional shifts","volume":"34","author":"Sun Xinwei","year":"2021","unstructured":"Xinwei Sun, Botong Wu, Xiangyu Zheng, Chang Liu, Wei Chen, Tao Qin, and Tie-Yan Liu. 2021. Recovering latent causal factor for generalization to distributional shifts. Advances in Neural Information Processing Systems 34 (2021), 16846\u201316859.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_101_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376675"},{"key":"e_1_2_1_102_1","volume-title":"Clustering and Structural Robustness in Causal Diagrams. arXiv preprint arXiv:2111.04513","author":"Tikka Santtu","year":"2021","unstructured":"Santtu Tikka, Jouni Helske, and Juha Karvanen. 2021. Clustering and Structural Robustness in Causal Diagrams. arXiv preprint arXiv:2111.04513 (2021)."},{"key":"e_1_2_1_103_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1510479113"},{"key":"e_1_2_1_104_1","volume-title":"Abhinav Kumar, Saketh Bachu, Vineeth N Balasubramanian, and Amit Sharma.","author":"Vashishtha Aniket","year":"2023","unstructured":"Aniket Vashishtha, Abbavaram Gowtham Reddy, Abhinav Kumar, Saketh Bachu, Vineeth N Balasubramanian, and Amit Sharma. 2023. Causal inference using llm-guided discovery. arXiv preprint arXiv:2310.15117 (2023)."},{"key":"e_1_2_1_105_1","volume-title":"UAI '88: Proceedings of the Fourth Annual Conference on Uncertainty in Artificial Intelligence","author":"Verma Thomas","year":"1988","unstructured":"Thomas Verma and Judea Pearl. 1988. Causal networks: semantics and expressiveness. In UAI '88: Proceedings of the Fourth Annual Conference on Uncertainty in Artificial Intelligence, Minneapolis, MN, USA, July 10\u201312, 1988, Ross D. Shachter, Tod S. Levitt, Laveen N. Kanal, and John F. Lemmer (Eds.). North-Holland, 69\u201378."},{"key":"e_1_2_1_106_1","volume-title":"Benelearn'02: Proceedings of the Twelfth Belgian-Dutch Conference on Machine Learning. 103\u2013108","author":"Marco","unstructured":"Marco A Wiering et al. 2002. Evolving causal neural networks. In Benelearn'02: Proceedings of the Twelfth Belgian-Dutch Conference on Machine Learning. 103\u2013108."},{"key":"e_1_2_1_107_1","volume-title":"Proceedings of the 2021 International Conference on Management of Data. 2357\u20132365","author":"Yong Quinton","year":"2021","unstructured":"Quinton Yong, Mahdi Hajiabadi, Venkatesh Srinivasan, and Alex Thomo. 2021. Efficient graph summarization using weighted lsh at billion-scale. In Proceedings of the 2021 International Conference on Management of Data. 2357\u20132365."},{"key":"e_1_2_1_108_1","doi-asserted-by":"publisher","DOI":"10.1145\/3639328"},{"key":"e_1_2_1_109_1","volume-title":"On Explaining Confounding Bias. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE","author":"Youngmann Brit","year":"2023","unstructured":"Brit Youngmann, Michael Cafarella, Yuval Moskovitch, and Babak Salimi. 2023. On Explaining Confounding Bias. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 1846\u20131859."},{"key":"e_1_2_1_110_1","first-page":"1","article-title":"Causal Data Integration","volume":"16","author":"Youngmann Brit","year":"2023","unstructured":"Brit Youngmann, Michael Cafarella, Babak Salimi, and Zeng Anna. 2023. Causal Data Integration. Proceedings of the VLDB Endowment 16, 1- (2023), 2665\u20132659.","journal-title":"Proceedings of the VLDB Endowment"},{"key":"e_1_2_1_111_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2008.08.001"},{"key":"e_1_2_1_112_1","volume-title":"Causal Discovery via Causal Star Graphs. ACM Transactions on Knowledge Discovery from Data 17, 7","author":"Zhao Boxiang","year":"2023","unstructured":"Boxiang Zhao, Shuliang Wang, Lianhua Chi, Qi Li, Xiaojia Liu, and Jing Geng. 2023. Causal Discovery via Causal Star Graphs. ACM Transactions on Knowledge Discovery from Data 17, 7 (2023), 1\u201324."},{"key":"e_1_2_1_113_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687627.1687709"},{"key":"e_1_2_1_114_1","doi-asserted-by":"publisher","DOI":"10.14778\/3611479.3611498"},{"key":"e_1_2_1_115_1","first-page":"18","article-title":"Overcoming Data Biases: Towards Enhanced Accuracy and Reliability in Machine Learning","volume":"47","author":"Zhu Jiongli","year":"2024","unstructured":"Jiongli Zhu and Babak Salimi. 2024. Overcoming Data Biases: Towards Enhanced Accuracy and Reliability in Machine Learning. IEEE Data Eng. Bull. 47, 1 (2024), 18\u201335.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_1_116_1","volume-title":"Causal Discovery with Reinforcement Learning. In 8th International Conference on Learning Representations, ICLR 2020","author":"Zhu Shengyu","year":"2020","unstructured":"Shengyu Zhu, Ignavier Ng, and Zhitang Chen. 2020. Causal Discovery with Reinforcement Learning. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26\u201330, 2020. OpenReview.net."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3725688.3725717","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,29]],"date-time":"2025-08-29T14:23:40Z","timestamp":1756477420000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3725688.3725717"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2]]},"references-count":116,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,2]]}},"alternative-id":["10.14778\/3725688.3725717"],"URL":"https:\/\/doi.org\/10.14778\/3725688.3725717","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,2]]},"assertion":[{"value":"2025-08-29","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}