{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,23]],"date-time":"2026-03-23T17:33:38Z","timestamp":1774287218494,"version":"3.50.1"},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"8","license":[{"start":{"date-parts":[[2023,6,28]],"date-time":"2023-06-28T00:00:00Z","timestamp":1687910400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2021YFC3340302"],"award-info":[{"award-number":["2021YFC3340302"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62006051"],"award-info":[{"award-number":["62006051"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"crossref","award":["2022M720033"],"award-info":[{"award-number":["2022M720033"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100021171","name":"Guangdong Basic and Applied Basic Research Foundation","doi-asserted-by":"crossref","award":["2021A1515011995"],"award-info":[{"award-number":["2021A1515011995"]}],"id":[{"id":"10.13039\/501100021171","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"crossref","award":["U1936205, 62172300"],"award-info":[{"award-number":["U1936205, 62172300"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2023,9,30]]},"abstract":"<jats:p>\n            Recently, many regression-based\n            <jats:bold>conditional independence (CI)<\/jats:bold>\n            test methods have been proposed to solve the problem of causal discovery. These methods provide alternatives to test CI of\n            <jats:italic>x,y<\/jats:italic>\n            given\n            <jats:italic>Z<\/jats:italic>\n            by first removing the information of the controlling set\n            <jats:italic>Z<\/jats:italic>\n            from\n            <jats:italic>x<\/jats:italic>\n            and\n            <jats:italic>y<\/jats:italic>\n            , and then testing the independence between the two residuals\n            <jats:italic>\n              R\n              <jats:sub>x,Z<\/jats:sub>\n            <\/jats:italic>\n            and\n            <jats:italic>\n              R\n              <jats:sub>y,Z<\/jats:sub>\n            <\/jats:italic>\n            . When the residuals are linearly uncorrelated, the independence test between them is nontrivial. With the ability to calculate inner product in high-dimensional space, kernel-based methods are usually used to achieve this goal, but they are considerably time-consuming. In this paper, we test the independence between two linear combinations under linear structural equation model. We show that the dependence between the two residuals can be captured by the difference between the similarity of\n            <jats:italic>\n              R\n              <jats:sub>x,Z<\/jats:sub>\n            <\/jats:italic>\n            and\n            <jats:italic>\n              R\n              <jats:sub>y,Z<\/jats:sub>\n            <\/jats:italic>\n            and that of\n            <jats:italic>\n              R\n              <jats:sub>x,Z<\/jats:sub>\n            <\/jats:italic>\n            and\n            <jats:italic>\n              R\n              <jats:sub>r<\/jats:sub>\n            <\/jats:italic>\n            (\n            <jats:italic>\n              R\n              <jats:sub>r<\/jats:sub>\n            <\/jats:italic>\n            is an independent copy of\n            <jats:italic>\n              R\n              <jats:sub>y,Z<\/jats:sub>\n            <\/jats:italic>\n            ) in high-dimensional space. With this result, we provide a new way to test CI based on the similarity between residuals, which is called SCIT \u2014 the abbreviation of Similarity-based CI Testing. Furthermore, we develop two versions of the proposal, called Kernel-SCIT and Neural-SCIT, respectively. Kernel-SCIT calculates the similarity by using kernel functions, while Neural-SCIT approximates the upper bound of the similarity by using deep neural networks. In both algorithms, random permutation tests are performed to control Type I error rate. The proposed tests are evaluated on (conditional) independence test and causal discovery with both synthetic and real datasets. Experimental results show that Kernel-SCIT is simpler yet more efficient and effective than the typical existing kernel-based methods HSIC and KCIT in the cases of small sample size, and Neural-SCIT can significantly boost the performance of CI testing when sufficient samples are available. The source code is available at\n            <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"url\" xlink:href=\"https:\/\/github.com\/xyw5vplus1\/SCIT\">https:\/\/github.com\/xyw5vplus1\/SCIT<\/jats:ext-link>\n            .\n          <\/jats:p>","DOI":"10.1145\/3593810","type":"journal-article","created":{"date-parts":[[2023,4,25]],"date-time":"2023-04-25T12:20:33Z","timestamp":1682425233000},"page":"1-18","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Conditional Independence Test Based on Residual Similarity"],"prefix":"10.1145","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5544-5347","authenticated-orcid":false,"given":"Hao","family":"Zhang","sequence":"first","affiliation":[{"name":"School of Computer Science, Fudan University, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5515-5913","authenticated-orcid":false,"given":"Yewei","family":"Xia","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0738-9958","authenticated-orcid":false,"given":"Kun","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Philosophy, Carnegie Mellon University, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1949-2768","authenticated-orcid":false,"given":"Shuigeng","family":"Zhou","sequence":"additional","affiliation":[{"name":"Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2313-7635","authenticated-orcid":false,"given":"Jihong","family":"Guan","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, Tongji University, China"}]}],"member":"320","published-online":{"date-parts":[[2023,6,28]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"Wicher Pieter Bergsma. 2004. Testing conditional independence for continuous random variables. (2004)."},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1111\/rssb.12340"},{"key":"e_1_3_1_4_2","first-page":"208","volume-title":"International Conference on Machine Learning","author":"Cai Ruichu","year":"2013","unstructured":"Ruichu Cai, Zhenjie Zhang, and Zhifeng Hao. 2013. SADA: A general framework to support robust causation discovery. In International Conference on Machine Learning. 208\u2013216."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1093\/biomet\/67.3.581"},{"key":"e_1_3_1_6_2","doi-asserted-by":"crossref","first-page":"685","DOI":"10.1109\/FOCS.2016.78","volume-title":"2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS\u201916)","author":"Diakonikolas Ilias","year":"2016","unstructured":"Ilias Diakonikolas and Daniel M. Kane. 2016. A new approach for testing properties of discrete distributions. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS\u201916). IEEE, 685\u2013694."},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.5555\/3020751.3020766"},{"issue":"2","key":"e_1_3_1_8_2","article-title":"Gaussian processes for independence tests with Non-iid data in causal inference.","volume":"7","author":"Flaxman Seth R.","year":"2016","unstructured":"Seth R. Flaxman, Daniel B. Neill, and Alexander J. Smola. 2016. Gaussian processes for independence tests with Non-iid data in causal inference. ACM TIST 7, 2 (2016), 22\u20131.","journal-title":"ACM TIST"},{"issue":"1","key":"e_1_3_1_9_2","first-page":"167","article-title":"Kernel measures of conditional dependence.","volume":"20","author":"Fukumizu Kenji","year":"2007","unstructured":"Kenji Fukumizu, Arthur Gretton, Xiaohai Sun, and Bernhard Sch\u00f6lkopf. 2007. Kernel measures of conditional dependence. Advances in Neural Information Processing Systems 20, 1 (2007), 167\u2013204.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_10_2","first-page":"513","volume-title":"Advances in Neural Information Processing Systems","author":"Gretton Arthur","year":"2006","unstructured":"Arthur Gretton, Karsten M. Borgwardt, Malte Rasch, Bernhard Sch\u00f6lkopf, and Alex J. Smola. 2006. A kernel method for the two-sample-problem. In Advances in Neural Information Processing Systems. 513\u2013520."},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1007\/11564089_7"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuroimage.2015.10.062"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.123"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3447548.3467439"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1214\/aoms\/1177730150"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1016\/0893-6080(89)90020-8"},{"key":"e_1_3_1_17_2","volume-title":"3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7\u20139, 2015, Conference Track Proceedings","author":"Kingma Diederik P.","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7\u20139, 2015, Conference Track Proceedings."},{"key":"e_1_3_1_18_2","volume-title":"Uncertainty in Artificial Intelligence","author":"Lee Sanghack","year":"2017","unstructured":"Sanghack Lee and Vasant G. Honavar. 2017. Self-discrepancy conditional independence test. In Uncertainty in Artificial Intelligence, Vol. 33."},{"key":"e_1_3_1_19_2","volume-title":"6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30\u2013May 3, 2018, Conference Track Proceedings","author":"Miyato Takeru","year":"2018","unstructured":"Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral normalization for generative adversarial networks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30\u2013May 3, 2018, Conference Track Proceedings. OpenReview.net. https:\/\/openreview.net\/forum?id=B1QRgziT-."},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.5555\/1642718"},{"key":"e_1_3_1_21_2","volume-title":"The Book of Why: The New Science of Cause and Effect","author":"Pearl Judea","year":"2018","unstructured":"Judea Pearl and Dana MacKenzie. 2018. The Book of Why: The New Science of Cause and Effect. Basic Books."},{"key":"e_1_3_1_22_2","article-title":"A scalable conditional independence test for nonlinear, non-Gaussian data","author":"Ramsey Joseph D.","year":"2014","unstructured":"Joseph D. Ramsey. 2014. A scalable conditional independence test for nonlinear, non-Gaussian data. arXiv preprint arXiv:1401.5031 (2014).","journal-title":"arXiv preprint arXiv:1401.5031"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4612-0103-8_2"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.1105809"},{"key":"e_1_3_1_25_2","article-title":"Approximate kernel-based conditional independence tests for fast non-parametric causal discovery","author":"Strobl Eric V.","year":"2017","unstructured":"Eric V. Strobl, Kun Zhang, and Shyam Visweswaran. 2017. Approximate kernel-based conditional independence tests for fast non-parametric causal discovery. arXiv preprint arXiv:1702.03877 (2017).","journal-title":"arXiv preprint arXiv:1702.03877"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1017\/S0266466608080341"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/4235.585893"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11555"},{"key":"e_1_3_1_29_2","first-page":"1250","volume-title":"AAAI","author":"Zhang Hao","year":"2017","unstructured":"Hao Zhang, Shuigeng Zhou, Kun Zhang, and Jihong Guan. 2017. Causal discovery using regression-based conditional independence tests. In AAAI. 1250\u20131256."},{"key":"e_1_3_1_30_2","volume-title":"AAAI","author":"Zhang Hao","year":"2021","unstructured":"Hao Zhang, Shuigeng Zhou, Kun Zhang, Jihong Guan, and Ji Zhang. 2021. Testing independence between linear combinations for causal discovery. In AAAI."},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.5555\/1795114.1795190"},{"key":"e_1_3_1_32_2","unstructured":"K. Zhang J. Peters D. Janzing and B. Sch\u00f6lkopf. 2011. Kernel-based conditional independence test and application in causal discovery. AUAI Press Corvallis OR USA 804\u2013813."},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11222-016-9721-7"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3593810","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3593810","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:47:50Z","timestamp":1750178870000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3593810"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,28]]},"references-count":32,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2023,9,30]]}},"alternative-id":["10.1145\/3593810"],"URL":"https:\/\/doi.org\/10.1145\/3593810","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,28]]},"assertion":[{"value":"2022-09-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-04-10","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-06-28","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}