{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,23]],"date-time":"2025-12-23T16:29:25Z","timestamp":1766507365171,"version":"3.48.0"},"reference-count":81,"publisher":"Association for Computing Machinery (ACM)","issue":"2","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2026,2,28]]},"abstract":"<jats:p>\n                    Click-Through Rate (CTR) prediction, which estimates the probability of a user clicking on a given item, is a critical task for online information services. Existing approaches often make strong assumptions that training and test data come from the same distribution. However, the data distribution varies since user interests are constantly evolving, resulting in the Out-of-Distribution (OOD) issue. In addition, users tend to have multiple interests, some of which evolve faster than others. Toward this end, we propose Disentangled Click-Through Rate Prediction (DiseCTR), which introduces a causal perspective of recommendation and disentangles multiple aspects of user interests to alleviate the OOD issue in recommendation. We conduct a causal factorization of CTR prediction involving user interest, exposure model, and click model, based on which we develop a deep learning implementation for these three causal mechanisms. Specifically, we first design an interest encoder with sparse attention which maps raw features to user interests and then introduce a weakly supervised interest disentangler to learn independent interest embeddings, which are further integrated by an attentive interest aggregator for prediction. Experimental results on three real-world datasets show that DiseCTR achieves the best accuracy and robustness in OOD recommendation against state-of-the-art approaches, significantly improving AUC and GAUC by over 0.02 and reducing logloss by over 13.7%. Further analyses demonstrate that DiseCTR successfully disentangles user interests, which is the key to OOD generalization for CTR prediction. We have released the code and data at\n                    <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/DavyMorgan\/DiseCTR\/\">https:\/\/github.com\/DavyMorgan\/DiseCTR\/<\/jats:ext-link>\n                    .\n                  <\/jats:p>","DOI":"10.1145\/3777368","type":"journal-article","created":{"date-parts":[[2025,11,14]],"date-time":"2025-11-14T11:18:18Z","timestamp":1763119098000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Disentangled Interest Network for Out-of-Distribution CTR\u00a0Prediction"],"prefix":"10.1145","volume":"44","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1837-6730","authenticated-orcid":false,"given":"Yu","family":"Zheng","sequence":"first","affiliation":[{"name":"Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7561-5646","authenticated-orcid":false,"given":"Chen","family":"Gao","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7886-9238","authenticated-orcid":false,"given":"Jianxin","family":"Chang","sequence":"additional","affiliation":[{"name":"Beijing Kuaishou Technology Co. Ltd., Haidian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2083-518X","authenticated-orcid":false,"given":"Yanan","family":"Niu","sequence":"additional","affiliation":[{"name":"Beijing Kuaishou Technology Co. Ltd., Haidian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1714-5527","authenticated-orcid":false,"given":"Yang","family":"Song","sequence":"additional","affiliation":[{"name":"Beijing Kuaishou Technology Co. Ltd., Haidian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0419-5514","authenticated-orcid":false,"given":"Depeng","family":"Jin","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3094-7735","authenticated-orcid":false,"given":"Meng","family":"Wang","sequence":"additional","affiliation":[{"name":"Hefei University of Technology, Hefei, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5617-1659","authenticated-orcid":false,"given":"Yong","family":"Li","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,12,23]]},"reference":[{"key":"e_1_3_2_2_2","first-page":"265","volume-title":"OSDI \u201916","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In OSDI \u201916, 265\u2013283."},{"key":"e_1_3_2_3_2","unstructured":"Martin Arjovsky L\u00e9on Bottou Ishaan Gulrajani and David Lopez-Paz. 2019. Invariant risk minimization. arXiv:1907.02893. Retrieved from https:\/\/arxiv.org\/abs\/1907.02893"},{"key":"e_1_3_2_4_2","unstructured":"Yoshua Bengio Tristan Deleu Nasim Rahaman Rosemary Ke S\u00e9bastien Lachapelle Olexa Bilaniuk Anirudh Goyal and Christopher Pal. 2019. A meta-transfer objective for learning to disentangle causal mechanisms. arXiv:1901.10912. Retrieved from https:\/\/arxiv.org\/abs\/1901.10912"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462968"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3373807"},{"key":"e_1_3_2_7_2","doi-asserted-by":"crossref","unstructured":"Jiawei Chen Hande Dong Yang Qiu Xiangnan He Xin Xin Liang Chen Guli Lin and Keping Yang. 2021. AutoDebias: Learning to debias for recommendation. arXiv:2105.04170. Retrieved from https:\/\/arxiv.org\/abs\/2105.04170","DOI":"10.1145\/3404835.3462919"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/2988450.2988454"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5768"},{"key":"e_1_3_2_10_2","first-page":"1180","volume-title":"ICML","author":"Ganin Yaroslav","year":"2015","unstructured":"Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised domain adaptation by backpropagation. In ICML, 1180\u20131189."},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3594871"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/3639048"},{"key":"e_1_3_2_13_2","unstructured":"Huifeng Guo Ruiming Tang Yunming Ye Zhenguo Li and Xiuqiang He. 2017. DeepFM: A factorization-machine based neural network for CTR prediction. arXiv:1703.04247. Retrieved from https:\/\/arxiv.org\/abs\/1703.04247"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330670"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3077136.3080777"},{"key":"e_1_3_2_16_2","first-page":"410","volume-title":"WWW","author":"He Yue","year":"2022","unstructured":"Yue He, Zimu Wang, Peng Cui, Hao Zou, Yafeng Zhang, Qiang Cui, and Yong Jiang. 2022. CausPref: Causal preference learning for out-of-distribution recommendation. In WWW, 410\u2013421."},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00823"},{"key":"e_1_3_2_18_2","unstructured":"Irina Higgins Loic Matthey Arka Pal Christopher Burgess Xavier Glorot Matthew Botvinick Shakir Mohamed and Alexander Lerchner. 2016. beta-vae: Learning basic visual concepts with a constrained variational framework. Retrieved from https:\/\/openreview.net\/forum?id=Sy2fzU9gl"},{"key":"e_1_3_2_19_2","unstructured":"Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980. Retrieved from https:\/\/arxiv.org\/abs\/1412.6980"},{"key":"e_1_3_2_20_2","unstructured":"Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv:1312.6114. Retrieved from https:\/\/arxiv.org\/abs\/1312.6114"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3340531.3412745"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3446427"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3220023"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403314"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401087"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/3329188"},{"key":"e_1_3_2_27_2","first-page":"4114","volume-title":"ICML","author":"Locatello Francesco","year":"2019","unstructured":"Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Raetsch, Sylvain Gelly, Bernhard Sch\u00f6lkopf, and Olivier Bachem. 2019. Challenging common assumptions in the unsupervised learning of disentangled representations. In ICML, 4114\u20134124."},{"key":"e_1_3_2_28_2","first-page":"6348","volume-title":"ICML","author":"Locatello Francesco","year":"2020","unstructured":"Francesco Locatello, Ben Poole, Gunnar R\u00e4tsch, Bernhard Sch\u00f6lkopf, Olivier Bachem, and Michael Tschannen. 2020. Weakly-supervised disentanglement without compromises. In ICML, 6348\u20136359."},{"key":"e_1_3_2_29_2","unstructured":"Francesco Locatello Michael Tschannen Stefan Bauer Gunnar R\u00e4tsch Bernhard Sch\u00f6lkopf and Olivier Bachem. 2019. Disentangling factors of variation using few labels. arXiv:1905.01258. Retrieved from https:\/\/arxiv.org\/abs\/1905.01258"},{"key":"e_1_3_2_30_2","unstructured":"Chaochao Lu Yuhuai Wu Jo\u015be Miguel Hern\u00e1ndez-Lobato and Bernhard Sch\u00f6lkopf. 2021. Nonlinear invariant risk minimization: A causal approach. arXiv:2102.12353. Retrieved from https:\/\/arxiv.org\/abs\/2102.12353"},{"key":"e_1_3_2_31_2","volume-title":"ICLR","author":"Lu Chaochao","year":"2022","unstructured":"Chaochao Lu, Yuhuai Wu, Jos\u00e9 Miguel Hern\u00e1ndez-Lobato, and Bernhard Sch\u00f6lkopf. 2022. Invariant causal representation learning for out-of-distribution generalization. In ICLR."},{"key":"e_1_3_2_32_2","unstructured":"Minh-Thang Luong Hieu Pham and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv:1508.04025. Retrieved from https:\/\/arxiv.org\/abs\/1508.04025"},{"key":"e_1_3_2_33_2","unstructured":"Jianxin Ma Chang Zhou Peng Cui Hongxia Yang and Wenwu Zhu. 2019. Learning disentangled representations for recommendation. arXiv:1910.14238. Retrieved from https:\/\/arxiv.org\/abs\/1910.14238"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403091"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/3606369"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.5555\/3360093"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1018"},{"key":"e_1_3_2_38_2","volume-title":"The Book of Why: The New Science of Cause and Effect","author":"Pearl Judea","year":"2018","unstructured":"Judea Pearl and Dana Mackenzie. 2018. The Book of Why: The New Science of Cause and Effect. Basic Books."},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.5555\/3202377"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330666"},{"key":"e_1_3_2_41_2","first-page":"669","volume-title":"WWW","author":"Punjabi Surabhi","year":"2018","unstructured":"Surabhi Punjabi and Priyanka Bhatt. 2018. Robust factorization machines for user response prediction. In WWW, 669\u2013678."},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3579354"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401440"},{"issue":"1","key":"e_1_3_2_44_2","first-page":"1","article-title":"Learning from hierarchical structure of knowledge graph for recommendation","volume":"42","author":"Qin Yingrong","year":"2023","unstructured":"Yingrong Qin, Chen Gao, Shuangqing Wei, Yue Wang, Depeng Jin, Jian Yuan, Lin Zhang, Dong Li, Jianye Hao, and Yong Li. 2023. Learning from hierarchical structure of knowledge graph for recommendation. ACM Transactions on Information Systems 42, 1 (2023), 1\u201324.","journal-title":"ACM Transactions on Information Systems"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3539618.3591665"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2010.127"},{"key":"e_1_3_2_47_2","unstructured":"Steffen Rendle Christoph Freudenthaler Zeno Gantner and Lars Schmidt-Thieme. 2012. BPR: Bayesian personalized ranking from implicit feedback. arXiv:1205.2618. Retrieved from https:\/\/arxiv.org\/abs\/1205.2618"},{"key":"e_1_3_2_48_2","unstructured":"Bernhard Sch\u00f6lkopf. 2019. Causality for machine learning. arXiv:1911.10500. Retrieved from https:\/\/arxiv.org\/abs\/1911.10500"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2021.3058954"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939704"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403269"},{"key":"e_1_3_2_52_2","unstructured":"Zheyan Shen Jiashuo Liu Yue He Xingxuan Zhang Renzhe Xu Han Yu and Peng Cui. 2021. Towards out-of-distribution generalization: A survey. arXiv:2108.13624. Retrieved from https:\/\/arxiv.org\/abs\/2108.13624"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3357925"},{"key":"e_1_3_2_54_2","first-page":"5998","volume-title":"NeurIPS","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NeurIPS, 5998\u20136008."},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/3522673"},{"key":"e_1_3_2_56_2","article-title":"Collaborative recurrent autoencoder: Recommend while learning to fill in the blanks","volume":"29","author":"Wang Hao","year":"2016","unstructured":"Hao Wang, Xingjian Shi, and Dit-Yan Yeung. 2016. Collaborative recurrent autoencoder: Recommend while learning to fill in the blanks. Advances in Neural Information Processing Systems 29 (2016).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783273"},{"key":"e_1_3_2_58_2","first-page":"3562","volume-title":"WWW","author":"Wang Wenjie","year":"2022","unstructured":"Wenjie Wang, Xinyu Lin, Fuli Feng, Xiangnan He, Min Lin, and Tat-Seng Chua. 2022. Causal representation learning for out-of-distribution recommendation. In WWW, 3562\u20133571."},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1145\/3442381.3450133"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401137"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2023\/260"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462914"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.5555\/3172077.3172324"},{"key":"e_1_3_2_64_2","unstructured":"Sang Michael Xie Ananya Kumar Robbie Jones Fereshte Khani Tengyu Ma and Percy Liang. 2020. In-n-out: Pre-training and self-training using auxiliary information for out-of-distribution robustness. arXiv:2012.04550. Retrieved from https:\/\/arxiv.org\/abs\/2012.04550"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.6119"},{"key":"e_1_3_2_66_2","doi-asserted-by":"publisher","DOI":"10.1145\/3587693"},{"key":"e_1_3_2_67_2","first-page":"23519","article-title":"Towards a theoretical framework of out-of-distribution generalization","volume":"34","author":"Ye Haotian","year":"2021","unstructured":"Haotian Ye, Chuanlong Xie, Tianle Cai, Ruichen Li, Zhenguo Li, and Liwei Wang. 2021. Towards a theoretical framework of out-of-distribution generalization. NeurIPS 34 (2021), 23519\u201323531.","journal-title":"NeurIPS"},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.1145\/3340531.3412077"},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1145\/3580305.3599277"},{"key":"e_1_3_2_70_2","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939673"},{"key":"e_1_3_2_71_2","doi-asserted-by":"publisher","DOI":"10.1145\/3626772.3657777"},{"key":"e_1_3_2_72_2","doi-asserted-by":"publisher","DOI":"10.1145\/3511469"},{"key":"e_1_3_2_73_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2024.3361482"},{"key":"e_1_3_2_74_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00533"},{"key":"e_1_3_2_75_2","doi-asserted-by":"crossref","unstructured":"Yang Zhang Fuli Feng Xiangnan He Tianxin Wei Chonggang Song Guohui Ling and Yongdong Zhang. 2021. Causal intervention for leveraging popularity bias in recommendation. arXiv:2105.06067. Retrieved from https:\/\/arxiv.org\/abs\/2105.06067","DOI":"10.1145\/3404835.3462875"},{"key":"e_1_3_2_76_2","first-page":"2256","volume-title":"WWW","author":"Zheng Yu","year":"2022","unstructured":"Yu Zheng, Chen Gao, Jianxin Chang, Yanan Niu, Yang Song, Depeng Jin, and Yong Li. 2022. Disentangling long and short-term interests for recommendation. In WWW, 2256\u20132267."},{"key":"e_1_3_2_77_2","first-page":"401","volume-title":"WWW","author":"Zheng Yu","year":"2021","unstructured":"Yu Zheng, Chen Gao, Liang Chen, Depeng Jin, and Yong Li. 2021. DGCN: Diversified recommendation with graph convolutional networks. In WWW, 401\u2013412."},{"key":"e_1_3_2_78_2","doi-asserted-by":"publisher","DOI":"10.1145\/3442381.3449788"},{"key":"e_1_3_2_79_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33015941"},{"key":"e_1_3_2_80_2","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219823"},{"key":"e_1_3_2_81_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i12.17325"},{"key":"e_1_3_2_82_2","unstructured":"Yanqiao Zhu Yichen Xu Feng Yu Qiang Liu Shu Wu and Liang Wang. 2021. Disentangled self-attentive neural networks for click-through rate prediction. arXiv:2101.03654. Retrieved from https:\/\/arxiv.org\/abs\/2101.03654"}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3777368","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,23]],"date-time":"2025-12-23T14:07:41Z","timestamp":1766498861000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3777368"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,23]]},"references-count":81,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,2,28]]}},"alternative-id":["10.1145\/3777368"],"URL":"https:\/\/doi.org\/10.1145\/3777368","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"type":"print","value":"1046-8188"},{"type":"electronic","value":"1558-2868"}],"subject":[],"published":{"date-parts":[[2025,12,23]]},"assertion":[{"value":"2024-07-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-26","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-12-23","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}