{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T12:38:14Z","timestamp":1775911094167,"version":"3.50.1"},"reference-count":38,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,2,15]],"date-time":"2022-02-15T00:00:00Z","timestamp":1644883200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,2,15]],"date-time":"2022-02-15T00:00:00Z","timestamp":1644883200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62072447"],"award-info":[{"award-number":["62072447"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Data Sci. Eng."],"published-print":{"date-parts":[[2022,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>BERT-based ranking models are emerging for its superior natural language understanding ability. All word relations and representations in the concatenation of query and document are modeled in the self-attention matrix as latent knowledge. However, some latent knowledge has none or negative effect on the relevance prediction between query and document. We model the observable and unobservable confounding factors in a causal graph and perform do-query to predict the relevance label given an intervention over this graph. For the observed factors, we block the back door path by an adaptive masking method through the transformer layer and refine word representations over this disentangled word graph through the refinement layer. For the unobserved factors, we resolve the do-operation query from the front door path by decomposing word representations into query related and unrelated parts through the decomposition layer. Pairwise ranking loss is mainly used for the ad hoc document ranking task, triangle distance loss is introduced to both the transformer and refinement layers for more discriminative representations, and mutual information constraints are put on the decomposition layer. Experimental results on public benchmark datasets TREC Robust04 and WebTrack2009-12 show that DGRe outperforms state-of-the-art baselines more than 2% especially for short queries.<\/jats:p>","DOI":"10.1007\/s41019-022-00179-3","type":"journal-article","created":{"date-parts":[[2022,2,15]],"date-time":"2022-02-15T09:03:00Z","timestamp":1644915780000},"page":"30-43","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Disentangled Graph Recurrent Network for Document Ranking"],"prefix":"10.1007","volume":"7","author":[{"given":"Qian","family":"Dong","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shuzi","family":"Niu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tao","family":"Yuan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yucheng","family":"Li","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2022,2,15]]},"reference":[{"key":"179_CR1","doi-asserted-by":"crossref","unstructured":"Abbasnejad E, Teney D, Parvaneh A, Shi J, Hengel AVD (2020) Counterfactual vision and language learning. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 10044\u201310054","DOI":"10.1109\/CVPR42600.2020.01006"},{"key":"179_CR2","unstructured":"Belghazi MI, Baratin A, Rajeshwar S, Ozair S, Bengio Y, Courville A, Hjelm D (2018) Mutual information neural estimation. In: International conference on machine learning. PMLR, pp 531\u2013540"},{"key":"179_CR3","unstructured":"Bolukbasi T, Chang KW, Zou J, Saligrama V, Kalai A (2016) Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. arXiv preprint arXiv:1607.06520"},{"key":"179_CR4","unstructured":"Chan C, Al-Bashabsheh A, Huang HP, Lim M, Tam DSH, Zhao C (2019) Neural entropic estimation: a faster path to mutual information estimation. arXiv preprint arXiv:1905.12957"},{"key":"179_CR5","unstructured":"Choi K, Lee S (2020) Regularized mutual information neural estimation. arXiv preprint arXiv:2011.07932"},{"key":"179_CR6","doi-asserted-by":"crossref","unstructured":"Dai Z, Callan J (2019) Deeper text understanding for IR with contextual neural language modeling. In: Proceedings of the 42nd international ACM SIGIR, pp 985\u2013988","DOI":"10.1145\/3331184.3331303"},{"key":"179_CR7","doi-asserted-by":"crossref","unstructured":"Dai Z, Callan J (2020) Context-aware term weighting for first stage passage retrieval. In: Proceedings of the 43rd international ACM SIGIR, pp 1533\u20131536","DOI":"10.1145\/3397271.3401204"},{"key":"179_CR8","doi-asserted-by":"crossref","unstructured":"Dai Z, Xiong C, Callan J, Liu Z (2018) Convolutional neural networks for soft-matching n-grams in ad-hoc search. In: Proceedings of the eleventh ACM international conference on web search and data mining, pp 126\u2013134","DOI":"10.1145\/3159652.3159659"},{"key":"179_CR9","unstructured":"Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805"},{"key":"179_CR10","doi-asserted-by":"crossref","unstructured":"Guo J, Fan Y, Ai Q, Croft WB (2016) A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM international on conference on information and knowledge management, pp 55\u201364","DOI":"10.1145\/2983323.2983769"},{"key":"179_CR11","doi-asserted-by":"crossref","unstructured":"Guo J, Fan Y, Pang L, Yang L, Ai Q, Zamani H, Wu C, Croft WB, Cheng X (2019) A deep look into neural ranking models for information retrieval. Inf Process Manag 102067","DOI":"10.1016\/j.ipm.2019.102067"},{"key":"179_CR12","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1016\/j.neunet.2020.11.017","volume":"135","author":"Z Hao","year":"2021","unstructured":"Hao Z, Lv D, Li Z, Cai R, Wen W, Xu B (2021) Semi-supervised disentangled framework for transferable named entity recognition. Neural Netw 135:127\u2013138","journal-title":"Neural Netw"},{"key":"179_CR13","doi-asserted-by":"crossref","unstructured":"Hendricks LA, Burns K, Saenko K, Darrell T, Rohrbach A (2018) Women also snowboard: overcoming bias in captioning models. In: Proceedings of the European conference on computer vision (ECCV), pp 771\u2013787","DOI":"10.1007\/978-3-030-01219-9_47"},{"key":"179_CR14","unstructured":"Hjelm RD, Fedorov A, Lavoie-Marchildon S, Grewal K, Bachman P, Trischler A, Bengio Y (2018) Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670"},{"key":"179_CR15","unstructured":"Hu B, Lu Z, Li H, Chen Q (2014) Convolutional neural network architectures for matching natural language sentences. In: Advances in neural information processing systems, pp 2042\u20132050"},{"key":"179_CR16","unstructured":"Kocaoglu M, Snyder C, Dimakis AG, Vishwanath S (2017) Causalgan: learning causal implicit generative models with adversarial training. arXiv preprint arXiv:1709.02023"},{"key":"179_CR17","doi-asserted-by":"crossref","unstructured":"Li B, Han L (2013) Distance weighted cosine similarity measure for text classification. In: International conference on intelligent data engineering and automated learning. Springer, Berlin, pp 611\u2013618 (2013)","DOI":"10.1007\/978-3-642-41278-3_74"},{"key":"179_CR18","unstructured":"Li Y, Tarlow D, Brockschmidt M, Zemel R (2015) Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493"},{"key":"179_CR19","unstructured":"Li C, Yates A, MacAvaney S, He B, Sun Y (2020) Parade: passage representation aggregation for document reranking. arXiv preprint arXiv:2008.09093"},{"key":"179_CR20","unstructured":"Lin X, Sur I, Nastase SA, Divakaran A, Hasson U, Amer MR (2019) Data-efficient mutual information neural estimator. arXiv preprint arXiv:1905.03319"},{"key":"179_CR21","first-page":"2579","volume":"9","author":"LVD Maaten","year":"2008","unstructured":"Maaten LVD, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579\u20132605","journal-title":"J Mach Learn Res"},{"key":"179_CR22","doi-asserted-by":"crossref","unstructured":"MacAvaney S, Yates A, Cohan A, Goharian N (2019) CEDR: contextualized embeddings for document ranking. In: Proceedings of the 42nd international ACM SIGIR, pp 1101\u20131104","DOI":"10.1145\/3331184.3331317"},{"key":"179_CR23","doi-asserted-by":"crossref","unstructured":"MacAvaney S, Nardini FM, Perego R, Tonellotto N, Goharian N, Frieder O (2020) Efficient document re-ranking for transformers by precomputing term representations. arXiv preprint arXiv:2004.14255","DOI":"10.1145\/3397271.3401093"},{"key":"179_CR24","unstructured":"Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781"},{"key":"179_CR25","doi-asserted-by":"crossref","unstructured":"Pang L, Lan Y, Guo J, Xu J, Wan S, Cheng X (2016) Text matching as image recognition. arXiv preprint arXiv:1602.06359","DOI":"10.1609\/aaai.v30i1.10341"},{"issue":"4","key":"179_CR26","doi-asserted-by":"publisher","first-page":"669","DOI":"10.1093\/biomet\/82.4.669","volume":"82","author":"J Pearl","year":"1995","unstructured":"Pearl J (1995) Causal diagrams for empirical research. Biometrika 82(4):669\u2013688","journal-title":"Biometrika"},{"key":"179_CR27","volume-title":"Probabilistic reasoning in intelligent systems: networks of plausible inference","author":"J Pearl","year":"2014","unstructured":"Pearl J (2014) Probabilistic reasoning in intelligent systems: networks of plausible inference. Elsevier, Amsterdam"},{"key":"179_CR28","doi-asserted-by":"crossref","unstructured":"Peng Z, Huang W, Luo M, Zheng Q, Rong Y, Xu T, Huang J (2020) Graph representation learning via graphical mutual information maximization. In: Proceedings of the web conference 2020, pp 259\u2013270","DOI":"10.1145\/3366423.3380112"},{"issue":"1","key":"179_CR29","first-page":"1929","volume":"15","author":"N Srivastava","year":"2014","unstructured":"Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929\u20131958","journal-title":"J Mach Learn Res"},{"key":"179_CR30","doi-asserted-by":"crossref","unstructured":"Tang K, Niu Y, Huang J, Shi J, Zhang H (2020) Unbiased scene graph generation from biased training. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 3716\u20133725","DOI":"10.1109\/CVPR42600.2020.00377"},{"key":"179_CR31","unstructured":"Veitch V, Sridhar D, Blei D (2020) Adapting text embeddings for causal inference. In: Conference on uncertainty in artificial intelligence. PMLR, pp 919\u2013928 (2020)"},{"key":"179_CR32","unstructured":"Veli\u010dkovi\u0107 P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903"},{"key":"179_CR33","unstructured":"Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning. PMLR, pp 2048\u20132057 (2015)"},{"key":"179_CR34","doi-asserted-by":"crossref","unstructured":"Yang P, Fang H, Lin J (2017) Anserini: enabling the use of Lucene for information retrieval research. In: Proceedings of the 40th international ACM SIGIR, pp 1253\u20131256","DOI":"10.1145\/3077136.3080721"},{"key":"179_CR35","doi-asserted-by":"crossref","unstructured":"Yang X, Zhang H, Qi G, Cai J (2021) Causal attention for vision-language tasks. arXiv preprint arXiv:2103.03493","DOI":"10.1109\/CVPR46437.2021.00972"},{"key":"179_CR36","unstructured":"Yue Z, Zhang H, Sun Q, Hua XS (2020) Interventional few-shot learning. arXiv preprint arXiv:2009.13000"},{"key":"179_CR37","unstructured":"Zhang D, Zhang H, Tang J, Hua X, Sun Q (2020) Causal intervention for weakly-supervised semantic segmentation. arXiv preprint arXiv:2009.12547"},{"key":"179_CR38","unstructured":"Zhou J, Cui G, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M (2018) Graph neural networks: a review of methods and applications. arXiv preprint arXiv:1812.08434"}],"container-title":["Data Science and Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41019-022-00179-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s41019-022-00179-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41019-022-00179-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,26]],"date-time":"2023-01-26T23:20:11Z","timestamp":1674775211000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s41019-022-00179-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,15]]},"references-count":38,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,3]]}},"alternative-id":["179"],"URL":"https:\/\/doi.org\/10.1007\/s41019-022-00179-3","relation":{},"ISSN":["2364-1185","2364-1541"],"issn-type":[{"value":"2364-1185","type":"print"},{"value":"2364-1541","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,2,15]]},"assertion":[{"value":"2 June 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 September 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 January 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 February 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Avoid reviewers from Chinese Academy of Sciences.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"All authors consent to participate in this work.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to participate"}},{"value":"All authors consent to publish the paper.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}]}}