{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T01:28:06Z","timestamp":1760059686137,"version":"build-2065373602"},"reference-count":49,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2025,6,29]],"date-time":"2025-06-29T00:00:00Z","timestamp":1751155200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Complex question decomposition is an important research topic in the field of natural language processing (NLP). It refers to the decomposition of a compound question containing multiple ontologies and classes into a simple question containing only a single attribute or entity. Most previous studies focus on how to generate simple questions using a single attribute or entity but pay little attention to the generation order of simple questions, which may lead to an inaccurate decomposition or longer execution time. In this study, we propose a new method based on causal reinforcement learning, which combines the advantages of the current optimal performance reinforcement learning method and the causal inference method. Compared with previous methods, causal reinforcement learning can find the generation order of sub-questions more accurately, so as to better decompose complex questions. In particular, the prior knowledge is extracted using the counterfactual method in causal reasoning and is integrated into the policy network of the reinforcement learning model, and the reward rules of reinforcement learning are designed from the perspective of symmetry (positive reward and negative punishment), thus the intelligent body is guided to choose the sub-question with a greater benefit and less risk of decomposing. We compare the proposed method with the baseline method on three datasets. The experimental results show that the performance of our method is improved by 5\u201310% compared with the baseline method on Hits@n (n = 1, 3, 10), which proves the effectiveness of our proposed method.<\/jats:p>","DOI":"10.3390\/sym17071022","type":"journal-article","created":{"date-parts":[[2025,6,30]],"date-time":"2025-06-30T03:54:28Z","timestamp":1751255668000},"page":"1022","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Complex Question Decomposition Based on Causal Reinforcement Learning"],"prefix":"10.3390","volume":"17","author":[{"given":"Dezhi","family":"Li","sequence":"first","affiliation":[{"name":"School of Information and Communication, National University of Defense Technology, Wuhan 430019, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yunjun","family":"Lu","sequence":"additional","affiliation":[{"name":"School of Information and Communication, National University of Defense Technology, Wuhan 430019, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianping","family":"Wu","sequence":"additional","affiliation":[{"name":"School of Information and Communication, National University of Defense Technology, Wuhan 430019, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenlu","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Information and Communication, National University of Defense Technology, Wuhan 430019, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guangjun","family":"Zeng","sequence":"additional","affiliation":[{"name":"School of Information and Communication, National University of Defense Technology, Wuhan 430019, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,6,29]]},"reference":[{"key":"ref_1","first-page":"22","article-title":"A survey of complex question decomposition methods in question answering system","volume":"17","author":"Jun","year":"2022","journal-title":"Comput. Eng. Appl."},{"key":"ref_2","first-page":"2673","article-title":"Complex question answering Method of interpretable knowledge map based on graph matching network","volume":"12","author":"Wei","year":"2021","journal-title":"J. Comput. Res. Dev."},{"key":"ref_3","first-page":"112","article-title":"Intelligent understanding of intention of complex questions for medical consultation","volume":"37","author":"Bin","year":"2023","journal-title":"J. Chin. Inf. Process."},{"key":"ref_4","unstructured":"Zhang, Y.N., Cheng, X., and Zhang, Y.F. (2019). Learning to order sub-questions for complex question answering. arXiv."},{"key":"ref_5","unstructured":"Fazili, B., Goswami, K., and Modani, N. (2024). GenSco: Can Question Decomposition based Passage Alignment improve Question Answering?. arXiv."},{"key":"ref_6","unstructured":"Rosset, C., Qin, G., and Feng, Z. (2024). Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents. arXiv."},{"key":"ref_7","unstructured":"Yi, F., Fu, W., and Liang, H. (2018, January 2\u20136). Model-based reinforcement learning: A survey. Proceedings of the 18th ICEB, Guilin, China."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1726","DOI":"10.1631\/FITEE.1900533","article-title":"Deep reinforcement learning: A survey","volume":"12","author":"Wang","year":"2020","journal-title":"Front. Inf. Technol. Electron. Eng."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1561\/2200000086","article-title":"Model-based reinforcement learning: A survey","volume":"16","author":"Moerland","year":"2023","journal-title":"Found. Trends Mach. Learn."},{"key":"ref_10","first-page":"1101","article-title":"A review of research on multi-agent reinforcement learning algorithms","volume":"4","author":"Yang","year":"2024","journal-title":"J. Front. Comput. Sci. Technol."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1007\/BF00992698","article-title":"Q-learning","volume":"8","author":"Watkins","year":"1992","journal-title":"Mach. Learn."},{"key":"ref_12","unstructured":"Mnih, V., Silver, D., and Graves, A. (2013). Playing Atari with deep reinforcement learning. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1023\/A:1007678930559","article-title":"Convergence results for single-step on-policy reinforcement-learning algorithms","volume":"38","author":"Singh","year":"2000","journal-title":"Mach. Learn."},{"key":"ref_14","unstructured":"Fortunato, M., Azar, M., and Piot, B. (2018, January 1\u20134). Noisy networks for exploration. Proceedings of the 6th ICLR, Vancouver, BC, Canada."},{"key":"ref_15","unstructured":"Gal, Y., McAllister, R., and Rasmussen, C.E. (2016, January 19\u201324). Improving PILCO with bayesian neural network dynamics models. Proceedings of the 33th ICML, New York, NY, USA."},{"key":"ref_16","unstructured":"Mnih, V., Badia, A.P., and Mirza, M. (2016, January 19\u201324). Asynchronous methods for deep reinforcement learning. Proceedings of the 33th ICML, New York, NY, USA."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"2975","DOI":"10.1038\/s41598-020-59669-x","article-title":"Challenges and opportunities with causal discovery algorithms: Application to Alzheimer\u2019s pathophysiology","volume":"1","author":"Shen","year":"2020","journal-title":"Sci. Rep."},{"key":"ref_18","first-page":"3397269","article-title":"A survey of learning causality with data: Problems and methods","volume":"53","author":"Guo","year":"2020","journal-title":"ACM Comput. Surv."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"203","DOI":"10.3934\/jdg.2021008","article-title":"Causal discovery in machine learning: Theories and applications","volume":"3","author":"Nogueira","year":"2021","journal-title":"J. Dyn. Games"},{"key":"ref_20","unstructured":"Ogarrio, J.M., Spirtes, P., and Ramsey, J. (2016, January 6\u20139). A hybrid causal search algorithm for latent variable models. Proceedings of the 8th PGM, Lugano, Switzerland."},{"key":"ref_21","first-page":"507","article-title":"Optimal structure identification with greedy search","volume":"3","author":"Chickering","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_22","unstructured":"Spirtes, P.L., Meek, C., and Richardon, T.S. (1995, January 18\u201320). Causal inference in the presence of latent variables and selection bias. Proceedings of the 11th UAI, Montreal, QC, Canada."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1126\/sciadv.aau4996","article-title":"Detecting and quantifying causal associations in large nonlinear time series datasets","volume":"5","author":"Runge","year":"2019","journal-title":"Sci. Adv."},{"key":"ref_24","unstructured":"Affeldt, S., and Isambert, H. (2015, January 12\u201316). Robust reconstruction of causal graphical models based on conditional 2-point and 3-point information. Proceedings of the 31th UAI, Amsterdam, The Netherlands."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"4053","DOI":"10.1002\/sim.6207","article-title":"On regression adjustment for the propensity score","volume":"23","author":"Vansteelandt","year":"2014","journal-title":"Stat. Med."},{"key":"ref_26","unstructured":"Danilo, J., Danihelka, I., and George, P. (2020). Causally correct partial models for reinforcement learning. arXiv."},{"key":"ref_27","unstructured":"Zhi, H.D., Jing, J., and Guo, D.L. (2023). Causal reinforcement learning: A survey. arXiv."},{"key":"ref_28","unstructured":"Zeng, Y., Rui, C., and Fu, S. (2023). A survey on causal reinforcement learning. arXiv."},{"key":"ref_29","first-page":"661","article-title":"Causality in reinforcement learning control: The state of the art and prospects","volume":"49","author":"Yue","year":"2023","journal-title":"Acta Autom. Sin."},{"key":"ref_30","unstructured":"Liao, Z., Fu, Z., and Yang, Y. (2021). Instrumental variable value iteration for causal offline reinforcement learning. arXiv."},{"key":"ref_31","unstructured":"Subramanian, C., and Ravindran, B. (2022, January 25\u201329). Causal contextual bandits with targeted interventions. Proceedings of the 10th ICLR, Online."},{"key":"ref_32","unstructured":"Huang, B., Lu, C., and Le, J. (2022, January 17\u201323). Action-sufficient state representation learning for control with structural constraints. Proceedings of the 39th ICML, Baltimore, MD, USA."},{"key":"ref_33","unstructured":"Bica, I., and Jarrett, D. (2021, January 3\u20137). Learning what if explanations for sequential decision-making. Proceedings of the 9th ICLR, Online."},{"key":"ref_34","first-page":"9001","article-title":"A span-based target-aware relation model for frame-semantic parsing","volume":"22","author":"Feng","year":"2023","journal-title":"ACM Trans. Asian Low. Resour. Lang. Inf. Process."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Kalyanpur, A., Patwardhan, S., and Boguraev, B. (2011, January 24\u201328). Fact-based question decomposition for candidate answer re-ranking. Proceedings of the 20th ACM CIKM, New York, NY, USA.","DOI":"10.1145\/2063576.2063886"},{"key":"ref_36","first-page":"133","article-title":"Fact-based question decomposition in DeepQA","volume":"3","author":"Kalyanpur","year":"2012","journal-title":"IBM J. Res. Dev."},{"key":"ref_37","unstructured":"Kalyanpur, A., Patwardhan, S., and Boguraev, B. (2012, January 23\u201327). Parallel and nested decomposition for factoid questions. Proceedings of the 13th EACL, Philadelphia, PA, USA."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"1373","DOI":"10.14778\/3236187.3236192","article-title":"Question answering over knowledge graphs: Question understanding via template decomposition","volume":"11","author":"Zheng","year":"2018","journal-title":"Proc. VLDB Endow."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Min, S., Zhong, V., and Zettlemoyer, L. (2019, January 3\u20135). Multi-hop reading comprehension through question decomposition and rescoring. Proceedings of the 20th EACL, Florence, Italy.","DOI":"10.18653\/v1\/P19-1613"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1162\/tacl_a_00301","article-title":"A graph-based model for joint Chinese word segmentation and dependency parsing","volume":"8","author":"Yan","year":"2020","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"ref_41","first-page":"1329","article-title":"Deep graph-based character-level Chinese dependency parsing","volume":"29","author":"Wu","year":"2021","journal-title":"Inst. Electr. Electron. Eng."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Khot, T., Khashabi, D., and Richardson, K. (2021, January 6\u201311). Text modular networks: Learning to decompose tasks in the language of existing models. Proceedings of the 2021 NAACL, Online.","DOI":"10.18653\/v1\/2021.naacl-main.99"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Fu, R.L., Wang, H., and Zhang, X.J. (2021, January 7\u201311). Decomposing complex questions makes multi- hop QA easier and more interpretable. Proceedings of the 2021 EMNLP, Punta Cana, Dominican Republic.","DOI":"10.18653\/v1\/2021.findings-emnlp.17"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1016\/j.ins.2020.02.065","article-title":"Processing knowledge graph-based complex questions through question decomposition and recomposition","volume":"523","author":"Shin","year":"2020","journal-title":"Inf. Sci."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Lin, X.V., Socher, R., and Xiong, C. (2018, January 2\u20134). Multi-hop knowledge graph reasoning with reward shaping. Proceedings of the 2018 EMNLP, Brussels, Belgium.","DOI":"10.18653\/v1\/D18-1362"},{"key":"ref_46","unstructured":"Das, R. (2018, January 1\u20133). Go for a walk and arrive at the answer-reasoning over paths in knowledge bases using reinforcement learning. Proceedings of the 6th ICLR, Vancouver, BC, Canada."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Talmor, A., and Berant, J. (2018, January 1\u20136). The web as a knowledge-base for answering complex questions. Proceedings of the 2018 NAACL-HLT, New Orleans, LA, USA.","DOI":"10.18653\/v1\/N18-1059"},{"key":"ref_48","unstructured":"Yih, M., Richardson, C., and Meek, M. (2018, January 15\u201320). The value of semantic parse labeling for knowledge base question answering. Proceedings of the 2018 ACL, Berlin, Germany."},{"key":"ref_49","unstructured":"Zhang, L., Winn, J.M., and Tomioka, R. (2016). Gaussian attention model and its application to knowledge base embedding and question answering. arXiv."}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/7\/1022\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:01:11Z","timestamp":1760032871000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/7\/1022"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,29]]},"references-count":49,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2025,7]]}},"alternative-id":["sym17071022"],"URL":"https:\/\/doi.org\/10.3390\/sym17071022","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2025,6,29]]}}}