{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T12:37:30Z","timestamp":1776083850616,"version":"3.50.1"},"reference-count":244,"publisher":"Emerald","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,6,13]]},"abstract":"<jats:p>The task of Question Answering (QA) has attracted significant research interest for a long time. Its relevance to language understanding and knowledge retrieval tasks, along with the simple setting, makes the task of QA crucial for strong AI systems. Recent success on simple QA tasks has shifted the focus to more complex settings. Among these, Multi-Hop QA (MHQA) is one of the most researched tasks over recent years. In broad terms, MHQA is the task of answering natural language questions that involve extracting and combining multiple pieces of information and doing multiple steps of reasoning. An example of a multi-hop question would be \u201cThe Argentine PGA Championship record holder has won how many tournaments worldwide?\u201d. Answering the question would need two pieces of information: \u201cWho is the record holder for Argentine PGA Championship tournaments?\u201d and \u201cHow many tournaments did [Answer of Sub Q1] win?\u201d. The ability to answer multi-hop questions and perform multi step reasoning can significantly improve the utility of NLP systems. Consequently, the field has seen a surge of high quality datasets, models and evaluation strategies. The notion of \u2018multiple hops\u2019 is somewhat abstract which results in a large variety of tasks that require multihop reasoning. This leads to different datasets and models that differ significantly from each other and make the field challenging to generalize and survey. We aim to provide a general and formal definition of the MHQA task, and organize and summarize existing MHQA frameworks. We also outline some best practices for building MHQA datasets. This monograph provides a systematic and thorough introduction as well as the structuring of the existing attempts to this highly interesting, yet quite challenging task.<\/jats:p>","DOI":"10.1561\/1500000102","type":"journal-article","created":{"date-parts":[[2024,6,13]],"date-time":"2024-06-13T05:45:43Z","timestamp":1718257543000},"page":"457-586","source":"Crossref","is-referenced-by-count":24,"title":["Multi-hop Question Answering"],"prefix":"10.1561","volume":"17","author":[{"given":"Vaibhav","family":"Mavi","sequence":"first","affiliation":[{"name":"New York University ,","place":["USA"]}]},{"given":"Anubhav","family":"Jangra","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology Patna ,","place":["India"]}]},{"given":"Adam","family":"Jatowt","sequence":"additional","affiliation":[{"name":"University of Innsbruck ,","place":["Austria"]}]}],"member":"140","published-online":{"date-parts":[[2024,6,13]]},"reference":[{"key":"2026040314424049100_ref001","doi-asserted-by":"publisher","first-page":"268","DOI":"10.18653\/v1\/2020.emnlp-main.19","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Ainslie","year":"2020"},{"issue":"3","key":"2026040314424049100_ref002","article-title":"The question answering systems: A survey","volume":"2","author":"Allam","year":"2012","journal-title":"International Journal of Research and Reviews in Information Sciences (IJRRIS)"},{"key":"2026040314424049100_ref003","doi-asserted-by":"publisher","first-page":"412","DOI":"10.18653\/v1\/D17-1042","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Alvarez-Melis","year":"2017"},{"key":"2026040314424049100_ref004","article-title":"What is Relevant in a Text Document?: An Interpretable Machine Learning Approach","volume-title":"arXiv preprint arXiv:1612.07843","author":"Arras","year":"2016"},{"key":"2026040314424049100_ref005","article-title":"Mastering the ABCDs of Complex Questions: AnswerBased Claim Decomposition for Fine-grained Self-Evaluation","volume-title":"arXiv:2305.14750 [cs.CL]","author":"Balepur","year":"2023"},{"key":"2026040314424049100_ref006","first-page":"178","article-title":"Abstract Meaning Representation for Sembanking","volume-title":"Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse","author":"Banarescu","year":"2013"},{"key":"2026040314424049100_ref007","first-page":"65","volume-title":"Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization","author":"Banerjee","year":"2005"},{"key":"2026040314424049100_ref008","first-page":"550","article-title":"Information fusion in the context of multi-document summarization","volume-title":"Proceedings of the 37th annual meeting of the Association for Computational Linguistics","author":"Barzilay","year":"1999"},{"key":"2026040314424049100_ref009","doi-asserted-by":"crossref","first-page":"4220","DOI":"10.18653\/v1\/D18-1454","article-title":"Commonsense for Generative Multi-Hop Question Answering Tasks","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Bauer","year":"2018"},{"key":"2026040314424049100_ref010","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1609\/aaai.v34i01.5327","article-title":"Balancing Spreads of Influence in a Social Network","volume":"34","author":"Becker","year":"2020","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"2026040314424049100_ref011","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1145\/2501511.2501516","article-title":"Methods for exploring and mining tables on wikipedia","volume-title":"Proceedings of the ACM SIGKDD workshop on interactive data exploration and analytics","author":"Bhagavatula","year":"2013"},{"key":"2026040314424049100_ref012","first-page":"8","article-title":"Explanation and justification in machine learning: A survey","volume-title":"IJCAI-17 workshop on explainable AI (XAI)","author":"Biran","year":"2017"},{"key":"2026040314424049100_ref013","doi-asserted-by":"crossref","DOI":"10.1145\/3594536.3595163","article-title":"Can GPT-3 Perform Statutory Reasoning?","volume-title":"arXiv:2302.06100 [cs.CL]","author":"Blair-Stanek","year":"2023"},{"key":"2026040314424049100_ref014","article-title":"Event Detection as Question Answering with Entity Information","volume-title":"CoRR","author":"Boros","year":"2021"},{"key":"2026040314424049100_ref015","doi-asserted-by":"crossref","first-page":"366","DOI":"10.1016\/j.procs.2015.12.005","article-title":"Question answering systems: survey and trends","volume":"73","author":"Bouziane","year":"2015","journal-title":"Procedia Computer Science"},{"key":"2026040314424049100_ref016","doi-asserted-by":"crossref","first-page":"632","DOI":"10.18653\/v1\/D15-1075","volume-title":"Conference on Empirical Methods in Natural Language Processing, EMNLP 2015","author":"Bowman","year":"2015"},{"key":"2026040314424049100_ref017","first-page":"1877","article-title":"Language Models are Few-Shot Learners","volume":"33","author":"Brown","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2026040314424049100_ref018","first-page":"1","article-title":"Coarse-grained decomposition and finegrained interaction for multi-hop question answering","volume-title":"Journal of Intelligent Information Systems","author":"Cao","year":"2021"},{"key":"2026040314424049100_ref019","first-page":"357","article-title":"BAG: Bi-directional Attention Entity Graph Convolutional Network for Multi-hop Reasoning Question Answering","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Cao","year":"2019"},{"key":"2026040314424049100_ref020","doi-asserted-by":"publisher","first-page":"1870","DOI":"10.18653\/v1\/P17-1171","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Chen","year":"2017"},{"key":"2026040314424049100_ref021","doi-asserted-by":"publisher","first-page":"4026","DOI":"10.18653\/v1\/N19-1405","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Chen","year":"2019"},{"key":"2026040314424049100_ref022","article-title":"Multi-hop question answering via reasoning chains","volume-title":"arXiv preprint arXiv:1910.02610","author":"Chen","year":"2019"},{"key":"2026040314424049100_ref023","doi-asserted-by":"publisher","first-page":"1657","DOI":"10.18653\/v1\/P17-1152","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Chen","year":"2017"},{"key":"2026040314424049100_ref024","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2020.acl-main.708","article-title":"Logical Natural Language Generation from Open-Domain Tables","volume-title":"CoRR","author":"Chen","year":"2020"},{"key":"2026040314424049100_ref025","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.18653\/v1\/2020.findings-emnlp.91","article-title":"HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Chen","year":"2020"},{"key":"2026040314424049100_ref026","doi-asserted-by":"publisher","first-page":"3697","DOI":"10.18653\/v1\/2021.emnlp-main.300","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Chen","year":"2021"},{"key":"2026040314424049100_ref027","doi-asserted-by":"crossref","DOI":"10.3115\/v1\/D14-1179","article-title":"Learning phrase representations using RNN encoder-decoder for statistical machine translation","volume-title":"Conference on Empirical Methods in Natural Language Processing (EMNLP 2014)","author":"Cho","year":"2014"},{"key":"2026040314424049100_ref028","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/D18-1241","article-title":"QuAC: Question Answering in Context","volume-title":"CoRR","author":"Choi","year":"2018"},{"key":"2026040314424049100_ref029","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2210.11416","article-title":"Scaling Instruction-Finetuned Language Models","author":"Chung","year":"2022"},{"issue":"1","key":"2026040314424049100_ref030","first-page":"22","article-title":"Word Association Norms, Mutual Information, and Lexicography","volume":"16","author":"Church","year":"1990","journal-title":"Computational Linguistics"},{"key":"2026040314424049100_ref031","doi-asserted-by":"crossref","first-page":"2748","DOI":"10.18653\/v1\/P19-1264","article-title":"Sentence mover\u2019s similarity: Automatic evaluation for multi-sentence texts","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Clark","year":"2019"},{"key":"2026040314424049100_ref032","article-title":"Think you have solved question answering? try arc, the ai2 reasoning challenge","volume-title":"arXiv preprint arXiv:1803.05457","author":"Clark","year":"2018"},{"key":"2026040314424049100_ref033","article-title":"Training Verifiers to Solve Math Word Problems","volume-title":"arXiv:2110.14168 [cs.LG]","author":"Cobbe","year":"2021"},{"key":"2026040314424049100_ref034","article-title":"Recurrent Hidden Semi-Markov Model","volume-title":"ICLR","author":"Dai","year":"2017"},{"key":"2026040314424049100_ref035","doi-asserted-by":"crossref","first-page":"113","DOI":"10.18653\/v1\/D19-5816","article-title":"Multi-step entitycentric information retrieval for multi-hop question answering","volume-title":"Proceedings of the 2nd Workshop on Machine Reading for Question Answering","author":"Das","year":"2019"},{"key":"2026040314424049100_ref036","doi-asserted-by":"publisher","first-page":"4599","DOI":"10.18653\/v1\/2021.naacl-main.365","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Dasigi","year":"2021"},{"key":"2026040314424049100_ref037","first-page":"2306","article-title":"Question Answering by Reasoning Across Documents with Graph Convolutional Networks","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"De Cao","year":"2019"},{"key":"2026040314424049100_ref038","article-title":"Transforming Question Answering Datasets Into Natural Language Inference Datasets","volume-title":"CoRR","author":"Demszky","year":"2018"},{"key":"2026040314424049100_ref039","doi-asserted-by":"publisher","first-page":"4093","DOI":"10.24963\/ijcai.2022\/568","article-title":"Interpretable AMR-Based Question Decomposition for Multi-hop Question Answering","volume-title":"Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22","author":"Deng","year":"2022"},{"issue":"3","key":"2026040314424049100_ref040","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1007\/s10115-017-1100-y","article-title":"Core techniques of question answering systems over knowledge bases: a survey","volume":"55","author":"Diefenbach","year":"2018","journal-title":"Knowledge and Information systems"},{"issue":"2","key":"2026040314424049100_ref041","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1007\/s10844-019-00584-7","article-title":"A survey on question answering systems over linked data and documents","volume":"55","author":"Dimitrakis","year":"2020","journal-title":"Journal of intelligent information systems"},{"key":"2026040314424049100_ref042","doi-asserted-by":"publisher","first-page":"2694","DOI":"10.18653\/v1\/P19-1259","article-title":"Cognitive Graph for Multi-Hop Reading Comprehension at Scale","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Ding","year":"2019"},{"key":"2026040314424049100_ref043","doi-asserted-by":"publisher","first-page":"1342","DOI":"10.18653\/v1\/P17-1123","article-title":"Learning to Ask: Neural Question Generation for Reading Comprehension","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Du","year":"2017"},{"key":"2026040314424049100_ref044","doi-asserted-by":"crossref","first-page":"7009","DOI":"10.18653\/v1\/2021.emnlp-main.561","article-title":"Generative Context Pair Selection for Multihop Question Answering","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Dua","year":"2021"},{"key":"2026040314424049100_ref045","doi-asserted-by":"publisher","first-page":"5627","DOI":"10.18653\/v1\/2020.acl-main.497","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Dua","year":"2020"},{"key":"2026040314424049100_ref046","doi-asserted-by":"publisher","first-page":"2368","DOI":"10.18653\/v1\/N19-1246","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Dua","year":"2019"},{"issue":"4","key":"2026040314424049100_ref047","doi-asserted-by":"crossref","first-page":"4124","DOI":"10.1007\/s10489-022-03732-9","article-title":"The state of the art in open domain complex question answering: a survey","volume":"53","author":"Etezadi","year":"2023","journal-title":"Applied Intelligence"},{"key":"2026040314424049100_ref048","doi-asserted-by":"publisher","first-page":"4186","DOI":"10.18653\/v1\/D19-1428","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Fan","year":"2019"},{"key":"2026040314424049100_ref049","doi-asserted-by":"crossref","first-page":"8823","DOI":"10.18653\/v1\/2020.emnlp-main.710","article-title":"Hierarchical Graph Network for Multi-hop Question Answering","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Fang","year":"2020"},{"key":"2026040314424049100_ref050","doi-asserted-by":"publisher","first-page":"594","DOI":"10.1109\/TPAMI.2006.79","article-title":"One-Shot Learning of Object Categories","volume":"28","author":"Fei-Fei","year":"2006","journal-title":"IEEE transactions on pattern analysis and machine intelligence"},{"key":"2026040314424049100_ref051","doi-asserted-by":"publisher","first-page":"2296","DOI":"10.18653\/v1\/P19-1222","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Feldman","year":"2019"},{"key":"2026040314424049100_ref052","article-title":"Knowledge Card: Filling LLMs\u2019 Knowledge Gaps with Plug-in Specialized Language Models","volume-title":"arXiv:2305.09955 [cs.CL]","author":"Feng","year":"2024"},{"key":"2026040314424049100_ref053","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2024.acl-long.786","article-title":"Don\u2019t Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration","volume-title":"arXiv:2402.00367 [cs.CL]","author":"Feng","year":"2024"},{"key":"2026040314424049100_ref054","article-title":"Learning to recover reasoning chains for multi-hop question answering via cooperative games","volume-title":"arXiv preprint arXiv:2004.02393","author":"Feng","year":"2020"},{"key":"2026040314424049100_ref055","article-title":"A survey on complex question answering over knowledge base: Recent advances and challenges","volume-title":"arXiv preprint arXiv:2007.13069","author":"Fu","year":"2020"},{"key":"2026040314424049100_ref056","article-title":"PAL: Program-aided Language Models","volume-title":"arXiv:2211.10435 [cs.CL]","author":"Gao","year":"2023"},{"key":"2026040314424049100_ref057","article-title":"Difficulty Controllable Question Generation for Reading Comprehension","volume-title":"CoRR","author":"Gao","year":"2018"},{"key":"2026040314424049100_ref058","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1613\/jair.5477","article-title":"Survey of the state of the art in natural language generation: Core tasks, applications and evaluation","volume":"61","author":"Gatt","year":"2018","journal-title":"Journal of Artificial Intelligence Research"},{"key":"2026040314424049100_ref059","doi-asserted-by":"publisher","first-page":"1161","DOI":"10.18653\/v1\/D19-1107","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Geva","year":"2019"},{"key":"2026040314424049100_ref060","first-page":"3443","article-title":"DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Geva","year":"2019"},{"key":"2026040314424049100_ref061","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2020.acl-main.122","article-title":"Examining the state-of-the-art in news timeline summarization","volume-title":"arXiv preprint arXiv:2005.10107","author":"Ghalandari","year":"2020"},{"key":"2026040314424049100_ref062","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1109\/DSAA.2018.00018","volume-title":"2018 IEEE 5th International Conference on data science and advanced analytics (DSAA)","author":"Gilpin","year":"2018"},{"key":"2026040314424049100_ref063","doi-asserted-by":"crossref","DOI":"10.3115\/1117575.1117580","article-title":"Multi-document summarization by sentence extraction","volume-title":"NAACL-ANLP 2000 workshop: automatic summarization","author":"Goldstein","year":"2000"},{"key":"2026040314424049100_ref064","article-title":"Neural turing machines","volume-title":"arXiv preprint arXiv:1410.5401","author":"Graves","year":"2014"},{"key":"2026040314424049100_ref065","doi-asserted-by":"publisher","first-page":"1631","DOI":"10.18653\/v1\/P16-1154","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Gu","year":"2016"},{"key":"2026040314424049100_ref066","doi-asserted-by":"publisher","first-page":"140","DOI":"10.18653\/v1\/P16-1014","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Gulcehre","year":"2016"},{"key":"2026040314424049100_ref067","article-title":"ArthModel: Enhance Arithmetic Skills to Large Language Model","volume-title":"arXiv:2311.18609 [cs.CL]","author":"Guo","year":"2023"},{"key":"2026040314424049100_ref068","doi-asserted-by":"crossref","first-page":"2760","DOI":"10.18653\/v1\/2020.coling-main.249","article-title":"Reinforced Multi-task Approach for Multi-hop Question Generation","volume-title":"Proceedings of the 28th International Conference on Computational Linguistics","author":"Gupta","year":"2020"},{"key":"2026040314424049100_ref069","doi-asserted-by":"publisher","first-page":"107","DOI":"10.18653\/v1\/N18-2017","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)","author":"Gururangan","year":"2018"},{"key":"2026040314424049100_ref070","first-page":"362","article-title":"Exploring content models for multi-document summarization","volume-title":"Proceedings of human language technologies: The 2009 annual conference of the North American Chapter of the Association for Computational Linguistics","author":"Haghighi","year":"2009"},{"key":"2026040314424049100_ref071","doi-asserted-by":"publisher","DOI":"10.1109\/ICSC56153.2023.00036","article-title":"Exploratory Inference Chain: Exploratorily Chaining Multi-hop Inferences with Large Language Models for Question-Answering","volume-title":"2023 IEEE 17th International Conference on Semantic Computing (ICSC)","author":"Haji","year":"2023"},{"key":"2026040314424049100_ref072","article-title":"FOLIO: Natural Language Reasoning with First-Order Logic","volume-title":"arXiv: 2209.00840 [cs.CL]","author":"Han","year":"2022"},{"key":"2026040314424049100_ref073","article-title":"DeBERTa: Decoding-enhanced BERT with Disentangled Attention","volume-title":"arXiv: 2006.03654 [cs.CL]","author":"He","year":"2021"},{"key":"2026040314424049100_ref074","article-title":"Measuring mathematical problem solving with the math dataset","volume-title":"arXiv preprint arXiv:2103.03874","author":"Hendrycks","year":"2021"},{"key":"2026040314424049100_ref075","article-title":"How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation","volume-title":"arXiv: 2302.09210 [cs.CL]","author":"Hendy","year":"2023"},{"key":"2026040314424049100_ref076","article-title":"Teaching machines to read and comprehend","volume":"28","author":"Hermann","year":"2015","journal-title":"Advances in neural information processing systems"},{"issue":"8","key":"2026040314424049100_ref077","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural computation"},{"issue":"8","key":"2026040314424049100_ref078","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural computation"},{"issue":"6","key":"2026040314424049100_ref079","first-page":"895","article-title":"Survey on challenges of question answering in the semantic web","volume":"8","author":"H\u00f6ffner","year":"2017","journal-title":"Semantic Web"},{"key":"2026040314424049100_ref080","doi-asserted-by":"publisher","first-page":"5810","DOI":"10.18653\/v1\/2021.naacl-main.464","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Huang","year":"2021"},{"key":"2026040314424049100_ref081","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2023.acl-industry.4","article-title":"MathPrompter: Mathematical Reasoning using Large Language Models","volume-title":"arXiv: 2303.05398 [cs.CL]","author":"Imani","year":"2023"},{"key":"2026040314424049100_ref082","doi-asserted-by":"publisher","first-page":"6740","DOI":"10.18653\/v1\/2020.acl-main.602","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Inoue","year":"2020"},{"key":"2026040314424049100_ref083","doi-asserted-by":"crossref","first-page":"12","DOI":"10.18653\/v1\/W18-1703","article-title":"Multi-hop Inference for Sentence-level TextGraphs: How Challenging is Meaningfully Combining Information for Science Question Answering?","volume-title":"Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12)","author":"Jansen","year":"2018"},{"key":"2026040314424049100_ref084","volume-title":"Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)","author":"Jansen","year":"2018"},{"issue":"4","key":"2026040314424049100_ref085","doi-asserted-by":"publisher","first-page":"422","DOI":"10.1145\/582415.582418","article-title":"Cumulated Gain-Based Evaluation of IR Techniques","volume":"20","author":"J\u00e4rvelin","year":"2002","journal-title":"ACM Trans. Inf. Syst."},{"key":"2026040314424049100_ref086","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505655","article-title":"Estimating document focus time","volume-title":"22nd ACM International Conference on Information and Knowledge Management, CIKM\u201913, San Francisco, CA, USA, October 27 - November 1, 2013","author":"Jatowt","year":"2013"},{"key":"2026040314424049100_ref087","doi-asserted-by":"publisher","first-page":"137","DOI":"10.18653\/v1\/2020.emnlp-main.10","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Jhamtani","year":"2020"},{"key":"2026040314424049100_ref088","first-page":"2021","article-title":"Adversarial Examples for Evaluating Reading Comprehension Systems","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Jia","year":"2017"},{"issue":"2","key":"2026040314424049100_ref089","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3490238","article-title":"Biomedical Question Answering: A Survey of Approaches and Challenges","volume":"55","author":"Jin","year":"2022","journal-title":"ACM Computing Surveys (CSUR)"},{"key":"2026040314424049100_ref090","article-title":"Improving Multi-Hop Reasoning in LLMs by Learning from Rich Human Feedback","volume-title":"Neuro-Symbolic Learning and Reasoning in the era of Large Language Models","author":"Joshi","year":"2023"},{"key":"2026040314424049100_ref091","article-title":"Improving Multi-Hop Reasoning in LLMs by Learning from Rich Human Feedback","volume-title":"Neuro-Symbolic Learning and Reasoning in the era of Large Language Models","author":"Joshi","year":"2023"},{"key":"2026040314424049100_ref092","article-title":"Language Models (Mostly) Know What They Know","volume-title":"arXiv: 2207.05221 [cs.CL]","author":"Kadavath","year":"2022"},{"key":"2026040314424049100_ref093","doi-asserted-by":"publisher","first-page":"69","DOI":"10.18653\/v1\/W17-2609","volume-title":"Proceedings of the 2nd Workshop on Representation Learning for NLP","author":"Kadlec","year":"2017"},{"key":"2026040314424049100_ref094","first-page":"4171","article-title":"BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding","volume-title":"Proceedings of NAACL-HLT","author":"Kenton","year":"2019"},{"key":"2026040314424049100_ref095","first-page":"252","article-title":"Looking beyond the surface: A challenge set for reading comprehension over multiple sentences","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)","author":"Khashabi","year":"2018"},{"issue":"05","key":"2026040314424049100_ref096","doi-asserted-by":"crossref","first-page":"8082","DOI":"10.1609\/aaai.v34i05.6319","article-title":"Qasc: A dataset for question answering via sentence composition","volume":"34","author":"Khot","year":"2020","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"2026040314424049100_ref097","doi-asserted-by":"crossref","first-page":"2814","DOI":"10.18653\/v1\/D19-1281","article-title":"What\u2019s Missing: A Knowledge Gap Guided Approach for Multi-hop Question Answering","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Khot","year":"2019"},{"key":"2026040314424049100_ref098","article-title":"Decomposed Prompting: A Modular Approach for Solving Complex Tasks","volume-title":"arXiv: 2210.02406 [cs.CL]","author":"Khot","year":"2023"},{"key":"2026040314424049100_ref099","article-title":"Bilinear attention networks","volume":"31","author":"Kim","year":"2018","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2026040314424049100_ref100","article-title":"Semi-Supervised Classification with Graph Convolutional Networks","volume-title":"CoRR","author":"Kipf","year":"2016"},{"key":"2026040314424049100_ref101","article-title":"Semi-Supervised Classification with Graph Convolutional Networks","volume-title":"CoRR","author":"Kipf","year":"2016"},{"key":"2026040314424049100_ref102","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1162\/tacl_a_00023","article-title":"The narrativeqa reading comprehension challenge","volume":"6","author":"Kocisky","year":"2018","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2026040314424049100_ref103","doi-asserted-by":"publisher","first-page":"4365","DOI":"10.18653\/v1\/D19-1445","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Kovaleva","year":"2019"},{"key":"2026040314424049100_ref104","first-page":"382","volume-title":"International Semantic Web Conference","author":"Kumar","year":"2019"},{"key":"2026040314424049100_ref105","doi-asserted-by":"publisher","DOI":"10.1007\/s40593-019-00186-y","article-title":"A Systematic Review of Automatic Question Generation for Educational Purposes","volume":"30","author":"Kurdi","year":"2019","journal-title":"International Journal of Artificial Intelligence in Education"},{"key":"2026040314424049100_ref106","first-page":"957","article-title":"From word embeddings to document distances","volume-title":"International conference on machine learning","author":"Kusner","year":"2015"},{"key":"2026040314424049100_ref107","doi-asserted-by":"crossref","DOI":"10.24963\/ijcai.2021\/611","article-title":"A survey on complex knowledge base question answering: Methods, challenges and solutions","volume-title":"IJCAI","author":"Lan","year":"2021"},{"key":"2026040314424049100_ref108","article-title":"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations","volume-title":"International Conference on Learning Representations","author":"Lan","year":"2019"},{"key":"2026040314424049100_ref109","doi-asserted-by":"crossref","first-page":"104","DOI":"10.18653\/v1\/D19-5413","article-title":"Analyzing Sentence Fusion in Abstractive Summarization","volume-title":"Proceedings of the 2nd Workshop on New Frontiers in Summarization","author":"Lebanoff","year":"2019"},{"key":"2026040314424049100_ref110","article-title":"S3 HQA: A Three-Stage Approach for Multi-hop Text-Table Hybrid Question Answering","volume-title":"arXiv: 2305.11725 [cs.CL]","author":"Lei","year":"2023"},{"key":"2026040314424049100_ref111","article-title":"BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension","volume-title":"CoRR","author":"Lewis","year":"2019"},{"key":"2026040314424049100_ref112","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1007\/978-981-99-6207-5_2","volume-title":"Chinese Computational Linguistics","author":"Li","year":"2023"},{"key":"2026040314424049100_ref113","doi-asserted-by":"publisher","first-page":"2157","DOI":"10.18653\/v1\/D17-1230","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Li","year":"2017"},{"key":"2026040314424049100_ref114","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2023.findings-emnlp.452","article-title":"Leveraging Structured Information for Explainable Multi-hop Question Answering and Reasoning","volume-title":"arXiv: 2311.03734 [cs.CL]","author":"Li","year":"2023"},{"key":"2026040314424049100_ref115","article-title":"Holistic Evaluation of Language Models","volume-title":"arXiv: 2211.09110 [cs.CL]","author":"Liang","year":"2023"},{"key":"2026040314424049100_ref116","first-page":"74","volume-title":"Text Summarization Branches Out","author":"Lin","year":"2004"},{"key":"2026040314424049100_ref117","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1016\/j.aiopen.2022.10.001","article-title":"A survey of transformers","volume":"3","author":"Lin","year":"2022","journal-title":"AI Open"},{"key":"2026040314424049100_ref118","article-title":"Medical Visual Question Answering: A Survey","volume-title":"arXiv preprint arXiv:2111.10056","author":"Lin","year":"2021"},{"key":"2026040314424049100_ref119","first-page":"105","volume-title":"Proceedings of the 14th European Workshop on Natural Language Generation","author":"Lindberg","year":"2013"},{"key":"2026040314424049100_ref120","article-title":"A critical review of recurrent neural networks for sequence learning","volume-title":"arXiv preprint arXiv:1506.00019","author":"Lipton","year":"2015"},{"key":"2026040314424049100_ref121","first-page":"1703","article-title":"Mean average precision","volume-title":"Encyclopedia of Database Systems2009","author":"Liu","year":"2009"},{"key":"2026040314424049100_ref122","article-title":"Roberta: A robustly optimized bert pretraining approach","volume-title":"arXiv preprint arXiv:1907.11692","author":"Liu","year":"2019"},{"key":"2026040314424049100_ref123","doi-asserted-by":"publisher","first-page":"412","DOI":"10.18653\/v1\/D15-1166","volume-title":"Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing","author":"Luong","year":"2015"},{"key":"2026040314424049100_ref124","article-title":"Multi-document summarization via deep learning techniques: A survey","volume-title":"arXiv preprint arXiv:2011.04843","author":"Ma","year":"2020"},{"key":"2026040314424049100_ref125","article-title":"Self-Refine: Iterative Refinement with Self-Feedback","volume-title":"arXiv: 2303.17651 [cs.CL]","author":"Madaan","year":"2023"},{"key":"2026040314424049100_ref126","article-title":"Generating followup questions for interpretable multi-hop question answering","volume-title":"arXiv preprint arXiv:2002.12344","author":"Malon","year":"2020"},{"key":"2026040314424049100_ref127","doi-asserted-by":"crossref","DOI":"10.1016\/j.neunet.2024.106550","article-title":"Arithmetic with Language Models: from Memorization to Computation","volume-title":"arXiv: 2308.01154 [cs.AI]","author":"Maltoni","year":"2024"},{"key":"2026040314424049100_ref128","doi-asserted-by":"publisher","first-page":"178","DOI":"10.18653\/v1\/2023.nllp-1.18","volume-title":"Proceedings of the Natural Legal Language Processing Workshop 2023","author":"Mavi","year":"2023"},{"key":"2026040314424049100_ref129","doi-asserted-by":"publisher","first-page":"38","DOI":"10.1007\/978-1-4419-9863-7_209","volume-title":"Encyclopedia of Systems Biology","author":"Melo","year":"2013"},{"key":"2026040314424049100_ref130","doi-asserted-by":"crossref","first-page":"2381","DOI":"10.18653\/v1\/D18-1260","article-title":"Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Mihaylov","year":"2018"},{"issue":"11","key":"2026040314424049100_ref131","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1145\/219717.219748","article-title":"WordNet: A Lexical Database for English","volume":"38","author":"Miller","year":"1995","journal-title":"Commun. ACM"},{"key":"2026040314424049100_ref132","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2020.emnlp-main.466","article-title":"AmbigQA: Answering ambiguous open-domain questions","volume-title":"arXiv preprint arXiv:2004.10645","author":"Min","year":"2020"},{"key":"2026040314424049100_ref133","doi-asserted-by":"publisher","first-page":"4249","DOI":"10.18653\/v1\/P19-1416","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Min","year":"2019"},{"key":"2026040314424049100_ref134","doi-asserted-by":"publisher","first-page":"1725","DOI":"10.18653\/v1\/P18-1160","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Min","year":"2018"},{"key":"2026040314424049100_ref135","doi-asserted-by":"publisher","first-page":"6097","DOI":"10.18653\/v1\/P19-1613","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Min","year":"2019"},{"issue":"3","key":"2026040314424049100_ref136","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1016\/j.jksuci.2014.10.007","article-title":"A survey on question answering systems with classification","volume":"28","author":"Mishra","year":"2016","journal-title":"Journal of King Saud University-Computer and Information Sciences"},{"key":"2026040314424049100_ref137","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2023.findings-emnlp.972","article-title":"Drilling Down into the Discourse Structure with LLMs for Long Document Question Answering","volume-title":"arXiv: 2311.13565 [cs.CL]","author":"Nair","year":"2023"},{"key":"2026040314424049100_ref138","doi-asserted-by":"publisher","first-page":"280","DOI":"10.18653\/v1\/K16-1028","volume-title":"Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning","author":"Nallapati","year":"2016"},{"key":"2026040314424049100_ref139","first-page":"1191","article-title":"Abstractive unsupervised multi-document summarization using paraphrastic sentence fusion","volume-title":"Proceedings of the 27th International Conference on Computational Linguistics","author":"Nayeem","year":"2018"},{"key":"2026040314424049100_ref140","doi-asserted-by":"publisher","first-page":"3950","DOI":"10.18653\/v1\/D18-1429","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Nema","year":"2018"},{"key":"2026040314424049100_ref141","doi-asserted-by":"publisher","first-page":"2241","DOI":"10.18653\/v1\/D17-1238","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Novikova","year":"2017"},{"key":"2026040314424049100_ref142","article-title":"Training language models to follow instructions with human feedback","volume-title":"arXiv: 2203.02155 [cs.CL]","author":"Ouyang","year":"2022"},{"key":"2026040314424049100_ref143","first-page":"5866","article-title":"Unsupervised Multi-hop Question Answering by Question Generation","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Pan","year":"2021"},{"key":"2026040314424049100_ref144","first-page":"311","article-title":"Bleu: a method for automatic evaluation of machine translation","volume-title":"Proceedings of the 40th annual meeting of the Association for Computational Linguistics","author":"Papineni","year":"2002"},{"key":"2026040314424049100_ref145","first-page":"2080","article-title":"Are NLP Models really able to Solve Simple Math Word Problems?","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Patel","year":"2021"},{"key":"2026040314424049100_ref146","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2022.emnlp-main.302","article-title":"Is a Question Decomposition Unit All We Need?","volume-title":"arXiv: 2205.12538 [cs.CL]","author":"Patel","year":"2022"},{"key":"2026040314424049100_ref147","doi-asserted-by":"crossref","first-page":"1532","DOI":"10.3115\/v1\/D14-1162","article-title":"Glove: Global vectors for word representation","volume-title":"Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)","author":"Pennington","year":"2014"},{"key":"2026040314424049100_ref148","doi-asserted-by":"publisher","first-page":"2227","DOI":"10.18653\/v1\/N18-1202","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)","author":"Peters","year":"2018"},{"key":"2026040314424049100_ref149","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1261","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Qi","year":"2019"},{"key":"2026040314424049100_ref150","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1617","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Qiu","year":"2019"},{"issue":"8","key":"2026040314424049100_ref151","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI blog"},{"key":"2026040314424049100_ref152","article-title":"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer","volume-title":"CoRR","author":"Raffel","year":"2019"},{"key":"2026040314424049100_ref153","article-title":"Navigating the Fermi Multiverse: Assessing LLMs for Complex Multi-hop Queries","author":"Rahgouy","year":"2023"},{"key":"2026040314424049100_ref154","doi-asserted-by":"publisher","first-page":"2383","DOI":"10.18653\/v1\/D16-1264","volume-title":"Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing","author":"Rajpurkar","year":"2016"},{"key":"2026040314424049100_ref155","doi-asserted-by":"crossref","first-page":"2383","DOI":"10.18653\/v1\/D16-1264","article-title":"SQuAD: 100,000+ Questions for Machine Comprehension of Text","volume-title":"Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing","author":"Rajpurkar","year":"2016"},{"key":"2026040314424049100_ref156","doi-asserted-by":"crossref","DOI":"10.1109\/CVPR.2017.131","article-title":"Self-Critical Sequence Training for Image Captioning","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Rennie","year":"2017"},{"key":"2026040314424049100_ref157","doi-asserted-by":"publisher","DOI":"10.2200\/S0113ED1V01Y202109ICR076","volume-title":"Question Answering for the Curated Web: Tasks and Methods in QA over Knowledge Bases and Text Collections. Synthesis Lectures on Information Concepts, Retrieval, and Services","author":"Roy","year":"2021"},{"key":"2026040314424049100_ref158","volume-title":"Tech. rep","author":"Rumelhart","year":"1985"},{"key":"2026040314424049100_ref159","article-title":"Stronger Transformers for Neural Multi-Hop Question Generation","volume-title":"arXiv preprint arXiv:2010.11374","author":"Sachan","year":"2020"},{"key":"2026040314424049100_ref160","doi-asserted-by":"publisher","first-page":"453","DOI":"10.18653\/v1\/P16-1043","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Sachan","year":"2016"},{"key":"2026040314424049100_ref161","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/D18-1233","article-title":"Interpretation of Natural Language Rules in Conversational Machine Reading","volume-title":"EMNLP","author":"Saeidi","year":"2018"},{"key":"2026040314424049100_ref162","article-title":"Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models","volume-title":"arXiv preprint arXiv:1708.08296","author":"Samek","year":"2017"},{"key":"2026040314424049100_ref163","article-title":"Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought","volume-title":"The Eleventh International Conference on Learning Representations","author":"Saparov","year":"2023"},{"key":"2026040314424049100_ref164","article-title":"Modeling Relational Data with Graph Convolutional Networks (2017)","volume-title":"Preprint","author":"Schlichtkrull","year":"2017"},{"key":"2026040314424049100_ref165","doi-asserted-by":"crossref","DOI":"10.1108\/IDD-06-2018-0022","article-title":"Introducing mathqa: a math-aware question answering system","volume-title":"Information Discovery and Delivery","author":"Schubotz","year":"2018"},{"key":"2026040314424049100_ref166","doi-asserted-by":"publisher","first-page":"6027","DOI":"10.18653\/v1\/P19-1604","volume-title":"Proceedings of the 57th Annual Meeting of Association for Computational Linguistics","author":"Scialom","year":"2019"},{"key":"2026040314424049100_ref167","doi-asserted-by":"crossref","first-page":"1073","DOI":"10.18653\/v1\/P17-1099","article-title":"Get To The Point: Summarization with Pointer-Generator Networks","volume-title":"Proceedings of the 55th Annual Meeting of Association for Computational Linguistics (Volume 1: Long Papers)","author":"See","year":"2017"},{"key":"2026040314424049100_ref168","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/P17-1099","article-title":"Get to the point: Summarization with pointer-generator networks","volume-title":"arXiv preprint arXiv:1704.04368","author":"See","year":"2017"},{"key":"2026040314424049100_ref169","article-title":"Bidirectional Attention Flow for Machine Comprehension","volume-title":"CoRR","author":"Seo","year":"2016"},{"key":"2026040314424049100_ref170","doi-asserted-by":"crossref","first-page":"7187","DOI":"10.18653\/v1\/2020.emnlp-main.583","article-title":"Is Graph Structure Necessary for Multi-hop Question Answering?","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Shao","year":"2020"},{"key":"2026040314424049100_ref171","article-title":"Memory augmented sequential paragraph retrieval for multi-hop question answering","volume-title":"arXiv preprint arXiv:2102.03741","author":"Shao","year":"2021"},{"key":"2026040314424049100_ref172","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2024.acl-long.594","article-title":"Exploring Hybrid Question Answering via Program-based Prompting","volume-title":"arXiv: 2402.10812 [cs.CL]","author":"Shi","year":"2024"},{"key":"2026040314424049100_ref173","doi-asserted-by":"crossref","first-page":"58","DOI":"10.18653\/v1\/2021.sustainlp-1.7","article-title":"Combining Lexical and Dense Retrieval for Computationally Efficient Multi-hop Question Answering","volume-title":"Proceedings of Second Workshop on Simple and Efficient Natural Language Processing","author":"Sidiropoulos","year":"2021"},{"key":"2026040314424049100_ref174","doi-asserted-by":"publisher","first-page":"58","DOI":"10.18653\/v1\/2021.sustainlp-1.7","volume-title":"Proceedings of Second Workshop on Simple and Efficient Natural Language Processing","author":"Sidiropoulos","year":"2021"},{"key":"2026040314424049100_ref175","doi-asserted-by":"publisher","first-page":"3607","DOI":"10.18653\/v1\/2023.emnlp-main.220","volume-title":"Proceedings of 2023 Conference on Empirical Methods in Natural Language Processing","author":"Slobodkin","year":"2023"},{"issue":"6","key":"2026040314424049100_ref176","doi-asserted-by":"crossref","first-page":"635","DOI":"10.1016\/j.jksuci.2018.08.005","article-title":"A literature review on question answering techniques, paradigms and systems","volume":"32","author":"Soares","year":"2020","journal-title":"Journal of King Saud University-Computer and Information Sciences"},{"key":"2026040314424049100_ref177","article-title":"Exploring graph-structured passage representation for multi-hop reading comprehension with graph neural networks","volume-title":"arXiv preprint arXiv:1809.02040","author":"Song","year":"2018"},{"key":"2026040314424049100_ref178","article-title":"ConceptNet 5.5: An Open Multilingual Graph of General Knowledge. Singh 2002 (2016)","volume-title":"arXiv preprint arxiv:1612.03975","author":"Speer","year":"2016"},{"key":"2026040314424049100_ref179","first-page":"75","volume-title":"International Conference on Computer Vision and Image Processing","author":"Srivastava","year":"2020"},{"key":"2026040314424049100_ref180","doi-asserted-by":"crossref","first-page":"21","DOI":"10.18653\/v1\/D19-5403","article-title":"Abstractive timeline summarization","volume-title":"Proceedings of 2nd Workshop on New Frontiers in Summarization","author":"Steen","year":"2019"},{"key":"2026040314424049100_ref181","doi-asserted-by":"crossref","first-page":"4636","DOI":"10.18653\/v1\/2020.findings-emnlp.416","article-title":"Multihop Question Generation with Graph Convolutional Network","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Su","year":"2020"},{"key":"2026040314424049100_ref182","article-title":"Iterative Hierarchical Attention for Answering Complex Questions over Long Documents","volume-title":"arXiv preprint arXiv:2106.00200","author":"Sun","year":"2021"},{"key":"2026040314424049100_ref183","doi-asserted-by":"publisher","first-page":"641","DOI":"10.18653\/v1\/N18-1059","volume-title":"Proceedings of 2018 Conference of North American Chapter of Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)","author":"Talmor","year":"2018"},{"key":"2026040314424049100_ref184","article-title":"Evaluation of ChatGPT as a Question Answering System for Answering Complex Questions","volume-title":"arXiv: 2303.07992 [cs.CL]","author":"Tan","year":"2023"},{"key":"2026040314424049100_ref185","article-title":"Question Answering and Question Generation as Dual Tasks","volume-title":"CoRR","author":"Tang","year":"2017"},{"key":"2026040314424049100_ref186","doi-asserted-by":"publisher","first-page":"3244","DOI":"10.18653\/v1\/2021.eacl-main.283","volume-title":"Proceedings of the 16th Conference of the European Chapter of Association for Computational Linguistics: Main Volume","author":"Tang","year":"2021"},{"key":"2026040314424049100_ref187","doi-asserted-by":"crossref","first-page":"42","DOI":"10.18653\/v1\/D19-5306","article-title":"Identifying Supporting Facts for Multi-hop Question Answering with Document Graph Networks","volume-title":"Proceedings of Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13)","author":"Thayaparan","year":"2019"},{"key":"2026040314424049100_ref188","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2023.emnlp-main.330","article-title":"Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback","volume-title":"arXiv: 2305.14975 [cs.CL]","author":"Tian","year":"2023"},{"key":"2026040314424049100_ref189","article-title":"NewsQA: A Machine Comprehension Dataset","volume-title":"CoRR","author":"Trischler","year":"2016"},{"key":"2026040314424049100_ref190","doi-asserted-by":"publisher","first-page":"8846","DOI":"10.18653\/v1\/2020.emnlp-main.712","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Trivedi","year":"2020"},{"key":"2026040314424049100_ref191","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2023.acl-long.557","article-title":"Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions","volume-title":"arXiv: 2212.10509 [cs.CL]","author":"Trivedi","year":"2023"},{"key":"2026040314424049100_ref192","first-page":"2948","article-title":"Repurposing Entailment for Multi-Hop Question Answering Tasks","volume-title":"Proceedings of 2019 Conference of North American Chapter of Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Trivedi","year":"2019"},{"issue":"05","key":"2026040314424049100_ref193","doi-asserted-by":"crossref","first-page":"9073","DOI":"10.1609\/aaai.v34i05.6441","article-title":"Select, answer and explain: Interpretable multi-hop reading comprehension over multiple documents","volume":"34","author":"Tu","year":"2020","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"2026040314424049100_ref194","article-title":"Representation learning with contrastive predictive coding","volume-title":"arXiv e-prints","author":"Van den Oord","year":"2018"},{"key":"2026040314424049100_ref195","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Advances in neural information processing systems"},{"key":"2026040314424049100_ref196","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Advances in neural information processing systems"},{"key":"2026040314424049100_ref197","article-title":"Cider: consensusbased image description evaluation. CoRR","volume-title":"arXiv preprint arXiv: 1411.5726","author":"Vedantam","year":"2014"},{"key":"2026040314424049100_ref198","article-title":"Graph Attention Networks","volume-title":"International Conference on Learning Representations","author":"Veli\u0107kovi\u0107","year":"2018"},{"key":"2026040314424049100_ref199","article-title":"Mesh-Transformer-JAX: Model-Parallel Implementation of Transformer Language Model with JAX","author":"Wang","year":"2021"},{"key":"2026040314424049100_ref200","doi-asserted-by":"crossref","first-page":"91","DOI":"10.18653\/v1\/D19-5813","article-title":"Do Multi-hop Readers Dream of Reasoning Chains?","volume-title":"Proceedings of 2nd Workshop on Machine Reading for Question Answering","author":"Wang","year":"2019"},{"key":"2026040314424049100_ref201","first-page":"774","volume-title":"Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14-17, 2020, Proceedings, Part I","author":"Wang","year":"2020"},{"issue":"1","key":"2026040314424049100_ref202","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1007\/s10791-020-09387-9","article-title":"Improving question answering for event-focused questions in temporal collections of news articles","volume":"24","author":"Wang","year":"2021","journal-title":"Inf. Retr. J."},{"key":"2026040314424049100_ref203","article-title":"ArchivalQA: A Largescale Benchmark Dataset for Open Domain Question Answering over Archival News Collections","volume-title":"CoRR","author":"Wang","year":"2021"},{"key":"2026040314424049100_ref204","first-page":"398","article-title":"Event Occurrence Date Estimation based on Multivariate Time Series Analysis over Temporal Document Collections","volume-title":"SIGIR \u201921: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11-15, 2021","author":"Wang","year":"2021"},{"key":"2026040314424049100_ref205","article-title":"Machine comprehension using match- lstm and answer pointer","volume-title":"arXiv preprint arXiv:1608.07905","author":"Wang","year":"2016"},{"key":"2026040314424049100_ref206","article-title":"Entailment as Few-Shot Learner","volume-title":"arXiv: 2104.14690 [cs.CL]","author":"Wang","year":"2021"},{"key":"2026040314424049100_ref207","article-title":"Self-Consistency Improves Chain of Thought Reasoning in Language Models","volume-title":"arXiv: 2203.11171 [cs.CL]","author":"Wang","year":"2023"},{"key":"2026040314424049100_ref208","article-title":"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models","volume-title":"arXiv: 2201.11903 [cs.CL]","author":"Wei","year":"2023"},{"key":"2026040314424049100_ref209","article-title":"Extending Multi-Text Sentence Fusion Resources via Pyramid Annotations","volume-title":"arXiv preprint arXiv:2110.04517","author":"Weiss","year":"2021"},{"key":"2026040314424049100_ref210","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1162\/tacl_a_00021","article-title":"Constructing datasets for multi-hop reading comprehension across documents","volume":"6","author":"Welbl","year":"2018","journal-title":"Transactions of Association for Computational Linguistics"},{"key":"2026040314424049100_ref211","article-title":"Neural Text Generation With Unlikelihood Training","volume-title":"International Conference on Learning Representations","author":"Welleck","year":"2019"},{"key":"2026040314424049100_ref212","doi-asserted-by":"publisher","first-page":"1112","DOI":"10.18653\/v1\/N18-1101","volume-title":"Proceedings of 2018 Conference of North American Chapter of Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)","author":"Williams","year":"2018"},{"issue":"3-4","key":"2026040314424049100_ref213","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1007\/BF00992696","article-title":"Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning","volume":"8","author":"Williams","year":"1992","journal-title":"Mach. Learn."},{"issue":"suppl_1","key":"2026040314424049100_ref214","doi-asserted-by":"crossref","first-page":"D901","DOI":"10.1093\/nar\/gkm958","article-title":"DrugBank: a knowledgebase for drugs, drug actions and drug targets","volume":"36","author":"Wishart","year":"2008","journal-title":"Nucleic acids research"},{"key":"2026040314424049100_ref215","article-title":"GenDec: A robust generative Question-decomposition method for Multi-hop reasoning","volume-title":"arXiv: 2402.11166 [cs.CL]","author":"Wu","year":"2024"},{"key":"2026040314424049100_ref216","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/j.cviu.2017.05.001","article-title":"Visual question answering: A survey of methods and datasets","volume":"163","author":"Wu","year":"2017","journal-title":"Computer Vision and Image Understanding"},{"key":"2026040314424049100_ref217","article-title":"Google\u2019s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation","volume-title":"CoRR","author":"Wu","year":"2016"},{"issue":"1","key":"2026040314424049100_ref218","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1109\/TNNLS.2020.2978386","article-title":"A comprehensive survey on graph neural networks","volume":"32","author":"Wu","year":"2020","journal-title":"IEEE transactions on neural networks and learning systems"},{"key":"2026040314424049100_ref219","article-title":"Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval","volume-title":"CoRR","author":"Xiong","year":"2020"},{"key":"2026040314424049100_ref220","doi-asserted-by":"crossref","first-page":"48","DOI":"10.18653\/v1\/D19-5806","article-title":"Simple yet Effective Bridge Reasoning for Open-Domain Multi-Hop Question Answering","volume-title":"Proceedings of 2nd Workshop on Machine Reading for Question Answering","author":"Xiong","year":"2019"},{"key":"2026040314424049100_ref221","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2021.findings-emnlp.99","article-title":"Exploiting Reasoning Chains for Multi-hop Science Question Answering","volume-title":"arXiv: 2109.02905 [cs.CL]","author":"Xu","year":"2021"},{"key":"2026040314424049100_ref222","doi-asserted-by":"publisher","first-page":"2681","DOI":"10.18653\/v1\/N19-1274","volume-title":"Proceedings of 2019 Conference of North American Chapter of Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Yadav","year":"2019"},{"key":"2026040314424049100_ref223","doi-asserted-by":"publisher","first-page":"2578","DOI":"10.18653\/v1\/D19-1260","volume-title":"Proceedings of 2019 Conference of Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Yadav","year":"2019"},{"key":"2026040314424049100_ref224","doi-asserted-by":"crossref","first-page":"4514","DOI":"10.18653\/v1\/2020.acl-main.414","article-title":"Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering","volume-title":"Proceedings of 58th Annual Meeting of Association for Computational Linguistics","author":"Yadav","year":"2020"},{"key":"2026040314424049100_ref225","first-page":"4571","article-title":"If You Want to Go Far Go Together: Unsupervised Joint Candidate Evidence Retrieval for Multi-hop Question Answering","volume-title":"Proceedings of 2021 Conference of North American Chapter of Association for Computational Linguistics: Human Language Technologies","author":"Yadav","year":"2021"},{"key":"2026040314424049100_ref226","first-page":"745","article-title":"Evolutionary timeline summarization: a balanced optimization framework via iterative substitution","volume-title":"Proceedings of 34th international ACM SIGIR conference on Research and development in Information Retrieval","author":"Yan","year":"2011"},{"key":"2026040314424049100_ref227","doi-asserted-by":"crossref","first-page":"2369","DOI":"10.18653\/v1\/D18-1259","article-title":"HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering","volume-title":"Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing","author":"Yang","year":"2018"},{"key":"2026040314424049100_ref228","doi-asserted-by":"publisher","first-page":"4546","DOI":"10.24963\/ijcai.2018\/632","volume-title":"Proceedings of Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18","author":"Yao","year":"2018"},{"key":"2026040314424049100_ref229","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2022.acl-long.69","article-title":"Modeling Multi-hop Question Answering as Single Sequence Prediction","volume-title":"arXiv: 2205.09226 [cs.CL]","author":"Yavuz","year":"2022"},{"key":"2026040314424049100_ref230","article-title":"Multi-paragraph reasoning with knowledge-enhanced graph neural network","volume-title":"arXiv preprint arXiv:1911.02170","author":"Ye","year":"2019"},{"key":"2026040314424049100_ref231","doi-asserted-by":"publisher","first-page":"6729","DOI":"10.18653\/v1\/2020.acl-main.601","volume-title":"Proceedings of 58th Annual Meeting of Association for Computational Linguistics","author":"Yu","year":"2020"},{"key":"2026040314424049100_ref232","first-page":"377","volume-title":"Proceedings of the 59th Annual Meeting of Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Yu","year":"2021"},{"key":"2026040314424049100_ref233","article-title":"Conversational question answering: A survey","volume-title":"arXiv preprint arXiv:2106.00874","author":"Zaib","year":"2021"},{"key":"2026040314424049100_ref234","article-title":"STaR: Bootstrapping Reasoning With Reasoning","volume-title":"arXiv: 2203.14465 [cs.LG]","author":"Zelikman","year":"2022"},{"key":"2026040314424049100_ref235","doi-asserted-by":"publisher","first-page":"56755","DOI":"10.1109\/ACCESS.2020.2981134","article-title":"Coarse and Fine Granularity Graph Reasoning for Interpretable Multi-Hop Question Answering","volume":"8","author":"Zhang","year":"2020","journal-title":"IEEE Access"},{"key":"2026040314424049100_ref236","doi-asserted-by":"publisher","first-page":"584","DOI":"10.18653\/v1\/D17-1062","volume-title":"Proceedings of 2017 Conference on Empirical Methods in Natural Language Processing","author":"Zhang","year":"2017"},{"key":"2026040314424049100_ref237","first-page":"481","article-title":"Answering Any-hop Open-domain Questions with Iterative Document Reranking","volume-title":"Proceedings of 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Zhang","year":"2021"},{"key":"2026040314424049100_ref238","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2023.acl-long.320","article-title":"Verify-and- Edit: A Knowledge-Enhanced Chain-of-Thought Framework","volume-title":"arXiv: 2305.03268 [cs.CL]","author":"Zhao","year":"2023"},{"key":"2026040314424049100_ref239","article-title":"A Survey of Large Language Models","volume-title":"arXiv: 2303.18223 [cs.CL]","author":"Zhao","year":"2023"},{"key":"2026040314424049100_ref240","doi-asserted-by":"publisher","first-page":"3901","DOI":"10.18653\/v1\/D18-1424","volume-title":"Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing","author":"Zhao","year":"2018"},{"key":"2026040314424049100_ref241","doi-asserted-by":"publisher","first-page":"3901","DOI":"10.18653\/v1\/D18-1424","volume-title":"Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing","author":"Zhao","year":"2018"},{"key":"2026040314424049100_ref242","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2022.emnlp-main.142","article-title":"Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts","volume-title":"arXiv: 2210.16865 [cs.CL]","author":"Zhou","year":"2022"},{"key":"2026040314424049100_ref243","article-title":"Least-to-Most Prompting Enables Complex Reasoning in Large Language Models","volume-title":"The Eleventh International Conference on Learning Representations","author":"Zhou","year":"2023"},{"key":"2026040314424049100_ref244","article-title":"Retrieving and reading: A comprehensive survey on open-domain question answering","volume-title":"arXiv preprint arXiv:2101.00774","author":"Zhu","year":"2021"}],"container-title":["Foundations and Trends\u00ae in Information Retrieval"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/ftinr\/article-pdf\/17\/5\/457\/11155542\/1500000102en.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/www.emerald.com\/ftinr\/article-pdf\/17\/5\/457\/11155542\/1500000102en.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T18:45:56Z","timestamp":1775241956000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.emerald.com\/ftinr\/article\/17\/5\/457\/1332410\/Multi-hop-Question-Answering"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,13]]},"references-count":244,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,6,13]]}},"URL":"https:\/\/doi.org\/10.1561\/1500000102","relation":{},"ISSN":["1554-0669","1554-0677"],"issn-type":[{"value":"1554-0669","type":"print"},{"value":"1554-0677","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,13]]}}}