{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T04:23:18Z","timestamp":1776399798535,"version":"3.51.2"},"reference-count":216,"publisher":"MIT Press","issue":"3","license":[{"start":{"date-parts":[[2024,6,10]],"date-time":"2024-06-10T00:00:00Z","timestamp":1717977600000},"content-version":"vor","delay-in-days":161,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Task semantics can be expressed by a set of input-output examples or a piece of textual instruction. Conventional machine learning approaches for natural language processing (NLP) mainly rely on the availability of large-scale sets of task-specific examples. Two issues arise: First, collecting task-specific labeled examples does not apply to scenarios where tasks may be too complicated or costly to annotate, or the system is required to handle a new task immediately; second, this is not user-friendly since end-users are probably more willing to provide task description rather than a set of examples before using the system. Therefore, the community is paying increasing interest in a new supervision-seeking paradigm for NLP: learning to follow task instructions, that is, instruction following. Despite its impressive progress, there are some unsolved research equations that the community struggles with. This survey tries to summarize and provide insights into the current research on instruction following, particularly, by answering the following questions: (i) What is task instruction, and what instruction types exist? (ii) How should we model instructions? (iii) What are popular instruction following datasets and evaluation metrics? (iv) What factors influence and explain the instructions\u2019 performance? (v) What challenges remain in instruction following? To our knowledge, this is the first comprehensive survey about instruction following.1<\/jats:p>","DOI":"10.1162\/coli_a_00523","type":"journal-article","created":{"date-parts":[[2024,6,10]],"date-time":"2024-06-10T14:19:39Z","timestamp":1718029179000},"page":"1053-1095","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":22,"title":["Large Language Model Instruction Following: A Survey of Progresses and Challenges"],"prefix":"10.1162","volume":"50","author":[{"given":"Renze","family":"Lou","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, The Pennsylvania State University, USA. renze.lou@psu.edu"}]},{"given":"Kai","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, The Ohio State University, USA. zhang.13253@osu.edu"}]},{"given":"Wenpeng","family":"Yin","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, The Pennsylvania State University, USA. wenpeng@psu.edu"}]}],"member":"281","published-online":{"date-parts":[[2024,9,1]]},"reference":[{"key":"2024092014244379600_bib1","first-page":"3731","article-title":"Communicating natural programs to humans and machines","volume-title":"Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022","author":"Acquaviva","year":"2022"},{"key":"2024092014244379600_bib2","article-title":"ExT5: Towards extreme multi-task scaling for transfer learning","volume-title":"The Tenth International Conference on Learning Representations, ICLR 2022","author":"Aribandi","year":"2022"},{"key":"2024092014244379600_bib3","article-title":"Massively multilingual neural machine translation in the wild: Findings and challenges","author":"Arivazhagan","year":"2019","journal-title":"CoRR"},{"key":"2024092014244379600_bib4","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1162\/tacl_a_00209","article-title":"Weakly supervised learning of semantic parsers for mapping instructions to actions","volume":"1","author":"Artzi","year":"2013","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2024092014244379600_bib5","first-page":"1","article-title":"Learning to interpret natural language instructions","volume-title":"Proceedings of the Second Workshop on Semantic Interpretation in an Actionable Context","author":"Babe\u015f-Vroman","year":"2012"},{"key":"2024092014244379600_bib6","doi-asserted-by":"publisher","first-page":"93","DOI":"10.18653\/v1\/2022.acl-demo.9","article-title":"PromptSource: An integrated development environment and repository for natural language prompts","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations","author":"Bach","year":"2022"},{"key":"2024092014244379600_bib7","article-title":"Training a helpful and harmless assistant with reinforcement learning from human feedback","author":"Bai","year":"2022","journal-title":"CoRR"},{"key":"2024092014244379600_bib8","article-title":"Constitutional AI: Harmlessness from AI feedback","author":"Bai","year":"2022","journal-title":"CoRR"},{"key":"2024092014244379600_bib9","article-title":"The poison of alignment","author":"Bekbayev","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib10","doi-asserted-by":"publisher","first-page":"679","DOI":"10.1512\/iumj.1957.6.56038","article-title":"A Markovian decision process","author":"Bellman","year":"1957","journal-title":"Journal of Mathematics and Mechanics"},{"key":"2024092014244379600_bib11","doi-asserted-by":"publisher","first-page":"17682","DOI":"10.1609\/aaai.v38i16.29720","article-title":"Graph of thoughts: Solving elaborate problems with large language models","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Besta","year":"2024"},{"key":"2024092014244379600_bib12","doi-asserted-by":"publisher","first-page":"751","DOI":"10.18653\/v1\/N16-1089","article-title":"Natural language communication with robots","volume-title":"Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Bisk","year":"2016"},{"key":"2024092014244379600_bib13","doi-asserted-by":"publisher","first-page":"632","DOI":"10.18653\/v1\/D15-1075","article-title":"A large annotated corpus for learning natural language inference","volume-title":"Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing","author":"Bowman","year":"2015"},{"key":"2024092014244379600_bib14","first-page":"268","article-title":"Learning to win by reading manuals in a Monte-Carlo framework","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies","author":"Branavan","year":"2011"},{"key":"2024092014244379600_bib15","first-page":"1268","article-title":"Reading between the lines: Learning to map high-level instructions to commands","volume-title":"Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics","author":"Branavan","year":"2010"},{"key":"2024092014244379600_bib16","article-title":"SMASH: one-shot model architecture search through hypernetworks","volume-title":"6th International Conference on Learning Representations, ICLR 2018","author":"Brock","year":"2018"},{"key":"2024092014244379600_bib17","first-page":"1877","article-title":"Language models are few-shot learners","volume-title":"Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020","author":"Brown","year":"2020"},{"key":"2024092014244379600_bib18","article-title":"Discovering latent knowledge in language models without supervision","volume-title":"The Eleventh International Conference on Learning Representations","author":"Burns","year":"2023"},{"key":"2024092014244379600_bib19","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1086\/461846","article-title":"Cognitively guided instruction: A knowledge base for reform in primary mathematics instruction","volume":"97","author":"Carpenter","year":"1996","journal-title":"The Elementary School Journal"},{"key":"2024092014244379600_bib20","doi-asserted-by":"publisher","first-page":"6848","DOI":"10.18653\/v1\/2022.emnlp-main.460","article-title":"Help me write a poem: Instruction tuning as a vehicle for collaborative poetry writing","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Chakrabarty","year":"2022"},{"key":"2024092014244379600_bib21","first-page":"430","article-title":"Fast online lexicon learning for grounded language acquisition","volume-title":"Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Chen","year":"2012"},{"key":"2024092014244379600_bib22","doi-asserted-by":"publisher","first-page":"128","DOI":"10.1145\/1390156.1390173","article-title":"Learning to sportscast: A test of grounded language acquisition","volume-title":"Machine Learning, Proceedings of the Twenty-Fifth International Conference (ICML 2008)","author":"Chen","year":"2008"},{"key":"2024092014244379600_bib23","doi-asserted-by":"publisher","first-page":"859","DOI":"10.1609\/aaai.v25i1.7974","article-title":"Learning to interpret natural language navigation instructions from observations","volume-title":"Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2011","author":"Chen","year":"2011"},{"key":"2024092014244379600_bib24","article-title":"Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks","author":"Chen","year":"2022","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib25","doi-asserted-by":"publisher","first-page":"2778","DOI":"10.1145\/3485447.3511998","article-title":"KnowPrompt: Knowledge-aware prompt-tuning with synergistic optimization for relation extraction","volume-title":"Proceedings of the ACM Web Conference 2022","author":"Chen","year":"2022"},{"key":"2024092014244379600_bib26","article-title":"INSTRUCTEVAL: Towards holistic evaluation of instruction-tuned large language models","author":"Chia","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib27","first-page":"240:1\u2013240:113","article-title":"PaLM: Scaling language modeling with pathways","volume":"24","author":"Chowdhery","year":"2023","journal-title":"Journal of Machine Learning Research"},{"key":"2024092014244379600_bib28","article-title":"Scaling instruction-finetuned language models","author":"Chung","year":"2022","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib29","first-page":"18","article-title":"Driving semantic parsing from the world\u2019s response","volume-title":"Proceedings of the Fourteenth Conference on Computational Natural Language Learning","author":"Clarke","year":"2010"},{"key":"2024092014244379600_bib30","doi-asserted-by":"publisher","first-page":"1835","DOI":"10.18653\/v1\/2021.findings-acl.161","article-title":"Template-based named entity recognition using BART","volume-title":"Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021","author":"Cui","year":"2021"},{"key":"2024092014244379600_bib31","doi-asserted-by":"publisher","first-page":"6792","DOI":"10.18653\/v1\/2022.emnlp-main.456","article-title":"Boosting natural language generation from instructions with meta-learning","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Deb","year":"2022"},{"key":"2024092014244379600_bib32","doi-asserted-by":"publisher","first-page":"3369","DOI":"10.18653\/v1\/2022.emnlp-main.222","article-title":"RLPrompt: Optimizing discrete text prompts with reinforcement learning","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Deng","year":"2022"},{"key":"2024092014244379600_bib33","first-page":"28091","article-title":"Mind2Web: Towards a generalist agent for the web","volume-title":"Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023","author":"Deng","year":"2023"},{"key":"2024092014244379600_bib34","first-page":"4171","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Devlin","year":"2019"},{"key":"2024092014244379600_bib35","doi-asserted-by":"publisher","first-page":"3029","DOI":"10.18653\/v1\/2023.emnlp-main.183","article-title":"Enhancing chat language models by scaling high-quality instructional conversations","volume-title":"Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6\u201310, 2023","author":"Ding","year":"2023"},{"key":"2024092014244379600_bib36","article-title":"A survey on in-context learning","volume":"abs\/2301.00234","author":"Dong","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib37","first-page":"30039","article-title":"AlpacaFarm: A simulation framework for methods that learn from human feedback","volume-title":"Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023","author":"Dubois","year":"2023"},{"issue":"5","key":"2024092014244379600_bib38","doi-asserted-by":"publisher","first-page":"716","DOI":"10.1037\/xhp0000481","article-title":"How does \u201cnot left\u201d become \u201cright\u201d? Electrophysiological evidence for a dynamic conflict-bound negation processing account","volume":"44","author":"Dudschig","year":"2018","journal-title":"Journal of Experimental Psychology: Human Perception and Performance"},{"key":"2024092014244379600_bib39","article-title":"EditEval: An instruction-based benchmark for text improvements","author":"Dwivedi-Yu","year":"2022","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib40","article-title":"The Turking Test: Can language models understand instructions?","author":"Efrat","year":"2020","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib41","doi-asserted-by":"publisher","first-page":"958","DOI":"10.3115\/1699571.1699637","article-title":"Reading to learn: Constructing features from semantic abstracts","volume-title":"Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing","author":"Eisenstein","year":"2009"},{"key":"2024092014244379600_bib42","doi-asserted-by":"publisher","DOI":"10.2307\/749875","article-title":"A longitudinal study of learning to use children\u2019s thinking in mathematics instruction","author":"Fennema","year":"1996","journal-title":"Journal for Research in Mathematics Education"},{"key":"2024092014244379600_bib43","doi-asserted-by":"publisher","first-page":"1066","DOI":"10.18653\/v1\/2023.wmt-1.100","article-title":"The devil is in the errors: Leveraging large language models for fine-grained machine translation evaluation","volume-title":"Proceedings of the Eighth Conference on Machine Translation","author":"Fernandes","year":"2023"},{"key":"2024092014244379600_bib44","doi-asserted-by":"publisher","first-page":"1946","DOI":"10.18653\/v1\/P19-1188","article-title":"Pre-learning environment representations for data-efficient neural instruction following","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Gaddy","year":"2019"},{"key":"2024092014244379600_bib45","doi-asserted-by":"publisher","first-page":"3816","DOI":"10.18653\/v1\/2021.acl-long.295","article-title":"Making pre-trained language models better few-shot learners","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Gao","year":"2021"},{"issue":"1","key":"2024092014244379600_bib46","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/978-3-319-09274-4","article-title":"Artificial general intelligence: Concept, state of the art, and future prospects","volume":"5","author":"Goertzel","year":"2014","journal-title":"Journal of Artificial General Intelligence"},{"key":"2024092014244379600_bib47","first-page":"1794","article-title":"Learning from natural instructions","volume-title":"IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence","author":"Goldwasser","year":"2011"},{"key":"2024092014244379600_bib48","doi-asserted-by":"publisher","first-page":"10136","DOI":"10.18653\/v1\/2023.findings-emnlp.679","article-title":"Demystifying prompts in language models via perplexity estimation","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2023","author":"Gonen","year":"2023"},{"key":"2024092014244379600_bib49","doi-asserted-by":"publisher","first-page":"13935","DOI":"10.1016\/j.learninstruc.2022.101692","article-title":"Robustness of learning from task instructions","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Gu","year":"2023"},{"key":"2024092014244379600_bib50","article-title":"Instruction tuned models are quick learners","author":"Gupta","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib51","doi-asserted-by":"publisher","first-page":"505","DOI":"10.18653\/v1\/2022.emnlp-main.33","article-title":"InstructDial: Improving zero and few-shot generalization in dialogue through instruction tuning","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Gupta","year":"2022"},{"key":"2024092014244379600_bib52","article-title":"HyperNetworks","volume-title":"5th International Conference on Learning Representations, ICLR 2017","author":"Ha","year":"2017"},{"key":"2024092014244379600_bib53","doi-asserted-by":"publisher","first-page":"1884","DOI":"10.18653\/v1\/P18-1175","article-title":"Training classifiers with natural language explanations","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Hancock","year":"2018"},{"key":"2024092014244379600_bib54","article-title":"AnnoLLM: Making large language models to be better crowdsourced annotators","author":"He","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib55","article-title":"Measuring massive multitask language understanding","volume-title":"9th International Conference on Learning Representations, ICLR 2021","author":"Hendrycks","year":"2021"},{"issue":"8","key":"2024092014244379600_bib56","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Computation"},{"key":"2024092014244379600_bib57","doi-asserted-by":"publisher","first-page":"14409","DOI":"10.18653\/v1\/2023.acl-long.806","article-title":"Unnatural instructions: Tuning language models with (almost) no human labor","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023","author":"Honovich","year":"2023"},{"key":"2024092014244379600_bib58","doi-asserted-by":"publisher","first-page":"1935","DOI":"10.18653\/v1\/2023.acl-long.108","article-title":"Instruction induction: From few examples to natural language task descriptions","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023","author":"Honovich","year":"2023"},{"key":"2024092014244379600_bib59","doi-asserted-by":"publisher","first-page":"1301","DOI":"10.18653\/v1\/2021.naacl-main.102","article-title":"Understanding by understanding not: Modeling negation in language models","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Hosseini","year":"2021"},{"key":"2024092014244379600_bib60","first-page":"2790","article-title":"Parameter-efficient transfer learning for NLP","volume-title":"Proceedings of the 36th International Conference on Machine Learning, ICML 2019","author":"Houlsby","year":"2019"},{"key":"2024092014244379600_bib61","article-title":"LoRA: Low-rank adaptation of large language models","volume-title":"The Tenth International Conference on Learning Representations, ICLR 2022","author":"Hu","year":"2022"},{"key":"2024092014244379600_bib62","doi-asserted-by":"publisher","first-page":"2627","DOI":"10.18653\/v1\/2022.findings-emnlp.193","article-title":"In-context learning for few-shot dialogue state tracking","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2022","author":"Hu","year":"2022"},{"key":"2024092014244379600_bib63","doi-asserted-by":"publisher","first-page":"1049","DOI":"10.18653\/v1\/2023.findings-acl.67","article-title":"Towards reasoning in large language models: A survey","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Huang","year":"2023"},{"key":"2024092014244379600_bib64","article-title":"A survey of NLP-related crowdsourcing hits: What works and what does not","author":"Huynh","year":"2021","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib65","doi-asserted-by":"publisher","first-page":"11272","DOI":"10.18653\/v1\/2023.acl-long.631","article-title":"HINT: hypernetwork instruction tuning for efficient zero- and few-shot generalisation","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023","author":"Ivison","year":"2023"},{"key":"2024092014244379600_bib66","article-title":"Camels in a changing climate: Enhancing LM adaptation with Tulu 2","volume":"abs\/2311.10702","author":"Ivison","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib67","article-title":"OPT-IML: Scaling language model instruction meta learning through the lens of generalization","volume":"abs\/2212.12017","author":"Iyer","year":"2022","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib68","doi-asserted-by":"publisher","first-page":"133653","DOI":"10.1109\/ACCESS.2019.2941229","article-title":"Q-learning algorithms: A comprehensive classification and applications","volume":"7","author":"Jang","year":"2019","journal-title":"IEEE Access"},{"key":"2024092014244379600_bib69","first-page":"14702","article-title":"Exploring the benefits of training expert language models over instruction tuning","volume-title":"International Conference on Machine Learning, ICML 2023","author":"Jang","year":"2023"},{"key":"2024092014244379600_bib70","first-page":"52","article-title":"Can large language models truly understand prompts? A case study with negated prompts","volume-title":"Transfer Learning for Natural Language Processing Workshop","author":"Jang","year":"2022"},{"key":"2024092014244379600_bib71","doi-asserted-by":"publisher","first-page":"6994","DOI":"10.18653\/v1\/2020.acl-main.625","article-title":"Language to network: Conditional parameter adaptation with natural language descriptions","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Jin","year":"2020"},{"key":"2024092014244379600_bib72","article-title":"Exploiting programmatic behavior of LLMs: Dual-use through standard security attacks","author":"Kang","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib73","doi-asserted-by":"publisher","first-page":"7811","DOI":"10.18653\/v1\/2020.acl-main.698","article-title":"Negated and misprimed probes for pretrained language models: Birds can talk, but cannot fly","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Kassner","year":"2020"},{"key":"2024092014244379600_bib74","article-title":"Turning English-centric LLMs into polyglots: How much multilinguality is needed?","author":"Kew","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib75","doi-asserted-by":"publisher","first-page":"3631","DOI":"10.18653\/v1\/2022.naacl-main.266","article-title":"Prompt waywardness: The curious case of discretized interpretation of continuous prompts","volume-title":"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Khashabi","year":"2022"},{"key":"2024092014244379600_bib76","doi-asserted-by":"publisher","first-page":"1896","DOI":"10.18653\/v1\/2020.findings-emnlp.171","article-title":"UNIFIEDQA: Crossing format boundaries with a single QA system","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Khashabi","year":"2020"},{"key":"2024092014244379600_bib77","first-page":"433","article-title":"Unsupervised PCFG induction for grounded language learning with highly ambiguous supervision","volume-title":"Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning","author":"Kim","year":"2012"},{"key":"2024092014244379600_bib78","doi-asserted-by":"publisher","first-page":"12685","DOI":"10.18653\/v1\/2023.emnlp-main.782","article-title":"The CoT collection: Improving zero-shot and few-shot learning of language models via chain-of-thought fine-tuning","volume-title":"Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023","author":"Kim","year":"2023"},{"key":"2024092014244379600_bib79","doi-asserted-by":"publisher","first-page":"2676","DOI":"10.18653\/v1\/P18-1249","article-title":"Constituency parsing with a self-attentive encoder","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Kitaev","year":"2018"},{"key":"2024092014244379600_bib80","article-title":"LongForm: Optimizing instruction tuning for long text generation with corpus extraction","author":"K\u00f6ksal","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib81","first-page":"47669","article-title":"OpenAssistant conversations\u2014democratizing large language model alignment","volume-title":"Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023","author":"K\u00f6pf","year":"2023"},{"key":"2024092014244379600_bib82","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1162\/tacl_a_00220","article-title":"Jointly learning to parse and perceive: Connecting natural language to the physical world","volume":"1","author":"Krishnamurthy","year":"2013","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2024092014244379600_bib83","first-page":"2468","article-title":"Guiding a reinforcement learner with natural language advice: Initial results in RoboCup soccer","volume-title":"The AAAI-2004 Workshop on Supervisory Control of Learning and Adaptive Systems","author":"Kuhlmann","year":"2004"},{"issue":"11","key":"2024092014244379600_bib84","doi-asserted-by":"publisher","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"LeCun","year":"1998","journal-title":"Proceedings of the IEEE"},{"key":"2024092014244379600_bib85","doi-asserted-by":"publisher","first-page":"3045","DOI":"10.18653\/v1\/2021.emnlp-main.243","article-title":"The power of scale for parameter-efficient prompt tuning","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Lester","year":"2021"},{"key":"2024092014244379600_bib86","doi-asserted-by":"publisher","first-page":"607","DOI":"10.1162\/tacl_a_00479","article-title":"Ultra-fine entity typing with indirect supervision from natural language inference","volume":"10","author":"Li","year":"2022","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2024092014244379600_bib87","article-title":"MIMIC-IT: Multi-modal in-context instruction tuning","author":"Li","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib88","first-page":"135","article-title":"MAQA: A multimodal QA benchmark for negation","volume-title":"NeurIPS 2022 Workshop on Synthetic Data for Empowering ML Research","author":"Li","year":"2022"},{"key":"2024092014244379600_bib89","doi-asserted-by":"publisher","first-page":"215","DOI":"10.18653\/v1\/2020.acl-demos.25","article-title":"Interactive task learning from GUI-grounded natural language instructions and demonstrations","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations","author":"Li","year":"2020"},{"key":"2024092014244379600_bib90","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.findings-emnlp.411","article-title":"Finding supporting examples for in-context learning","author":"Li","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib91","doi-asserted-by":"publisher","first-page":"4582","DOI":"10.18653\/v1\/2021.acl-long.353","article-title":"Prefix-tuning: Optimizing continuous prompts for generation","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Li","year":"2021"},{"key":"2024092014244379600_bib92","doi-asserted-by":"publisher","first-page":"2579","DOI":"10.18653\/v1\/2022.findings-acl.203","article-title":"Prompt-driven neural machine translation","volume-title":"Findings of the Association for Computational Linguistics: ACL 2022","author":"Li","year":"2022"},{"key":"2024092014244379600_bib93","article-title":"Do you really follow me? Adversarial instructions for evaluating the robustness of large language models","author":"Li","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib94","article-title":"Scaling down to scale up: A guide to parameter-efficient fine-tuning","author":"Lialin","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib95","doi-asserted-by":"publisher","first-page":"91","DOI":"10.3115\/1687878.1687893","article-title":"Learning semantic correspondences with less supervision","volume-title":"Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP","author":"Liang","year":"2009"},{"key":"2024092014244379600_bib96","first-page":"74","article-title":"ROUGE: A package for automatic evaluation of summaries","volume-title":"Text Summarization Branches Out","author":"Lin","year":"2004"},{"key":"2024092014244379600_bib97","article-title":"RA-DIT: Retrieval-augmented dual instruction tuning","author":"Lin","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib98","doi-asserted-by":"publisher","first-page":"9019","DOI":"10.18653\/v1\/2022.emnlp-main.616","article-title":"Few-shot learning with multilingual generative language models","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022","author":"Lin","year":"2022"},{"key":"2024092014244379600_bib99","first-page":"1950","article-title":"Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning","volume-title":"Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022","author":"Liu","year":"2022"},{"key":"2024092014244379600_bib100","doi-asserted-by":"publisher","first-page":"100","DOI":"10.18653\/v1\/2022.deelio-1.10","article-title":"What makes good in-context examples for GPT-3?","volume-title":"Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures","author":"Liu","year":"2022"},{"issue":"9","key":"2024092014244379600_bib101","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3560815","article-title":"Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing","volume":"55","author":"Liu","year":"2023","journal-title":"ACM Computing Surveys"},{"key":"2024092014244379600_bib102","article-title":"From zero to hero: Examining the power of symbolic tasks in instruction tuning","author":"Liu","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib103","article-title":"What makes good data for alignment? A comprehensive study of automatic data selection in instruction tuning","author":"Liu","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib104","article-title":"GPT understands, too","author":"Liu","year":"2021","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib105","article-title":"Automatic instruction optimization for open-source LLM instruction tuning","author":"Liu","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib106","article-title":"Benchmarking generation and evaluation capabilities of large language models for instruction controllable summarization","author":"Liu","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib107","first-page":"22631","article-title":"The Flan Collection: Designing data and methods for effective instruction tuning","volume-title":"International Conference on Machine Learning, ICML 2023","author":"Longpre","year":"2023"},{"key":"2024092014244379600_bib108","article-title":"Forget demonstrations, focus on learning from textual instructions","author":"Lou","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib109","article-title":"MUFFIN: Curating multi-faceted instructions for improving instruction following","volume-title":"The Twelfth International Conference on Learning Representations","author":"Lou","year":"2024"},{"key":"2024092014244379600_bib110","doi-asserted-by":"publisher","first-page":"8086","DOI":"10.18653\/v1\/2022.acl-long.556","article-title":"Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Lu","year":"2022"},{"key":"2024092014244379600_bib111","first-page":"1435","article-title":"A joint model of language and perception for grounded attribute learning","volume-title":"Proceedings of the 29th International Conference on Machine Learning, ICML 2012","author":"Matuszek","year":"2012"},{"key":"2024092014244379600_bib112","doi-asserted-by":"publisher","first-page":"2791","DOI":"10.18653\/v1\/2022.naacl-main.201","article-title":"MetaICL: Learning to learn in context","volume-title":"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Min","year":"2022"},{"key":"2024092014244379600_bib113","doi-asserted-by":"publisher","first-page":"11048","DOI":"10.18653\/v1\/2022.emnlp-main.759","article-title":"Rethinking the role of demonstrations: What makes in-context learning work?","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Min","year":"2022"},{"key":"2024092014244379600_bib114","doi-asserted-by":"publisher","first-page":"589","DOI":"10.18653\/v1\/2022.findings-acl.50","article-title":"Reframing instructional prompts to GPTk\u2019s language","volume-title":"Findings of the Association for Computational Linguistics: ACL 2022","author":"Mishra","year":"2022"},{"key":"2024092014244379600_bib115","doi-asserted-by":"publisher","first-page":"3470","DOI":"10.18653\/v1\/2022.acl-long.244","article-title":"Cross-task generalization via natural language crowdsourcing instructions","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Mishra","year":"2022"},{"key":"2024092014244379600_bib116","doi-asserted-by":"publisher","first-page":"11834","DOI":"10.18653\/v1\/2023.findings-acl.751","article-title":"HELP ME THINK: A simple prompting strategy for non-experts to create customized content with models","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Mishra","year":"2023"},{"key":"2024092014244379600_bib117","doi-asserted-by":"publisher","first-page":"15991","DOI":"10.18653\/v1\/2023.acl-long.891","article-title":"Crosslingual generalization through multitask finetuning","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023","author":"Muennighoff","year":"2023"},{"key":"2024092014244379600_bib118","doi-asserted-by":"publisher","first-page":"2106","DOI":"10.18653\/v1\/2020.acl-main.190","article-title":"ExpBERT: Representation engineering with natural language explanations","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Murty","year":"2020"},{"key":"2024092014244379600_bib119","first-page":"2340","article-title":"Stress test evaluation for natural language inference","volume-title":"Proceedings of the 27th International Conference on Computational Linguistics","author":"Naik","year":"2018"},{"key":"2024092014244379600_bib120","article-title":"In-context example selection with influences","author":"Nguyen","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib121","article-title":"ChatGPT","author":"OpenAI","year":"2022"},{"key":"2024092014244379600_bib122","unstructured":"OpenAI. 2023. GPT-4 technical report. ArXiv preprint, abs\/2303.08774."},{"key":"2024092014244379600_bib123","article-title":"Non-proportional parametrizations for stable hypernetwork learning","author":"Ortiz","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib124","first-page":"27730","article-title":"Training language models to follow instructions with human feedback","volume-title":"Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022","author":"Ouyang","year":"2022"},{"key":"2024092014244379600_bib125","first-page":"26837","article-title":"Do the rewards justify the means? Measuring trade-offs between rewards and ethical behavior in the MACHIAVELLI benchmark","volume-title":"International Conference on Machine Learning, ICML 2023","author":"Pan","year":"2023"},{"key":"2024092014244379600_bib126","doi-asserted-by":"publisher","first-page":"1779","DOI":"10.18653\/v1\/2023.eacl-main.130","article-title":"Don\u2019t blame the annotator: Bias already starts in the annotation instructions","volume-title":"Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics","author":"Parmar","year":"2023"},{"key":"2024092014244379600_bib127","article-title":"Instruction tuning with GPT-4","author":"Peng","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib128","doi-asserted-by":"publisher","first-page":"3845","DOI":"10.18653\/v1\/2023.eacl-main.277","article-title":"GrIPS: Gradient-free, edit-based instruction search for prompting large language models","volume-title":"Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics","author":"Prasad","year":"2023"},{"key":"2024092014244379600_bib129","doi-asserted-by":"publisher","first-page":"4751","DOI":"10.21437\/Interspeech.2020-2831","article-title":"Massively multilingual ASR: 50 languages, 1 model, 1 billion parameters","volume-title":"Interspeech 2020, 21st Annual Conference of the International Speech Communication Association","author":"Pratap","year":"2020"},{"key":"2024092014244379600_bib130","doi-asserted-by":"publisher","first-page":"5687","DOI":"10.18653\/v1\/2023.findings-emnlp.378","article-title":"Measuring and narrowing the compositionality gap in language models","volume-title":"The 2023 Conference on Empirical Methods in Natural Language Processing","author":"Press","year":"2023"},{"key":"2024092014244379600_bib131","doi-asserted-by":"publisher","first-page":"8494","DOI":"10.1109\/CVPR.2018.00886","article-title":"VirtualHome: Simulating household activities via programs","volume-title":"2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018","author":"Puig","year":"2018"},{"key":"2024092014244379600_bib132","doi-asserted-by":"publisher","first-page":"5368","DOI":"10.18653\/v1\/2023.acl-long.294","article-title":"Reasoning with language model prompting: A survey","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023","author":"Qiao","year":"2023"},{"key":"2024092014244379600_bib133","doi-asserted-by":"publisher","first-page":"5203","DOI":"10.18653\/v1\/2021.naacl-main.410","article-title":"Learning how to ask: Querying LMs with mixtures of soft prompts","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Qin","year":"2021"},{"key":"2024092014244379600_bib134","article-title":"Language models are unsupervised multitask learners","author":"Radford","year":"2019","journal-title":"OpenAI blog"},{"key":"2024092014244379600_bib135","first-page":"140:1\u2013140:67","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","volume":"21","author":"Raffel","year":"2020","journal-title":"Journal of Machine Learning Research"},{"key":"2024092014244379600_bib136","doi-asserted-by":"publisher","first-page":"5274","DOI":"10.18653\/v1\/2023.findings-emnlp.350","article-title":"CoEDIT: Text editing by task-specific instruction tuning","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2023","author":"Raheja","year":"2023"},{"key":"2024092014244379600_bib137","doi-asserted-by":"publisher","first-page":"2383","DOI":"10.18653\/v1\/D16-1264","article-title":"SQuAD: 100,000+ questions for machine comprehension of text","volume-title":"Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing","author":"Rajpurkar","year":"2016"},{"key":"2024092014244379600_bib138","article-title":"Branch-solve-merge improves large language model evaluation and generation","author":"Saha","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib139","doi-asserted-by":"publisher","first-page":"2439","DOI":"10.18653\/v1\/2022.findings-naacl.187","article-title":"Textual entailment for event argument extraction: Zero- and few-shot with multi-source learning","volume-title":"Findings of the Association for Computational Linguistics: NAACL 2022","author":"Sainz","year":"2022"},{"key":"2024092014244379600_bib140","doi-asserted-by":"publisher","first-page":"1199","DOI":"10.18653\/v1\/2021.emnlp-main.92","article-title":"Label verbalization and entailment for effective zero and few-shot relation extraction","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Sainz","year":"2021"},{"key":"2024092014244379600_bib141","article-title":"Multitask prompted training enables zero-shot task generalization","volume-title":"The Tenth International Conference on Learning Representations, ICLR 2022","author":"Sanh","year":"2022"},{"key":"2024092014244379600_bib142","doi-asserted-by":"publisher","first-page":"255","DOI":"10.18653\/v1\/2021.eacl-main.20","article-title":"Exploiting cloze-questions for few-shot text classification and natural language inference","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Schick","year":"2021"},{"key":"2024092014244379600_bib143","doi-asserted-by":"publisher","first-page":"390","DOI":"10.18653\/v1\/2021.emnlp-main.32","article-title":"Few-shot text generation with natural language instructions","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Schick","year":"2021"},{"key":"2024092014244379600_bib144","doi-asserted-by":"publisher","first-page":"2339","DOI":"10.18653\/v1\/2021.naacl-main.185","article-title":"It\u2019s not just size that matters: Small language models are also few-shot learners","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Schick","year":"2021"},{"key":"2024092014244379600_bib145","article-title":"Dynamics of instruction tuning: Each ability of large language models has its own growth pace","author":"Song","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib146","doi-asserted-by":"publisher","first-page":"819","DOI":"10.18653\/v1\/2022.acl-long.60","article-title":"An information-theoretic approach to prompt engineering without ground truth labels","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Sorensen","year":"2022"},{"key":"2024092014244379600_bib147","doi-asserted-by":"publisher","first-page":"1527","DOI":"10.18653\/v1\/D17-1161","article-title":"Joint concept learning and semantic parsing from natural language explanations","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Srivastava","year":"2017"},{"key":"2024092014244379600_bib148","doi-asserted-by":"publisher","first-page":"306","DOI":"10.18653\/v1\/P18-1029","article-title":"Zero-shot learning of classifiers from natural language quantification","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Srivastava","year":"2018"},{"key":"2024092014244379600_bib149","first-page":"3008","article-title":"Learning to summarize with human feedback","volume-title":"Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020","author":"Stiennon","year":"2020"},{"key":"2024092014244379600_bib150","doi-asserted-by":"publisher","first-page":"3645","DOI":"10.18653\/v1\/P19-1355","article-title":"Energy and policy considerations for deep learning in NLP","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Strubell","year":"2019"},{"key":"2024092014244379600_bib151","doi-asserted-by":"publisher","first-page":"1102","DOI":"10.18653\/v1\/2023.findings-acl.71","article-title":"One embedder, any task: Instruction-finetuned text embeddings","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Su","year":"2023"},{"key":"2024092014244379600_bib152","doi-asserted-by":"publisher","first-page":"19062","DOI":"10.1609\/aaai.v38i17.29873","article-title":"UMIE: Unified multimodal information extraction with instruction tuning","volume-title":"Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014","author":"Sun","year":"2024"},{"key":"2024092014244379600_bib153","doi-asserted-by":"publisher","first-page":"1624","DOI":"10.18653\/v1\/2022.naacl-main.117","article-title":"Implicit n-grams induced by recurrence","volume-title":"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Sun","year":"2022"},{"key":"2024092014244379600_bib154","doi-asserted-by":"publisher","first-page":"13003","DOI":"10.18653\/v1\/2023.findings-acl.824","article-title":"Challenging big-bench tasks and whether chain-of-thought can solve them","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Suzgun","year":"2023"},{"key":"2024092014244379600_bib155","article-title":"Stanford Alpaca: An instruction-following LLaMA model","author":"Taori","year":"2023"},{"key":"2024092014244379600_bib156","article-title":"UL2: Unifying language learning paradigms","volume-title":"The Eleventh International Conference on Learning Representations, ICLR 2023","author":"Tay","year":"2023"},{"issue":"4","key":"2024092014244379600_bib157","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1609\/aimag.v32i4.2384","article-title":"Approaching the symbol grounding problem with probabilistic graphical models","volume":"32","author":"Tellex","year":"2011","journal-title":"AI Magazine"},{"key":"2024092014244379600_bib158","article-title":"LLaMA: Open and efficient foundation language models","author":"Touvron","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib159","first-page":"5998","article-title":"Attention is all you need","volume-title":"Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017","author":"Vaswani","year":"2017"},{"key":"2024092014244379600_bib160","first-page":"806","article-title":"Learning to follow navigational directions","volume-title":"Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics","author":"Vogel","year":"2010"},{"key":"2024092014244379600_bib161","first-page":"35413","article-title":"Poisoning language models during instruction tuning","volume-title":"International Conference on Machine Learning, ICML 2023","author":"Wan","year":"2023"},{"key":"2024092014244379600_bib162","article-title":"InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-Shot NER","author":"Wang","year":"2022","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib163","first-page":"1","article-title":"Introduction: Aspects of artificial general intelligence","volume-title":"Proceedings of the 2007 Conference on Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms: Proceedings of the AGI Workshop 2006","author":"Wang","year":"2007"},{"key":"2024092014244379600_bib164","article-title":"Large language models are implicitly topic models: Explaining and finding good demonstrations for in-context learning","author":"Wang","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib165","article-title":"Self-consistency improves chain of thought reasoning in language models","volume-title":"The Eleventh International Conference on Learning Representations, ICLR 2023","author":"Wang","year":"2023"},{"key":"2024092014244379600_bib166","first-page":"74764","article-title":"How far can camels go? Exploring the state of instruction tuning on open resources","volume-title":"Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023","author":"Wang","year":"2023"},{"key":"2024092014244379600_bib167","doi-asserted-by":"publisher","first-page":"13484","DOI":"10.18653\/v1\/2023.acl-long.754","article-title":"Self-Instruct: Aligning language models with self-generated instructions","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023","author":"Wang","year":"2023"},{"key":"2024092014244379600_bib168","doi-asserted-by":"crossref","first-page":"5085","DOI":"10.18653\/v1\/2022.emnlp-main.340","article-title":"Benchmarking generalization via in-context instructions on 1,600+ language tasks","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Wang","year":"2022"},{"key":"2024092014244379600_bib169","article-title":"Learning from explanations with neural execution tree","volume-title":"8th International Conference on Learning Representations, ICLR 2020","author":"Wang","year":"2020"},{"key":"2024092014244379600_bib170","doi-asserted-by":"publisher","first-page":"2300","DOI":"10.18653\/v1\/2022.naacl-main.167","article-title":"Do prompt-based models really understand the meaning of their prompts?","volume-title":"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Webson","year":"2022"},{"key":"2024092014244379600_bib171","article-title":"Finetuned language models are zero-shot learners","volume-title":"The Tenth International Conference on Learning Representations, ICLR 2022","author":"Wei","year":"2022"},{"key":"2024092014244379600_bib172","first-page":"24824","article-title":"Chain-of-thought prompting elicits reasoning in large language models","volume-title":"Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022","author":"Wei","year":"2022"},{"key":"2024092014244379600_bib173","doi-asserted-by":"publisher","first-page":"968","DOI":"10.18653\/v1\/2023.emnlp-main.61","article-title":"Symbol tuning improves in-context learning in language models","volume-title":"Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023","author":"Wei","year":"2023"},{"key":"2024092014244379600_bib174","doi-asserted-by":"publisher","first-page":"1361","DOI":"10.18653\/v1\/2020.emnlp-main.105","article-title":"Learning from task descriptions","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Weller","year":"2020"},{"key":"2024092014244379600_bib175","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-demos.6","article-title":"HuggingFace\u2019s transformers: State-of-the-art natural language processing","author":"Wolf","year":"2019","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib176","doi-asserted-by":"publisher","first-page":"2438","DOI":"10.18653\/v1\/2022.acl-long.174","article-title":"Adversarial soft prompt tuning for cross-domain sentiment analysis","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Wu","year":"2022"},{"key":"2024092014244379600_bib177","first-page":"944\u2013\u0170964","article-title":"LaMini-LM: A diverse herd of distilled models from large-scale instructions","volume-title":"Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024 - Volume 1: Long Papers","author":"Wu","year":"2024"},{"key":"2024092014244379600_bib178","doi-asserted-by":"publisher","first-page":"646","DOI":"10.1145\/3159652.3159709","article-title":"Indirect supervision for relation extraction using question-answer pairs","volume-title":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM 2018","author":"Wu","year":"2018"},{"key":"2024092014244379600_bib179","doi-asserted-by":"publisher","first-page":"1423","DOI":"10.18653\/v1\/2023.acl-long.79","article-title":"Self-adaptive in-context learning: An information compression perspective for in-context example selection and ordering","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023","author":"Wu","year":"2023"},{"key":"2024092014244379600_bib180","doi-asserted-by":"publisher","first-page":"1351","DOI":"10.18653\/v1\/2021.naacl-main.106","article-title":"Incremental few-shot text classification with multi-round new classes: Formulation, dataset and system","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Xia","year":"2021"},{"key":"2024092014244379600_bib181","article-title":"Adaptive chameleon or stubborn sloth: Revealing the behavior of large language models in knowledge conflicts","volume-title":"The Twelfth International Conference on Learning Representations","author":"Xie","year":"2024"},{"key":"2024092014244379600_bib182","article-title":"TravelPlanner: A benchmark for real-world planning with language agents","author":"Xie","year":"2024","journal-title":"arXiv preprint arXiv:2402.01622"},{"key":"2024092014244379600_bib183","doi-asserted-by":"publisher","first-page":"6268","DOI":"10.18653\/v1\/2023.emnlp-main.385","article-title":"Baize: An open-source chat model with parameter-efficient tuning on self-chat data","volume-title":"Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023","author":"Xu","year":"2023"},{"key":"2024092014244379600_bib184","article-title":"WizardLM: Empowering large language models to follow complex instructions","author":"Xu","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib185","article-title":"Small models are valuable plug-ins for large language models","author":"Xu","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib186","doi-asserted-by":"publisher","first-page":"10559","DOI":"10.18653\/v1\/2023.acl-long.589","article-title":"A universal discriminator for zero-shot generalization","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023","author":"Xu","year":"2023"},{"key":"2024092014244379600_bib187","doi-asserted-by":"publisher","first-page":"4235","DOI":"10.18653\/v1\/2022.findings-emnlp.312","article-title":"ZeroPrompt: Scaling prompt-based pretraining to 1,000 tasks improves zero-shot generalization","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2022","author":"Xu","year":"2022"},{"key":"2024092014244379600_bib188","doi-asserted-by":"publisher","first-page":"314","DOI":"10.18653\/v1\/2022.conll-1.21","article-title":"OpenStance: Real-world zero-shot stance detection","volume-title":"Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL)","author":"Xu","year":"2022"},{"key":"2024092014244379600_bib189","doi-asserted-by":"publisher","first-page":"5967","DOI":"10.18653\/v1\/2023.emnlp-main.365","article-title":"INSTRUCTSCORE: Towards explainable text generation evaluation with automatic feedback","volume-title":"Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing","author":"Xu","year":"2023"},{"key":"2024092014244379600_bib190","doi-asserted-by":"publisher","first-page":"11445","DOI":"10.18653\/v1\/2023.acl-long.641","article-title":"MultiInstruct: Improving multi-modal zero-shot learning via instruction tuning","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023","author":"Xu","year":"2023"},{"key":"2024092014244379600_bib191","first-page":"11809","article-title":"Tree of thoughts: Deliberate problem solving with large language models","volume-title":"Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023","author":"Yao","year":"2023"},{"key":"2024092014244379600_bib192","doi-asserted-by":"publisher","first-page":"1599","DOI":"10.18653\/v1\/2020.findings-emnlp.145","article-title":"Teaching machine comprehension with compositional explanations","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Ye","year":"2020"},{"key":"2024092014244379600_bib193","doi-asserted-by":"publisher","first-page":"7163","DOI":"10.18653\/v1\/2021.emnlp-main.572","article-title":"CrossFit: A few-shot learning challenge for cross-task generalization in NLP","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Ye","year":"2021"},{"key":"2024092014244379600_bib194","doi-asserted-by":"publisher","first-page":"646","DOI":"10.18653\/v1\/2021.acl-short.82","article-title":"Learning to generate task-specific adapters from task description","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)","author":"Ye","year":"2021"},{"key":"2024092014244379600_bib195","article-title":"In-context instruction learning","author":"Ye","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib196","article-title":"Retrieval of soft prompt enhances zero-shot task generalization","author":"Ye","year":"2022","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib197","article-title":"Guess the instruction! Making language models stronger zero-shot learners","author":"Ye","year":"2022","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib198","doi-asserted-by":"publisher","first-page":"4031","DOI":"10.18653\/v1\/2023.emnlp-main.245","article-title":"Dynosaur: A dynamic growth paradigm for instruction-tuning data curation","volume-title":"Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023","author":"Yin","year":"2023"},{"key":"2024092014244379600_bib199","doi-asserted-by":"publisher","first-page":"32","DOI":"10.18653\/v1\/2023.acl-tutorials.5","article-title":"Indirectly supervised natural language processing","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts)","author":"Yin","year":"2023"},{"key":"2024092014244379600_bib200","doi-asserted-by":"publisher","first-page":"3914","DOI":"10.18653\/v1\/D19-1404","article-title":"Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Yin","year":"2019"},{"key":"2024092014244379600_bib201","doi-asserted-by":"publisher","first-page":"3062","DOI":"10.18653\/v1\/2022.acl-long.218","article-title":"ConTinTin: Continual learning from task instructions","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Yin","year":"2022"},{"key":"2024092014244379600_bib202","article-title":"Nature language reasoning, a survey","author":"Yu","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib203","article-title":"WaveCoder: Widespread and versatile enhanced instruction tuning with refined data generation","author":"Yu","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib204","article-title":"GLM-130B: An open bilingual pre-trained model","volume-title":"The Eleventh International Conference on Learning Representations","author":"Zeng","year":"2022"},{"key":"2024092014244379600_bib205","doi-asserted-by":"publisher","first-page":"1541","DOI":"10.18653\/v1\/2020.emnlp-main.119","article-title":"Analogous process structure induction for sub-event sequence prediction","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Zhang","year":"2020"},{"key":"2024092014244379600_bib206","doi-asserted-by":"publisher","first-page":"794","DOI":"10.18653\/v1\/2023.findings-acl.50","article-title":"Aligning instruction tasks unlocks large language models as zero-shot relation extractors","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Zhang","year":"2023"},{"key":"2024092014244379600_bib207","article-title":"Instruction tuning for large language models: A survey","author":"Zhang","year":"2023","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib208","article-title":"OPT: Open pre-trained transformer language models","author":"Zhang","year":"2022","journal-title":"ArXiv preprint"},{"key":"2024092014244379600_bib209","doi-asserted-by":"publisher","first-page":"2726","DOI":"10.18653\/v1\/2021.naacl-main.217","article-title":"Learning to decompose and organize complex tasks","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Zhang","year":"2021"},{"key":"2024092014244379600_bib210","doi-asserted-by":"publisher","first-page":"9134","DOI":"10.18653\/v1\/2022.emnlp-main.622","article-title":"Active example selection for in-context learning","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Zhang","year":"2022"},{"key":"2024092014244379600_bib211","article-title":"Automatic chain of thought prompting in large language models","volume-title":"The Eleventh International Conference on Learning Representations","author":"Zhang","year":"2022"},{"key":"2024092014244379600_bib212","first-page":"12697","article-title":"Calibrate before use: Improving few-shot performance of language models","volume-title":"Proceedings of the 38th International Conference on Machine Learning, ICML 2021","author":"Zhao","year":"2021"},{"key":"2024092014244379600_bib213","doi-asserted-by":"publisher","first-page":"2856","DOI":"10.18653\/v1\/2021.findings-emnlp.244","article-title":"Adapting language models for zero-shot learning by meta-tuning on dataset and prompt collections","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2021","author":"Zhong","year":"2021"},{"key":"2024092014244379600_bib214","first-page":"55006","article-title":"LIMA: Less is more for alignment","volume-title":"Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023","author":"Zhou","year":"2023"},{"key":"2024092014244379600_bib215","article-title":"Least-to-most prompting enables complex reasoning in large language models","volume-title":"The Eleventh International Conference on Learning Representations","author":"Zhou","year":"2022"},{"key":"2024092014244379600_bib216","article-title":"Instruction-following evaluation for large language models","author":"Zhou","year":"2023","journal-title":"ArXiv preprint"}],"container-title":["Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/coli\/article-pdf\/50\/3\/1053\/2470911\/coli_a_00523.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/coli\/article-pdf\/50\/3\/1053\/2470911\/coli_a_00523.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,20]],"date-time":"2024-09-20T14:25:27Z","timestamp":1726842327000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/coli\/article\/50\/3\/1053\/121669\/Large-Language-Model-Instruction-Following-A"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024]]},"references-count":216,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2024,9,1]]},"published-print":{"date-parts":[[2024,9,1]]}},"URL":"https:\/\/doi.org\/10.1162\/coli_a_00523","relation":{},"ISSN":["0891-2017","1530-9312"],"issn-type":[{"value":"0891-2017","type":"print"},{"value":"1530-9312","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024]]},"published":{"date-parts":[[2024]]}}}