{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,3]],"date-time":"2026-07-03T00:06:07Z","timestamp":1783037167712,"version":"3.54.6"},"reference-count":48,"publisher":"MIT Press","license":[{"start":{"date-parts":[[2023,11,17]],"date-time":"2023-11-17T00:00:00Z","timestamp":1700179200000},"content-version":"vor","delay-in-days":320,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,11,13]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Retrieval-Augmented Language Modeling (RALM) methods, which condition a language model (LM) on relevant documents from a grounding corpus during generation, were shown to significantly improve language modeling performance. In addition, they can mitigate the problem of factually inaccurate text generation and provide natural source attribution mechanism. Existing RALM approaches focus on modifying the LM architecture in order to facilitate the incorporation of external information, significantly complicating deployment. This paper considers a simple alternative, which we dub In-Context RALM: leaving the LM architecture unchanged and prepending grounding documents to the input, without any further training of the LM. We show that In-Context RALM that builds on off-the-shelf general purpose retrievers provides surprisingly large LM gains across model sizes and diverse corpora. We also demonstrate that the document retrieval and ranking mechanism can be specialized to the RALM setting to further boost performance. We conclude that In-Context RALM has considerable potential to increase the prevalence of LM grounding, particularly in settings where a pretrained LM must be used without modification or even via API access.1<\/jats:p>","DOI":"10.1162\/tacl_a_00605","type":"journal-article","created":{"date-parts":[[2023,11,17]],"date-time":"2023-11-17T16:37:21Z","timestamp":1700239041000},"page":"1316-1331","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":333,"title":["In-Context Retrieval-Augmented Language Models"],"prefix":"10.1162","volume":"11","author":[{"given":"Ori","family":"Ram","sequence":"first","affiliation":[{"name":"AI21 Labs, Israel. orir@ai21.com"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yoav","family":"Levine","sequence":"additional","affiliation":[{"name":"AI21 Labs, Israel. yoavl@ai21.com"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Itay","family":"Dalmedigos","sequence":"additional","affiliation":[{"name":"AI21 Labs, Israel. itayd@ai21.com"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Dor","family":"Muhlgay","sequence":"additional","affiliation":[{"name":"AI21 Labs, Israel. dorm@ai21.com"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Amnon","family":"Shashua","sequence":"additional","affiliation":[{"name":"AI21 Labs, Israel. amnons@ai21.com"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kevin","family":"Leyton-Brown","sequence":"additional","affiliation":[{"name":"AI21 Labs, Israel. kevinlb@ai21.com"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yoav","family":"Shoham","sequence":"additional","affiliation":[{"name":"AI21 Labs, Israel. yoavs@ai21.com"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"281","published-online":{"date-parts":[[2023,11,13]]},"reference":[{"key":"2023111716365961000_bib1","article-title":"Neuro-symbolic language modeling with automaton-augmented retrieval","volume-title":"ICML","author":"Alon","year":"2022"},{"key":"2023111716365961000_bib2","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.5297715","article-title":"GPT-Neo: Large scale autoregressive language modeling with mesh-tensorflow","author":"Black","year":"2021"},{"key":"2023111716365961000_bib3","article-title":"Improving language models by retrieving from trillions of tokens","volume-title":"ICML","author":"Borgeaud","year":"2022"},{"key":"2023111716365961000_bib4","article-title":"Language models are few-shot learners","volume-title":"Advances in Neural Information Processing Systems","author":"Brown","year":"2020"},{"key":"2023111716365961000_bib5","doi-asserted-by":"publisher","first-page":"1870","DOI":"10.18653\/v1\/P17-1171","article-title":"Reading Wikipedia to answer open-domain questions","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Chen","year":"2017"},{"key":"2023111716365961000_bib6","doi-asserted-by":"publisher","first-page":"4171","DOI":"10.18653\/v1\/N19-1423","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Devlin","year":"2019"},{"key":"2023111716365961000_bib7","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2301.00234","article-title":"A survey on in-context learning","author":"Dong","year":"2023"},{"key":"2023111716365961000_bib8","doi-asserted-by":"publisher","first-page":"854","DOI":"10.18653\/v1\/2021.findings-emnlp.73","article-title":"R2-D2: A modular baseline for open-domain question answering","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2021","author":"Fajcik","year":"2021"},{"key":"2023111716365961000_bib9","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2101.00027","article-title":"The pile: An 800gb dataset of diverse text for language modeling","author":"Gao","year":"2021"},{"key":"2023111716365961000_bib10","article-title":"REALM: Retrieval-augmented language model pre-training","volume-title":"ICML","author":"Guu","year":"2020"},{"key":"2023111716365961000_bib11","doi-asserted-by":"publisher","first-page":"5703","DOI":"10.18653\/v1\/2021.emnlp-main.461","article-title":"Efficient nearest neighbor language models","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"He","year":"2021"},{"issue":"3","key":"2023111716365961000_bib12","doi-asserted-by":"publisher","DOI":"10.1145\/3383123","article-title":"Challenges in building intelligent open-domain dialog systems","volume":"38","author":"Huang","year":"2020","journal-title":"ACM Transactions on Information Systems"},{"key":"2023111716365961000_bib13","article-title":"Unsupervised dense information retrieval with contrastive learning","author":"Izacard","year":"2022","journal-title":"Transactions on Machine Learning Research"},{"key":"2023111716365961000_bib14","doi-asserted-by":"publisher","first-page":"874","DOI":"10.18653\/v1\/2021.eacl-main.74","article-title":"Leveraging passage retrieval with generative models for open domain question answering","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Izacard","year":"2021"},{"key":"2023111716365961000_bib15","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2208.03299","article-title":"Atlas: Few-shot learning with retrieval augmented language models","author":"Izacard","year":"2022"},{"issue":"3","key":"2023111716365961000_bib16","doi-asserted-by":"publisher","first-page":"535","DOI":"10.1109\/TBDATA.2019.2921572","article-title":"Billion-scale similarity search with GPUs","volume":"7","author":"Johnson","year":"2021","journal-title":"IEEE Transactions on Big Data"},{"key":"2023111716365961000_bib17","doi-asserted-by":"publisher","first-page":"1601","DOI":"10.18653\/v1\/P17-1147","article-title":"TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Joshi","year":"2017"},{"key":"2023111716365961000_bib18","doi-asserted-by":"publisher","first-page":"6769","DOI":"10.18653\/v1\/2020.emnlp-main.550","article-title":"Dense passage retrieval for open-domain question answering","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Karpukhin","year":"2020"},{"key":"2023111716365961000_bib19","article-title":"Generalization through memorization: Nearest neighbor language models","volume-title":"International Conference on Learning Representations","author":"Khandelwal","year":"2020"},{"key":"2023111716365961000_bib20","doi-asserted-by":"publisher","first-page":"452","DOI":"10.1162\/tacl_a_00276","article-title":"Natural questions: A benchmark for question answering research","volume":"7","author":"Kwiatkowski","year":"2019","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2023111716365961000_bib21","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2204.10019","article-title":"Standing on the shoulders of giant frozen language models","author":"Levine","year":"2022"},{"key":"2023111716365961000_bib22","article-title":"Huge frozen language models as readers for open-domain question answering","volume-title":"ICML 2022 Workshop on Knowledge Retrieval and Language Models","author":"Levine","year":"2022"},{"key":"2023111716365961000_bib23","article-title":"The inductive bias of in-context learning: Rethinking pretraining example design","volume-title":"International Conference on Learning Representations","author":"Levine","year":"2022"},{"key":"2023111716365961000_bib24","first-page":"9459","article-title":"Retrieval-augmented generation for knowledge-intensive NLP tasks","volume-title":"Advances in Neural Information Processing Systems","author":"Lewis","year":"2020"},{"key":"2023111716365961000_bib25","article-title":"Decoupled context processing for context augmented language modeling","volume-title":"Advances in Neural Information Processing Systems","author":"Li","year":"2022"},{"key":"2023111716365961000_bib26","article-title":"Jurassic-1: Technical details and evaluation","author":"Lieber","year":"2021"},{"key":"2023111716365961000_bib27","doi-asserted-by":"publisher","first-page":"2356","DOI":"10.1145\/3404835.3463238","article-title":"Pyserini: A Python toolkit for reproducible information retrieval research with sparse and dense representations","volume-title":"Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Lin","year":"2021"},{"key":"2023111716365961000_bib28","doi-asserted-by":"publisher","first-page":"3214","DOI":"10.18653\/v1\/2022.acl-long.229","article-title":"TruthfulQA: Measuring how models mimic human falsehoods","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Lin","year":"2022"},{"key":"2023111716365961000_bib29","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1907.11692","article-title":"RoBERTa: A robustly optimized bert pretraining approach","author":"Liu","year":"2019"},{"key":"2023111716365961000_bib30","doi-asserted-by":"publisher","first-page":"1906","DOI":"10.18653\/v1\/2020.acl-main.173","article-title":"On faithfulness and factuality in abstractive summarization","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Maynez","year":"2020"},{"key":"2023111716365961000_bib31","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1609.07843","article-title":"Pointer sentinel mixture models","author":"Merity","year":"2016"},{"key":"2023111716365961000_bib32","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2307.06908","article-title":"Generating benchmarks for factuality evaluation of language models","author":"Muhlgay","year":"2023"},{"key":"2023111716365961000_bib33","doi-asserted-by":"publisher","first-page":"2523","DOI":"10.18653\/v1\/2021.naacl-main.200","article-title":"KILT: A benchmark for knowledge intensive language tasks","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Petroni","year":"2021"},{"key":"2023111716365961000_bib34","article-title":"Improving language understanding by generative pre-training","author":"Radford","year":"2018"},{"key":"2023111716365961000_bib35","article-title":"Language models are unsupervised multitask learners","author":"Radford","year":"2019"},{"key":"2023111716365961000_bib36","doi-asserted-by":"publisher","first-page":"2687","DOI":"10.18653\/v1\/2022.naacl-main.193","article-title":"Learning to retrieve passages without supervision","volume-title":"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Ram","year":"2022"},{"key":"2023111716365961000_bib37","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.acl-long.352","article-title":"Parallel context windows improve in-context learning of large language models","author":"Ratner","year":"2022"},{"issue":"4","key":"2023111716365961000_bib38","doi-asserted-by":"publisher","first-page":"333","DOI":"10.1561\/1500000019","article-title":"The probabilistic relevance framework: BM25 and beyond","volume":"3","author":"Robertson","year":"2009","journal-title":"Foundations and Trends in Information Retrieval"},{"key":"2023111716365961000_bib39","doi-asserted-by":"publisher","first-page":"3781","DOI":"10.18653\/v1\/2022.emnlp-main.249","article-title":"Improving passage retrieval with zero-shot question generation","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Sachan","year":"2022"},{"key":"2023111716365961000_bib40","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2301.12652","article-title":"REPLUG: Retrieval-augmented black-box language models","author":"Shi","year":"2023"},{"key":"2023111716365961000_bib41","article-title":"BEIR: A heterogeneous benchmark for zero-shot evaluation of information retrieval models","volume-title":"Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks","author":"Thakur","year":"2021"},{"key":"2023111716365961000_bib42","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2302.13971","article-title":"LLaMA: Open and efficient foundation language models","author":"Touvron","year":"2023"},{"key":"2023111716365961000_bib43","first-page":"5998","article-title":"Attention is all you need","volume-title":"Advances in Neural Information Processing Systems 30","author":"Vaswani","year":"2017"},{"key":"2023111716365961000_bib44","article-title":"GPT-J-6B: A 6 billion parameter autoregressive language model","author":"Wang","year":"2021"},{"key":"2023111716365961000_bib45","doi-asserted-by":"publisher","first-page":"38","DOI":"10.18653\/v1\/2020.emnlp-demos.6","article-title":"Transformers: State-of-the-art natural language processing","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations","author":"Wolf","year":"2020"},{"key":"2023111716365961000_bib46","article-title":"Defending against neural fake news","volume-title":"Advances in Neural Information Processing Systems","author":"Zellers","year":"2019"},{"key":"2023111716365961000_bib47","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2205.01068","article-title":"OPT: Open pre-trained transformer language models","author":"Zhang","year":"2022"},{"key":"2023111716365961000_bib48","doi-asserted-by":"publisher","first-page":"5657","DOI":"10.18653\/v1\/2022.emnlp-main.382","article-title":"Training language models with memory augmentation","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Zhong","year":"2022"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00605\/2178834\/tacl_a_00605.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00605\/2178834\/tacl_a_00605.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,17]],"date-time":"2023-11-17T16:37:42Z","timestamp":1700239062000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00605\/118118\/In-Context-Retrieval-Augmented-Language-Models"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023]]},"references-count":48,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00605","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023]]},"published":{"date-parts":[[2023]]}}}