{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T13:03:25Z","timestamp":1772715805691,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":21,"publisher":"ACM","license":[{"start":{"date-parts":[[2024,8,24]],"date-time":"2024-08-24T00:00:00Z","timestamp":1724457600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,8,25]]},"DOI":"10.1145\/3637528.3671905","type":"proceedings-article","created":{"date-parts":[[2024,8,25]],"date-time":"2024-08-25T04:54:55Z","timestamp":1724561695000},"page":"2711-2720","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["MAML-en-LLM: Model Agnostic Meta-Training of LLMs for Improved In-Context Learning"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2650-0612","authenticated-orcid":false,"given":"Sanchit","family":"Sinha","sequence":"first","affiliation":[{"name":"University of Virginia, Charlottesville, VA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9771-3557","authenticated-orcid":false,"given":"Yuguang","family":"Yue","sequence":"additional","affiliation":[{"name":"Amazon AGI, New York, NY, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-2023-196X","authenticated-orcid":false,"given":"Victor","family":"Soto","sequence":"additional","affiliation":[{"name":"Amazon AGI, New York, NY, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-1805-9388","authenticated-orcid":false,"given":"Mayank","family":"Kulkarni","sequence":"additional","affiliation":[{"name":"Amazon AGI, Cambridge, MA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-3121-5174","authenticated-orcid":false,"given":"Jianhua","family":"Lu","sequence":"additional","affiliation":[{"name":"Amazon AGI, Cambridge, MA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9723-3246","authenticated-orcid":false,"given":"Aidong","family":"Zhang","sequence":"additional","affiliation":[{"name":"University of Virginia, Charlottesville, VA, USA"}]}],"member":"320","published-online":{"date-parts":[[2024,8,24]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"The Eleventh International Conference on Learning Representations, ICLR 2023","author":"Aky\u00fcrek Ekin","year":"2023","unstructured":"Ekin Aky\u00fcrek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, and Denny Zhou. 2023. What learning algorithm is in-context learning? Investigations with linear models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1--5, 2023. OpenReview.net. https:\/\/openreview.net\/pdf?id=0g0X4H8yN4I"},{"key":"e_1_3_2_1_2_1","volume-title":"7th International Conference on Learning Representations, ICLR 2019","author":"Antoniou Antreas","year":"2019","unstructured":"Antreas Antoniou, Harrison Edwards, and Amos J. Storkey. 2019. How to train your MAML. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6--9, 2019. OpenReview.net. https:\/\/openreview.net\/forum?id=HJGven05Y7"},{"key":"e_1_3_2_1_3_1","unstructured":"Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. Advances in neural information processing systems Vol. 33 (2020) 1877--1901."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.53"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.emnlp-main.456"},{"key":"e_1_3_2_1_6_1","volume-title":"International conference on machine learning. PMLR, 1126--1135","author":"Finn Chelsea","year":"2017","unstructured":"Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning. PMLR, 1126--1135."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.171"},{"key":"e_1_3_2_1_8_1","volume-title":"General-purpose in-context learning by meta-learning transformers. arXiv preprint arXiv:2212.04458","author":"Kirsch Louis","year":"2022","unstructured":"Louis Kirsch, James Harrison, Jascha Sohl-Dickstein, and Luke Metz. 2022. General-purpose in-context learning by meta-learning transformers. arXiv preprint arXiv:2212.04458 (2022)."},{"key":"e_1_3_2_1_9_1","volume-title":"Decoupled Weight Decay Regularization. In 7th International Conference on Learning Representations, ICLR 2019","author":"Loshchilov Ilya","year":"2019","unstructured":"Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6--9, 2019. OpenReview.net."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.556"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.472"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.365"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.naacl-main.201"},{"key":"e_1_3_2_1_14_1","volume-title":"Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning? arXiv preprint arXiv:2302.08143","author":"Qin Chengwei","year":"2023","unstructured":"Chengwei Qin, Shafiq Joty, Qian Li, and Ruochen Zhao. 2023. Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning? arXiv preprint arXiv:2302.08143 (2023)."},{"key":"e_1_3_2_1_15_1","unstructured":"Alec Radford Jeffrey Wu Rewon Child David Luan Dario Amodei Ilya Sutskever et al. 2019. Language models are unsupervised multitask learners. OpenAI blog Vol. 1 8 (2019) 9."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.206"},{"key":"e_1_3_2_1_17_1","first-page":"24824","article-title":"2022. Chain-of-thought prompting elicits reasoning in large language models","volume":"35","author":"Wei Jason","year":"2022","unstructured":"Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, Vol. 35 (2022), 24824--24837.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_18_1","unstructured":"Jerry Wei Jason Wei Yi Tay Dustin Tran Albert Webson Yifeng Lu Xinyun Chen Hanxiao Liu Da Huang Denny Zhou et al. 2023. Larger language models do in-context learning differently. arXiv preprint arXiv:2303.03846 (2023)."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.572"},{"key":"e_1_3_2_1_20_1","volume-title":"8th International Conference on Learning Representations, ICLR 2020","author":"Yin Mingzhang","year":"2020","unstructured":"Mingzhang Yin, George Tucker, Mingyuan Zhou, Sergey Levine, and Chelsea Finn. 2020. Meta-Learning without Memorization. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26--30, 2020. OpenReview.net. https:\/\/openreview.net\/forum?id=BklEFpEYwS"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.emnlp-main.155"}],"event":{"name":"KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","location":"Barcelona Spain","acronym":"KDD '24","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data"]},"container-title":["Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3637528.3671905","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3637528.3671905","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:04:15Z","timestamp":1750291455000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3637528.3671905"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,24]]},"references-count":21,"alternative-id":["10.1145\/3637528.3671905","10.1145\/3637528"],"URL":"https:\/\/doi.org\/10.1145\/3637528.3671905","relation":{},"subject":[],"published":{"date-parts":[[2024,8,24]]},"assertion":[{"value":"2024-08-24","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}