{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,25]],"date-time":"2025-06-25T14:13:43Z","timestamp":1750860823370},"reference-count":68,"publisher":"MIT Press","license":[{"start":{"date-parts":[[2022,10,20]],"date-time":"2022-10-20T00:00:00Z","timestamp":1666224000000},"content-version":"vor","delay-in-days":292,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,18]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Idiomatic expressions (IEs), characterized by their non-compositionality, are an important part of natural language. They have been a classical challenge to NLP, including pre-trained language models that drive today\u2019s state-of-the-art. Prior work has identified deficiencies in their contextualized representation stemming from the underlying compositional paradigm of representation. In this work, we take a first-principles approach to build idiomaticity into BART using an adapter as a lightweight non-compositional language expert trained on idiomatic sentences. The improved capability over baselines (e.g., BART) is seen via intrinsic and extrinsic methods, where idiom embeddings score 0.19 points higher in homogeneity score for embedding clustering, and up to 25% higher sequence accuracy on the idiom processing tasks of IE sense disambiguation and span detection.<\/jats:p>","DOI":"10.1162\/tacl_a_00510","type":"journal-article","created":{"date-parts":[[2022,10,20]],"date-time":"2022-10-20T14:52:25Z","timestamp":1666277545000},"page":"1120-1137","update-policy":"http:\/\/dx.doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":2,"title":["Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions"],"prefix":"10.1162","volume":"10","author":[{"given":"Ziheng","family":"Zeng","sequence":"first","affiliation":[{"name":"Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign Champaign, IL USA zzeng13@illinois.edu"}]},{"given":"Suma","family":"Bhat","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign Champaign, IL USA spbhat2@illinois.edu"}]}],"member":"281","published-online":{"date-parts":[[2022,10,18]]},"reference":[{"key":"2022102014520000400_bib1","article-title":"Potential idiomatic expression (PIE)-english: Corpus for classes of idioms","author":"Adewumi","year":"2021","journal-title":"ArXiv"},{"key":"2022102014520000400_bib2","article-title":"Fine-grained analysis of sentence embeddings using auxiliary prediction tasks","author":"Adi","year":"2016","journal-title":"CoRR"},{"key":"2022102014520000400_bib3","doi-asserted-by":"crossref","first-page":"1534","DOI":"10.18653\/v1\/2020.acl-main.140","article-title":"Probing linguistic features of sentence-level representations in neural relation extraction","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Alt","year":"2020"},{"key":"2022102014520000400_bib4","doi-asserted-by":"publisher","first-page":"4762","DOI":"10.18653\/v1\/2021.findings-emnlp.410","article-title":"MAD-G: Multilingual adapter generation for efficient cross-lingual transfer","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2021","author":"Ansell","year":"2021"},{"key":"2022102014520000400_bib5","volume-title":"Oxford Dictionary of English Idioms","author":"Ayto","year":"2009","edition":"3rd"},{"key":"2022102014520000400_bib6","first-page":"267","article-title":"Multiword expressions","volume-title":"Handbook of Natural Language Processing, Second Edition","author":"Baldwin","year":"2010"},{"key":"2022102014520000400_bib7","doi-asserted-by":"publisher","first-page":"1538","DOI":"10.18653\/v1\/D19-1165","article-title":"Simple, scalable adaptation for neural machine translation","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Bapna","year":"2019"},{"key":"2022102014520000400_bib8","doi-asserted-by":"publisher","first-page":"1217","DOI":"10.1145\/3366423.3380198","article-title":"Leveraging sentiment distributions to distinguish figurative from literal health reports on Twitter","volume-title":"Proceedings of The Web Conference 2020","author":"Biddle","year":"2020"},{"key":"2022102014520000400_bib9","first-page":"1877","article-title":"Language models are few-shot learners","volume-title":"Advances in Neural Information Processing Systems","author":"Brown","year":"2020"},{"key":"2022102014520000400_bib10","doi-asserted-by":"publisher","first-page":"3354","DOI":"10.18653\/v1\/2021.findings-acl.297","article-title":"Figurative language in recognizing textual entailment","volume-title":"Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021","author":"Chakrabarty","year":"2021"},{"key":"2022102014520000400_bib11","first-page":"19","article-title":"The VNC-tokens dataset","volume-title":"Proceedings of the LREC Workshop Towards a Shared Task for Multiword Expressions (MWE 2008)","author":"Cook","year":"2008"},{"key":"2022102014520000400_bib12","doi-asserted-by":"publisher","first-page":"1986","DOI":"10.18653\/v1\/P16-1187","article-title":"Predicting the compositionality of nominal compounds: Giving word embeddings a hard time","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Cordeiro","year":"2016"},{"key":"2022102014520000400_bib13","article-title":"Examining the tip of the iceberg: A data set for idiom translation","volume-title":"Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)","author":"Fadaee","year":"2018"},{"issue":"1","key":"2022102014520000400_bib14","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1162\/coli.08-010-R1-07-048","article-title":"Unsupervised type and token identification of idiomatic expressions","volume":"35","author":"Fazly","year":"2009","journal-title":"Computational Linguistics"},{"key":"2022102014520000400_bib15","article-title":"Automatically constructing a lexicon of verb phrase idiomatic combinations","volume-title":"11th Conference of the European Chapter of the Association for Computational Linguistics","author":"Fazly","year":"2006"},{"key":"2022102014520000400_bib16","doi-asserted-by":"publisher","first-page":"435","DOI":"10.1007\/978-3-642-37247-6_35","article-title":"Automatic detection of idiomatic clauses","volume-title":"International Conference on Intelligent Text Processing and Computational Linguistics","author":"Feldman","year":"2013"},{"key":"2022102014520000400_bib17","first-page":"758","article-title":"PPDB: The paraphrase database","volume-title":"Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Ganitkevitch","year":"2013"},{"key":"2022102014520000400_bib18","first-page":"300","article-title":"Word embedding evaluation and combination","volume-title":"Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC\u201916)","author":"Ghannay","year":"2016"},{"key":"2022102014520000400_bib19","first-page":"279","article-title":"MAGPIE: A large corpus of potentially idiomatic expressions","volume-title":"Proceedings of the 12th Language Resources and Evaluation Conference","author":"Haagsma","year":"2020"},{"key":"2022102014520000400_bib20","first-page":"472","article-title":"Leveraging contextual embeddings and idiom principle for detecting idiomaticity in potentially idiomatic expressions","volume-title":"Proceedings of the Workshop on the Cognitive Aspects of the Lexicon","author":"Hashempour","year":"2020"},{"key":"2022102014520000400_bib21","doi-asserted-by":"publisher","first-page":"205","DOI":"10.18653\/v1\/P16-1020","article-title":"Adaptive joint learning of compositional and non-compositional phrase embeddings","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Hashimoto","year":"2016"},{"key":"2022102014520000400_bib22","first-page":"2790","article-title":"Parameter-efficient transfer learning for NLP","volume-title":"Proceedings of the 36th International Conference on Machine Learning","author":"Houlsby","year":"2019"},{"key":"2022102014520000400_bib23","doi-asserted-by":"publisher","first-page":"5617","DOI":"10.24963\/ijcai.2018\/796","article-title":"Visualisation and \u2018diagnostic classifiers\u2019 reveal how recurrent and recursive neural networks process hierarchical structure (extended abstract)","volume-title":"Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18","author":"Hupkes","year":"2018"},{"key":"2022102014520000400_bib24","doi-asserted-by":"publisher","first-page":"3263","DOI":"10.18653\/v1\/P19-1316","article-title":"On the compositionality prediction of noun phrases using poincar\u00e9 embeddings","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Jana","year":"2019"},{"key":"2022102014520000400_bib25","doi-asserted-by":"publisher","first-page":"7476","DOI":"10.18653\/v1\/2021.emnlp-main.592","article-title":"Investigating robustness of dialog models to popular figurative language constructs","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Jhamtani","year":"2021"},{"key":"2022102014520000400_bib26","doi-asserted-by":"publisher","first-page":"7871","DOI":"10.18653\/v1\/2020.acl-main.703","article-title":"BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Lewis","year":"2020"},{"key":"2022102014520000400_bib27","first-page":"4144","article-title":"An adaptive hierarchical compositional model for phrase embedding","volume-title":"Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18","author":"Li","year":"2018"},{"key":"2022102014520000400_bib28","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1016\/j.knosys.2018.04.009","article-title":"Phrase embedding learning based on external and internal context with compositionality constraint","volume":"152","author":"Li","year":"2018","journal-title":"Knowledge-Based Systems"},{"key":"2022102014520000400_bib29","doi-asserted-by":"crossref","first-page":"3888","DOI":"10.18653\/v1\/2021.emnlp-main.316","article-title":"CATE: A contrastive pre-trained model for metaphor detection with semi-supervised learning","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Lin","year":"2021"},{"key":"2022102014520000400_bib30","unstructured":"Changsheng Liu . 2019. Toward Robust and Efficient Interpretations of Idiomatic Expressions in Context. Ph.D. thesis, University of Pittsburgh."},{"key":"2022102014520000400_bib31","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.10998","article-title":"Representations of context in recognizing the figurative and literal usages of idioms","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Liu","year":"2017"},{"key":"2022102014520000400_bib32","doi-asserted-by":"publisher","first-page":"6738","DOI":"10.1609\/aaai.v33i01.33016738","article-title":"A generalized idiom usage recognition model based on semantic compatibility","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Liu","year":"2019"},{"key":"2022102014520000400_bib33","first-page":"1204","article-title":"Idiom-aware compositional distributed semantics","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Liu","year":"2017"},{"key":"2022102014520000400_bib34","article-title":"Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing","author":"Liu","year":"2021","journal-title":"ArXiv"},{"key":"2022102014520000400_bib35","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780198236146.001.0001","volume-title":"Fixed Expressions and Idioms in English: A Corpus-Based Approach","author":"Moon","year":"1998"},{"key":"2022102014520000400_bib36","doi-asserted-by":"publisher","DOI":"10.3389\/frai.2022.813967","article-title":"Shapley idioms: Analysing BERT sentence embeddings for general idiom token identification","volume":"5","author":"Nedumpozhimana","year":"2022","journal-title":"Frontiers in Artificial Intelligence"},{"key":"2022102014520000400_bib37","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"Journal of Machine Learning Research"},{"key":"2022102014520000400_bib38","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1007\/978-3-319-55209-5_2","article-title":"Automatic idiom recognition with word embeddings","volume-title":"Information Management and Big Data - Second Annual International Symposium, SIMBig 2015, Cusco, Peru, September 2-4, 2015, and Third Annual International Symposium, SIMBig 2016, Cusco, Peru, September 1-3, 2016, Revised Selected Papers","author":"Peng","year":"2016"},{"key":"2022102014520000400_bib39","doi-asserted-by":"publisher","first-page":"2019","DOI":"10.3115\/v1\/D14-1216","article-title":"Classifying idiomatic and literal expressions using topic models and intensity of emotions","volume-title":"Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Peng","year":"2014"},{"key":"2022102014520000400_bib40","doi-asserted-by":"publisher","first-page":"46","DOI":"10.18653\/v1\/2020.emnlp-demos.7","article-title":"Adapterhub: A framework for adapting transformers","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020): Systems Demonstrations","author":"Pfeiffer","year":"2020"},{"key":"2022102014520000400_bib41","doi-asserted-by":"publisher","first-page":"7654","DOI":"10.18653\/v1\/2020.emnlp-main.617","article-title":"MAD-X: An adapter-based framework for multi-task cross-lingual transfer","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Pfeiffer","year":"2020"},{"key":"2022102014520000400_bib42","article-title":"Learning transferable visual models from natural language supervision","volume-title":"ICML","author":"Radford","year":"2021"},{"key":"2022102014520000400_bib43","doi-asserted-by":"publisher","first-page":"3363","DOI":"10.18653\/v1\/2021.eacl-main.295","article-title":"Probing the probing paradigm: Does probing accuracy entail task relevance?","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Ravichander","year":"2021"},{"key":"2022102014520000400_bib44","article-title":"Learning multiple visual domains with residual adapters","volume-title":"Advances in Neural Information Processing Systems","author":"Rebuffi","year":"2017"},{"key":"2022102014520000400_bib45","doi-asserted-by":"crossref","first-page":"8119","DOI":"10.1109\/CVPR.2018.00847","article-title":"Efficient parametrization of multi-domain deep neural networks","volume-title":"2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Rebuffi","year":"2018"},{"key":"2022102014520000400_bib46","first-page":"210","article-title":"An empirical study on compositionality in compound nouns","volume-title":"Proceedings of 5th International Joint Conference on Natural Language Processing","author":"Reddy","year":"2011"},{"key":"2022102014520000400_bib47","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1410","article-title":"Sentence-BERT: Sentence embeddings using Siamese BERT-networks","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing","author":"Reimers","year":"2019"},{"key":"2022102014520000400_bib48","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/3-540-45715-1_1","article-title":"Multiword expressions: A pain in the neck for NLP","volume-title":"International Conference on Intelligent Text Processing and Computational Linguistics","author":"Sag","year":"2002"},{"key":"2022102014520000400_bib49","doi-asserted-by":"publisher","first-page":"36","DOI":"10.3115\/v1\/W14-1007","article-title":"An empirical study of the impact of idioms on phrase based statistical machine translation of English to Brazilian-Portuguese","volume-title":"Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)","author":"Salton","year":"2014"},{"key":"2022102014520000400_bib50","doi-asserted-by":"publisher","first-page":"194","DOI":"10.18653\/v1\/P16-1019","article-title":"Idiom token classification using sentential distributed semantics","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Salton","year":"2016"},{"key":"2022102014520000400_bib51","doi-asserted-by":"publisher","first-page":"298","DOI":"10.18653\/v1\/D15-1036","article-title":"Evaluation methods for unsupervised word embeddings","volume-title":"Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing","author":"Schnabel","year":"2015"},{"key":"2022102014520000400_bib52","doi-asserted-by":"publisher","first-page":"1537","DOI":"10.3115\/v1\/N15-1177","article-title":"A corpus and model integrating multiword expressions and supersenses","volume-title":"Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Schneider","year":"2015"},{"key":"2022102014520000400_bib53","doi-asserted-by":"publisher","first-page":"1715","DOI":"10.18653\/v1\/P16-1162","article-title":"Neural machine translation of rare words with subword units","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Sennrich","year":"2016"},{"key":"2022102014520000400_bib54","first-page":"1002","article-title":"Metaphor identification using verb and noun clustering","volume-title":"Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)","author":"Shutova","year":"2010"},{"key":"2022102014520000400_bib55","article-title":"Mpnet: Masked and permuted pre-training for language understanding","volume-title":"Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual","author":"Song","year":"2020"},{"issue":"2","key":"2022102014520000400_bib56","doi-asserted-by":"publisher","first-page":"313","DOI":"10.1037\/0278-7393.34.2.313","article-title":"Processing idiomatic expressions: Effects of semantic compositionality.","volume":"34","author":"Tabossi","year":"2008","journal-title":"Journal of Experimental Psychology: Learning, Memory, and Cognition"},{"issue":"4","key":"2022102014520000400_bib57","doi-asserted-by":"publisher","first-page":"529","DOI":"10.3758\/MC.37.4.529","article-title":"Why are idioms recognized fast?","volume":"37","author":"Tabossi","year":"2009","journal-title":"Memory & Cognition"},{"key":"2022102014520000400_bib58","first-page":"299","article-title":"Identification of multiword expressions: A fresh look at modelling and evaluation","volume-title":"Multiword Expressions at Length and in Depth: Extended Papers from the MWE 2017 Workshop","author":"Taslimipoor","year":"2018"},{"key":"2022102014520000400_bib59","doi-asserted-by":"publisher","first-page":"3464","DOI":"10.18653\/v1\/2021.findings-emnlp.294","article-title":"AStitchInLanguageModels: Dataset and methods for the exploration of idiomaticity in pre-trained language models","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2021","author":"Madabushi","year":"2021"},{"key":"2022102014520000400_bib60","doi-asserted-by":"publisher","first-page":"e19","DOI":"10.1017\/ATSIP.2019.12","article-title":"Evaluating word embedding models: methods and experimental results","volume":"8","author":"Wang","year":"2019","journal-title":"APSIPA Transactions on Signal and Information Processing"},{"key":"2022102014520000400_bib61","doi-asserted-by":"publisher","first-page":"2877","DOI":"10.18653\/v1\/D19-1286","article-title":"Investigating BERT\u2019s knowledge of language: Five analysis methods with NPIs","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Warstadt","year":"2019"},{"key":"2022102014520000400_bib62","article-title":"On the compositionality of idioms","author":"Westerst\u00e5hl","year":"2002","journal-title":"Proceedings of LLC8. CSLI Publications"},{"key":"2022102014520000400_bib63","doi-asserted-by":"publisher","first-page":"38","DOI":"10.18653\/v1\/2020.emnlp-demos.6","article-title":"Transformers: State-of-the-art natural language processing","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations","author":"Wolf","year":"2020"},{"key":"2022102014520000400_bib64","article-title":"Google\u2019s neural machine translation system: Bridging the gap between human and machine translation","author":"Yonghui","year":"2016","journal-title":"CoRR"},{"key":"2022102014520000400_bib65","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.findings-acl.150","article-title":"Dict-BERT: Enhancing language model pre-training with dictionary","author":"Yu","year":"2021","journal-title":"ArXiv"},{"key":"2022102014520000400_bib66","doi-asserted-by":"publisher","first-page":"1546","DOI":"10.1162\/tacl_a_00442","article-title":"Idiomatic expression identification using semantic compatibility","volume":"9","author":"Zeng","year":"2021","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2022102014520000400_bib67","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v36i10.21433","article-title":"Idiomatic expression paraphrasing without strong supervision","author":"Zhou","year":"2021"},{"key":"2022102014520000400_bib68","doi-asserted-by":"publisher","first-page":"107606","DOI":"10.1016\/j.knosys.2021.107606","article-title":"Mice: Mining idioms with contextual embeddings","volume":"235","author":"\u0160kvorc","year":"2022","journal-title":"Knowledge-Based Systems"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00510\/2054693\/tacl_a_00510.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00510\/2054693\/tacl_a_00510.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,29]],"date-time":"2023-11-29T08:26:55Z","timestamp":1701246415000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00510\/113491\/Getting-BART-to-Ride-the-Idiomatic-Train-Learning"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022]]},"references-count":68,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00510","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022]]},"published":{"date-parts":[[2022]]}}}