{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,18]],"date-time":"2026-06-18T12:03:18Z","timestamp":1781784198101,"version":"3.54.5"},"reference-count":150,"publisher":"MIT Press","license":[{"start":{"date-parts":[[2023,3,16]],"date-time":"2023-03-16T00:00:00Z","timestamp":1678924800000},"content-version":"vor","delay-in-days":74,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,3,14]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>NLP has achieved great progress in the past decade through the use of neural models and large labeled datasets. The dependence on abundant data prevents NLP models from being applied to low-resource settings or novel tasks where significant time, money, or expertise is required to label massive amounts of textual data. Recently, data augmentation methods have been explored as a means of improving data efficiency in NLP. To date, there has been no systematic empirical overview of data augmentation for NLP in the limited labeled data setting, making it difficult to understand which methods work in which settings. In this paper, we provide an empirical survey of recent progress on data augmentation for NLP in the limited labeled data setting, summarizing the landscape of methods (including token-level augmentations, sentence-level augmentations, adversarial augmentations, and hidden-space augmentations) and carrying out experiments on 11 datasets covering topics\/news classification, inference tasks, paraphrasing tasks, and single-sentence tasks. Based on the results, we draw several conclusions to help practitioners choose appropriate augmentations in different settings and discuss the current challenges and future directions for limited data learning in NLP.<\/jats:p>","DOI":"10.1162\/tacl_a_00542","type":"journal-article","created":{"date-parts":[[2023,3,16]],"date-time":"2023-03-16T14:37:57Z","timestamp":1678977477000},"page":"191-211","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":116,"title":["An Empirical Survey of Data Augmentation for Limited Data Learning in NLP"],"prefix":"10.1162","volume":"11","author":[{"given":"Jiaao","family":"Chen","sequence":"first","affiliation":[{"name":"Georgia Institute of Technology, USA. jchen896@gatech.edu"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Derek","family":"Tam","sequence":"additional","affiliation":[{"name":"UNC Chapel Hill, USA. dtredsox@cs.unc.edu"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Colin","family":"Raffel","sequence":"additional","affiliation":[{"name":"UNC Chapel Hill, USA. craffel@cs.unc.edu"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Mohit","family":"Bansal","sequence":"additional","affiliation":[{"name":"UNC Chapel Hill, USA. mbansal@cs.unc.edu"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Diyi","family":"Yang","sequence":"additional","affiliation":[{"name":"Georgia Institute of Technology, USA. dyang888@gatech.edu"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"281","published-online":{"date-parts":[[2023,3,14]]},"reference":[{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP40776.2020.9054468","article-title":"Cross lingual transfer learning for zero-resource domain adaptation","author":"Abad","year":"2020","journal-title":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"7383","DOI":"10.1609\/aaai.v34i05.6233","article-title":"Do not have enough data? Deep learning to the rescue!","volume-title":"The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020","author":"Anaby-Tavor","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"7556","DOI":"10.18653\/v1\/2020.acl-main.676","article-title":"Good-enough compositional data augmentation","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Andreas","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1399","article-title":"Unsupervised neural machine translation","volume-title":"International Conference on Learning Representations","author":"Artetxe","year":"2018"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1172","article-title":"Multi-task learning of pairwise sequence classification tasks over disparate label spaces","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)","author":"Augenstein","year":"2018"},{"key":"2023031614352475800_","article-title":"Learning with pseudo-ensembles","volume-title":"Advances in Neural Information Processing Systems","author":"Bachman","year":"2014"},{"key":"2023031614352475800_","article-title":"Synthetic and natural noise both break neural machine translation","volume-title":"International Conference on Learning Representations","author":"Belinkov","year":"2017"},{"key":"2023031614352475800_","first-page":"5050","article-title":"Mixmatch: A holistic approach to semi-supervised learning","volume-title":"Advances in Neural Information Processing Systems","author":"Berthelot","year":"2019"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","first-page":"275","DOI":"10.18653\/v1\/D19-6131","article-title":"Zero-shot cross-lingual name retrieval for low-resource languages","volume-title":"Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)","author":"Blissett","year":"2019"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1145\/279943.279962","article-title":"Combining labeled and unlabeled data with co-training","volume-title":"Proceedings of the Eleventh Annual Conference on Computational Learning Theory","author":"Blum","year":"1998"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"10","DOI":"10.18653\/v1\/K16-1002","article-title":"Generating sentences from a continuous space","volume-title":"Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning","author":"Bowman","year":"2016"},{"key":"2023031614352475800_","first-page":"1877","article-title":"Language models are few-shot learners","volume-title":"Advances in Neural Information Processing Systems","author":"Brown","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","first-page":"6334","DOI":"10.18653\/v1\/2020.acl-main.564","article-title":"Data manipulation: Towards effective instance learning for neural dialogue generation via learning to augment and reweight","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Cai","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1018","DOI":"10.18653\/v1\/D19-1094","article-title":"Semi-supervised semantic role labeling with cross-view training","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Cai","year":"2019"},{"key":"2023031614352475800_","first-page":"830","article-title":"Importance of semantic representation: Dataless classification","volume-title":"Proceedings of the 23rd National Conference on Artificial Intelligence - Volume 2","author":"Chang","year":"2008"},{"issue":"3","key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"542","DOI":"10.1109\/TNN.2009.2015974","article-title":"Semi-supervised learning","volume":"20","author":"Chapelle","year":"2009","journal-title":"IEEE Transactions on Neural Networks"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.acl-long.338","article-title":"Hiddencut: Simple data augmentation for natural language understanding with better generalization","volume-title":"ACL","author":"Chen","year":"2021"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1241","DOI":"10.18653\/v1\/2020.emnlp-main.95","article-title":"Local additivity based data augmentation for semi-supervised NER","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Chen","year":"2020"},{"key":"2023031614352475800_","article-title":"Semi-supervised models via data augmentation for classifying interactive affective responses","volume-title":"AffCon@ AAAI","author":"Chen","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"2147","DOI":"10.18653\/v1\/2020.acl-main.194","article-title":"MixText: Linguistically-informed interpolation of hidden space for semi-supervised text classification","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Chen","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"8801","DOI":"10.18653\/v1\/2020.acl-main.777","article-title":"SeqVAT: Virtual adversarial training for semi-supervised sequence labeling","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Chen","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","first-page":"5972","DOI":"10.18653\/v1\/P19-1599","article-title":"Controllable paraphrase generation with a syntactic exemplar","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Chen","year":"2019"},{"key":"2023031614352475800_","article-title":"Compositional generalization via neural-symbolic stack machines","volume":"33","author":"Chen","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"04","key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"3601","DOI":"10.1609\/aaai.v34i04.5767","article-title":"Seq2sick: Evaluating the robustness of sequence-to-sequence models with adversarial examples","volume":"34","author":"Cheng","year":"2020","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1425","article-title":"Robust neural machine translation with doubly adversarial inputs","author":"Cheng","year":"2019","journal-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"5961","DOI":"10.18653\/v1\/2020.acl-main.529","article-title":"AdvAug: Robust adversarial augmentation for neural machine translation","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Cheng","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1185","article-title":"Semi-supervised learning for neural machine translation","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Cheng","year":"2016"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1914","DOI":"10.18653\/v1\/D18-1217","article-title":"Semi-supervised sequence modeling with cross-view training","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Clark","year":"2018"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"748","DOI":"10.18653\/v1\/D17-1078","article-title":"Cross-lingual character-level neural morphological tagging","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Cotterell","year":"2017"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"113","DOI":"10.1109\/CVPR.2019.00020","article-title":"Autoaugment: Learning augmentation strategies from data","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Cubuk","year":"2019"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6639345","article-title":"Recent advances in deep learning for speech research at microsoft","volume-title":"IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)","author":"Li","year":"2013"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i10.7158","article-title":"When low resource nlp meets unsupervised language model: Meta-pretraining then meta-learning for few-shot text classification","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Deng","year":"2019"},{"key":"2023031614352475800_","first-page":"4171","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Devlin","year":"2019"},{"key":"2023031614352475800_","article-title":"Improved regularization of convolutional neural networks with cutout","author":"DeVries","year":"2017"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","first-page":"1455","DOI":"10.18653\/v1\/D19-1153","article-title":"Cross-lingual transfer learning with data selection for large-scale spoken language understanding","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Do","year":"2019"},{"key":"2023031614352475800_","article-title":"On adversarial examples for character-level neural machine translation","author":"Ebrahimi","year":"2018"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"31","DOI":"10.18653\/v1\/P18-2006","article-title":"Hotflip: White-box adversarial examples for text classification","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)","author":"Ebrahimi","year":"2018"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1045","article-title":"Understanding back-translation at scale","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Edunov","year":"2018"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"567","DOI":"10.18653\/v1\/P17-2090","article-title":"Data augmentation for low-resource neural machine translation","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)","author":"Fadaee","year":"2017"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W18-2706","article-title":"Robust neural abstractive summarization systems and evaluation against adversarial information","author":"Fan","year":"2018","journal-title":"Interpretability and Robustness for Audio, Speech and Language Workshop at Neurips 2018"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","first-page":"29","DOI":"10.18653\/v1\/2020.deelio-1.4","article-title":"Genaug: Data augmentation for finetuning text generators","volume-title":"Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures","author":"Feng","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2021.findings-acl.84","article-title":"A survey of data augmentation approaches for NLP","volume-title":"Association for Computational Linguistics Findings","author":"Feng","year":"2021"},{"key":"2023031614352475800_","article-title":"Compositional generalization in semantic parsing: Pre-training vs. specialized architectures","author":"Furrer","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"5539","DOI":"10.18653\/v1\/P19-1555","article-title":"Soft contextual data augmentation for neural machine translation","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Gao","year":"2019"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"6174","DOI":"10.18653\/v1\/2020.emnlp-main.498","article-title":"BAE: BERT-based adversarial examples for text classification","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Garg","year":"2020"},{"key":"2023031614352475800_","first-page":"138","article-title":"Learning a part-of-speech tagger from two hours of annotation","volume-title":"Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Garrette","year":"2013"},{"key":"2023031614352475800_","article-title":"Domain adaptation for large-scale sentiment classification: A deep learning approach","volume-title":"International Conference of Machine Learning","author":"Glorot","year":"2011"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"42","DOI":"10.18653\/v1\/2021.naacl-demos.6","article-title":"Robustness gym: Unifying the NLP evaluation landscape","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations","author":"Goel","year":"2021"},{"key":"2023031614352475800_","first-page":"2672","article-title":"Generative adversarial nets","volume-title":"Advances in Neural Information Processing Systems 27","author":"Goodfellow","year":"2014"},{"key":"2023031614352475800_","first-page":"20","article-title":"Explaining and harnessing adversarial examples","volume":"1050","author":"Goodfellow","year":"2015","journal-title":"stat"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"5547","DOI":"10.18653\/v1\/2020.emnlp-main.447","article-title":"Sequence-level mixed sample data augmentation","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Guo","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11956","article-title":"A deep generative framework for paraphrase generation","volume-title":"Association for the Advancement of Artificial Intelligence","author":"Gupta","year":"2017"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00030","article-title":"Generating sentences by editing prototypes","volume-title":"Transactions of the Association for Computational Linguistics","author":"Guu","year":"2017"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"4803","DOI":"10.18653\/v1\/D18-1514","article-title":"FewRel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Han","year":"2018"},{"key":"2023031614352475800_","article-title":"Revisiting self-training for neural sequence generation","volume-title":"International Conference on Learning Representations","author":"He","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.201","article-title":"A survey on recent approaches for natural language processing in low-resource scenarios","author":"Hedderich","year":"2020","journal-title":"CoRR"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"304","DOI":"10.18653\/v1\/D17-1030","article-title":"High-risk learning: Acquiring new word vectors from tiny data","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Herbelot","year":"2017"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","DOI":"10.21437\/Interspeech.2018-1097","article-title":"Unsupervised adaptation with interpretable disentangled representations for distant conversational speech recognition","author":"Hsu","year":"2018","journal-title":"Interspeech 2018"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","DOI":"10.1109\/ASRU.2017.8268911","article-title":"Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation","author":"Hsu","year":"2017","journal-title":"2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)"},{"key":"2023031614352475800_","first-page":"15764","article-title":"Learning data manipulation for augmentation and weighting","volume":"32","author":"Zhiting","year":"2019","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2023031614352475800_","first-page":"1587","article-title":"Toward controlled generation of text","volume-title":"Proceedings of the 34th International Conference on Machine Learning","author":"Zhiting","year":"2017"},{"key":"2023031614352475800_","first-page":"487","article-title":"Few-shot charge prediction with discriminative legal attributes","volume-title":"Proceedings of the 27th International Conference on Computational Linguistics","author":"Zikun","year":"2018"},{"key":"2023031614352475800_","article-title":"Adversarial attacks and defense on texts: A survey","author":"Huq","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1681","DOI":"10.3115\/v1\/P15-1162","article-title":"Deep unordered composition rivals syntactic methods for text classification","volume-title":"Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Iyyer","year":"2015"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1875","DOI":"10.18653\/v1\/N18-1170","article-title":"Adversarial example generation with syntactically controlled paraphrase networks","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)","author":"Iyyer","year":"2018"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"12","DOI":"10.18653\/v1\/P16-1002","article-title":"Data recombination for neural semantic parsing","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Jia","year":"2016"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"2021","DOI":"10.18653\/v1\/D17-1215","article-title":"Adversarial examples for evaluating reading comprehension systems","volume-title":"Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing","author":"Jia","year":"2017"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.197","article-title":"Smart: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Jiang","year":"2019"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"2726","DOI":"10.18653\/v1\/P19-1262","article-title":"Avoiding reasoning shortcuts: Adversarial evaluation, training, and model development for multi-hop QA","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Jiang","year":"2019"},{"key":"2023031614352475800_","first-page":"143","article-title":"A probabilistic analysis of the rocchio algorithm with tfidf for text categorization","volume-title":"Proceedings of the Fourteenth International Conference on Machine Learning","author":"Joachims","year":"1997"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"339","DOI":"10.1162\/tacl_a_00065","article-title":"Google\u2019s multilingual neural machine translation system: Enabling zero-shot translation","volume":"5","author":"Johnson","year":"2017","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2023031614352475800_","article-title":"Semi-supervised learning with deep generative models","volume-title":"Advances in Neural Information Processing Systems","author":"Kingma","year":"2014"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"452","DOI":"10.18653\/v1\/N18-2072","article-title":"Contextual augmentation: Data augmentation by words with paradigmatic relations","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)","author":"Kobayashi","year":"2018"},{"key":"2023031614352475800_","first-page":"271","article-title":"Model-portability experiments for textual temporal analysis","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies","author":"Kolomiyets","year":"2011"},{"issue":"2","key":"2023031614352475800_","article-title":"Imagenet classification with deep convolutional neural networks","volume":"25","author":"Krizhevsky","year":"2012","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"3609","DOI":"10.18653\/v1\/N19-1363","article-title":"Submodular optimization-based diverse paraphrasing and its effectiveness in data augmentation","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Kumar","year":"2019"},{"key":"2023031614352475800_","first-page":"18","article-title":"Data augmentation using pre-trained transformer models","volume-title":"Proceedings of the 2nd Workshop on Life-long Learning for Spoken Language Systems","author":"Kumar","year":"2020"},{"key":"2023031614352475800_","article-title":"Temporal ensembling for semi-supervised learning","volume-title":"5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings","author":"Laine","year":"2017"},{"key":"2023031614352475800_","article-title":"Cross-lingual language model pretraining","volume-title":"Advances in Neural Information Processing Systems 32 (NeurIPS 2019)","author":"Lample","year":"2019"},{"key":"2023031614352475800_","article-title":"Unsupervised machine translation using monolingual corpora only","volume-title":"International Conference on Learning Representations","author":"Lample","year":"2018"},{"key":"2023031614352475800_","article-title":"Cross-lingual transfer learning for question answering","author":"Lee","year":"2019","journal-title":"arXiv"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","first-page":"8503","DOI":"10.18653\/v1\/2020.acl-main.752","article-title":"TriggerNER: Learning with entity triggers as explanations for named entity recognition","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Lin","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18653\/v1\/P17-1001","article-title":"Adversarial multi-task learning for text classification","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Liu","year":"2017"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.333","article-title":"Adversarial augmentation policy search for domain and cross-lingual generalization in reading comprehension","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Maharana","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"90","DOI":"10.18653\/v1\/D19-5609","article-title":"Controlled text generation for data augmentation in intelligent artificial agents","volume-title":"Proceedings of the 3rd Workshop on Neural Generation and Translation","author":"Malandrakis","year":"2019"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1334","article-title":"Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Thomas McCoy","year":"2019"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"617","DOI":"10.1145\/3366423.3380144","article-title":"Snippext: Semi-supervised opinion mining with augmented data","volume-title":"Proceedings of The Web Conference 2020","author":"Miao","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2020.acl-main.212","article-title":"Syntactic data augmentation increases robustness to inference heuristics","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Min","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.1145\/3439726","article-title":"Deep learning based text classification: A comprehensive review","volume-title":"ACM Comptuing Survey","author":"Minaee","year":"2021"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i05.6371","article-title":"Enhancing natural language inference using new and expanded training data sets and new learning models","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Mitra","year":"2020"},{"key":"2023031614352475800_","article-title":"Adversarial training methods for semi-supervised text classification","author":"Miyato","year":"2017","journal-title":"International Conference on Learning Representations (ICLR)"},{"issue":"8","key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1979","DOI":"10.1109\/TPAMI.2018.2858821","article-title":"Virtual adversarial training: A regularization method for supervised and semi-supervised learning","volume":"41","author":"Miyato","year":"2018","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"119","DOI":"10.18653\/v1\/2020.emnlp-demos.16","article-title":"TextAttack: A framework for adversarial attacks, data augmentation, and adversarial training in NLP","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations","author":"Morris","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"304","DOI":"10.18653\/v1\/K19-1029","article-title":"Low-resource parsing with crosslingual contextualized representations","volume-title":"Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)","author":"Mulcaire","year":"2019"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K18-1047","article-title":"Adversarial over-sensitivity and over-stability strategies for dialogue models","volume-title":"The SIGNLL Conference on Computational Natural Language Learning (CoNLL)","author":"Niu","year":"2018"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1317","DOI":"10.18653\/v1\/D19-1132","article-title":"Automatically learning data augmentation policies for dialogue tasks","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Niu","year":"2019"},{"key":"2023031614352475800_","article-title":"Learning compositional rules via neural program synthesis","volume":"33","author":"Nye","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"2227","DOI":"10.18653\/v1\/N18-1202","article-title":"Deep contextualized word representations","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)","author":"Peters","year":"2018"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","first-page":"13","DOI":"10.18653\/v1\/W19-5202","article-title":"Improving zero-shot translation with language-independent constraints","volume-title":"Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)","author":"Pham","year":"2019"},{"key":"2023031614352475800_","article-title":"Neural paraphrase generation with stacked residual lstm networks","volume-title":"Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers","author":"Prakash","year":"2016"},{"issue":"8","key":"2023031614352475800_","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI Blog"},{"issue":"140","key":"2023031614352475800_","first-page":"1","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","volume":"21","author":"Raffel","year":"2020","journal-title":"Journal of Machine Learning Research"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","first-page":"1085","DOI":"10.18653\/v1\/P19-1103","article-title":"Generating natural language adversarial examples through probability weighted word saliency","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Ren","year":"2019"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"856","DOI":"10.18653\/v1\/P18-1079","article-title":"Semantically equivalent adversarial rules for debugging NLP models","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Ribeiro","year":"2018"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"3132","DOI":"10.18653\/v1\/D18-1352","article-title":"Few-shot and zero-shot multi-label learning for structured label spaces","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Rios","year":"2018"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"255","DOI":"10.18653\/v1\/2021.eacl-main.20","article-title":"Exploiting cloze-questions for few-shot text classification and natural language inference","volume-title":"Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume","author":"Schick","year":"2021"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"2339","DOI":"10.18653\/v1\/2021.naacl-main.185","article-title":"It\u2019s not just size that matters: Small language models are also few-shot learners","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Schick","year":"2021"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"3795","DOI":"10.18653\/v1\/N19-1380","article-title":"Cross-lingual transfer learning for multilingual task oriented dialog","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Schuster","year":"2019"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1009","article-title":"Improving neural machine translation models with monolingual data","author":"Sennrich","year":"2015","journal-title":"Computer Science"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"922","DOI":"10.18653\/v1\/2021.acl-long.75","article-title":"Compositional generalization and natural language variation: Can a semantic parsing approach handle both?","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Shaw","year":"2021"},{"key":"2023031614352475800_","article-title":"A simple but tough-to-beat data augmentation approach for natural language understanding and generation","author":"Shen","year":"2020","journal-title":"arXiv preprint arXiv:2009.13818"},{"key":"2023031614352475800_","article-title":"Low resource text classification with ulmfit and backtranslation","author":"Shleifer","year":"2019"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","first-page":"958","DOI":"10.1109\/ICDAR.2003.1227801","article-title":"Best practices for convolutional neural networks applied to visual document analysis","author":"Simard","year":"2003","journal-title":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings"},{"key":"2023031614352475800_","article-title":"Fixmatch: Simplifying semi-supervised learning with consistency and confidence","volume":"33","author":"Sohn","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2023031614352475800_","first-page":"3104","article-title":"Sequence to sequence learning with neural networks","volume-title":"Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2","author":"Sutskever","year":"2014"},{"key":"2023031614352475800_","doi-asserted-by":"crossref","first-page":"2920","DOI":"10.18653\/v1\/2020.acl-main.263","article-title":"It\u2019s morphin\u2019 time! Combating linguistic discrimination with inflectional perturbations","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Tan","year":"2020"},{"key":"2023031614352475800_","first-page":"1195","article-title":"Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results","volume-title":"Advances in Neural Information Processing Systems","author":"Tarvainen","year":"2017"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1542","DOI":"10.1109\/SSCI.2018.8628742","article-title":"Improving deep learning with generic data augmentation","volume-title":"2018 IEEE Symposium Series on Computational Intelligence (SSCI)","author":"Taylor","year":"2018"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"8846","DOI":"10.18653\/v1\/2020.emnlp-main.712","article-title":"Is multihop QA in DiRe condition? Measuring and reducing disconnected reasoning","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Trivedi","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1357","DOI":"10.18653\/v1\/N16-1161","article-title":"Polyglot neural language models: A case study in cross-lingual phonetic representation learning","volume-title":"Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Tsvetkov","year":"2016"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"353","DOI":"10.18653\/v1\/W18-5446","article-title":"GLUE: A multi-task benchmark and analysis platform for natural language understanding","volume-title":"Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP","author":"Wang","year":"2018"},{"issue":"3","key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3386252","article-title":"Generalizing from a few examples","volume":"53","author":"Wang","year":"2020","journal-title":"ACM Computing Surveys"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"575","DOI":"10.18653\/v1\/N18-2091","article-title":"Robust machine comprehension models via adversarial training","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)","author":"Wang","year":"2018"},{"key":"2023031614352475800_","article-title":"Towards zero-label language learning","author":"Wang","year":"2021","journal-title":"arXiv preprint arXiv:2109.09193"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"6382","DOI":"10.18653\/v1\/D19-1670","article-title":"EDA: Easy data augmentation techniques for boosting performance on text classification tasks","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Wei","year":"2019"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"84","DOI":"10.1007\/978-3-030-22747-0_7","article-title":"Conditional BERT contextual augmentation","volume-title":"International Conference on Computational Science","author":"Xing","year":"2019"},{"key":"2023031614352475800_","first-page":"1163","article-title":"Data augmentation using variational autoencoder for embedding based speaker verification","volume-title":"Proceedings of Interspeech 2019","author":"Zhanghao","year":"2019"},{"key":"2023031614352475800_","article-title":"Unsupervised data augmentation for consistency training","volume":"33","author":"Xie","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2023031614352475800_","article-title":"Data noising as smoothing in neural network language models","author":"Xie","year":"2017","journal-title":"CoRR"},{"key":"2023031614352475800_","article-title":"Dp-gan: Diversity-promoting generative adversarial network for generating informative and diversified text","author":"Jingjing","year":"2018","journal-title":"CoRR"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.10966","article-title":"Variational autoencoder for semi-supervised text classification","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Weidi","year":"2017"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D15-1306","article-title":"That\u2019s so annoying!!!: A lexical and frame-semantic embedding based data augmentation approach to automatic categorization of annoying behaviors using #petpeeve tweets","volume-title":"Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing","author":"Yang","year":"2015"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1008","DOI":"10.18653\/v1\/2020.findings-emnlp.90","article-title":"Generative data augmentation for commonsense reasoning","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Yang","year":"2020"},{"key":"2023031614352475800_","first-page":"3881","article-title":"Improved variational autoencoders for text modeling using dilated convolutions","volume-title":"Proceedings of the 34th International Conference on Machine Learning - Volume 70","author":"Yang","year":"2017"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"189","DOI":"10.3115\/981658.981684","article-title":"Unsupervised word sense disambiguation rivaling supervised methods","volume-title":"33rd Annual Meeting of the Association for Computational Linguistics","author":"Yarowsky","year":"1995"},{"key":"2023031614352475800_","article-title":"Zerogen: Efficient zero-shot learning via dataset generation","author":"Ye","year":"2022","journal-title":"arXiv preprint arXiv:2202.07922"},{"key":"2023031614352475800_","article-title":"Qanet: Combining local convolution with global self-attention for reading comprehension","author":"Adams Wei","year":"2018","journal-title":"CoRR"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.acl-demo.43","article-title":"Openattack: An open-source textual adversarial attack toolkit","author":"Zeng","year":"2020"},{"key":"2023031614352475800_","article-title":"mixup: Beyond empirical risk minimization","volume-title":"International Conference on Learning Representations","author":"Zhang","year":"2018"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1535","DOI":"10.18653\/v1\/D16-1160","article-title":"Exploiting source-side monolingual data in neural machine translation","volume-title":"Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing","author":"Zhang","year":"2016"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"8566","DOI":"10.18653\/v1\/2020.emnlp-main.691","article-title":"SeqMix: Augmenting active sequence labeling via sequence mixup","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Zhang","year":"2020"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"2495","DOI":"10.18653\/v1\/D19-1253","article-title":"Addressing semantic drift in question generation for semi-supervised question answering","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Zhang","year":"2019"},{"key":"2023031614352475800_","first-page":"649","article-title":"Character-level convolutional networks for text classification","volume-title":"Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1","author":"Zhang","year":"2015"},{"issue":"3","key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3374217","article-title":"Adversarial attacks on deep-learning models in natural language processing: A survey","volume":"11","author":"Zhang","year":"2020","journal-title":"ACM Transactions on Intelligent Systems and Technology (TIST)"},{"key":"2023031614352475800_","first-page":"649","article-title":"Character-level convolutional networks for text classification","volume":"28","author":"Zhang","year":"2015","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2023031614352475800_","article-title":"Generating natural adversarial examples","volume-title":"International Conference on Learning Representations","author":"Zhao","year":"2017"},{"key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"13001","DOI":"10.1609\/aaai.v34i07.7000","article-title":"Random erasing data augmentation","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Zhong","year":"2020"},{"key":"2023031614352475800_","article-title":"Freelb: Enhanced adversarial training for natural language understanding","volume-title":"ICLR","author":"Zhu","year":"2020"},{"key":"2023031614352475800_","unstructured":"Xiaojin Jerry Zhu . 2005. Semi-supervised learning literature survey, University of Wisconsin-Madison Department of Computer Sciences."},{"issue":"1","key":"2023031614352475800_","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1162\/coli_a_00425","article-title":"To augment or not to augment? A comparative study on text augmentation techniques for low-resource NLP","volume":"48","author":"\u015eahin","year":"2022","journal-title":"Computational Linguistics"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00542\/2074871\/tacl_a_00542.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00542\/2074871\/tacl_a_00542.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,9]],"date-time":"2023-12-09T02:16:55Z","timestamp":1702088215000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00542\/115238\/An-Empirical-Survey-of-Data-Augmentation-for"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023]]},"references-count":150,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00542","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023]]},"published":{"date-parts":[[2023]]}}}