{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,2]],"date-time":"2026-06-02T07:04:57Z","timestamp":1780383897705,"version":"3.54.1"},"reference-count":66,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T00:00:00Z","timestamp":1649721600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T00:00:00Z","timestamp":1649721600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002347","name":"bundesministerium f\u00fcr bildung und forschung","doi-asserted-by":"publisher","award":["13N15407"],"award-info":[{"award-number":["13N15407"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100005714","name":"Technische Universit\u00e4t Darmstadt","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100005714","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int. J. Mach. Learn. &amp; Cyber."],"published-print":{"date-parts":[[2023,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In many cases of machine learning, research suggests that the development of training data might have a higher relevance than the choice and modelling of classifiers themselves. Thus, data augmentation methods have been developed to improve classifiers by artificially created training data. In NLP, there is the challenge of establishing universal rules for text transformations which provide new linguistic patterns. In this paper, we present and evaluate a text generation method suitable to increase the performance of classifiers for long and short texts. We achieved promising improvements when evaluating short as well as long text tasks with the enhancement by our text generation method. Especially with regard to small data analytics, additive accuracy gains of up to 15.53% and 3.56% are achieved within a constructed low data regime, compared to the no augmentation baseline and another data augmentation technique. As the current track of these constructed regimes is not universally applicable, we also show major improvements in several real world low data tasks (up to +4.84 F1-score). Since we are evaluating the method from many perspectives (in total 11 datasets), we also observe situations where the method might not be suitable. We discuss implications and patterns for the successful application of our approach on different types of datasets.<\/jats:p>","DOI":"10.1007\/s13042-022-01553-3","type":"journal-article","created":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T05:03:11Z","timestamp":1649739791000},"page":"135-150","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":121,"title":["Data augmentation in natural language processing: a novel text generation approach for long and short text classifiers"],"prefix":"10.1007","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2040-5609","authenticated-orcid":false,"given":"Markus","family":"Bayer","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Marc-Andr\u00e9","family":"Kaufhold","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Bj\u00f6rn","family":"Buchhold","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Marcel","family":"Keller","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"J\u00f6rg","family":"Dallmeyer","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Christian","family":"Reuter","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2022,4,12]]},"reference":[{"issue":"3","key":"1553_CR1","doi-asserted-by":"publisher","first-page":"288","DOI":"10.1080\/0144929X.2019.1610908","volume":"39","author":"F Alam","year":"2020","unstructured":"Alam F, Ofli F, Imran M (2020) Descriptive and visual summaries of disaster events using artificial intelligence techniques: case studies of hurricanes harvey, irma, and maria. Behav Inf Technol 39(3):288\u2013318. https:\/\/doi.org\/10.1080\/0144929X.2019.1610908","journal-title":"Behav Inf Technol"},{"key":"1553_CR2","doi-asserted-by":"publisher","unstructured":"Alzantot M, Sharma Y, Elgohary A, Ho BJ, Srivastava MB, Chang KW (2018) Generating natural language adversarial examples. In: Proceedings of EMNLP. https:\/\/doi.org\/10.18653\/v1\/d18-1316","DOI":"10.18653\/v1\/d18-1316"},{"key":"1553_CR3","doi-asserted-by":"crossref","unstructured":"Anaby-Tavor A, Carmeli B, Goldbraich E, Kantor A, Kour G, Shlomov S, Tepper N, Zwerdling N (2020) Do not have enough data? Deep learning to the rescue! Proceedings of the AAAI. http:\/\/arxiv.org\/abs\/1911.03118","DOI":"10.1609\/aaai.v34i05.6233"},{"key":"1553_CR4","doi-asserted-by":"publisher","unstructured":"Banko M, Brill E (2001) Scaling to very very large corpora for natural language disambiguation. In: Proceedings of the 39th annual meeting of the Association for Computational Linguistics. https:\/\/doi.org\/10.3115\/1073012.1073017","DOI":"10.3115\/1073012.1073017"},{"key":"1553_CR5","unstructured":"Bayer M, Kaufhold MA, Reuter C (2021) A survey on data augmentation for text classification. https:\/\/arxiv.org\/abs\/2107.03158"},{"key":"1553_CR6","unstructured":"Belinkov Y, Bisk Y (2018) Synthetic and natural noise both break neural machine translation. In: Proceedings of ICLR"},{"key":"1553_CR7","unstructured":"Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler DM, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D (2020) Language models are few-shot learners. In: NeurIPS, http:\/\/arxiv.org\/abs\/2005.14165"},{"key":"1553_CR8","doi-asserted-by":"publisher","unstructured":"Carreira R, Crato JM, Gon\u00e7alves D, Jorge JA (2004) Evaluating adaptive user profiles for news classification. In: Proceedings IUI. https:\/\/doi.org\/10.1145\/964442.964481","DOI":"10.1145\/964442.964481"},{"key":"1553_CR9","doi-asserted-by":"publisher","unstructured":"Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. JAIR.https:\/\/doi.org\/10.1613\/jair.953","DOI":"10.1613\/jair.953"},{"key":"1553_CR10","unstructured":"Coulombe C (2018) Text data augmentation made simple by leveraging NLP cloud APIs. arXiv preprint arXiv:1812.04718, pp 1\u201333. http:\/\/arxiv.org\/abs\/1812.04718"},{"key":"1553_CR11","doi-asserted-by":"publisher","unstructured":"Fadaee M, Bisazza A, Monz C (2017) Data augmentation for low-resource neural machine translation. In: ACL. https:\/\/doi.org\/10.18653\/v1\/P17-2090","DOI":"10.18653\/v1\/P17-2090"},{"key":"1553_CR12","doi-asserted-by":"publisher","unstructured":"Howard J, Gugger S (2020) Fastai: a layered api for deep learning. Information (Switzerland). https:\/\/doi.org\/10.3390\/info11020108","DOI":"10.3390\/info11020108"},{"key":"1553_CR13","doi-asserted-by":"publisher","unstructured":"Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In: Proceedings of ACL. https:\/\/doi.org\/10.18653\/v1\/p18-1031","DOI":"10.18653\/v1\/p18-1031"},{"issue":"4","key":"1553_CR14","doi-asserted-by":"publisher","first-page":"795","DOI":"10.1007\/s13042-020-01062-1","volume":"11","author":"YQ Hu","year":"2020","unstructured":"Hu YQ, Yu Y (2020) A technical view on neural architecture search. Int J Mach Learn Cybern 11(4):795\u2013811. https:\/\/doi.org\/10.1007\/s13042-020-01062-1","journal-title":"Int J Mach Learn Cybern"},{"key":"1553_CR15","unstructured":"Hu Z, Tan B, Salakhutdinov R, Mitchell T, Xing EP (2019) Learning data manipulation for augmentation and weighting"},{"key":"1553_CR16","doi-asserted-by":"publisher","unstructured":"Huong TH, Hoang VT (2020) A data augmentation technique based on text for Vietnamese sentiment analysis. In: Proceedings of IAIT pp 1\u20135. https:\/\/doi.org\/10.1145\/3406601.3406618","DOI":"10.1145\/3406601.3406618"},{"key":"1553_CR17","doi-asserted-by":"publisher","unstructured":"Imran M, Castillo C, Diaz F, Vieweg S (2018) Processing social media messages in mass emergency: Survey summary. In: Companion proceedings of the the web conference 2018, international world wide web conferences steering committee, Republic and Canton of Geneva, CHE, WWW \u201918, pp 507\u2013511. https:\/\/doi.org\/10.1145\/3184558.3186242","DOI":"10.1145\/3184558.3186242"},{"key":"1553_CR18","doi-asserted-by":"crossref","unstructured":"Jiao X, Yin Y, Shang L, Jiang X, Chen X, Li L, Wang F, Liu Q (2019) TinyBERT: distilling BERT for natural language understanding. In: EMNLP 2020, pp 1\u201314. http:\/\/arxiv.org\/abs\/1909.10351","DOI":"10.18653\/v1\/2020.findings-emnlp.372"},{"key":"1553_CR19","doi-asserted-by":"publisher","unstructured":"Kafle K, Yousefhussien M, Kanan C (2018) Data augmentation for visual question answering. In: Proceedings of the 10th international conference on natural language generation. https:\/\/doi.org\/10.18653\/v1\/w17-3529","DOI":"10.18653\/v1\/w17-3529"},{"key":"1553_CR20","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-658-33341-6","volume-title":"Information refinement technologies for crisis informatics: user expectations and design principles for social media and mobile apps","author":"MA Kaufhold","year":"2021","unstructured":"Kaufhold MA (2021) Information refinement technologies for crisis informatics: user expectations and design principles for social media and mobile apps. Springer Verlag, Wiesbaden, Germany"},{"key":"1553_CR21","doi-asserted-by":"publisher","unstructured":"Kaufhold MA, Bayer M, Reuter C (2020) Rapid relevance classification of social media posts in disasters and emergencies: a system and evaluation featuring active, incremental and online learning. Inf Process Manage. https:\/\/doi.org\/10.1016\/j.ipm.2019.102132","DOI":"10.1016\/j.ipm.2019.102132"},{"key":"1553_CR22","unstructured":"Khan B (2019) Generate your own text with OpenAI\u2019s GPT-2. https:\/\/www.kaggle.com\/bkkaggle\/generate-your-own-text-with-openai-s-gpt-2-117m"},{"key":"1553_CR23","unstructured":"Kingma DP, Ba JL (2015) Adam: a method for stochastic optimization. In: ICLR 2015\u2014conference track proceedings"},{"key":"1553_CR24","doi-asserted-by":"publisher","unstructured":"Kobayashi S (2018) Contextual augmentation: data augmentation by words with paradigmatic relations. arXiv preprint arXiv:1805.06201. https:\/\/doi.org\/10.18653\/v1\/n18-2072","DOI":"10.18653\/v1\/n18-2072"},{"key":"1553_CR25","unstructured":"Kolomiyets O, Bethard S, Moens MF (2011) Model-portability experiments for textual temporal analysis. In: Proceedings of ACL-HLT"},{"key":"1553_CR26","doi-asserted-by":"publisher","unstructured":"Krishnalal G, Rengarajan SB, Srinivasagan KG (2010) A new text mining approach based on HMM-SVM for web news classification. Int J Comput Appl. https:\/\/doi.org\/10.5120\/395-589","DOI":"10.5120\/395-589"},{"key":"1553_CR27","doi-asserted-by":"crossref","unstructured":"Kruspe A, Kersten J, Wiegmann M, Stein B, Klan F (2018) Classification of incident-related tweets : tackling imbalanced training data using hybrid CNNs and translation-based data augmentation. In: Notebook papers of TREC","DOI":"10.6028\/NIST.SP.1250.incident-DLR_DW"},{"key":"1553_CR28","doi-asserted-by":"publisher","unstructured":"Kumar A, Bhattamishra S, Bhandari M, Talukdar P (2019) Submodular optimization-based diverse paraphrasing and its effectiveness in data augmentation. In: Proceedings of NAACL-HLT, pp 3609\u20133619. https:\/\/doi.org\/10.18653\/v1\/n19-1363","DOI":"10.18653\/v1\/n19-1363"},{"key":"1553_CR29","unstructured":"Kumar V, Choudhary A, Cho E (2020) Data augmentation using pre-trained transformer models"},{"issue":"1109\/5","key":"1553_CR30","first-page":"726791","volume":"10","author":"Y LeCun","year":"1998","unstructured":"LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 10(1109\/5):726791","journal-title":"Proc IEEE"},{"key":"1553_CR31","doi-asserted-by":"publisher","first-page":"415","DOI":"10.1007\/978-3-031-02145-9","volume-title":"A survey of opinion mining and sentiment analysis","author":"B Liu","year":"2012","unstructured":"Liu B, Zhang L (2012) A survey of opinion mining and sentiment analysis. Springer, Boston, MA, US, pp 415\u2013463"},{"key":"1553_CR32","doi-asserted-by":"crossref","unstructured":"Longpre S, Wang Y, DuBois C (2020) How effective is task-agnostic data augmentation for pretrained transformers? In: Findings of EMNLP","DOI":"10.18653\/v1\/2020.findings-emnlp.394"},{"issue":"4","key":"1553_CR33","doi-asserted-by":"publisher","first-page":"1093","DOI":"10.1016\/j.asej.2014.04.011","volume":"5","author":"W Medhat","year":"2014","unstructured":"Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 5(4):1093\u20131113. https:\/\/doi.org\/10.1016\/j.asej.2014.04.011","journal-title":"Ain Shams Eng J"},{"key":"1553_CR34","unstructured":"Merity S, Keskar NS, Socher R (2018) Regularizing and optimizing LSTM language models. In: ICLR 2018\u2014conference track proceedings"},{"key":"1553_CR35","unstructured":"Miyato T, Dai AM, Goodfellow I (2017) Adversarial training methods for semi-supervised text classification. In: Conference Track - ICLR"},{"key":"1553_CR36","doi-asserted-by":"crossref","unstructured":"Nguyen D, Ali Al\u00a0Mannai K, Joty S, Sajjad H, Imran M, Mitra P (2017) Robust classification of crisis-related data on social networks using convolutional neural networks. In: Proceedings of the international AAAI conference on web and social media 11(1). https:\/\/ojs.aaai.org\/index.php\/ICWSM\/article\/view\/14950","DOI":"10.1609\/icwsm.v11i1.14950"},{"key":"1553_CR37","doi-asserted-by":"publisher","unstructured":"Olteanu A, Vieweg S, Castillo C (2015) What to expect when the unexpected happens: social media communications across crises. In: Proceedings of CSCW. https:\/\/doi.org\/10.1145\/2675133.2675242","DOI":"10.1145\/2675133.2675242"},{"key":"1553_CR38","doi-asserted-by":"publisher","unstructured":"Qiu S, Xu B, Zhang J, Wang Y, Shen X, de\u00a0Melo G, Long C, Li X (2020) EasyAug: an automatic textual data augmentation platform for classification tasks. In: Companion proceedings of the web conference 2020. https:\/\/doi.org\/10.1145\/3366424.3383552","DOI":"10.1145\/3366424.3383552"},{"key":"1553_CR39","unstructured":"Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2018) Language models are unsupervised multitask learners. In: OpenAI blog"},{"issue":"5","key":"1553_CR40","doi-asserted-by":"publisher","first-page":"1255","DOI":"10.1007\/s13042-020-01232-1","volume":"12","author":"BS Raghuwanshi","year":"2021","unstructured":"Raghuwanshi BS, Shukla S (2021) Classifying imbalanced data using SMOTE based class-specific kernelized ELM. Int J Mach Learn Cybern 12(5):1255\u20131280. https:\/\/doi.org\/10.1007\/s13042-020-01232-1","journal-title":"Int J Mach Learn Cybern"},{"key":"1553_CR41","doi-asserted-by":"publisher","unstructured":"Reimers N, Gurevych I (2019) Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). https:\/\/doi.org\/10.18653\/v1\/d19-1410","DOI":"10.18653\/v1\/d19-1410"},{"issue":"1","key":"1553_CR42","doi-asserted-by":"publisher","first-page":"1","DOI":"10.4018\/jiscrm.2012010101","volume":"4","author":"C Reuter","year":"2012","unstructured":"Reuter C, Marx A, Pipek V (2012) Crisis management 2.0: towards a systematization of social software use in crisis situations. Int J Inf Syst Crisis Response Manage (IJISCRAM) 4(1):1\u201316. https:\/\/doi.org\/10.4018\/jiscrm.2012010101","journal-title":"Int J Inf Syst Crisis Response Manage (IJISCRAM)"},{"key":"1553_CR43","doi-asserted-by":"publisher","first-page":"96","DOI":"10.1016\/j.ijhcs.2016.03.005","volume":"95","author":"C Reuter","year":"2016","unstructured":"Reuter C, Ludwig T, Kaufhold MA, Spielhofer T (2016) Emergency services attitudes towards social media: a quantitative and qualitative survey across europe. Int J Hum Comput Stud (IJHCS) 95:96\u2013111. https:\/\/doi.org\/10.1016\/j.ijhcs.2016.03.005","journal-title":"Int J Hum Comput Stud (IJHCS)"},{"key":"1553_CR44","doi-asserted-by":"publisher","unstructured":"Rizos G, Hemker K, Schuller B (2019) Augment to prevent: short-text data augmentation in deep learning for hate-speech classification. In: Proceedings of CIKM. https:\/\/doi.org\/10.1145\/3357384.3358040","DOI":"10.1145\/3357384.3358040"},{"key":"1553_CR45","doi-asserted-by":"publisher","unstructured":"\u015eahin GG, Steedman M (2018) Data augmentation via dependency tree morphing for low-resource languages. In: Proceedings of the 2018 conference on empirical methods in natural language processing. https:\/\/doi.org\/10.18653\/v1\/d18-1545","DOI":"10.18653\/v1\/d18-1545"},{"key":"1553_CR46","doi-asserted-by":"publisher","unstructured":"Schulz A, Guckelsberger C, Janssen F (2017) Semantic abstraction for generalization of tweet classification: an evaluation of incident-related tweets. Semantic Web. https:\/\/doi.org\/10.3233\/SW-150188","DOI":"10.3233\/SW-150188"},{"key":"1553_CR47","doi-asserted-by":"publisher","unstructured":"Sennrich R, Haddow B, Birch A (2016) Improving neural machine translation models with monolingual data. In: ACL, https:\/\/doi.org\/10.18653\/v1\/p16-1009","DOI":"10.18653\/v1\/p16-1009"},{"key":"1553_CR48","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-019-0197-0","author":"C Shorten","year":"2019","unstructured":"Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data. https:\/\/doi.org\/10.1186\/s40537-019-0197-0","journal-title":"J Big Data"},{"key":"1553_CR49","unstructured":"Smith LN (2018) A disciplined approach to neural network hyper-parameters: Part 1- learning rate, batch size, momentum, and weight decay"},{"key":"1553_CR50","unstructured":"Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of EMNLP"},{"key":"1553_CR51","doi-asserted-by":"publisher","unstructured":"Soden R, Palen L (2018) Informating crisis: Expanding critical perspectives in crisis informatics. In: Proc ACM Hum-Comput Interact 2 (CSCW). https:\/\/doi.org\/10.1145\/3274431","DOI":"10.1145\/3274431"},{"key":"1553_CR52","unstructured":"Solaiman I, Brundage M, Clark J, Askell A, Herbert-Voss A, Wu J, Radford A, Wang J (2019) Release strategies and the social impacts of language models"},{"key":"1553_CR53","doi-asserted-by":"publisher","first-page":"156","DOI":"10.1016\/j.ijinfomgt.2017.12.002","volume":"39","author":"S Stieglitz","year":"2018","unstructured":"Stieglitz S, Mirbabaie M, Ross B, Neuberger C (2018) Social media analytics\u2014challenges in topic discovery, data collection, and data preparation. Int J Inf Manage 39:156\u2013168","journal-title":"Int J Inf Manage"},{"key":"1553_CR54","doi-asserted-by":"publisher","unstructured":"Sun X, He J (2020) A novel approach to generate a large scale of supervised data for short text sentiment analysis. multimedia tools and applications. https:\/\/doi.org\/10.1007\/s11042-018-5748-4","DOI":"10.1007\/s11042-018-5748-4"},{"key":"1553_CR55","doi-asserted-by":"publisher","unstructured":"Sun C, Shrivastava A, Singh S, Gupta A (2017) Revisiting unreasonable effectiveness of data in deep learning era. In: Proceedings of the ICCV. https:\/\/doi.org\/10.1109\/ICCV.2017.97","DOI":"10.1109\/ICCV.2017.97"},{"key":"1553_CR56","doi-asserted-by":"publisher","unstructured":"Taylor L, Nitschke G (2019) Improving deep learning with generic data augmentation. In: Proceedings of SSCI. https:\/\/doi.org\/10.1109\/SSCI.2018.8628742","DOI":"10.1109\/SSCI.2018.8628742"},{"key":"1553_CR57","doi-asserted-by":"crossref","unstructured":"Wang C, Lillis D (2020) Classification for crisis-related tweets leveraging word embeddings and data augmentation. In: TREC 2019. https:\/\/trec.nist.gov\/","DOI":"10.6028\/NIST.SP.1250.incident-CS-UCD"},{"key":"1553_CR58","doi-asserted-by":"publisher","unstructured":"Wang WY, Yang D (2015) That\u2019s so annoying!!!: A lexical and frame-semantic embedding based data augmentation approach to automatic categorization of annoying behaviors using #petpeeve tweets. In: Proceedings of EMNLP. https:\/\/doi.org\/10.18653\/v1\/d15-1306","DOI":"10.18653\/v1\/d15-1306"},{"key":"1553_CR59","doi-asserted-by":"publisher","unstructured":"Wei J, Zou K (2019) EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). https:\/\/doi.org\/10.18653\/v1\/d19-1670","DOI":"10.18653\/v1\/d19-1670"},{"key":"1553_CR60","unstructured":"Woolf M (2019) GitHub\u2014gpt-2-simple: Python package to easily retrain OpenAI\u2019s GPT-2 text-generating model on new texts. https:\/\/github.com\/minimaxir\/gpt-2-simple"},{"issue":"11","key":"1553_CR61","doi-asserted-by":"publisher","first-page":"1432","DOI":"10.1002\/asi.24493","volume":"72","author":"R Xiang","year":"2021","unstructured":"Xiang R, Chersoni E, Lu Q, Huang CR, Li W, Long Y (2021) Lexical data augmentation for sentiment analysis. J Assoc Inf Sci Technol 72(11):1432\u20131447. https:\/\/doi.org\/10.1002\/asi.24493","journal-title":"J Assoc Inf Sci Technol"},{"key":"1553_CR62","unstructured":"Xu Y, Jia R, Mou L, Li G, Chen Y, Lu Y, Jin Z (2016) Improved relation classification by deep recurrent neural networks with data augmentation. In: Proceedings of COLING 2016: technical papers"},{"key":"1553_CR63","unstructured":"Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. In: Proceedings of ICLR"},{"key":"1553_CR64","doi-asserted-by":"publisher","DOI":"10.1007\/s13042-021-01321-9","author":"J Zhai","year":"2021","unstructured":"Zhai J, Qi J, Zhang S (2021) Imbalanced data classification based on diverse sample generation and classifier fusion. Int J Mach Learn Cybern. https:\/\/doi.org\/10.1007\/s13042-021-01321-9","journal-title":"Int J Mach Learn Cybern"},{"key":"1553_CR65","unstructured":"Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2018) MixUp: beyond empirical risk minimization. In: Conference track of ICLR"},{"key":"1553_CR66","unstructured":"Zhang X, Zhao J, Lecun Y (2015) Character-level convolutional networks for text classification. In: NIPS"}],"container-title":["International Journal of Machine Learning and Cybernetics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13042-022-01553-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s13042-022-01553-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13042-022-01553-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,22]],"date-time":"2024-09-22T00:13:17Z","timestamp":1726963997000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s13042-022-01553-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,12]]},"references-count":66,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,1]]}},"alternative-id":["1553"],"URL":"https:\/\/doi.org\/10.1007\/s13042-022-01553-3","relation":{},"ISSN":["1868-8071","1868-808X"],"issn-type":[{"value":"1868-8071","type":"print"},{"value":"1868-808X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,4,12]]},"assertion":[{"value":"28 June 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 March 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 April 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflicts of interest\/Competing interests"}},{"value":"The code specifics are outlined in Appendix\u00a0C.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Code availability"}},{"value":"Ethics are discussed in Appendix\u00a0D.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics"}}]}}