{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T01:39:26Z","timestamp":1777426766035,"version":"3.51.4"},"reference-count":169,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2021,6,30]],"date-time":"2021-06-30T00:00:00Z","timestamp":1625011200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2021,9,30]]},"abstract":"<jats:p>Word representation has always been an important research area in the history of natural language processing (NLP). Understanding such complex text data is imperative, given that it is rich in information and can be used widely across various applications. In this survey, we explore different word representation models and its power of expression, from the classical to modern-day state-of-the-art word representation language models (LMS). We describe a variety of text representation methods, and model designs have blossomed in the context of NLP, including SOTA LMs. These models can transform large volumes of text into effective vector representations capturing the same semantic information. Further, such representations can be utilized by various machine learning (ML) algorithms for a variety of NLP-related tasks. In the end, this survey briefly discusses the commonly used ML- and DL-based classifiers, evaluation metrics, and the applications of these word embeddings in different NLP tasks.<\/jats:p>","DOI":"10.1145\/3434237","type":"journal-article","created":{"date-parts":[[2021,6,30]],"date-time":"2021-06-30T20:06:29Z","timestamp":1625083589000},"page":"1-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":130,"title":["A Comprehensive Survey on Word Representation Models: From Classical to State-of-the-Art Word Representation Language Models"],"prefix":"10.1145","volume":"20","author":[{"given":"Usman","family":"Naseem","sequence":"first","affiliation":[{"name":"School of Computer Science, The University of Sydney, Australia"}]},{"given":"Imran","family":"Razzak","sequence":"additional","affiliation":[{"name":"School of Information Technology, Deakin University, Australia"}]},{"given":"Shah Khalid","family":"Khan","sequence":"additional","affiliation":[{"name":"School of Engineering, RMIT University, Australia"}]},{"given":"Mukesh","family":"Prasad","sequence":"additional","affiliation":[{"name":"School of Computer Science, University of Technology Sydney, Australia"}]}],"member":"320","published-online":{"date-parts":[[2021,6,30]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Passonneau","author":"Agarwal Apoorv","year":"2011","unstructured":"Apoorv Agarwal , Boyi Xie , Ilia Vovsha , Owen Rambow , and Rebecca J . Passonneau . 2011 . Sentiment analysis of Twitter data. https:\/\/www.aclweb.org\/anthology\/W11-0705.pdf. Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow, and Rebecca J. Passonneau. 2011. Sentiment analysis of Twitter data. https:\/\/www.aclweb.org\/anthology\/W11-0705.pdf."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/2535015"},{"key":"e_1_2_1_3_1","volume-title":"Aggarwal and ChengXiang Zhai","author":"Charu","year":"2012","unstructured":"Charu C. Aggarwal and ChengXiang Zhai . 2012 . A survey of text classification algorithms. In Mining Text Data. Springer , Boston, MA, 163\u2013222. Charu C. Aggarwal and ChengXiang Zhai. 2012. A survey of text classification algorithms. In Mining Text Data. Springer, Boston, MA, 163\u2013222."},{"key":"e_1_2_1_4_1","unstructured":"Edgar Altszyler Mariano Sigman and Diego Fern\u00e1ndez Slezak. 2016. Comparative study of LSA vs Word2vec embeddings in small corpora: A case study in dreams database. (2016). arxiv:abs\/1610.01520. Edgar Altszyler Mariano Sigman and Diego Fern\u00e1ndez Slezak. 2016. Comparative study of LSA vs Word2vec embeddings in small corpora: A case study in dreams database. (2016). arxiv:abs\/1610.01520."},{"key":"e_1_2_1_5_1","unstructured":"Alexandra Balahur. 2013. Sentiment analysis in social media texts. In WASSA@NAACL-HLT. Alexandra Balahur. 2013. Sentiment analysis in social media texts. In WASSA@NAACL-HLT."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2015.06.002"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/3235571"},{"key":"e_1_2_1_8_1","volume-title":"Intelligent Computing Methodologies, De-Shuang Huang, Kang-Hyun Jo","author":"Bao Yanwei","unstructured":"Yanwei Bao , Changqin Quan , Lijuan Wang , and Fuji Ren . 2014. The role of pre-processing in Twitter sentiment analysis . In Intelligent Computing Methodologies, De-Shuang Huang, Kang-Hyun Jo , and Ling Wang (Eds.). Springer International Publishing , Cham , 615\u2013624. Yanwei Bao, Changqin Quan, Lijuan Wang, and Fuji Ren. 2014. The role of pre-processing in Twitter sentiment analysis. In Intelligent Computing Methodologies, De-Shuang Huang, Kang-Hyun Jo, and Ling Wang (Eds.). Springer International Publishing, Cham, 615\u2013624."},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Iz Beltagy Kyle Lo and Arman Cohan. 2019. SciBERT: A pretrained language model for scientific text. arxiv:cs.CL\/1903.10676. Iz Beltagy Kyle Lo and Arman Cohan. 2019. SciBERT: A pretrained language model for scientific text. arxiv:cs.CL\/1903.10676.","DOI":"10.18653\/v1\/D19-1371"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.50"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/944919.944966"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/72.279181"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the Workshop on Sentiment Analysis Where AI Meets Psychology (\u201911)","author":"Bermingham Adam","year":"2011","unstructured":"Adam Bermingham and Alan Smeaton . 2011 . On using Twitter to monitor political sentiment and predict election results . In Proceedings of the Workshop on Sentiment Analysis Where AI Meets Psychology (\u201911) . Asian Federation of Natural Language Processing, 2\u201310. Retrieved from https:\/\/www.aclweb.org\/anthology\/W11-3702. Adam Bermingham and Alan Smeaton. 2011. On using Twitter to monitor political sentiment and predict election results. In Proceedings of the Workshop on Sentiment Analysis Where AI Meets Psychology (\u201911). Asian Federation of Natural Language Processing, 2\u201310. Retrieved from https:\/\/www.aclweb.org\/anthology\/W11-3702."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/SocialCom.2013.54"},{"key":"e_1_2_1_15_1","volume-title":"Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606","author":"Bojanowski Piotr","year":"2016","unstructured":"Piotr Bojanowski , Edouard Grave , Armand Joulin , and Tomas Mikolov . 2016. Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606 ( 2016 ). Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606 (2016)."},{"key":"e_1_2_1_16_1","volume-title":"Enriching word vectors with subword information. CoRR abs\/1607.04606","author":"Bojanowski Piotr","year":"2016","unstructured":"Piotr Bojanowski , Edouard Grave , Armand Joulin , and Tomas Mikolov . 2016. Enriching word vectors with subword information. CoRR abs\/1607.04606 ( 2016 ). Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching word vectors with subword information. CoRR abs\/1607.04606 (2016)."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/3157382.3157584"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2016.07.005"},{"key":"e_1_2_1_19_1","volume-title":"Association for the Advancement of Artificial Intelligence Conference.","author":"Cambria Erik","year":"2018","unstructured":"Erik Cambria , Soujanya Poria , Devamanyu Hazarika , and Kenneth Kwok . 2018 . SenticNet 5: Discovering conceptual primitives for sentiment analysis by means of context embeddings . In Association for the Advancement of Artificial Intelligence Conference. Erik Cambria, Soujanya Poria, Devamanyu Hazarika, and Kenneth Kwok. 2018. SenticNet 5: Discovering conceptual primitives for sentiment analysis by means of context embeddings. In Association for the Advancement of Artificial Intelligence Conference."},{"key":"e_1_2_1_20_1","volume-title":"Boosting trees for anti-spam email filtering. CoRR cs.CL\/0109015","author":"Carreras Xavier","year":"2001","unstructured":"Xavier Carreras and Llu\u00eds M\u00e0rquez . 2001. Boosting trees for anti-spam email filtering. CoRR cs.CL\/0109015 ( 2001 ). Xavier Carreras and Llu\u00eds M\u00e0rquez. 2001. Boosting trees for anti-spam email filtering. CoRR cs.CL\/0109015 (2001)."},{"key":"e_1_2_1_21_1","volume-title":"Acquiring a large scale polarity lexicon through unsupervised distributional methods","author":"Castellucci Giuseppe","unstructured":"Giuseppe Castellucci , Danilo Croce , and Roberto Basili . 2015. Acquiring a large scale polarity lexicon through unsupervised distributional methods . In Natural Language Processing and Information Systems, Chris Biemann, Siegfried Handschuh, Andr\u00e9 Freitas, Farid Meziane, and Elisabeth M\u00e9tais (Eds.). Springer International Publishing , Cham , 73\u201386. Giuseppe Castellucci, Danilo Croce, and Roberto Basili. 2015. Acquiring a large scale polarity lexicon through unsupervised distributional methods. In Natural Language Processing and Information Systems, Chris Biemann, Siegfried Handschuh, Andr\u00e9 Freitas, Farid Meziane, and Elisabeth M\u00e9tais (Eds.). Springer International Publishing, Cham, 73\u201386."},{"key":"e_1_2_1_22_1","unstructured":"Arda Celebi and Arzucan Ozgur. 2016. Segmenting hashtags using automatically created training data. https:\/\/www.aclweb.org\/anthology\/L16-1476.pdf. Arda Celebi and Arzucan Ozgur. 2016. Segmenting hashtags using automatically created training data. https:\/\/www.aclweb.org\/anthology\/L16-1476.pdf."},{"key":"e_1_2_1_23_1","volume-title":"Zhao Duan, and Jianquan Ma.","author":"Chen Wei James","year":"2017","unstructured":"Wei James Chen , Xiaoshen Xie , Jiale Wang , Biswajeet Pradhan , Haoyuan Hong , Dieu Tien Bui , Zhao Duan, and Jianquan Ma. 2017 . A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0341816216305136. Wei James Chen, Xiaoshen Xie, Jiale Wang, Biswajeet Pradhan, Haoyuan Hong, Dieu Tien Bui, Zhao Duan, and Jianquan Ma. 2017. A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0341816216305136."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1179"},{"key":"e_1_2_1_25_1","volume-title":"Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs\/1412.3555","author":"Chung Junyoung","year":"2014","unstructured":"Junyoung Chung , \u00c7aglar G\u00fcl\u00e7ehre , KyungHyun Cho , and Yoshua Bengio . 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs\/1412.3555 ( 2014 ). Junyoung Chung, \u00c7aglar G\u00fcl\u00e7ehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs\/1412.3555 (2014)."},{"key":"e_1_2_1_26_1","volume-title":"Manning","author":"Clark Kevin","year":"2020","unstructured":"Kevin Clark , Minh-Thang Luong , Quoc V. Le , and Christopher D . Manning . 2020 . Electra : Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020). Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020)."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390177"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078186"},{"key":"e_1_2_1_29_1","volume-title":"Automated hate speech detection and the problem of offensive language. CoRR abs\/1703.04009","author":"Davidson Thomas","year":"2017","unstructured":"Thomas Davidson , Dana Warmsley , Michael W. Macy , and Ingmar Weber . 2017. Automated hate speech detection and the problem of offensive language. CoRR abs\/1703.04009 ( 2017 ). Thomas Davidson, Dana Warmsley, Michael W. Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. CoRR abs\/1703.04009 (2017)."},{"key":"e_1_2_1_30_1","volume-title":"BERT: Pre-training of deep bidirectional transformers for language understanding. CoRR abs\/1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . BERT: Pre-training of deep bidirectional transformers for language understanding. CoRR abs\/1810.04805 (2018). Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. CoRR abs\/1810.04805 (2018)."},{"key":"e_1_2_1_31_1","volume-title":"Cohen","author":"Dhingra Bhuwan","year":"2017","unstructured":"Bhuwan Dhingra , Hanxiao Liu , Ruslan Salakhutdinov , and William W . Cohen . 2017 . A comparative study of word embeddings for reading comprehension. CoRR abs\/1703.00993 (2017). Bhuwan Dhingra, Hanxiao Liu, Ruslan Salakhutdinov, and William W. Cohen. 2017. A comparative study of word embeddings for reading comprehension. CoRR abs\/1703.00993 (2017)."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/3454287.3455457"},{"key":"e_1_2_1_33_1","volume-title":"International Conference on Computational Linguistics.","author":"dos Santos C\u00edcero Nogueira","unstructured":"C\u00edcero Nogueira dos Santos and Ma\u00edra A . de C. Gatti. 2014. Deep convolutional neural networks for sentiment analysis of short texts . In International Conference on Computational Linguistics. C\u00edcero Nogueira dos Santos and Ma\u00edra A. de C. Gatti. 2014. Deep convolutional neural networks for sentiment analysis of short texts. In International Conference on Computational Linguistics."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00799-018-0237-y"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.5555\/1390681.1442794"},{"key":"e_1_2_1_36_1","volume-title":"Chris Dyer, Eduard H. Hovy, and Noah A. Smith.","author":"Faruqui Manaal","year":"2014","unstructured":"Manaal Faruqui , Jesse Dodge , Sujay Kumar Jauhar , Chris Dyer, Eduard H. Hovy, and Noah A. Smith. 2014 . Retrofitting word vectors to semantic lexicons. CoRR abs\/1411.4166 (2014). Manaal Faruqui, Jesse Dodge, Sujay Kumar Jauhar, Chris Dyer, Eduard H. Hovy, and Noah A. Smith. 2014. Retrofitting word vectors to semantic lexicons. CoRR abs\/1411.4166 (2014)."},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of 5th International Joint Conference on Natural Language Processing. Asian Federation of Natural Language Processing, 893\u2013901","author":"Foster Jennifer","year":"2011","unstructured":"Jennifer Foster , \u00d6zlem \u00c7etino\u011flu , Joachim Wagner , Joseph Le Roux , Joakim Nivre , Deirdre Hogan , and Josef van Genabith . 2011 . From news to comment: Resources and benchmarks for parsing the language of Web 2.0 . In Proceedings of 5th International Joint Conference on Natural Language Processing. Asian Federation of Natural Language Processing, 893\u2013901 . Retrieved from https:\/\/www.aclweb.org\/anthology\/I11-1100. Jennifer Foster, \u00d6zlem \u00c7etino\u011flu, Joachim Wagner, Joseph Le Roux, Joakim Nivre, Deirdre Hogan, and Josef van Genabith. 2011. From news to comment: Resources and benchmarks for parsing the language of Web 2.0. In Proceedings of 5th International Joint Conference on Natural Language Processing. Asian Federation of Natural Language Processing, 893\u2013901. Retrieved from https:\/\/www.aclweb.org\/anthology\/I11-1100."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2017.01.079"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1198\/004017007000000245"},{"key":"e_1_2_1_40_1","doi-asserted-by":"crossref","unstructured":"Anastasia Giachanou Julio Gonzalo Ida Mele and Fabio Crestani. 2017. Sentiment propagation for predicting reputation polarity. DOI:https:\/\/doi.org\/10.1007\/978-3-319-56608-5_18 Anastasia Giachanou Julio Gonzalo Ida Mele and Fabio Crestani. 2017. Sentiment propagation for predicting reputation polarity. DOI:https:\/\/doi.org\/10.1007\/978-3-319-56608-5_18","DOI":"10.1007\/978-3-319-56608-5_18"},{"key":"e_1_2_1_41_1","unstructured":"Kevin Gimpel Nathan Schneider Dipanjan Das Daniel Mills Jacob Eisenstein Michael Heilman Dani Yogatama Jeffrey Flanigan and Noah A. Smith. [n.d.]. Part-of-Speech Tagging for Twitter: Annotation Features and Experiments. https:\/\/www.aclweb.org\/anthology\/P11-2008.pdf. Kevin Gimpel Nathan Schneider Dipanjan Das Daniel Mills Jacob Eisenstein Michael Heilman Dani Yogatama Jeffrey Flanigan and Noah A. Smith. [n.d.]. Part-of-Speech Tagging for Twitter: Annotation Features and Experiments. https:\/\/www.aclweb.org\/anthology\/P11-2008.pdf."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/IECON.2017.8217316"},{"key":"e_1_2_1_43_1","unstructured":"Edel Greevy. 2004. Automatic text categorisation of racist webpages. http:\/\/doras.dcu.ie\/17275\/1\/edel_greevy_20120702122736.pdf. Edel Greevy. 2004. Automatic text categorisation of racist webpages. http:\/\/doras.dcu.ie\/17275\/1\/edel_greevy_20120702122736.pdf."},{"key":"e_1_2_1_44_1","volume-title":"A survey of text mining techniques and applications. J. Emerg. Technol. Web Intell. 1 (08","author":"Gupta Vishal","year":"2009","unstructured":"Vishal Gupta and Gurpreet Lehal . 2009. A survey of text mining techniques and applications. J. Emerg. Technol. Web Intell. 1 (08 2009 ). DOI:DOI:https:\/\/doi.org\/10.4304\/jetwi.1.1.60-76 Vishal Gupta and Gurpreet Lehal. 2009. A survey of text mining techniques and applications. J. Emerg. Technol. Web Intell. 1 (08 2009). DOI:DOI:https:\/\/doi.org\/10.4304\/jetwi.1.1.60-76"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2013.05.005"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2004.58"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.5555\/2002472.2002489"},{"key":"e_1_2_1_48_1","volume-title":"High-risk learning: Acquiring new word vectors from tiny data. CoRR abs\/1707.06556","author":"Herbelot Aur\u00e9lie","year":"2017","unstructured":"Aur\u00e9lie Herbelot and Marco Baroni . 2017. High-risk learning: Acquiring new word vectors from tiny data. CoRR abs\/1707.06556 ( 2017 ). Aur\u00e9lie Herbelot and Marco Baroni. 2017. High-risk learning: Acquiring new word vectors from tiny data. CoRR abs\/1707.06556 (2017)."},{"key":"e_1_2_1_49_1","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1080\/01621459.1968.11009286","article-title":"Posterior distribution of percentiles: Bayes\u2019 theorem for sampling from a population","volume":"63","author":"Hill Bruce M.","year":"1968","unstructured":"Bruce M. Hill . 1968 . Posterior distribution of percentiles: Bayes\u2019 theorem for sampling from a population . J. Amer. Statist. Assoc. 63 , 322 (1968), 677 \u2013 691 . Retrieved from http:\/\/www.jstor.org\/stable\/2284038. Bruce M. Hill. 1968. Posterior distribution of percentiles: Bayes\u2019 theorem for sampling from a population. J. Amer. Statist. Assoc. 63, 322 (1968), 677\u2013691. Retrieved from http:\/\/www.jstor.org\/stable\/2284038.","journal-title":"J. Amer. Statist. Assoc."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.709601"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1031"},{"key":"e_1_2_1_53_1","volume-title":"Text Analytics in Social Media","author":"Hu Xia","unstructured":"Xia Hu and Huan Liu . 2012. Text Analytics in Social Media . Springer US , Boston, MA , 385\u2013414. DOI:DOI:https:\/\/doi.org\/10.1007\/978-1-4614-3223-4_12 Xia Hu and Huan Liu. 2012. Text Analytics in Social Media. Springer US, Boston, MA, 385\u2013414. DOI:DOI:https:\/\/doi.org\/10.1007\/978-1-4614-3223-4_12"},{"key":"e_1_2_1_54_1","volume-title":"Workshop on Knowledge Discovery from Advanced Databases. 65\u201370","author":"Tan Ah","year":"1999","unstructured":"Ah hwee Tan . 1999 . Text mining: The state of the art and the challenges . In Workshop on Knowledge Discovery from Advanced Databases. 65\u201370 . Ah hwee Tan. 1999. Text mining: The state of the art and the challenges. In Workshop on Knowledge Discovery from Advanced Databases. 65\u201370."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-1010"},{"key":"e_1_2_1_56_1","volume-title":"Deep contextualized word representations for detecting sarcasm and irony. CoRR abs\/1809.09795","author":"Ilic Suzana","year":"2018","unstructured":"Suzana Ilic , Edison Marrese-Taylor , Jorge A. Balazs , and Yutaka Matsuo . 2018. Deep contextualized word representations for detecting sarcasm and irony. CoRR abs\/1809.09795 ( 2018 ). Suzana Ilic, Edison Marrese-Taylor, Jorge A. Balazs, and Yutaka Matsuo. 2018. Deep contextualized word representations for detecting sarcasm and irony. CoRR abs\/1809.09795 (2018)."},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0823-z"},{"key":"e_1_2_1_58_1","doi-asserted-by":"crossref","unstructured":"Zhao Jianqiang. 2015. Pre-processing boosting Twitter sentiment analysis?DOI:DOI:https:\/\/doi.org\/10.1109\/SmartCity.2015.158 Zhao Jianqiang. 2015. Pre-processing boosting Twitter sentiment analysis?DOI:DOI:https:\/\/doi.org\/10.1109\/SmartCity.2015.158","DOI":"10.1109\/SmartCity.2015.158"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2017.2672677"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2017.2776930"},{"key":"e_1_2_1_61_1","volume-title":"Effective use of word order for text categorization with convolutional neural networks. CoRR abs\/1412.1058","author":"Johnson Rie","year":"2014","unstructured":"Rie Johnson and Tong Zhang . 2014. Effective use of word order for text categorization with convolutional neural networks. CoRR abs\/1412.1058 ( 2014 ). Rie Johnson and Tong Zhang. 2014. Effective use of word order for text categorization with convolutional neural networks. CoRR abs\/1412.1058 (2014)."},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00300"},{"key":"e_1_2_1_63_1","volume-title":"Bag of tricks for efficient text classification. CoRR abs\/1607.01759","author":"Joulin Armand","year":"2016","unstructured":"Armand Joulin , Edouard Grave , Piotr Bojanowski , and Tomas Mikolov . 2016. Bag of tricks for efficient text classification. CoRR abs\/1607.01759 ( 2016 ). Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. 2016. Bag of tricks for efficient text classification. CoRR abs\/1607.01759 (2016)."},{"key":"e_1_2_1_64_1","volume-title":"CTRL: A conditional transformer language model for controllable generation. arxiv:cs.CL\/1909.05858.","author":"Keskar Nitish Shirish","year":"2019","unstructured":"Nitish Shirish Keskar , Bryan McCann , Lav R. Varshney , Caiming Xiong , and Richard Socher . 2019 . CTRL: A conditional transformer language model for controllable generation. arxiv:cs.CL\/1909.05858. Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, Caiming Xiong, and Richard Socher. 2019. CTRL: A conditional transformer language model for controllable generation. arxiv:cs.CL\/1909.05858."},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2013.09.004"},{"key":"e_1_2_1_66_1","volume-title":"Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882","author":"Kim Yoon","year":"2014","unstructured":"Yoon Kim . 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 ( 2014 ). Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)."},{"key":"e_1_2_1_67_1","unstructured":"Vandana Korde and C. Namrata Mahender. 2012. Text classification and classifiers: A survey. http:\/\/www.airccse.org\/journal\/ijaia\/papers\/3212ijaia08.pdfhttps:\/\/www.researchgate.net\/publication\/276196340_Text_Classification_and_ClassifiersA_Survey. Vandana Korde and C. Namrata Mahender. 2012. Text classification and classifiers: A survey. http:\/\/www.airccse.org\/journal\/ijaia\/papers\/3212ijaia08.pdfhttps:\/\/www.researchgate.net\/publication\/276196340_Text_Classification_and_ClassifiersA_Survey."},{"key":"e_1_2_1_68_1","volume-title":"Moore","author":"Kouloumpis Efthymios","year":"2011","unstructured":"Efthymios Kouloumpis , Theresa Wilson , and Johanna D . Moore . 2011 . Twitter sentiment analysis: The good the bad and the OMG! In International AAAI Conference on Web and Social Media . Efthymios Kouloumpis, Theresa Wilson, and Johanna D. Moore. 2011. Twitter sentiment analysis: The good the bad and the OMG! In International AAAI Conference on Web and Social Media."},{"key":"e_1_2_1_69_1","volume-title":"Mojtaba Heidarysafa, Sanjana Mendu, Laura E. Barnes, and Donald E. Brown.","author":"Kowsari Kamran","year":"2019","unstructured":"Kamran Kowsari , Kiana Jafari Meimandi , Mojtaba Heidarysafa, Sanjana Mendu, Laura E. Barnes, and Donald E. Brown. 2019 . Text classification algorithms: A survey. CoRR abs\/1904.08067 (2019). Kamran Kowsari, Kiana Jafari Meimandi, Mojtaba Heidarysafa, Sanjana Mendu, Laura E. Barnes, and Donald E. Brown. 2019. Text classification algorithms: A survey. CoRR abs\/1904.08067 (2019)."},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.5555\/2891460.2891697"},{"key":"e_1_2_1_71_1","volume-title":"Cross-lingual language model pretraining. CoRR abs\/1901.07291","author":"Lample Guillaume","year":"2019","unstructured":"Guillaume Lample and Alexis Conneau . 2019. Cross-lingual language model pretraining. CoRR abs\/1901.07291 ( 2019 ). Guillaume Lample and Alexis Conneau. 2019. Cross-lingual language model pretraining. CoRR abs\/1901.07291 (2019)."},{"key":"e_1_2_1_72_1","volume-title":"ALBERT: A lite BERT for self-supervised learning of language representations. arxiv:cs.CL\/1909.11942.","author":"Lan Zhenzhong","year":"2019","unstructured":"Zhenzhong Lan , Mingda Chen , Sebastian Goodman , Kevin Gimpel , Piyush Sharma , and Radu Soricut . 2019 . ALBERT: A lite BERT for self-supervised learning of language representations. arxiv:cs.CL\/1909.11942. Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. ALBERT: A lite BERT for self-supervised learning of language representations. arxiv:cs.CL\/1909.11942."},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.5555\/1753126.1753129"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2017.01.117"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.5555\/3044805.3045025"},{"key":"e_1_2_1_76_1","volume-title":"Deep learning. Nature 521 (05","author":"LeCun Yann","year":"2015","unstructured":"Yann LeCun , Y. Bengio , and Geoffrey Hinton . 2015. Deep learning. Nature 521 (05 2015 ), 436\u201344. DOI:DOI:https:\/\/doi.org\/10.1038\/nature14539 Yann LeCun, Y. Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521 (05 2015), 436\u201344. DOI:DOI:https:\/\/doi.org\/10.1038\/nature14539"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_2_1_78_1","unstructured":"Ledell Adam Fisch Sumit Chopra Keith Adams Antoine Bordes and Jason Weston. 2017. StarSpace: Embed all the things!arxiv:cs.CL\/1709.03856. Ledell Adam Fisch Sumit Chopra Keith Adams Antoine Bordes and Jason Weston. 2017. StarSpace: Embed all the things!arxiv:cs.CL\/1709.03856."},{"key":"e_1_2_1_79_1","volume-title":"Chan Ho So, and Jaewoo Kang","author":"Lee Jinhyuk","year":"2019","unstructured":"Jinhyuk Lee , Wonjin Yoon , Sungdong Kim , Donghyeon Kim , Sunkyu Kim , Chan Ho So, and Jaewoo Kang . 2019 . BioBERT : A pre-trained biomedical language representation model for biomedical text mining. arxiv:cs.CL\/1901.08746. Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2019. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. arxiv:cs.CL\/1901.08746."},{"key":"e_1_2_1_80_1","volume-title":"Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461","author":"Lewis Mike","year":"2019","unstructured":"Mike Lewis , Yinhan Liu , Naman Goyal , Marjan Ghazvininejad , Abdelrahman Mohamed , Omer Levy , Ves Stoyanov , and Luke Zettlemoyer . 2019 . Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019). Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)."},{"key":"e_1_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2017.2788182"},{"key":"e_1_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.1145\/1645953.1646003"},{"key":"e_1_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.5555\/2832415.2832428"},{"key":"e_1_2_1_84_1","doi-asserted-by":"publisher","DOI":"10.5220\/0005170305300537"},{"key":"e_1_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2017.11.005"},{"key":"e_1_2_1_86_1","volume-title":"RoBERTa: A robustly optimized BERT pretraining approach. CoRR abs\/1907.11692","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu , Myle Ott , Naman Goyal , Jingfei Du , Mandar Joshi , Danqi Chen , Omer Levy , Mike Lewis , Luke Zettlemoyer , and Veselin Stoyanov . 2019. RoBERTa: A robustly optimized BERT pretraining approach. CoRR abs\/1907.11692 ( 2019 ). Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. CoRR abs\/1907.11692 (2019)."},{"key":"e_1_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.3115\/981658.981695"},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.5555\/559226"},{"key":"e_1_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-5010"},{"key":"e_1_2_1_90_1","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295377"},{"key":"e_1_2_1_91_1","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295377"},{"key":"e_1_2_1_92_1","unstructured":"Yelena Mejova and Padmini Srinivasan. 2011. Exploring feature definition and selection for sentiment classifiers. Yelena Mejova and Padmini Srinivasan. 2011. Exploring feature definition and selection for sentiment classifiers."},{"key":"e_1_2_1_93_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K16-1006"},{"key":"e_1_2_1_94_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K16-1006"},{"key":"e_1_2_1_95_1","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557156"},{"key":"e_1_2_1_96_1","doi-asserted-by":"publisher","DOI":"10.5555\/2999792.2999959"},{"key":"e_1_2_1_97_1","volume-title":"Second Joint Conference on Lexical and Computational Semantics (*SEM)","volume":"327","author":"Mohammad Saif","year":"2013","unstructured":"Saif Mohammad , Svetlana Kiritchenko , and Xiaodan Zhu . 2013 . NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets . In Second Joint Conference on Lexical and Computational Semantics (*SEM) , Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013). Association for Computational Linguistics, 321\u2013 327 . Retrieved from http:\/\/aclweb.org\/anthology\/S13- 2053. Saif Mohammad, Svetlana Kiritchenko, and Xiaodan Zhu. 2013. NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013). Association for Computational Linguistics, 321\u2013327. Retrieved from http:\/\/aclweb.org\/anthology\/S13-2053."},{"key":"e_1_2_1_98_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1963.10500855"},{"key":"e_1_2_1_99_1","volume-title":"Young","author":"Mrksic Nikola","year":"2017","unstructured":"Nikola Mrksic , Ivan Vulic , Diarmuid \u00d3 S\u00e9aghdha , Ira Leviant , Roi Reichart , Milica Gasic , Anna Korhonen , and Steve J . Young . 2017 . Semantic specialisation of distributional word vector spaces using monolingual and cross-lingual constraints. CoRR abs\/1706.00374 (2017). Nikola Mrksic, Ivan Vulic, Diarmuid \u00d3 S\u00e9aghdha, Ira Leviant, Roi Reichart, Milica Gasic, Anna Korhonen, and Steve J. Young. 2017. Semantic specialisation of distributional word vector spaces using monolingual and cross-lingual constraints. CoRR abs\/1706.00374 (2017)."},{"key":"e_1_2_1_100_1","volume-title":"AAAI Spring Symposium - Technical Report SS-06-03","author":"Mullen T.","year":"2006","unstructured":"T. Mullen and R. Malouf . 2006. A preliminary investigation into sentiment analysis of informal political discourse . AAAI Spring Symposium - Technical Report SS-06-03 ( 2006 ), 159\u2013162. Retrieved from https:\/\/www.scopus.com\/inward\/record.uri?eid=2-s2.0-33747172751&partnerID=40&md5=6b12793b70eae006102989ed6d398fcb. T. Mullen and R. Malouf. 2006. A preliminary investigation into sentiment analysis of informal political discourse. AAAI Spring Symposium - Technical Report SS-06-03 (2006), 159\u2013162. Retrieved from https:\/\/www.scopus.com\/inward\/record.uri?eid=2-s2.0-33747172751&partnerID=40&md5=6b12793b70eae006102989ed6d398fcb."},{"key":"e_1_2_1_101_1","volume-title":"Kummervold","author":"M\u00fcller Martin","year":"2020","unstructured":"Martin M\u00fcller , Marcel Salath\u00e9 , and Per E . Kummervold . 2020 . COVID-Twitter-BERT: A natural language processing model to analyse COVID-19 content on Twitter . arXiv preprint arXiv:2005.07503 (2020). Martin M\u00fcller, Marcel Salath\u00e9, and Per E. Kummervold. 2020. COVID-Twitter-BERT: A natural language processing model to analyse COVID-19 content on Twitter. arXiv preprint arXiv:2005.07503 (2020)."},{"key":"e_1_2_1_102_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2017.08.009"},{"key":"e_1_2_1_103_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-41278-3_24"},{"key":"e_1_2_1_105_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.17485\/ijst\/2019\/v12i45\/146538","article-title":"Abusive language detection: A comprehensive review","volume":"12","author":"Naseem U.","year":"2019","unstructured":"U. Naseem , S. K. Khan , M. Farasat , and F. Ali . 2019 . Abusive language detection: A comprehensive review . Indian J. Sci. Technol. 12 , 45 (2019), 1 \u2013 13 . U. Naseem, S. K. Khan, M. Farasat, and F. Ali. 2019. Abusive language detection: A comprehensive review. Indian J. Sci. Technol. 12, 45 (2019), 1\u201313.","journal-title":"Indian J. Sci. Technol."},{"key":"e_1_2_1_106_1","doi-asserted-by":"crossref","unstructured":"U. Naseem I. Razzak and P. W. Eklund. 2020. A survey of pre-processing techniques to improve short-text quality: a case study on hate speech detection on twitter. Multimedia Tools and Applications 1\u201328. U. Naseem I. Razzak and P. W. Eklund. 2020. A survey of pre-processing techniques to improve short-text quality: a case study on hate speech detection on twitter. Multimedia Tools and Applications 1\u201328.","DOI":"10.1007\/s11042-020-10082-6"},{"key":"e_1_2_1_107_1","volume-title":"Imran Razzak","author":"Naseem Usman","year":"2019","unstructured":"Usman Naseem , Shah Khalid Khan , Imran Razzak , and Ibrahim A. Hameed. 2019 . Hybrid words representation for airlines sentiment analysis. In AI 2019: Advances in Artificial Intelligence, Jixue Liu and James Bailey (Eds.). Springer International Publishing , Cham, 381\u2013392. Usman Naseem, Shah Khalid Khan, Imran Razzak, and Ibrahim A. Hameed. 2019. Hybrid words representation for airlines sentiment analysis. In AI 2019: Advances in Artificial Intelligence, Jixue Liu and James Bailey (Eds.). Springer International Publishing, Cham, 381\u2013392."},{"key":"e_1_2_1_108_1","volume-title":"Nazar Waheed, Adnan Mir, Atika Qazi, Bandar Alshammari, and Simon K. Poon.","author":"Naseem Usman","year":"2020","unstructured":"Usman Naseem , Matloob Khushi , Shah Khalid Khan , Nazar Waheed, Adnan Mir, Atika Qazi, Bandar Alshammari, and Simon K. Poon. 2020 . Diabetic retinopathy detection using multi-layer neural networks and split attention with focal loss. In International Conference on Neural Information Processing. Springer , 1\u201312. Usman Naseem, Matloob Khushi, Shah Khalid Khan, Nazar Waheed, Adnan Mir, Atika Qazi, Bandar Alshammari, and Simon K. Poon. 2020. Diabetic retinopathy detection using multi-layer neural networks and split attention with focal loss. In International Conference on Neural Information Processing. Springer, 1\u201312."},{"key":"e_1_2_1_109_1","volume-title":"BioALBERT: A simple and effective pre-trained language model for biomedical named entity recognition. arXiv preprint arXiv:2009.09223","author":"Naseem Usman","year":"2020","unstructured":"Usman Naseem , Matloob Khushi , Vinay Reddy , Sakthivel Rajendran , Imran Razzak , and Jinman Kim . 2020. BioALBERT: A simple and effective pre-trained language model for biomedical named entity recognition. arXiv preprint arXiv:2009.09223 ( 2020 ). Usman Naseem, Matloob Khushi, Vinay Reddy, Sakthivel Rajendran, Imran Razzak, and Jinman Kim. 2020. BioALBERT: A simple and effective pre-trained language model for biomedical named entity recognition. arXiv preprint arXiv:2009.09223 (2020)."},{"key":"e_1_2_1_110_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDAR.2019.00157"},{"key":"e_1_2_1_111_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSS.2021.3051189"},{"key":"e_1_2_1_112_1","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN48605.2020.9206808"},{"key":"e_1_2_1_113_1","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN48605.2020.9207237"},{"key":"e_1_2_1_114_1","first-page":"69","article-title":"Deep context-aware embedding for abusive and hate speech detection on","volume":"15","author":"Naseem Usman","year":"2019","unstructured":"Usman Naseem , Imran Razzak , and Ibrahim A. Hameed . 2019 . Deep context-aware embedding for abusive and hate speech detection on Twitter. Aust. J. Intell. Inf. Process. Syst. 15 , 3 (2019), 69 \u2013 76 . Usman Naseem, Imran Razzak, and Ibrahim A. Hameed. 2019. Deep context-aware embedding for abusive and hate speech detection on Twitter. Aust. J. Intell. Inf. Process. Syst. 15, 3 (2019), 69\u201376.","journal-title":"Twitter. Aust. J. Intell. Inf. Process. Syst."},{"key":"e_1_2_1_115_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2020.06.050"},{"key":"e_1_2_1_116_1","volume-title":"Efficient non-parametric estimation of multiple embeddings per word in vector space. CoRR abs\/1504.06654","author":"Neelakantan Arvind","year":"2015","unstructured":"Arvind Neelakantan , Jeevan Shankar , Alexandre Passos , and Andrew McCallum . 2015. Efficient non-parametric estimation of multiple embeddings per word in vector space. CoRR abs\/1504.06654 ( 2015 ). Arvind Neelakantan, Jeevan Shankar, Alexandre Passos, and Andrew McCallum. 2015. Efficient non-parametric estimation of multiple embeddings per word in vector space. CoRR abs\/1504.06654 (2015)."},{"key":"e_1_2_1_117_1","volume-title":"BERTweet: A pre-trained language model for English Tweets. arXiv preprint arXiv:2005.10200","author":"Nguyen Dat Quoc","year":"2020","unstructured":"Dat Quoc Nguyen , Thanh Vu , and Anh Tuan Nguyen . 2020. BERTweet: A pre-trained language model for English Tweets. arXiv preprint arXiv:2005.10200 ( 2020 ). Dat Quoc Nguyen, Thanh Vu, and Anh Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English Tweets. arXiv preprint arXiv:2005.10200 (2020)."},{"key":"e_1_2_1_118_1","volume-title":"Learning semantic relatedness from human feedback using metric learning. CoRR abs\/1705.07425","author":"Niebler Thomas","year":"2017","unstructured":"Thomas Niebler , Martin Becker , Christian P\u00f6litz , and Andreas Hotho . 2017. Learning semantic relatedness from human feedback using metric learning. CoRR abs\/1705.07425 ( 2017 ). Thomas Niebler, Martin Becker, Christian P\u00f6litz, and Andreas Hotho. 2017. Learning semantic relatedness from human feedback using metric learning. CoRR abs\/1705.07425 (2017)."},{"key":"e_1_2_1_119_1","volume-title":"International Conference on Language Resources and Evaluation.","author":"Pak Alexander","year":"2010","unstructured":"Alexander Pak and Patrick Paroubek . 2010 . Twitter as a corpus for sentiment analysis and opinion mining . In International Conference on Language Resources and Evaluation. Alexander Pak and Patrick Paroubek. 2010. Twitter as a corpus for sentiment analysis and opinion mining. In International Conference on Language Resources and Evaluation."},{"key":"e_1_2_1_120_1","doi-asserted-by":"publisher","DOI":"10.3115\/1118693.1118704"},{"key":"e_1_2_1_121_1","doi-asserted-by":"publisher","DOI":"10.3390\/asi4010023"},{"key":"e_1_2_1_122_1","doi-asserted-by":"publisher","DOI":"10.5555\/3042817.3043083"},{"key":"e_1_2_1_123_1","doi-asserted-by":"publisher","DOI":"10.1016\/S1002-0160(15)60047-9"},{"key":"e_1_2_1_124_1","doi-asserted-by":"publisher","DOI":"10.1016\/S1002-0160(15)60047-9"},{"key":"e_1_2_1_125_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1202"},{"key":"e_1_2_1_126_1","volume-title":"Deep contextualized word representations. CoRR abs\/1802.05365","author":"Peters Matthew E.","year":"2018","unstructured":"Matthew E. Peters , Mark Neumann , Mohit Iyyer , Matt Gardner , Christopher Clark , Kenton Lee , and Luke Zettlemoyer . 2018. Deep contextualized word representations. CoRR abs\/1802.05365 ( 2018 ). Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. CoRR abs\/1802.05365 (2018)."},{"key":"e_1_2_1_127_1","volume-title":"Mimicking word embeddings using subword RNNs. CoRR abs\/1707.06961","author":"Pinter Yuval","year":"2017","unstructured":"Yuval Pinter , Robert Guthrie , and Jacob Eisenstein . 2017. Mimicking word embeddings using subword RNNs. CoRR abs\/1707.06961 ( 2017 ). Yuval Pinter, Robert Guthrie, and Jacob Eisenstein. 2017. Mimicking word embeddings using subword RNNs. CoRR abs\/1707.06961 (2017)."},{"key":"e_1_2_1_128_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2015.12.091"},{"key":"e_1_2_1_129_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigComp.2018.00124"},{"key":"e_1_2_1_130_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0020-7373(87)80053-6"},{"key":"e_1_2_1_131_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022643204877"},{"key":"e_1_2_1_132_1","unstructured":"Alec Radford Jeffrey Wu Rewon Child David Luan Dario Amodei and Ilya Sutskever. 2018. Language models are unsupervised multitask learners. Retrieved from https:\/\/d4mucfpksywv.cloudfront.net\/better-language-models\/language-models.pdf. Alec Radford Jeffrey Wu Rewon Child David Luan Dario Amodei and Ilya Sutskever. 2018. Language models are unsupervised multitask learners. Retrieved from https:\/\/d4mucfpksywv.cloudfront.net\/better-language-models\/language-models.pdf."},{"key":"e_1_2_1_133_1","volume-title":"Liu","author":"Raffel Colin","year":"2019","unstructured":"Colin Raffel , Noam Shazeer , Adam Roberts , Katherine Lee , Sharan Narang , Michael Matena , Yanqi Zhou , Wei Li , and Peter J . Liu . 2019 . Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683 (2019). Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683 (2019)."},{"key":"e_1_2_1_134_1","first-page":"53","article-title":"Deep AutoEncoder-Decoder framework for semantic segmentation of brain tumor","volume":"15","author":"Rehman Arshia","year":"2019","unstructured":"Arshia Rehman , Saeeda Naz , Usman Naseem , Imran Razzak , and Ibrahim A. Hameed . 2019 . Deep AutoEncoder-Decoder framework for semantic segmentation of brain tumor . Aust. J. Intell. Inf. Process. Syst. 15 , 3 (2019), 53 \u2013 60 . Arshia Rehman, Saeeda Naz, Usman Naseem, Imran Razzak, and Ibrahim A. Hameed. 2019. Deep AutoEncoder-Decoder framework for semantic segmentation of brain tumor. Aust. J. Intell. Inf. Process. Syst. 15, 3 (2019), 53\u201360.","journal-title":"Aust. J. Intell. Inf. Process. Syst."},{"key":"e_1_2_1_135_1","doi-asserted-by":"publisher","DOI":"10.5555\/3015812.3015844"},{"key":"e_1_2_1_136_1","doi-asserted-by":"publisher","DOI":"10.5121\/ijnlc.2016.5402"},{"key":"e_1_2_1_137_1","doi-asserted-by":"publisher","DOI":"10.3390\/asi4010013"},{"key":"e_1_2_1_138_1","volume-title":"Improving the accuracy of pre-trained word embeddings for sentiment analysis. CoRR abs\/1711.08609","author":"Rezaeinia Seyed Mahdi","year":"2017","unstructured":"Seyed Mahdi Rezaeinia , Ali Ghodsi , and Rouhollah Rahmani . 2017. Improving the accuracy of pre-trained word embeddings for sentiment analysis. CoRR abs\/1711.08609 ( 2017 ). Seyed Mahdi Rezaeinia, Ali Ghodsi, and Rouhollah Rahmani. 2017. Improving the accuracy of pre-trained word embeddings for sentiment analysis. CoRR abs\/1711.08609 (2017)."},{"key":"e_1_2_1_139_1","volume-title":"International Workshop on Emotion and Sentiment in Social and Expressive Media: Approaches and Perspectives from AI.","author":"Saif Hassan","year":"2013","unstructured":"Hassan Saif , Marta Fernandez Andres , Yulan He , and Harith Alani . 2013 . Evaluation datasets for Twitter sentiment analysis: A survey and a new dataset, the STS-Gold . In International Workshop on Emotion and Sentiment in Social and Expressive Media: Approaches and Perspectives from AI. Hassan Saif, Marta Fernandez Andres, Yulan He, and Harith Alani. 2013. Evaluation datasets for Twitter sentiment analysis: A survey and a new dataset, the STS-Gold. In International Workshop on Emotion and Sentiment in Social and Expressive Media: Approaches and Perspectives from AI."},{"key":"e_1_2_1_140_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W15-4303"},{"key":"e_1_2_1_141_1","volume-title":"a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108","author":"Sanh Victor","year":"2019","unstructured":"Victor Sanh , Lysandre Debut , Julien Chaumond , and Thomas Wolf . 2019. DistilBERT , a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 ( 2019 ). Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)."},{"key":"e_1_2_1_142_1","doi-asserted-by":"publisher","DOI":"10.5555\/1886436.1886447"},{"key":"e_1_2_1_143_1","unstructured":"Seungil David Ding Kevin Canini Jan Pfeifer and Maya Gupta. 2017. Deep lattice networks and partial monotonic functions. arxiv:stat.ML\/1709.06680. Seungil David Ding Kevin Canini Jan Pfeifer and Maya Gupta. 2017. Deep lattice networks and partial monotonic functions. arxiv:stat.ML\/1709.06680."},{"key":"e_1_2_1_144_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766462.2767830"},{"key":"e_1_2_1_145_1","unstructured":"Mohammad Shoeybi Mostofa Patwary Raul Puri Patrick LeGresley Jared Casper and Bryan Catanzaro. 2019. Megatron-LM: Training multi-billion parameter language models using model parallelism. arxiv:cs.CL\/1909.08053. Mohammad Shoeybi Mostofa Patwary Raul Puri Patrick LeGresley Jared Casper and Bryan Catanzaro. 2019. Megatron-LM: Training multi-billion parameter language models using model parallelism. arxiv:cs.CL\/1909.08053."},{"key":"e_1_2_1_146_1","doi-asserted-by":"crossref","unstructured":"Tajinder Singh and Madhu Kumari. 2016. Role of text pre-processing in Twitter sentiment analysis. https:\/\/www.sciencedirect.com\/science\/article\/pii\/S1877050916311607. Tajinder Singh and Madhu Kumari. 2016. Role of text pre-processing in Twitter sentiment analysis. https:\/\/www.sciencedirect.com\/science\/article\/pii\/S1877050916311607.","DOI":"10.1016\/j.procs.2016.06.095"},{"key":"e_1_2_1_147_1","volume-title":"Conference on Empirical Methods in Natural Language Processing. 1631\u20131642","author":"Socher R.","unstructured":"R. Socher , A. Perelygin , J. Y. Wu , J. Chuang , C. D. Manning , A. Y. Ng , and C. Potts . 2013. Recursive deep models for semantic compositionality over a sentiment treebank . In Conference on Empirical Methods in Natural Language Processing. 1631\u20131642 . R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning, A. Y. Ng, and C. Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Conference on Empirical Methods in Natural Language Processing. 1631\u20131642."},{"key":"e_1_2_1_148_1","doi-asserted-by":"crossref","unstructured":"Saeid Soheily-Khah Pierre-Fran\u00e7ois Marteau and Nicolas B\u00e9chet. 2017. Intrusion detection in network systems through hybrid supervised and unsupervised mining process- a detailed case study on the ISCX benchmark dataset -. Saeid Soheily-Khah Pierre-Fran\u00e7ois Marteau and Nicolas B\u00e9chet. 2017. Intrusion detection in network systems through hybrid supervised and unsupervised mining process- a detailed case study on the ISCX benchmark dataset -.","DOI":"10.1109\/ICDIS.2018.00043"},{"key":"e_1_2_1_149_1","volume-title":"Mass: Masked sequence to sequence pre-training for language generation. arXiv preprint arXiv:1905.02450","author":"Song Kaitao","year":"2019","unstructured":"Kaitao Song , Xu Tan , Tao Qin , Jianfeng Lu , and Tie-Yan Liu . 2019 . Mass: Masked sequence to sequence pre-training for language generation. arXiv preprint arXiv:1905.02450 (2019). Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2019. Mass: Masked sequence to sequence pre-training for language generation. arXiv preprint arXiv:1905.02450 (2019)."},{"key":"e_1_2_1_150_1","doi-asserted-by":"publisher","DOI":"10.5555\/106765.106782"},{"key":"e_1_2_1_151_1","volume-title":"An open multilingual graph of general knowledge. CoRR abs\/1612.03975","author":"Speer Robyn","year":"2016","unstructured":"Robyn Speer , Joshua Chin , and Catherine Havasi . 2016. ConceptNet 5.5 : An open multilingual graph of general knowledge. CoRR abs\/1612.03975 ( 2016 ). Robyn Speer, Joshua Chin, and Catherine Havasi. 2016. ConceptNet 5.5: An open multilingual graph of general knowledge. CoRR abs\/1612.03975 (2016)."},{"key":"e_1_2_1_152_1","volume-title":"Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223","author":"Sun Yu","year":"2019","unstructured":"Yu Sun , Shuohuan Wang , Yukun Li , Shikun Feng , Xuyi Chen , Han Zhang , Xin Tian , Danxiang Zhu , Hao Tian , and Hua Wu . 2019 . Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019). Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. Ernie: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019)."},{"key":"e_1_2_1_153_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i05.6428"},{"key":"e_1_2_1_154_1","doi-asserted-by":"publisher","DOI":"10.5555\/3104482.3104610"},{"key":"e_1_2_1_155_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-37256-8_11"},{"key":"e_1_2_1_156_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2018.06.022"},{"key":"e_1_2_1_157_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2015.2449071"},{"key":"e_1_2_1_158_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2015.2489653"},{"key":"e_1_2_1_159_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2015.2489653"},{"key":"e_1_2_1_160_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-1146"},{"key":"e_1_2_1_161_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2013.08.006"},{"key":"e_1_2_1_162_1","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"key":"e_1_2_1_163_1","volume-title":"Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Asian Federation of Natural Language Processing, 253\u2013263","author":"Wallace Byron","year":"2017","unstructured":"Byron Wallace . 2017 . A sensitivity analysis of (and practitioners\u2019 guide to) convolutional neural networks for sentence classification . In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Asian Federation of Natural Language Processing, 253\u2013263 . Byron Wallace. 2017. A sensitivity analysis of (and practitioners\u2019 guide to) convolutional neural networks for sentence classification. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Asian Federation of Natural Language Processing, 253\u2013263."},{"key":"e_1_2_1_164_1","unstructured":"Wei Wang Bin Bi Ming Yan Chen Wu Zuyi Bao Jiangnan Xia Liwei Peng and Luo Si. 2019. StructBERT: Incorporating language structures into pre-training for deep language understanding. arxiv:cs.CL\/1908.04577. Wei Wang Bin Bi Ming Yan Chen Wu Zuyi Bao Jiangnan Xia Liwei Peng and Luo Si. 2019. StructBERT: Incorporating language structures into pre-training for deep language understanding. arxiv:cs.CL\/1908.04577."},{"key":"e_1_2_1_165_1","doi-asserted-by":"publisher","DOI":"10.1109\/SocialCom-PASSAT.2012.119"},{"key":"e_1_2_1_166_1","doi-asserted-by":"publisher","DOI":"10.5555\/3297863.3297977"},{"key":"e_1_2_1_167_1","volume-title":"Nonparametric Bayesian estimation of periodic light curves. Astrophy. J. 756, 1 (Aug","author":"Wang Yuyang","year":"2012","unstructured":"Yuyang Wang , Roni Khardon , and Pavlos Protopapas . 2012. Nonparametric Bayesian estimation of periodic light curves. Astrophy. J. 756, 1 (Aug . 2012 ), 67. DOI:DOI:https:\/\/doi.org\/10.1088\/0004-637x\/756\/1\/67 Yuyang Wang, Roni Khardon, and Pavlos Protopapas. 2012. Nonparametric Bayesian estimation of periodic light curves. Astrophy. J. 756, 1 (Aug. 2012), 67. DOI:DOI:https:\/\/doi.org\/10.1088\/0004-637x\/756\/1\/67"},{"key":"e_1_2_1_168_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W15-4320"},{"key":"e_1_2_1_169_1","doi-asserted-by":"publisher","DOI":"10.5555\/3454287.3454804"},{"key":"e_1_2_1_170_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.11"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3434237","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3434237","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:24:35Z","timestamp":1750195475000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3434237"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,30]]},"references-count":169,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2021,9,30]]}},"alternative-id":["10.1145\/3434237"],"URL":"https:\/\/doi.org\/10.1145\/3434237","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"value":"2375-4699","type":"print"},{"value":"2375-4702","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,6,30]]},"assertion":[{"value":"2020-05-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-06-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}