{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,20]],"date-time":"2025-06-20T05:24:31Z","timestamp":1750397071237,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":57,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,8,4]],"date-time":"2023-08-04T00:00:00Z","timestamp":1691107200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,8,6]]},"DOI":"10.1145\/3580305.3599907","type":"proceedings-article","created":{"date-parts":[[2023,8,4]],"date-time":"2023-08-04T18:13:58Z","timestamp":1691172838000},"page":"3737-3749","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["SMILE: Evaluation and Domain Adaptation for Social Media Language Understanding"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-0235-304X","authenticated-orcid":false,"given":"Vasilisa","family":"Bashlovkina","sequence":"first","affiliation":[{"name":"Google Research, New York, NY, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-7634-7076","authenticated-orcid":false,"given":"Riley","family":"Matthews","sequence":"additional","affiliation":[{"name":"Google Research, New York, NY, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9295-4669","authenticated-orcid":false,"given":"Zhaobin","family":"Kuang","sequence":"additional","affiliation":[{"name":"Google Research, New York, NY, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-8746-5787","authenticated-orcid":false,"given":"Simon","family":"Baumgartner","sequence":"additional","affiliation":[{"name":"Google Research, New York, NY, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2941-6240","authenticated-orcid":false,"given":"Michael","family":"Bendersky","sequence":"additional","affiliation":[{"name":"Google Research, Mountain View, CA, USA"}]}],"member":"320","published-online":{"date-parts":[[2023,8,4]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-67670-4_26"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447548.3467162"},{"key":"e_1_3_2_2_3_1","first-page":"1","article-title":"Social media use in 2021","volume":"1","author":"Auxier Brooke","year":"2021","unstructured":"Brooke Auxier and Monica Anderson. 2021. Social media use in 2021. Pew Research Center 1 (2021), 1--4.","journal-title":"Pew Research Center"},{"key":"e_1_3_2_2_4_1","volume-title":"Proceedings of the Thirteenth Language Resources and Evaluation Conference. 258--266","author":"Barbieri Francesco","year":"2022","unstructured":"Francesco Barbieri, Luis Espinosa Anke, and Jose Camacho-Collados. 2022. Xlmt: Multilingual language models in twitter for sentiment analysis and beyond. In Proceedings of the Thirteenth Language Resources and Evaluation Conference. 258--266."},{"key":"e_1_3_2_2_5_1","volume-title":"Tweeteval: Unified benchmark and comparative evaluation for tweet classification. arXiv preprint arXiv:2010.12421","author":"Barbieri Francesco","year":"2020","unstructured":"Francesco Barbieri, Jose Camacho-Collados, Leonardo Neves, and Luis Espinosa- Anke. 2020. Tweeteval: Unified benchmark and comparative evaluation for tweet classification. arXiv preprint arXiv:2010.12421 (2020)."},{"key":"e_1_3_2_2_6_1","volume-title":"SciBERT: A pretrained language model for scientific text. arXiv preprint arXiv:1903.10676","author":"Beltagy Iz","year":"2019","unstructured":"Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A pretrained language model for scientific text. arXiv preprint arXiv:1903.10676 (2019)."},{"key":"e_1_3_2_2_7_1","volume-title":"Findings of the 2014 workshop on statistical machine translation. In Proceedings of the ninth workshop on statistical machine translation. 12--58","author":"Bojar Ond\u0159ej","year":"2014","unstructured":"Ond\u0159ej Bojar, Christian Buck, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Herve Saint- Amand, et al. 2014. Findings of the 2014 workshop on statistical machine translation. In Proceedings of the ninth workshop on statistical machine translation. 12--58."},{"key":"e_1_3_2_2_8_1","unstructured":"Rishi Bommasani Drew A Hudson Ehsan Adeli Russ Altman Simran Arora Sydney von Arx Michael S Bernstein Jeannette Bohg Antoine Bosselut Emma Brunskill et al. 2021. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021)."},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3308560.3317593"},{"key":"e_1_3_2_2_10_1","unstructured":"Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020) 1877--1901."},{"key":"e_1_3_2_2_11_1","unstructured":"Krzysztof Choromanski Valerii Likhosherstov David Dohan Xingyou Song Andreea Gane Tamas Sarlos Peter Hawkins Jared Davis Afroz Mohiuddin Lukasz Kaiser et al. 2020. Rethinking attention with performers. arXiv preprint arXiv:2009.14794 (2020)."},{"key":"e_1_3_2_2_12_1","volume-title":"Charles Sutton, Sebastian Gehrmann, et al.","author":"Chowdhery Aakanksha","year":"2022","unstructured":"Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, et al. 2022. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311 (2022)."},{"key":"e_1_3_2_2_13_1","first-page":"16344","article-title":"Flashattention: Fast and memory-efficient exact attention with io-awareness","volume":"35","author":"Dao Tri","year":"2022","unstructured":"Tri Dao, Dan Fu, Stefano Ermon, Atri Rudra, and Christopher R\u00e9. 2022. Flashattention: Fast and memory-efficient exact attention with io-awareness. Advances in Neural Information Processing Systems 35 (2022), 16344--16359.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_2_14_1","volume-title":"Bernice: A Multilingual Pre-trained Encoder for Twitter.","author":"DeLucia Alexandra","year":"2022","unstructured":"Alexandra DeLucia, Shijie Wu, Aaron Mueller, Carlos Aguirre, Mark Dredze, and Philip Resnik. 2022. Bernice: A Multilingual Pre-trained Encoder for Twitter. (2022)."},{"key":"e_1_3_2_2_15_1","volume-title":"GoEmotions: A dataset of fine-grained emotions. arXiv preprint arXiv:2005.00547","author":"Demszky Dorottya","year":"2020","unstructured":"Dorottya Demszky, Dana Movshovitz-Attias, Jeongwoo Ko, Alan Cowen, Gaurav Nemade, and Sujith Ravi. 2020. GoEmotions: A dataset of fine-grained emotions. arXiv preprint arXiv:2005.00547 (2020)."},{"key":"e_1_3_2_2_16_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"volume-title":"Natural language processing for corpus linguistics","author":"Dunn Jonathan","key":"e_1_3_2_2_17_1","unstructured":"Jonathan Dunn. 2022. Natural language processing for corpus linguistics. Cambridge University Press."},{"key":"e_1_3_2_2_18_1","volume-title":"Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)","author":"Fothergill Richard","year":"2016","unstructured":"Richard Fothergill, Paul Cook, and Timothy Baldwin. 2016. Evaluating a topic modelling approach to measuring corpus similarity. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). 273--279."},{"key":"e_1_3_2_2_19_1","unstructured":"Sinong Geng Mladen Kolar and Oluwasanmi Koyejo. 2020. Joint nonparametric precision matrix estimation with confounding. In Uncertainty in Artificial Intelligence. PMLR 378--388."},{"key":"e_1_3_2_2_20_1","volume-title":"conference. Conference on Uncertainty in Artificial Intelligence","volume":"2018","author":"Geng Sinong","year":"2018","unstructured":"Sinong Geng, Zhaobin Kuang, Jie Liu, Stephen Wright, and David Page. 2018. Stochastic learning for sparse discrete Markov random fields with controlled gradient approximation error. In Uncertainty in artificial intelligence: proceedings of the... conference. Conference on Uncertainty in Artificial Intelligence, Vol. 2018. NIH Public Access, 156."},{"key":"e_1_3_2_2_21_1","volume-title":"An efficient pseudolikelihood method for sparse binary pairwise Markov network estimation. arXiv preprint arXiv:1702.08320","author":"Geng Sinong","year":"2017","unstructured":"Sinong Geng, Zhaobin Kuang, and David Page. 2017. An efficient pseudolikelihood method for sparse binary pairwise Markov network estimation. arXiv preprint arXiv:1702.08320 (2017)."},{"key":"e_1_3_2_2_22_1","volume-title":"International Conference on Machine Learning. PMLR, 1714--1723","author":"Geng Sinong","year":"2018","unstructured":"Sinong Geng, Zhaobin Kuang, Peggy Peissig, and David Page. 2018. Temporal poisson square root graphical models. In International Conference on Machine Learning. PMLR, 1714--1723."},{"key":"e_1_3_2_2_23_1","volume-title":"International Conference on Machine Learning. PMLR, 2180--2190","author":"Geng Sinong","year":"2019","unstructured":"Sinong Geng, Minhao Yan, Mladen Kolar, and Sanmi Koyejo. 2019. Partially linear additive Gaussian graphical models. In International Conference on Machine Learning. PMLR, 2180--2190."},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3458754"},{"key":"e_1_3_2_2_25_1","unstructured":"Andrew Jaegle Sebastian Borgeaud Jean-Baptiste Alayrac Carl Doersch Catalin Ionescu David Ding Skanda Koppula Daniel Zoran Andrew Brock Evan Shelhamer et al. 2021. Perceiver io: A general architecture for structured inputs & outputs. arXiv preprint arXiv:2107.14795 (2021)."},{"key":"e_1_3_2_2_26_1","volume-title":"Mentalbert: Publicly available pretrained language models for mental healthcare. arXiv preprint arXiv:2110.15621","author":"Ji Shaoxiong","year":"2021","unstructured":"Shaoxiong Ji, Tianlin Zhang, Luna Ansari, Jie Fu, Prayag Tiwari, and Erik Cambria. 2021. Mentalbert: Publicly available pretrained language models for mental healthcare. arXiv preprint arXiv:2110.15621 (2021)."},{"key":"e_1_3_2_2_27_1","volume-title":"Scaling laws for neural language models. arXiv preprint arXiv:2001.08361","author":"Kaplan Jared","year":"2020","unstructured":"Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. 2020. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361 (2020)."},{"key":"e_1_3_2_2_28_1","volume-title":"Fifth Workshop on Very Large Corpora.","author":"Kilgarriff Adam","year":"1997","unstructured":"Adam Kilgarriff. 1997. Usingword frequency lists to measure corpus homogeneity and similarity between corpora. In Fifth Workshop on Very Large Corpora."},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1075\/ijcl.6.1.05kil"},{"key":"e_1_3_2_2_30_1","volume-title":"Abstractive summarization of reddit posts with multi-level memory networks. arXiv preprint arXiv:1811.00783","author":"Kim Byeongchang","year":"2018","unstructured":"Byeongchang Kim, Hyunwoo Kim, and Gunhee Kim. 2018. Abstractive summarization of reddit posts with multi-level memory networks. arXiv preprint arXiv:1811.00783 (2018)."},{"key":"e_1_3_2_2_31_1","volume-title":"A screening rule for l1- regularized ising model estimation. Advances in neural information processing systems 30","author":"Kuang Zhaobin","year":"2017","unstructured":"Zhaobin Kuang, Sinong Geng, and David Page. 2017. A screening rule for l1- regularized ising model estimation. Advances in neural information processing systems 30 (2017)."},{"key":"e_1_3_2_2_32_1","volume-title":"Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. arXiv preprint arXiv:1808.06226","author":"Kudo Taku","year":"2018","unstructured":"Taku Kudo and John Richardson. 2018. Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. arXiv preprint arXiv:1808.06226 (2018)."},{"key":"e_1_3_2_2_33_1","unstructured":"Percy Liang Rishi Bommasani Tony Lee Dimitris Tsipras Dilara Soylu Michihiro Yasunaga Yian Zhang Deepak Narayanan YuhuaiWu Ananya Kumar et al. 2022. Holistic evaluation of language models. arXiv preprint arXiv:2211.09110 (2022)."},{"key":"e_1_3_2_2_34_1","volume-title":"Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74--81.","author":"Lin Chin-Yew","year":"2004","unstructured":"Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74--81."},{"key":"e_1_3_2_2_35_1","volume-title":"Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)."},{"key":"e_1_3_2_2_36_1","unstructured":"Jinghui Lu Maeve Henchion and Brian Mac Namee. 2021. Diverging divergences: Examining variants of Jensen Shannon divergence for corpus comparison tasks. (2021)."},{"key":"e_1_3_2_2_37_1","volume-title":"BioGPT: generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics 23, 6","author":"Luo Renqian","year":"2022","unstructured":"Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, and Tie-Yan Liu. 2022. BioGPT: generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics 23, 6 (2022)."},{"key":"e_1_3_2_2_38_1","volume-title":"BERTweet: A pretrained language model for English Tweets. arXiv preprint arXiv:2005.10200","author":"Nguyen Dat Quoc","year":"2020","unstructured":"Dat Quoc Nguyen, Thanh Vu, and Anh Tuan Nguyen. 2020. BERTweet: A pretrained language model for English Tweets. arXiv preprint arXiv:2005.10200 (2020)."},{"key":"e_1_3_2_2_39_1","volume-title":"Ning Xu, Pradeep Ravikumar, and Barnab\u00e1s P\u00f3czos.","author":"Paria Biswajit","year":"2020","unstructured":"Biswajit Paria, Chih-Kuan Yeh, Ian EH Yen, Ning Xu, Pradeep Ravikumar, and Barnab\u00e1s P\u00f3czos. 2020. Minimizing flops to learn efficient sparse representations. arXiv preprint arXiv:2004.05665 (2020)."},{"key":"e_1_3_2_2_40_1","volume-title":"Hyena hierarchy: Towards larger convolutional language models. arXiv preprint arXiv:2302.10866","author":"Poli Michael","year":"2023","unstructured":"Michael Poli, Stefano Massaroli, Eric Nguyen, Daniel Y Fu, Tri Dao, Stephen Baccus, Yoshua Bengio, Stefano Ermon, and Christopher R\u00e9. 2023. Hyena hierarchy: Towards larger convolutional language models. arXiv preprint arXiv:2302.10866 (2023)."},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.5555\/3455716.3455856"},{"key":"e_1_3_2_2_42_1","volume-title":"100,000 questions for machine comprehension of text. arXiv preprint arXiv:1606.05250","author":"Rajpurkar Pranav","year":"2016","unstructured":"Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. Squad: 100,000 questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016)."},{"key":"e_1_3_2_2_43_1","doi-asserted-by":"publisher","unstructured":"Adam Roberts Hyung Won Chung Anselm Levskaya Gaurav Mishra James Bradbury Daniel Andor Sharan Narang Brian Lester Colin Gaffney Afroz Mohiuddin Curtis Hawthorne Aitor Lewkowycz Alex Salcianu Marc van Zee Jacob Austin Sebastian Goodman Livio Baldini Soares Haitang Hu Sasha Tsvyashchenko Aakanksha Chowdhery Jasmijn Bastings Jannis Bulian Xavier Garcia Jianmo Ni Andrew Chen Kathleen Kenealy Jonathan H. Clark Stephan Lee Dan Garrette James Lee-Thorp Colin Raffel Noam Shazeer Marvin Ritter Maarten Bosma Alexandre Passos Jeremy Maitin-Shepard Noah Fiedel Mark Omernick Brennan Saeta Ryan Sepassi Alexander Spiridonov Joshua Newlan and Andrea Gesmundo. 2022. Scaling Up Models and Data with t5x and seqio. https:\/\/doi.org\/10.48550\/ARXIV.2203.17189","DOI":"10.48550\/ARXIV.2203.17189"},{"key":"e_1_3_2_2_44_1","volume-title":"Abubakar Abid, Adam Fisch, Adam R Brown, Adam Santoro, Aditya Gupta, Adri\u00e0 Garriga-Alonso, et al.","author":"Srivastava Aarohi","year":"2022","unstructured":"Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R Brown, Adam Santoro, Aditya Gupta, Adri\u00e0 Garriga-Alonso, et al. 2022. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. arXiv preprint arXiv:2206.04615 (2022)."},{"key":"e_1_3_2_2_45_1","volume-title":"2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT). IEEE, 1--5.","author":"Islam Talukder Md Ashraful","year":"2019","unstructured":"Md Ashraful Islam Talukder, Sheikh Abujar, Abu Kaisar Mohammad Masum, Fahad Faisal, and Syed Akhter Hossain. 2019. Bengali abstractive text summarization using sequence to sequence RNNs. In 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT). IEEE, 1--5."},{"key":"e_1_3_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3530811"},{"key":"e_1_3_2_2_47_1","volume-title":"Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, and Donald Metzler.","author":"Tay Yi","year":"2021","unstructured":"Yi Tay, Vinh Q Tran, Sebastian Ruder, Jai Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, and Donald Metzler. 2021. Charformer: Fast character transformers via gradient-based subword tokenization. arXiv preprint arXiv:2106.12672 (2021)."},{"key":"e_1_3_2_2_48_1","volume-title":"Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, et al.","author":"Thoppilan Romal","year":"2022","unstructured":"Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, et al. 2022. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239 (2022)."},{"key":"e_1_3_2_2_49_1","volume-title":"Attention is all you need. Advances in neural information processing systems 30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017)."},{"key":"e_1_3_2_2_50_1","volume-title":"Superglue: A stickier benchmark for general-purpose language understanding systems. Advances in neural information processing systems 32","author":"Wang Alex","year":"2019","unstructured":"Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. 2019. Superglue: A stickier benchmark for general-purpose language understanding systems. Advances in neural information processing systems 32 (2019)."},{"key":"e_1_3_2_2_51_1","volume-title":"GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461","author":"Wang Alex","year":"2018","unstructured":"Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R Bowman. 2018. GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461 (2018)."},{"key":"e_1_3_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00461"},{"key":"e_1_3_2_2_53_1","volume-title":"mT5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934","author":"Xue Linting","year":"2020","unstructured":"Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2020. mT5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934 (2020)."},{"key":"e_1_3_2_2_54_1","volume-title":"Kaleb E Smith, Christopher Parisien, Colin Compas, Cheryl Martin, Anthony B Costa, Mona G Flores, et al.","author":"Yang Xi","year":"2022","unstructured":"Xi Yang, Aokun Chen, Nima PourNejatian, Hoo Chang Shin, Kaleb E Smith, Christopher Parisien, Colin Compas, Cheryl Martin, Anthony B Costa, Mona G Flores, et al. 2022. A large language model for electronic health records. npj Digital Medicine 5, 1 (2022), 194."},{"key":"e_1_3_2_2_55_1","article-title":"Statistical properties of the population stability index","volume":"14","author":"Yurdakul Bilal","year":"2020","unstructured":"Bilal Yurdakul and Joshua Naranjo. 2020. Statistical properties of the population stability index. Journal of Risk Model Validation 14, 4 (2020).","journal-title":"Journal of Risk Model Validation"},{"key":"e_1_3_2_2_56_1","volume-title":"TwHIN-BERT: A Socially-Enriched Pretrained Language Model for Multilingual Tweet Representations. arXiv preprint arXiv:2209.07562","author":"Zhang Xinyang","year":"2022","unstructured":"Xinyang Zhang, Yury Malkov, Omar Florez, Serim Park, Brian McWilliams, Jiawei Han, and Ahmed El-Kishky. 2022. TwHIN-BERT: A Socially-Enriched Pretrained Language Model for Multilingual Tweet Representations. arXiv preprint arXiv:2209.07562 (2022)."},{"key":"e_1_3_2_2_57_1","volume-title":"Character-level convolutional networks for text classification. Advances in neural information processing systems 28","author":"Zhang Xiang","year":"2015","unstructured":"Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. Advances in neural information processing systems 28 (2015)."}],"event":{"name":"KDD '23: The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data"],"location":"Long Beach CA USA","acronym":"KDD '23"},"container-title":["Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3580305.3599907","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3580305.3599907","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:37:41Z","timestamp":1750178261000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3580305.3599907"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,4]]},"references-count":57,"alternative-id":["10.1145\/3580305.3599907","10.1145\/3580305"],"URL":"https:\/\/doi.org\/10.1145\/3580305.3599907","relation":{},"subject":[],"published":{"date-parts":[[2023,8,4]]},"assertion":[{"value":"2023-08-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}