{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T19:57:08Z","timestamp":1775851028442,"version":"3.50.1"},"reference-count":56,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2013,1,1]],"date-time":"2013-01-01T00:00:00Z","timestamp":1356998400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Communication and Digital Economy"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2013,1]]},"abstract":"<jats:p>Twitter provides access to large volumes of data in real time, but is notoriously noisy, hampering its utility for NLP. In this article, we target out-of-vocabulary words in short text messages and propose a method for identifying and normalizing lexical variants. Our method uses a classifier to detect lexical variants, and generates correction candidates based on morphophonemic similarity. Both word similarity and context are then exploited to select the most probable correction candidate for the word. The proposed method doesn't require any annotations, and achieves state-of-the-art performance over an SMS corpus and a novel dataset based on Twitter.<\/jats:p>","DOI":"10.1145\/2414425.2414430","type":"journal-article","created":{"date-parts":[[2013,2,5]],"date-time":"2013-02-05T13:19:41Z","timestamp":1360070381000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":79,"title":["Lexical normalization for social media text"],"prefix":"10.1145","volume":"4","author":[{"given":"Bo","family":"Han","sequence":"first","affiliation":[{"name":"NICTA Victoria Research Laboratory and The University of Melbourne, Australia"}]},{"given":"Paul","family":"Cook","sequence":"additional","affiliation":[{"name":"The University of Melbourne, Australia"}]},{"given":"Timothy","family":"Baldwin","sequence":"additional","affiliation":[{"name":"NICTA Victoria Research Laboratory and The University of Melbourne, Australia"}]}],"member":"320","published-online":{"date-parts":[[2013,2]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the COLING\/ACL Main Conference Poster Sessions. 33--40","author":"Aw A.","unstructured":"Aw , A. , Zhang , M. , Xiao , J. , and Su , J . 2006. A phrase-based statistical model for SMS text normalization . In Proceedings of the COLING\/ACL Main Conference Poster Sessions. 33--40 . Aw, A., Zhang, M., Xiao, J., and Su, J. 2006. A phrase-based statistical model for SMS text normalization. In Proceedings of the COLING\/ACL Main Conference Poster Sessions. 33--40."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the Human Language Technologies Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT'10)","author":"Baldwin T.","unstructured":"Baldwin , T. and Lui , M . 2010. Language identification: The long and the short of the matter . In Proceedings of the Human Language Technologies Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT'10) . 229--237. Baldwin, T. and Lui, M. 2010. Language identification: The long and the short of the matter. In Proceedings of the Human Language Technologies Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT'10). 229--237."},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 389--398","author":"Benson E.","unstructured":"Benson , E. , Haghighi , A. , and Barzilay , R . 2011. Event discovery in social media feeds . In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 389--398 . Benson, E., Haghighi, A., and Barzilay, R. 2011. Event discovery in social media feeds. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 389--398."},{"key":"e_1_2_1_4_1","unstructured":"Brants T. and Franz A. 2006. Web 1T 5-gram Version 1.  Brants T. and Franz A. 2006. Web 1T 5-gram Version 1."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075218.1075255"},{"key":"e_1_2_1_6_1","volume-title":"The ICWSM 2009 spinn3r dataset. In Proceedings of the 3rd Annual Conference on Weblogs and Social Media.","author":"Burton K.","unstructured":"Burton , K. , Java , A. , and Soboroff , I . 2009 . The ICWSM 2009 spinn3r dataset. In Proceedings of the 3rd Annual Conference on Weblogs and Social Media. Burton, K., Java, A., and Soboroff, I. 2009. The ICWSM 2009 spinn3r dataset. In Proceedings of the 3rd Annual Conference on Weblogs and Social Media."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10032-007-0054-0"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the 23rd International Conference on Computational Linguistics: Posters. 189--196","author":"Contractor D.","unstructured":"Contractor , D. , Faruquie , T. A. , and Subramaniam , L. V . 2010. Unsupervised cleansing of noisy text . In Proceedings of the 23rd International Conference on Computational Linguistics: Posters. 189--196 . Contractor, D., Faruquie, T. A., and Subramaniam, L. V. 2010. Unsupervised cleansing of noisy text. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters. 189--196."},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the Workshop on Computational Approaches to Linguistic Creativity (CALC'09)","author":"Cook P.","unstructured":"Cook , P. and Stevenson , S . 2009. An unsupervised model for text message normalization . In Proceedings of the Workshop on Computational Approaches to Linguistic Creativity (CALC'09) 71--78. Cook, P. and Stevenson, S. 2009. An unsupervised model for text message normalization. In Proceedings of the Workshop on Computational Approaches to Linguistic Creativity (CALC'09) 71--78."},{"key":"e_1_2_1_10_1","unstructured":"David Graff C. C. 2003. English Gigaword. Linguistic Data Consortium. http:\/\/www.ldc.upenn.edu\/Catalog\/CatalogEntry.jsp&quest;catalogId=LDC2003T05.  David Graff C. C. 2003. English Gigaword. Linguistic Data Consortium. http:\/\/www.ldc.upenn.edu\/Catalog\/CatalogEntry.jsp&quest;catalogId=LDC2003T05."},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC'06)","author":"de Marneffe M.","unstructured":"de Marneffe , M. , MacCartney , B. , and Manning , C. D . 2006. Generating typed dependency parses from phrase structure parses . In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC'06) . de Marneffe, M., MacCartney, B., and Manning, C. D. 2006. Generating typed dependency parses from phrase structure parses. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC'06)."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/1390681.1442794"},{"key":"e_1_2_1_13_1","unstructured":"Foster J. \u00c7etinoglu O. Wagner J. Roux J. L. Hogan S. Nivre J. Hogan D. and van Genabith J. 2011. &num;hardtoparse: POS tagging and parsing the twitterverse. In Analyzing Microtext: Papers from the 2011 AAAI Workshop 20--25.  Foster J. \u00c7etinoglu O. Wagner J. Roux J. L. Hogan S. Nivre J. Hogan D. and van Genabith J. 2011. &num;hardtoparse: POS tagging and parsing the twitterverse. In Analyzing Microtext: Papers from the 2011 AAAI Workshop 20--25."},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 42--47","author":"Gimpel K.","unstructured":"Gimpel , K. , Schneider , N. , O'Connor , B. , Das , D. , Mills , D. , Eisenstein , J. , Heilman , M. , Yogatama , D. , Flanigan , J. , and Smith , N. A . 2011. Part-of-Speech tagging for twitter: Annotation, features, and experiments . In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 42--47 . Gimpel, K., Schneider, N., O'Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan, J., and Smith, N. A. 2011. Part-of-Speech tagging for twitter: Annotation, features, and experiments. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 42--47."},{"key":"e_1_2_1_15_1","first-page":"581","article-title":"Identifying sarcasm in twitter: A closer look. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics","volume":"2","author":"Gonz\u00e1lez-Ib\u00e1\u00f1ez R.","year":"2011","unstructured":"Gonz\u00e1lez-Ib\u00e1\u00f1ez , R. , Muresan , S. , and Wacholder , N. 2011 . Identifying sarcasm in twitter: A closer look. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics : Human Language Technologies: Short Papers. Vol. 2 , 581 -- 586 . Gonz\u00e1lez-Ib\u00e1\u00f1ez, R., Muresan, S., and Wacholder, N. 2011. Identifying sarcasm in twitter: A closer look. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers. Vol. 2, 581--586.","journal-title":"Human Language Technologies: Short Papers."},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the Ist Workshop on Unsupervised Learning in NLP. 82--90","author":"Gouws S.","unstructured":"Gouws , S. , Hovy , D. , and Metzler , D . 2011. Unsupervised mining of lexical variants from noisy text . In Proceedings of the Ist Workshop on Unsupervised Learning in NLP. 82--90 . Gouws, S., Hovy, D., and Metzler, D. 2011. Unsupervised mining of lexical variants from noisy text. In Proceedings of the Ist Workshop on Unsupervised Learning in NLP. 82--90."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 368--378","author":"Han B.","unstructured":"Han , B. and Baldwin , T . 2011. Lexical normalisation of short text messages: Makn sens a &num;twitter . In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 368--378 . Han, B. and Baldwin, T. 2011. Lexical normalisation of short text messages: Makn sens a &num;twitter. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 368--378."},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP-CoNLL'12)","author":"Han B.","unstructured":"Han , B. , Cook , P. , and Baldwin , T . 2012. Automatically constructing a normalisation dictionary for microblogs . In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP-CoNLL'12) . To appear. Han, B., Cook, P., and Baldwin, T. 2012. Automatically constructing a normalisation dictionary for microblogs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP-CoNLL'12). To appear."},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the Inernational Conference on Human Computer Interfaces International (HCII 05)","author":"How Y.","unstructured":"How , Y. and Kan , M . -Y. 2005. Optimizing predictive text entry for short message service on mobile phones . In Proceedings of the Inernational Conference on Human Computer Interfaces International (HCII 05) . How, Y. and Kan, M.-Y. 2005. Optimizing predictive text entry for short message service on mobile phones. In Proceedings of the Inernational Conference on Human Computer Interfaces International (HCII 05)."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075178.1075202"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/582415.582418"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. 151--160","author":"Jiang L.","unstructured":"Jiang , L. , Yu , M. , Zhou , M. , Liu , X. , and Zhao , T . 2011. Target-Dependent twitter sentiment classification . In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. 151--160 . Jiang, L., Yu, M., Zhou, M., Liu, X., and Zhao, T. 2011. Target-Dependent twitter sentiment classification. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. 151--160."},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the International Conference on Natural Language Processing.","author":"Kaufmann J.","unstructured":"Kaufmann , J. and Kalita , J . 2010. Syntactic normalization of twitter messages . In Proceedings of the International Conference on Natural Language Processing. Kaufmann, J. and Kalita, J. 2010. Syntactic normalization of twitter messages. In Proceedings of the International Conference on Natural Language Processing."},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the Conference on Advances in Neural Information Processing Systems 15 (NIPS'02)","author":"Klein D.","unstructured":"Klein , D. and Manning , C. D . 2003. Fast exact inference with a factored model for natural language parsing . In Proceedings of the Conference on Advances in Neural Information Processing Systems 15 (NIPS'02) . 3--10. Klein, D. and Manning, C. D. 2003. Fast exact inference with a factored model for natural language parsing. In Proceedings of the Conference on Advances in Neural Information Processing Systems 15 (NIPS'02). 3--10."},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. 177--180","author":"Koehn P.","unstructured":"Koehn , P. , Hoang , H. , Birch , A. , Callison-Burch , C. , Federico , M. , Bertoldi , N. , Cowan , B. , Shen , W. , Moran , C. , Zens , R. , Dyer , C. , Bojar , O. , Constantin , A. , and Herbst , E . 2007. Moses: Open source toolkit for statistical machine translation . In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. 177--180 . Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., and Herbst, E. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. 177--180."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1214\/aoms\/1177729694"},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 18th International Conference on Machine Learning. 282--289","author":"Lafferty J. D.","unstructured":"Lafferty , J. D. , McCallum , A. , and Pereira , F. C. N. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data . In Proceedings of the 18th International Conference on Machine Learning. 282--289 . Lafferty, J. D., McCallum, A., and Pereira, F. C. N. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning. 282--289."},{"key":"e_1_2_1_28_1","first-page":"707","article-title":"Binary codes capable of correcting deletions, insertions, and reversals","volume":"10","author":"Levenshtein V. I.","year":"1966","unstructured":"Levenshtein , V. I. 1966 . Binary codes capable of correcting deletions, insertions, and reversals . Soviet Physics Doklady 10 , 707 -- 710 . Levenshtein, V. I. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10, 707--710.","journal-title":"Soviet Physics Doklady"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.3115\/1220175.1220304"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.3115\/980691.980696"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/18.61115"},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL'12)","author":"Liu F.","unstructured":"Liu , F. , Weng , F. , and Jiang , X . 2012. A broad-coverage normalization system for social media language . In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL'12) . Liu, F., Weng, F., and Jiang, X. 2012. A broad-coverage normalization system for social media language. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL'12)."},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 71--76","author":"Liu F.","unstructured":"Liu , F. , Weng , F. , Wang , B. , and Liu , Y . 2011a. Insertion, deletion, or substitution&quest; Normalizing text messages without pre-categorization nor supervision . In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 71--76 . Liu, F., Weng, F., Wang, B., and Liu, Y. 2011a. Insertion, deletion, or substitution&quest; Normalizing text messages without pre-categorization nor supervision. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 71--76."},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 359--367","author":"Liu X.","unstructured":"Liu , X. , Zhang , S. , Wei , F. , and Zhou , M . 2011b. Recognizing named entities in tweets . In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 359--367 . Liu, X., Zhang, S., Wei, F., and Zhou, M. 2011b. Recognizing named entities in tweets. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 359--367."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1162\/153244302760200687"},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of 5th International Joint Conference on Natural Language Processing. 553--561","author":"Lui M.","unstructured":"Lui , M. and Baldwin , T . 2011. Cross-Domain feature selection for language identification . In Proceedings of 5th International Joint Conference on Natural Language Processing. 553--561 . Lui, M. and Baldwin, T. 2011. Cross-Domain feature selection for language identification. In Proceedings of 5th International Joint Conference on Natural Language Processing. 553--561."},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the 4th International Conference on Weblogs and Social Media (ICWSM'10)","author":"O'Connor B.","unstructured":"O'Connor , B. , Krieger , M. , and Ahn , D . 2010. TweetMotif: Exploratory search and topic summarization for twitter . In Proceedings of the 4th International Conference on Weblogs and Social Media (ICWSM'10) . 384--385. O'Connor, B., Krieger, M., and Ahn, D. 2010. TweetMotif: Exploratory search and topic summarization for twitter. In Proceedings of the 4th International Conference on Weblogs and Social Media (ICWSM'10). 384--385."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073135"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/359038.359041"},{"key":"e_1_2_1_40_1","volume-title":"The double metaphone search algorithm. C\/C&plus;&plus","author":"Philips L.","unstructured":"Philips , L. 2000. The double metaphone search algorithm. C\/C&plus;&plus ; Users J. 18, 38--43. Philips, L. 2000. The double metaphone search algorithm. C\/C&plus;&plus; Users J. 18, 38--43."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.18626"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the Human Language Technologies Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT'10)","author":"Ritter A.","unstructured":"Ritter , A. , Cherry , C. , and Dolan , B . 2010. Unsupervised modeling of twitter conversations . In Proceedings of the Human Language Technologies Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT'10) . 172--180. Ritter, A., Cherry, C., and Dolan, B. 2010. Unsupervised modeling of twitter conversations. In Proceedings of the Human Language Technologies Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT'10). 172--180."},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1524--1534","author":"Ritter A.","unstructured":"Ritter , A. , Clark , S. , Mausam , and Etzioni, O . 2011. Named entity recognition in tweets: An experimental study . In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1524--1534 . Ritter, A., Clark, S., Mausam, and Etzioni, O. 2011. Named entity recognition in tweets: An experimental study. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1524--1534."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772777"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1948.tb01338.x"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1006\/csla.2001.0169"},{"key":"e_1_2_1_47_1","volume-title":"Proceedings of the International Conference on Spoken Language Processing. 901--904","author":"Stolcke A.","year":"2002","unstructured":"Stolcke , A. 2002 . Srilm - an extensible language modeling toolkit . In Proceedings of the International Conference on Spoken Language Processing. 901--904 . Stolcke, A. 2002. Srilm - an extensible language modeling toolkit. In Proceedings of the International Conference on Spoken Language Processing. 901--904."},{"key":"e_1_2_1_48_1","volume-title":"Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 81--88","author":"Sun G.","unstructured":"Sun , G. , Cong , G. , Liu , X. , Lin , C.-Y. , and Zhou , M . 2007. Mining sequential patterns and tree patterns to detect erroneous sentences . In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 81--88 . Sun, G., Cong, G., Liu, X., Lin, C.-Y., and Zhou, M. 2007. Mining sequential patterns and tree patterns to detect erroneous sentences. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 81--88."},{"key":"e_1_2_1_49_1","first-page":"1","article-title":"Generation txt&quest; The sociolinguistics of young people's text-messaging","volume":"1","author":"Thurlow C.","year":"2003","unstructured":"Thurlow , C. 2003 . Generation txt&quest; The sociolinguistics of young people's text-messaging . Discourse Anal. Online 1 , 1 . Thurlow, C. 2003. Generation txt&quest; The sociolinguistics of young people's text-messaging. Discourse Anal. Online 1, 1.","journal-title":"Discourse Anal. Online"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073445.1073478"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073109"},{"key":"e_1_2_1_52_1","unstructured":"Twitter. 2011. 200 million tweets per day. http:\/\/techcrunch.com\/2012\/12\/18\/twitter-passes-200m-monthly-active-users-a-42-increase-over-9-months\/  Twitter. 2011. 200 million tweets per day. http:\/\/techcrunch.com\/2012\/12\/18\/twitter-passes-200m-monthly-active-users-a-42-increase-over-9-months\/"},{"key":"e_1_2_1_53_1","volume-title":"Proceedings of the 5th International AAAI Conference on Weblogs and Social Media.","author":"Weng J.","unstructured":"Weng , J. and Lee , B . -S. 2011. Event detection in Twitter . In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media. Weng, J. and Lee, B.-S. 2011. Event detection in Twitter. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media."},{"key":"e_1_2_1_54_1","volume-title":"Proceedings of the 5th Australasian Conference on Data Mining and Analytics. 83--89","author":"Wong W.","unstructured":"Wong , W. , Liu , W. , and Bennamoun , M . 2006. Integrated scoring for spelling error correction, abbreviation expansion and case restoration in dirty text . In Proceedings of the 5th Australasian Conference on Data Mining and Analytics. 83--89 . Wong, W., Liu, W., and Bennamoun, M. 2006. Integrated scoring for spelling error correction, abbreviation expansion and case restoration in dirty text. In Proceedings of the 5th Australasian Conference on Data Mining and Analytics. 83--89."},{"key":"e_1_2_1_55_1","volume-title":"Proceedings of the AAAI-11 Workshop on Analyzing Microtext. 74--79","author":"Xue Z.","unstructured":"Xue , Z. , Yin , D. , and Davison , B. D . 2011. Normalizing microtext . In Proceedings of the AAAI-11 Workshop on Analyzing Microtext. 74--79 . Xue, Z., Yin, D., and Davison, B. D. 2011. Normalizing microtext. In Proceedings of the AAAI-11 Workshop on Analyzing Microtext. 74--79."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.3115\/992730.992783"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2414425.2414430","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2414425.2414430","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T09:21:10Z","timestamp":1750238470000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2414425.2414430"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,1]]},"references-count":56,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2013,1]]}},"alternative-id":["10.1145\/2414425.2414430"],"URL":"https:\/\/doi.org\/10.1145\/2414425.2414430","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"value":"2157-6904","type":"print"},{"value":"2157-6912","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,1]]},"assertion":[{"value":"2011-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-02-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}