{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,28]],"date-time":"2026-01-28T21:17:17Z","timestamp":1769635037232,"version":"3.49.0"},"publisher-location":"New York, NY, USA","reference-count":23,"publisher":"ACM","license":[{"start":{"date-parts":[[2014,7,11]],"date-time":"2014-07-11T00:00:00Z","timestamp":1405036800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2014,7,11]]},"DOI":"10.1145\/2632188.2632207","type":"proceedings-article","created":{"date-parts":[[2014,9,17]],"date-time":"2014-09-17T14:22:41Z","timestamp":1410963761000},"page":"35-40","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":26,"title":["Automatic identification of arabic dialects in social media"],"prefix":"10.1145","author":[{"given":"Fatiha","family":"Sadat","sequence":"first","affiliation":[{"name":"UQAM, Montreal, PQ, Canada"}]},{"given":"Farnazeh","family":"Kazemi","sequence":"additional","affiliation":[{"name":"NLP Technologies Inc., Montreal, PQ, Canada"}]},{"given":"Atefeh","family":"Farzindar","sequence":"additional","affiliation":[{"name":"NLP Technologies Inc., Montreal, PQ, Canada"}]}],"member":"320","published-online":{"date-parts":[[2014,7,11]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)","author":"Al-Sabbagh R.","year":"2012","unstructured":"R. Al-Sabbagh and R. Girju . Yadac, Yet another dialectal arabic corpus. In N. C. C. Chair, K. Choukri, T. Declerck, M. U. Doan, B. Maegaard, J. Mariani, J. Odijk, and S. Piperidis, editors , Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12) , Istanbul, Turkey , May 2012 . R. Al-Sabbagh and R. Girju. Yadac, Yet another dialectal arabic corpus. In N. C. C. Chair, K. Choukri, T. Declerck, M. U. Doan, B. Maegaard, J. Mariani, J. Odijk, and S. Piperidis, editors, Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), Istanbul, Turkey, May 2012."},{"key":"e_1_3_2_1_2_1","volume-title":"Communications, Signal Processing, and their Applications (ICCSPA)","author":"Almeman K.","year":"2013","unstructured":"K. Almeman and M. Lee . Automatic building of arabic multi dialect text corpora by bootstrapping dialect words . In Communications, Signal Processing, and their Applications (ICCSPA) , 2013 . K. Almeman and M. Lee. Automatic building of arabic multi dialect text corpora by bootstrapping dialect words. In Communications, Signal Processing, and their Applications (ICCSPA), 2013."},{"key":"e_1_3_2_1_3_1","first-page":"229","volume-title":"Human Language Technologies: The 2010 Annual Conference of theNorth American Chapter of the Association for Computational Linguistics, HLT'10","author":"Baldwin T.","year":"2010","unstructured":"T. Baldwin and M. Lui . Language identification: The long and the short of the matter . In Human Language Technologies: The 2010 Annual Conference of theNorth American Chapter of the Association for Computational Linguistics, HLT'10 , pages 229 -- 237 , Stroudsburg, PA, USA , 2010 . Association for Computational Linguistics. T. Baldwin and M. Lui. Language identification: The long and the short of the matter. In Human Language Technologies: The 2010 Annual Conference of theNorth American Chapter of the Association for Computational Linguistics, HLT'10, pages 229--237, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics."},{"key":"e_1_3_2_1_4_1","first-page":"65","volume-title":"Proceedings of the 2012 Workshop on Language in Social Media (LSM 2012), ACL 2012","author":"Bergsma S.","year":"2012","unstructured":"S. Bergsma , P. McNamee , M. Bagdouri , Clayton Fin , T. Wilson . Language Identification for Creating Language-Specific Twitter Collections . In Proceedings of the 2012 Workshop on Language in Social Media (LSM 2012), ACL 2012 , pages 65 -- 74 , 2012 . S. Bergsma, P. McNamee, M. Bagdouri, Clayton Fin, T. Wilson. Language Identification for Creating Language-Specific Twitter Collections. In Proceedings of the 2012 Workshop on Language in Social Media (LSM 2012), ACL 2012, pages 65--74, 2012."},{"issue":"2","key":"e_1_3_2_1_5_1","first-page":"161","volume":"48113","author":"Cavnar W. B.","year":"1994","unstructured":"W. B. Cavnar , J. M. Trenkle , and al. N-gram-based text categorization. Ann Arbor MI , 48113 ( 2 ): 161 -- 175 , 1994 . W. B. Cavnar, J. M. Trenkle, and al. N-gram-based text categorization. Ann Arbor MI, 48113(2):161--175, 1994.","journal-title":"Ann Arbor MI"},{"key":"e_1_3_2_1_6_1","volume-title":"Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013","author":"Elfardy H.","year":"2013","unstructured":"H. Elfardy and M. Diab , Sentence-Level Dialect Identification in Arabic , In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013 , Sofia, Bulgaria. 2013 . H. Elfardy and M. Diab, Sentence-Level Dialect Identification in Arabic, In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, Sofia, Bulgaria. 2013."},{"key":"e_1_3_2_1_7_1","volume-title":"Citeseer","author":"Dunning T.","year":"1994","unstructured":"T. Dunning . Statistical identification of languages . Citeseer , 1994 . T. Dunning. Statistical identification of languages. Citeseer, 1994."},{"key":"e_1_3_2_1_8_1","volume-title":"Proceedings of the Workshop on Language Analysis in Social Media","author":"Gotti F.","year":"2013","unstructured":"F. Gotti , P. Langlais , and A. Farzindar . Translating government agencies' tweet feeds: Specificities, problems and (a few) solutions . In Proceedings of the Workshop on Language Analysis in Social Media , Atlanta, Georgia , June 2013 . Association for Computational Linguistics, Association for Computational Linguistics. F. Gotti, P. Langlais, and A. Farzindar. Translating government agencies' tweet feeds: Specificities, problems and (a few) solutions. In Proceedings of the Workshop on Language Analysis in Social Media, Atlanta, Georgia, June 2013. Association for Computational Linguistics, Association for Computational Linguistics."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/645326.649721"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.3115\/993268.993282"},{"key":"e_1_3_2_1_11_1","first-page":"765","volume-title":"Proceedings.(ICASSP'04)","volume":"1","author":"Kirchhoff K.","unstructured":"K. Kirchhoff and D. Vergyri . Cross-dialectal acoustic data sharing for arabic speech recognition. In Acoustics, Speech, and Signal Processing, 2004 . Proceedings.(ICASSP'04) . IEEE International Conference on , volume 1 , pages I- 765 . IEEE, 2004. K. Kirchhoff and D. Vergyri. Cross-dialectal acoustic data sharing for arabic speech recognition. In Acoustics, Speech, and Signal Processing, 2004. Proceedings.(ICASSP'04). IEEE International Conference on, volume 1, pages I-765. IEEE, 2004."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/1757788.1757820"},{"key":"e_1_3_2_1_14_1","volume-title":"Arabic dialect identi_cation","author":"Zaidan O. F.","year":"2012","unstructured":"O. F. Zaidan and C. Callison-Burch . Arabic dialect identi_cation . volume 1 , Microsoft Research , 2012 . O. F. Zaidan and C. Callison-Burch. Arabic dialect identi_cation. volume 1, Microsoft Research, 2012."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.3115\/1220175.1220261"},{"key":"e_1_3_2_1_16_1","volume-title":"Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC)","author":"Elfardy H.","unstructured":"H. Elfardy and M. Diab . 2012a. Simplified guidelines for the creation of large scale dialectal arabic annotations . In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC) , Istanbul, Turkey. H. Elfardy and M. Diab. 2012a. Simplified guidelines for the creation of large scale dialectal arabic annotations. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC), Istanbul, Turkey."},{"key":"e_1_3_2_1_17_1","volume-title":"Proceedings of the 24th International Conference on Computational Linguistics (COLING)","author":"Elfardy H.","unstructured":"H. Elfardy and M. Diab . 2012b. Token level identification of linguistic code switching . In Proceedings of the 24th International Conference on Computational Linguistics (COLING) , Mumbai, India. H. Elfardy and M. Diab. 2012b. Token level identification of linguistic code switching. In Proceedings of the 24th International Conference on Computational Linguistics (COLING),Mumbai, India."},{"key":"e_1_3_2_1_18_1","first-page":"456","volume-title":"Sentence Level Dialect Identification in Arabic. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics","author":"Elfardy H.","year":"2013","unstructured":"H. Elfardy , M. Al-Badrashiny , M. Elfardy and M. Diab . 2013 . Sentence Level Dialect Identification in Arabic. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics , pages 456 -- 461 , Sofia, Bulgaria, August 4--9 2013 . H. Elfardy, M. Al-Badrashiny, M. Elfardy and M. Diab. 2013. Sentence Level Dialect Identification in Arabic. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pages 456--461, Sofia, Bulgaria, August 4--9 2013."},{"key":"e_1_3_2_1_19_1","volume-title":"Proceedings of the Workshop on Computational Approaches to Semitic Languages at the meeting of the European Association for Computational Linguistics (EACL)","author":"Biadsy F.","unstructured":"F. Biadsy , J. Hirschberg , and N. Habash . 2009. Spoken arabic dialect identification using phonotactic modeling . In Proceedings of the Workshop on Computational Approaches to Semitic Languages at the meeting of the European Association for Computational Linguistics (EACL) , Athens, Greece. F. Biadsy, J. Hirschberg, and N. Habash. 2009. Spoken arabic dialect identification using phonotactic modeling. In Proceedings of the Workshop on Computational Approaches to Semitic Languages at the meeting of the European Association for Computational Linguistics (EACL), Athens, Greece."},{"key":"e_1_3_2_1_20_1","first-page":"10","volume-title":"Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties","author":"Salloum W.","unstructured":"W. Salloum and N. Habash . 2011. Dialectal to standard arabic paraphrasing to improve arabic-english statistical machine translation . In Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties , pages 10 -- 21 . Association for Computational Linguistics. W. Salloum and N. Habash. 2011. Dialectal to standard arabic paraphrasing to improve arabic-english statistical machine translation. In Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties, pages 10--21. Association for Computational Linguistics."},{"key":"e_1_3_2_1_21_1","volume-title":"Conventional orthography for dialectal arabic. In Proceedings of the Language Resources and Evaluation Conference (LREC)","author":"Habash N.","unstructured":"N. Habash , M. Diab , and O. Rabmow . 2012 . Conventional orthography for dialectal arabic. In Proceedings of the Language Resources and Evaluation Conference (LREC) , Istanbul. N. Habash, M. Diab, and O. Rabmow. 2012. Conventional orthography for dialectal arabic. In Proceedings of the Language Resources and Evaluation Conference (LREC), Istanbul."},{"key":"e_1_3_2_1_22_1","volume-title":"Proceedings of the 5th International Joint Conference on Natural Language Processing (ICJNLP), Chiangmai, Thailand .","author":"Dasigi P.","unstructured":"P. Dasigi and M. Diab . 2011. Codact: Towards identifying orthographic variants in dialectal arabic . In Proceedings of the 5th International Joint Conference on Natural Language Processing (ICJNLP), Chiangmai, Thailand . P. Dasigi and M. Diab. 2011. Codact: Towards identifying orthographic variants in dialectal arabic. In Proceedings of the 5th International Joint Conference on Natural Language Processing (ICJNLP), Chiangmai, Thailand ."},{"key":"e_1_3_2_1_23_1","first-page":"37","volume-title":"Proceedings of ACL","author":"Zaidan O. F.","unstructured":"O. F. Zaidan and C. Callison-Burch . 2011. The arabic online commentary dataset: an annotated dataset of informal arabic with high dialectal content . In Proceedings of ACL , pages 37 -- 41 . O. F. Zaidan and C. Callison-Burch. 2011. The arabic online commentary dataset: an annotated dataset of informal arabic with high dialectal content. In Proceedings of ACL, pages 37--41."},{"key":"e_1_3_2_1_24_1","volume-title":"Dobrovnik","author":"Sadat F.","year":"2014","unstructured":"F. Sadat . The ASMAT project - Arabic Social Media Analysis Tools. In proeedings of the Seventeenth Annual Conference of the European Association for Machine Translation (EAMT 2014) , Dobrovnik , Croatia , 16-18 June 2014 . F. Sadat. The ASMAT project - Arabic Social Media Analysis Tools. In proeedings of the Seventeenth Annual Conference of the European Association for Machine Translation (EAMT 2014), Dobrovnik, Croatia, 16-18 June 2014."}],"event":{"name":"SIGIR '14: The 37th International ACM SIGIR Conference on Research and Development in Information Retrieval","location":"Gold Coast Queensland Australia","acronym":"SIGIR '14","sponsor":["SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the first international workshop on Social media retrieval and analysis"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2632188.2632207","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2632188.2632207","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:56:13Z","timestamp":1750229773000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2632188.2632207"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,7,11]]},"references-count":23,"alternative-id":["10.1145\/2632188.2632207","10.1145\/2632188"],"URL":"https:\/\/doi.org\/10.1145\/2632188.2632207","relation":{},"subject":[],"published":{"date-parts":[[2014,7,11]]},"assertion":[{"value":"2014-07-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}