{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,20]],"date-time":"2026-03-20T18:01:32Z","timestamp":1774029692073,"version":"3.50.1"},"reference-count":36,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2023,8,1]],"date-time":"2023-08-01T00:00:00Z","timestamp":1690848000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Sichuan Science and Technology Program","award":["2021YFQ0003"],"award-info":[{"award-number":["2021YFQ0003"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Systems"],"abstract":"<jats:p>Facing fast-increasing electronic documents in the Digital Media Age, the need to extract textual features of online texts for better communication is growing. Sentiment classification might be the key method to catch emotions of online communication, and developing corpora with annotation of emotions is the first step to achieving sentiment classification. However, the labour-intensive and costly manual annotation has resulted in the lack of corpora for emotional words. Furthermore, single-label semantic corpora could hardly meet the requirement of modern analysis of complicated user\u2019s emotions, but tagging emotional words with multiple labels is even more difficult than usual. Improvement of the methods of automatic emotion tagging with multiple emotion labels to construct new semantic corpora is urgently needed. Taking Twitter short texts as the case, this study proposes a new semi-automatic method to annotate Internet short texts with multiple labels and form a multi-labelled corpus for further algorithm training. Each sentence is tagged with both the emotional tendency and polarity, and each tweet, which generally contains several sentences, is tagged with the first two major emotional tendencies. The semi-automatic multi-labelled annotation is achieved through the process of selecting the base corpus and emotional tags, data preprocessing, automatic annotation through word matching and weight calculation, and manual correction in case of multiple emotional tendencies are found. The experiments on the Sentiment140 published Twitter corpus demonstrate the effectiveness of the proposed approach and show consistency between the results of semi-automatic annotation and manual annotation. By applying this method, this study summarises the annotation specification and constructs a multi-labelled emotion corpus with 6500 tweets for further algorithm training.<\/jats:p>","DOI":"10.3390\/systems11080390","type":"journal-article","created":{"date-parts":[[2023,8,1]],"date-time":"2023-08-01T09:24:24Z","timestamp":1690881864000},"page":"390","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":144,"title":["Developing Multi-Labelled Corpus of Twitter Short Texts: A Semi-Automatic Method"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5599-2607","authenticated-orcid":false,"given":"Xuan","family":"Liu","sequence":"first","affiliation":[{"name":"School of Public Affairs and Administration, University of Electronic Science and Technology of China, Chengdu 611731, China"}]},{"given":"Guohui","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Public Affairs and Administration, University of Electronic Science and Technology of China, Chengdu 611731, China"}]},{"given":"Minghui","family":"Kong","sequence":"additional","affiliation":[{"name":"School of Public Affairs and Administration, University of Electronic Science and Technology of China, Chengdu 611731, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9818-9205","authenticated-orcid":false,"given":"Zhengtong","family":"Yin","sequence":"additional","affiliation":[{"name":"College of Resource and Environment Engineering, Guizhou University, Guiyang 550025, China"}]},{"given":"Xiaolu","family":"Li","sequence":"additional","affiliation":[{"name":"School of Geographical Sciences, Southwest University, Chongqing 400715, China"}]},{"given":"Lirong","family":"Yin","sequence":"additional","affiliation":[{"name":"Department of Geography and Anthropology, Louisiana State University, Baton Rouge, LA 70803, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8486-1654","authenticated-orcid":false,"given":"Wenfeng","family":"Zheng","sequence":"additional","affiliation":[{"name":"School of Automation, University of Electronic Science and Technology of China, Chengdu 611731, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,8,1]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"103547","DOI":"10.1016\/j.im.2021.103547","article-title":"Understanding how the semantic features of contents influence the diffusion of government microblogs: Moderating role of content topics","volume":"58","author":"Feng","year":"2021","journal-title":"Inf. Manag."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Hu, A., and Flaxman, S. (2018, January 19\u201323). Multimodal sentiment analysis to explore the structure of emotions. Proceedings of the 24th ACM SIGKDD international conference on Knowledge Discovery & Data Mining, London, UK.","DOI":"10.1145\/3219819.3219853"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Ai, Y., Chen, Z., Wang, S., and Pang, Y. (2018, January 26\u201328). Recognizing emotions in chinese text using dictionary and ensemble of classifier. Proceedings of the Third International Workshop on Pattern Recognition, Jinan, China.","DOI":"10.1117\/12.2501916"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Yang, J., Jiang, L., Wang, C., and Xie, J. (2014, January 10\u201312). Multi-label emotion classification for tweets in weibo: Method and application. Proceedings of the 2014 IEEE 26th International Conference on Tools with Artificial Intelligence (ICTAI), Limassol, Cyprus.","DOI":"10.1109\/ICTAI.2014.71"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1083","DOI":"10.1016\/j.eswa.2014.08.036","article-title":"A multi-label classification based approach for sentiment classification","volume":"42","author":"Liu","year":"2015","journal-title":"Expert Syst. Appl."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Shah, F.M., Reyadh, A.S., Shaafi, A.I., Ahmed, S., and Sithil, F.T. (2019, January 26\u201328). Emotion detection from tweets using AIT-2018 dataset. Proceedings of the 2019 5th International Conference on Advances in Electrical Engineering (ICAEE), Dhaka, Bangladesh.","DOI":"10.1109\/ICAEE48663.2019.8975433"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1016\/j.csl.2013.04.010","article-title":"Automatically annotating a five-billion-word corpus of Japanese blogs for sentiment and affect analysis","volume":"28","author":"Ptaszynski","year":"2014","journal-title":"Comput. Speech Lang."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"595","DOI":"10.1049\/joe.2019.1212","article-title":"Using normal dictionaries to extract multiple semantic relationships","volume":"2020","author":"Liang","year":"2020","journal-title":"J. Eng."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1080\/02699939208411068","article-title":"An argument for basic emotions","volume":"6","author":"Ekman","year":"1992","journal-title":"Cogn. Emot."},{"key":"ref_10","first-page":"116","article-title":"Construction and analysis of affective corpus","volume":"22","author":"Xu","year":"2008","journal-title":"J. Chin. Inf."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Ji, Q., and Raney, A.A. (2020). Developing and validating the self-transcendent emotion dictionary for text analysis. PLoS ONE, 15.","DOI":"10.1371\/journal.pone.0239050"},{"key":"ref_12","unstructured":"Pak, A., and Paroubek, P. (2010, January 17\u201323). Twitter as a corpus for sentiment analysis and opinion mining. Proceedings of the Seventh International Conference on Language Resources and Evaluation LREc, Valletta, Malta."},{"key":"ref_13","unstructured":"Uryupina, O., Plank, B., Severyn, A., Rotondi, A., and Moschitti, A. (2014, January 26\u201331). SenTube: A Corpus for Sentiment Analysis on YouTube Social Medi. Proceedings of the 9th International Conference on Language Resources and Evaluation LREC, Reykjavik, Iceland."},{"key":"ref_14","unstructured":"Refaee, E., and Rieser, V. (2014, January 26\u201331). An arabic twitter corpus for subjectivity and sentiment analysis. Proceedings of the 9th International Conference on Language Resources and Evaluation LREC, Reykjavik, Iceland."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s13278-019-0602-x","article-title":"Arabic sentiment analysis: Studies, resources, and tools","volume":"9","author":"Guellil","year":"2019","journal-title":"Soc. Netw. Anal. Min."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1515\/cllt-2019-0060","article-title":"A corpus-based analysis of meaning variations in German tag questions Evidence from spoken and written conversational corpora","volume":"18","author":"Clausen","year":"2022","journal-title":"Corpus Linguist. Linguist. Theory"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Svetlov, K., and Platonov, K. (2019, January 5\u20138). Sentiment analysis of posts and comments in the accounts of russian politicians on the social network. Proceedings of the 2019 25th Conference of Open Innovations Association (FRUCT), Helsinki, Finland.","DOI":"10.23919\/FRUCT48121.2019.8981501"},{"key":"ref_18","unstructured":"Mohammad, S., and Turney, P. (2010, January 5\u20136). Emotions evoked by common words and phrases: Using mechanical turk to create an emotion lexicon. Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, Los Angeles, CA, USA."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Matsumoto, K., Sasayama, M., Yoshida, M., and Kita, K. (2019, January 19\u201321). Emotional state estimation by dialogue history and sentence distributed representation. Proceedings of the 2019 IEEE 6th International Conference on Cloud Computing and Intelligence Systems (CCIS), Singapore.","DOI":"10.1109\/CCIS48116.2019.9073750"},{"key":"ref_20","unstructured":"Aman, S., and Szpakowicz, S. (2008, January 7\u201312). Using roget\u2019s thesaurus for fine-grained emotion recognition. Proceedings of the Third International Joint Conference on Natural Language Processing, Hyderabad, India."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Yang, L., Zhou, F., Lin, H., Wang, J., and Zhang, S. (2018, January 26\u201328). Chinese emotion commonsense knowledge base construction and its application. Proceedings of the Workshop on Chinese Lexical Semantics, Chiayi, Taiwan.","DOI":"10.1007\/978-3-030-04015-4_11"},{"key":"ref_22","first-page":"1197","article-title":"Multilabel Emotion Tagging for Domain-Specific Texts","volume":"9","year":"2021","journal-title":"IEEE Trans. Comput. Soc. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1016\/j.neucom.2016.03.088","article-title":"Multi-label maximum entropy model for social emotion classification over short text","volume":"210","author":"Li","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Rajabi, Z., Shehu, A., and Uzuner, O. (2020, January 3\u20135). A multi-channel bilstm-cnn model for multilabel emotion classification of informal text. Proceedings of the 2020 IEEE 14th International Conference on Semantic Computing (ICSC), San Diego, CA, USA.","DOI":"10.1109\/ICSC.2020.00060"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1839","DOI":"10.1109\/TASLP.2020.3001390","article-title":"Topic-enhanced capsule network for multi-label emotion classification","volume":"28","author":"Fei","year":"2020","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_26","first-page":"2323","article-title":"Deep Learning and Machine Learning-Based Model for Conversational Sentiment Classification","volume":"72","author":"Ullah","year":"2022","journal-title":"Comput. Mater. Contin."},{"key":"ref_27","unstructured":"Aman, S., and Szpakowicz, S. (2007, January 3\u20137). Identifying expressions of emotion in text. Proceedings of the Text, Speech and Dialogue: 10th International Conference, Pilsen, Czech Republic."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1162\/COLI_a_00049","article-title":"Lexicon-based methods for sentiment analysis","volume":"37","author":"Taboada","year":"2011","journal-title":"Comput. Linguist."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Yan, D., Hu, B., and Qin, J. (2018, January 15\u201317). Sentiment analysis for microblog related to Finance based on rules and classification. Proceedings of the 2018 IEEE International Conference on Big Data and Smart computing (BigComp), Shanghai, China.","DOI":"10.1109\/BigComp.2018.00026"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Liu, H., Guo, H., and Hu, W. (2021, January 22\u201328). Eeg-based emotion classification using joint adaptation networks. Proceedings of the 2021 IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Republic of Korea.","DOI":"10.1109\/ISCAS51556.2021.9401737"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Shu, L., Xie, J., Yang, M., Li, Z., Li, Z., Liao, D., Xu, X., and Yang, X. (2018). A review of emotion recognition using physiological signals. Sensors, 18.","DOI":"10.3390\/s18072074"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"174","DOI":"10.1007\/s11036-020-01697-y","article-title":"Research on sentiment analysis of network forum based on BP neural network","volume":"26","author":"Tang","year":"2021","journal-title":"Mob. Netw. Appl."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"101076","DOI":"10.1016\/j.joi.2020.101076","article-title":"A novel term weighting scheme for text classification: Tf-mono","volume":"14","author":"Dogan","year":"2020","journal-title":"J. Informetr."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Sintsova, V., Musat, C., and Pu, P. (2014, January 14). Semi-supervised method for multi-tendency emotion recognition in tweets. Proceedings of the 2014 IEEE International Conference on Data Mining Workshop, Shenzhen, China.","DOI":"10.1109\/ICDMW.2014.146"},{"key":"ref_35","unstructured":"Mishne, G. (2005, January 15\u201319). Experiments with mood classification in blog posts. Proceedings of the ACM SIGIR 2005 Workshop on Stylistic Analysis of Text for Information Access, Salvador, Brazil."},{"key":"ref_36","first-page":"2009","article-title":"Twitter sentiment classification using distant supervision","volume":"1","author":"Go","year":"2009","journal-title":"CS224N Proj. Rep. Stanf."}],"container-title":["Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-8954\/11\/8\/390\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:23:37Z","timestamp":1760127817000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-8954\/11\/8\/390"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,1]]},"references-count":36,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2023,8]]}},"alternative-id":["systems11080390"],"URL":"https:\/\/doi.org\/10.3390\/systems11080390","relation":{},"ISSN":["2079-8954"],"issn-type":[{"value":"2079-8954","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,1]]}}}