{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,10]],"date-time":"2026-06-10T16:19:00Z","timestamp":1781108340018,"version":"3.54.1"},"reference-count":34,"publisher":"IGI Global Scientific Publishing","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,4,1]]},"abstract":"<p>Learning text representation is forming a core for numerous natural language processing applications. Word embedding is a type of text representation that allows words with similar meaning to have similar representation. Word embedding techniques categorize semantic similarities between linguistic items based on their distributional properties in large samples of text data. Although these techniques are very efficient, handling semantic and pragmatics ambiguity with high accuracy is still a challenging research task. In this article, we propose a new feature as a semantic score which handles ambiguities between words. We use external knowledge bases and the Huffman Coding algorithm to compute this score that depicts the semantic relatedness between all fragments composing a given text. We combine this feature with word embedding methods to improve text representation. We evaluate our method on a hashtag recommendation system in Twitter where text is noisy and short. The experimental results demonstrate that, compared with state-of-the-art algorithms, our method achieves good results.<\/p>","DOI":"10.4018\/ijswis.2020040107","type":"journal-article","created":{"date-parts":[[2020,2,21]],"date-time":"2020-02-21T10:19:05Z","timestamp":1582280345000},"page":"126-142","source":"Crossref","is-referenced-by-count":2,"title":["Efficient Weighted Semantic Score Based on the Huffman Coding Algorithm and Knowledge Bases for Word Sequences Embedding"],"prefix":"10.4018","volume":"16","author":[{"given":"Nada","family":"Ben-Lhachemi","sequence":"first","affiliation":[{"name":"Sidi Mohamed Ben Abdellah University, Morocco"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5816-0897","authenticated-orcid":true,"given":"El Habib","family":"Nfaoui","sequence":"additional","affiliation":[{"name":"LIIAN Laboratory, Sidi Mohamed Ben Abdellah University, Morocco"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"2432","reference":[{"key":"IJSWIS.2020040107-0","article-title":"A simple but tough-to-beat baseline for sentence embeddings.","author":"S.Arora","year":"2017","journal-title":"Proceedings of the 5th International Conference on Learning Representations"},{"key":"IJSWIS.2020040107-1","doi-asserted-by":"publisher","DOI":"10.1145\/3102254.3102283"},{"key":"IJSWIS.2020040107-2","article-title":"A comparison of vector-based representations for semantic composition.","author":"W.Blacoe","year":"2012","journal-title":"Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning"},{"key":"IJSWIS.2020040107-3","first-page":"226","article-title":"A density-based algorithm for discovering clusters in large spatial databases with noise.","author":"M.Ester","year":"1996","journal-title":"Proceedings of the Second International Conference on Knowledge Discovery and Data Mining"},{"key":"IJSWIS.2020040107-4","author":"Z. S.Harris","year":"1981","journal-title":"Distributional structure"},{"key":"IJSWIS.2020040107-5","doi-asserted-by":"publisher","DOI":"10.1109\/JRPROC.1952.273898"},{"key":"IJSWIS.2020040107-6","unstructured":"Ignacio, A.-F., Carlos-Francisco, M.-C., Gerardo, S., Juan-Manuel, T.-M., & Grigori, S. (2017). Sentence Representations as Word Information Series: Revisiting TF\u2014IDF."},{"key":"IJSWIS.2020040107-7","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-1162"},{"key":"IJSWIS.2020040107-8","doi-asserted-by":"publisher","DOI":"10.4018\/IJSWIS.2017010105"},{"key":"IJSWIS.2020040107-9","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-45924-9_14"},{"key":"IJSWIS.2020040107-10","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1089"},{"key":"IJSWIS.2020040107-11","doi-asserted-by":"publisher","DOI":"10.1145\/2806416.2806475"},{"key":"IJSWIS.2020040107-12","unstructured":"Kiros, R., Zhu, Y., Salakhutdinov, R. R., Zemel, R., Urtasun, R., Torralba, A., & Fidler, S. (2015). Skip-thought vectors. In Advances in neural information processing systems (pp. 3294-3302). Academic Press."},{"key":"IJSWIS.2020040107-13","unstructured":"Le, Q., & Mikolov, T. (2014). Distributed Representations of Sentences and Documents. Proceedings of the 31st International Conference on Machine Learning, Beijing, China. JMLR: W&CP."},{"key":"IJSWIS.2020040107-14","article-title":"Hashtag Recommendation with Topical Attention-Based LSTM.","author":"Y.Li","year":"2016","journal-title":"Proceedings of the 26th International Conference on Computational Linguistics"},{"key":"IJSWIS.2020040107-15","doi-asserted-by":"publisher","DOI":"10.1109\/ICCI-CC.2015.7259377"},{"key":"IJSWIS.2020040107-16","unstructured":"Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space."},{"key":"IJSWIS.2020040107-17","unstructured":"Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119). Academic Press."},{"issue":"8","key":"IJSWIS.2020040107-18","doi-asserted-by":"crossref","first-page":"1388","DOI":"10.1111\/j.1551-6709.2010.01106.x","article-title":"Composition in distributional models of semantics.","volume":"34","author":"J.Mitchell","year":"2010","journal-title":"Cognitive Science"},{"key":"IJSWIS.2020040107-19","author":"A.Moro","year":"2014","journal-title":"Entity Linking meets Word Sense Disambiguation: A Unified Approach"},{"key":"IJSWIS.2020040107-20","article-title":"Glove: Global vectors for word representation.","author":"J.Pennington","year":"2014","journal-title":"Proceedings of the Empirical Methods in Natural Language Processing"},{"key":"IJSWIS.2020040107-21","unstructured":"Piotr, B., Edouard, G., Armand, J., & Mikolov, T. (2016). Enriching Word Vectors with Subword Information."},{"key":"IJSWIS.2020040107-22","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.04.045"},{"key":"IJSWIS.2020040107-23","article-title":"A unified architecture for natural language processing: Deep neural networks with multitask learning.","author":"C.Ronan","year":"2008","journal-title":"Proceedings of the 25th International Conference on Machine Learning"},{"key":"IJSWIS.2020040107-24","doi-asserted-by":"crossref","unstructured":"Tai, K.S., Socher, R., & Manning, C.D. (2015). Improved semantic representations from tree-structured long short-term memory networks.","DOI":"10.3115\/v1\/P15-1150"},{"key":"IJSWIS.2020040107-25","unstructured":"The word2vec pre-trained Google News corpus. (n.d.). Retrieved from https:\/\/drive.google.com\/file\/d\/0B7XkCwpI5KDYNlNUTTlSS21pQmM\/edit"},{"key":"IJSWIS.2020040107-26","unstructured":"UDIT Twitter Crawl [Dataset] (2012). Retrieved from https:\/\/wiki.cites.illinois.edu\/wiki\/display\/forward\/Dataset-UDITwitterCrawl-Aug2012"},{"key":"IJSWIS.2020040107-27","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1048"},{"key":"IJSWIS.2020040107-28","doi-asserted-by":"crossref","first-page":"1822","DOI":"10.3115\/v1\/D14-1194","article-title":"tagspace: Semantic embeddings from hashtags.","author":"J.Weston","year":"2014","journal-title":"Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)"},{"key":"IJSWIS.2020040107-29","article-title":"Towards universal paraphrastic sentence embeddings.","author":"J.Wieting","year":"2016","journal-title":"Proceedings of the International Conference on Learning Representations"},{"key":"IJSWIS.2020040107-30","unstructured":"Word2vec open source. (n.d.). Retrieved from https:\/\/code.google.com\/archive\/p\/word2vec\/"},{"key":"IJSWIS.2020040107-31","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1016\/bs.host.2018.05.001","article-title":"Deep learning for natural language processing","volume":"Vol. 38","author":"Y.Xie","year":"2018","journal-title":"Handbook of statistics"},{"key":"IJSWIS.2020040107-32","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/N15-1091"},{"key":"IJSWIS.2020040107-33","doi-asserted-by":"crossref","first-page":"196","DOI":"10.1016\/j.future.2015.10.012","article-title":"A personalized hashtag recommendation approach using LDA-based topic model in microblog environment.","volume":"65","author":"F.Zhao","year":"2016","journal-title":"Future Generation Computer Systems"}],"container-title":["International Journal on Semantic Web and Information Systems"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=249682","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,6]],"date-time":"2022-05-06T14:20:57Z","timestamp":1651846857000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/IJSWIS.2020040107"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2020,4,1]]},"references-count":34,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2020,4]]}},"URL":"https:\/\/doi.org\/10.4018\/ijswis.2020040107","relation":{},"ISSN":["1552-6283","1552-6291"],"issn-type":[{"value":"1552-6283","type":"print"},{"value":"1552-6291","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,4,1]]}}}