{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T08:30:16Z","timestamp":1772785816063,"version":"3.50.1"},"reference-count":38,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,9,28]],"date-time":"2023-09-28T00:00:00Z","timestamp":1695859200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,9,28]],"date-time":"2023-09-28T00:00:00Z","timestamp":1695859200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62072429"],"award-info":[{"award-number":["62072429"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100012542","name":"Sichuan Province Science and Technology Support Program","doi-asserted-by":"crossref","award":["2021YFG0305"],"award-info":[{"award-number":["2021YFG0305"]}],"id":[{"id":"10.13039\/100012542","id-type":"DOI","asserted-by":"crossref"}]},{"name":"The Intelligent terminal Key Laboratory of Sichuan Province","award":["SCITLAB-0003"],"award-info":[{"award-number":["SCITLAB-0003"]}]},{"name":"The Intelligent terminal Key Laboratory of Sichuan Province","award":["SCITLAB-1003"],"award-info":[{"award-number":["SCITLAB-1003"]}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"crossref","award":["ZYGX2020ZB021"],"award-info":[{"award-number":["ZYGX2020ZB021"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Big Data"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Hashtags are the keywords that describe the theme of social media content and have become very popular in influence marketing and trending topics. In recent years, hashtag prediction has become a hot topic in AI research to help users with automatic hashtag recommendations by capturing the theme of the post. Most of the previous work mainly focused only on textual information, but many microblog posts contain not only text but also the corresponding images. This work explores both image-text features of the microblog post. Inspired by the self-attention mechanism of the transformer in natural language processing, the visual-linguistics pre-train model with transfer learning also outperforms many downstream tasks that require image and text inputs. However, most of the existing models for multimodal hashtag recommendation are based on the traditional co-attention mechanism. This paper investigates the cross-modality transformer LXMERT for multimodal hashtag prediction for developing LXMERT4Hashtag, a cross-modality representation learning transformer model for hashtag prediction. It is a large-scale transformer model that consists of three encoders: a language encoder, an object encoder, and a cross-modality encoder. We evaluate the presented approach on dataset InstaNY100K. Experimental results show that our model is competitive and achieves impressive results, including precision of 50.5% vs 46.12%, recall of 44.02% vs 38.93%, and F1-score of 47.04% vs 42.22% compared to the existing state-of-the-art baseline model.<\/jats:p>","DOI":"10.1186\/s40537-023-00824-2","type":"journal-article","created":{"date-parts":[[2023,9,28]],"date-time":"2023-09-28T11:03:02Z","timestamp":1695898982000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Cross-modality representation learning from transformer for hashtag prediction"],"prefix":"10.1186","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2999-8519","authenticated-orcid":false,"given":"Mian Muhammad Yasir","family":"Khalil","sequence":"first","affiliation":[]},{"given":"Qingxian","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Bo","family":"Chen","sequence":"additional","affiliation":[]},{"given":"Weidong","family":"Wang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,9,28]]},"reference":[{"key":"824_CR1","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A.N, Kaiser \u0141, Polosukhin I. Attention is all you need. Adv Neural Inf Process Syst. 2017-Decem(Nips), 2017; 5999\u20136009. arXiv:1706.03762"},{"key":"824_CR2","unstructured":"Devlin J, Chang M.W, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL HLT 2019 - 2019 Conf North Am Chapter Assoc Comput Linguist Hum Lang Technol - Proc Conf. 1(Mlm), 2019; 4171\u20134186. arXiv:1810.04805"},{"key":"824_CR3","doi-asserted-by":"publisher","unstructured":"Tan H, Bansal M. 2020 LXMert: Learning cross-modality encoder representations from transformers. EMNLP-IJCNLP 2019 - 2019 Conf Empir Methods Nat Lang Process. 9th Int Jt Conf Nat Lang Process Proc Conf. 2020; 5100\u20135111. https:\/\/doi.org\/10.18653\/v1\/d19-1514. arXiv:1908.07490","DOI":"10.18653\/v1\/d19-1514"},{"key":"824_CR4","first-page":"67","volume":"730","author":"E Zangerle","year":"2011","unstructured":"Zangerle E, Gassler W, Specht G. Recommending #-tags in Twitter. CEUR Workshop Proc. 2011;730:67.","journal-title":"CEUR Workshop Proc"},{"key":"824_CR5","unstructured":"Ding Qi Zhang uanJ ing Huang Z.X. Automatic hashtag recommendation for microblogs using topic-specific translation model TITLE AND ABSTRACT IN CHINESE, 2012; 265\u2013274."},{"key":"824_CR6","doi-asserted-by":"publisher","unstructured":"Sedhai S, Sun A. Hashtag recommendation for hyperlinked tweets. SIGIR 2014 - Proc 37th Int ACM SIGIR Conf Res Dev Inf Retr., 2014; 831\u2013834: https:\/\/doi.org\/10.1145\/2600428.2609452","DOI":"10.1145\/2600428.2609452"},{"key":"824_CR7","doi-asserted-by":"publisher","first-page":"196","DOI":"10.1016\/J.FUTURE.2015.10.012","volume":"65","author":"F Zhao","year":"2016","unstructured":"Zhao F, Zhu Y, Jin H, Yang LT. A personalized hashtag recommendation approach using LDA-based topic model in microblog environment. Futur Gener Comput Syst. 2016;65:196\u2013206. https:\/\/doi.org\/10.1016\/J.FUTURE.2015.10.012.","journal-title":"Futur Gener Comput Syst"},{"key":"824_CR8","first-page":"2782","volume":"16","author":"G Yuyun","year":"2016","unstructured":"Yuyun G, Qi Z. Hashtag recommendation using attention-based convolutional neural network. IJCAI Int Jt Conf Artif Intell. 2016;16:2782\u20138.","journal-title":"IJCAI Int Jt Conf Artif Intell"},{"key":"824_CR9","unstructured":"Li Y, Liu T, Jiang J, Zhang L. Hashtag recommendation with topical attention-based LSTM. COLING 2016 - 26th Int Conf Comput Linguist Proc. COLING 2016 Tech Pap. 2016; 3019\u20133029"},{"key":"824_CR10","doi-asserted-by":"publisher","unstructured":"Li J, Xu H, He X, Deng J, Sun X. Tweet modeling with LSTM recurrent neural networks for hashtag recommendation. Proc Int Jt Conf. Neural Networks 2016-October, 2016; 1570\u20131577: https:\/\/doi.org\/10.1109\/IJCNN.2016.7727385","DOI":"10.1109\/IJCNN.2016.7727385"},{"issue":"4","key":"824_CR11","doi-asserted-by":"publisher","first-page":"711","DOI":"10.1007\/s11390-018-1851-2","volume":"33","author":"FF Kou","year":"2018","unstructured":"Kou FF, Du JP, Yang CX. Hashtag recommendation based on multi-features of microblogs. J COM-PUTER Sci Technol. 2018;33(4):711\u201326. https:\/\/doi.org\/10.1007\/s11390-018-1851-2.","journal-title":"J COM-PUTER Sci Technol"},{"key":"824_CR12","doi-asserted-by":"crossref","unstructured":"Liu J, He Z, Huang Y. Hashtag2Vec: Learning hashtag representation with relational hierarchical embedding model. 2018.","DOI":"10.24963\/ijcai.2018\/480"},{"key":"824_CR13","doi-asserted-by":"publisher","first-page":"125","DOI":"10.1007\/978-3-030-15719-7-16","volume-title":"Lecture notes computer science","author":"SK Maity","year":"2019","unstructured":"Maity SK, Panigrahi A, Ghosh S, Banerjee A, Goyal P, Mukherjee A. DeepTagRec: a content-cum-user based tag recommendation framework for stack overflow. In: Azzopardi L, Stein B, Fuhr N, Mayr P, Hauff C, Hiemstra D, editors. Lecture notes computer science. Cham: Springer; 2019. p. 125\u201331. https:\/\/doi.org\/10.1007\/978-3-030-15719-7-16."},{"key":"824_CR14","doi-asserted-by":"publisher","first-page":"356","DOI":"10.1016\/J.NEUCOM.2018.11.057","volume":"331","author":"Y Li","year":"2019","unstructured":"Li Y, Liu T, Hu J, Jiang J. Topical Co-attention networks for hashtag recommendation on microblogs. Neurocomputing. 2019;331:356\u201365. https:\/\/doi.org\/10.1016\/J.NEUCOM.2018.11.057.","journal-title":"Neurocomputing"},{"key":"824_CR15","doi-asserted-by":"publisher","unstructured":"Peng M, Bian Q, Zhang Q, Gui T, Fu J, Zeng L, Huang X. Model the Long-Term Post History for Hashtag Recommendation. Lect Notes Comput Sci. (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 11838 LNAI, 2019; 596\u2013608. https:\/\/doi.org\/10.1007\/978-3-030-32233-5_46","DOI":"10.1007\/978-3-030-32233-5_46"},{"key":"824_CR16","doi-asserted-by":"publisher","unstructured":"Zhang J, Sun H, Tian Y, Liu X. Poster: Semantically enhanced tag recommendation for software CQAs via deep learning. https:\/\/doi.org\/10.1145\/3183440.3194977","DOI":"10.1145\/3183440.3194977"},{"key":"824_CR17","doi-asserted-by":"publisher","unstructured":"Sigurbj\u00f6rnsson B, Van Zwol R. Flickr tag recommendation based on collective knowledge. Proceeding 17th Int Conf World Wide Web 2008, WWW\u201908, 2008; 327\u2013336. https:\/\/doi.org\/10.1145\/1367497.1367542","DOI":"10.1145\/1367497.1367542"},{"key":"824_CR18","doi-asserted-by":"publisher","unstructured":"Garg N, Weber I. Personalized, interactive tag recommendation for flickr. RecSys\u201908 Proc. 2008 ACM Conf Recomm Syst. 2008; 67\u201374. https:\/\/doi.org\/10.1145\/1454008.1454020","DOI":"10.1145\/1454008.1454020"},{"key":"824_CR19","doi-asserted-by":"publisher","unstructured":"Liu D, Hua X.S, Yang L, Wang M, Zhang H.J. Tag ranking. WWW\u201909 - Proc. 18th Int. World Wide Web Conf. 2009; 351\u2013360 . https:\/\/doi.org\/10.1145\/1526709.1526757","DOI":"10.1145\/1526709.1526757"},{"key":"824_CR20","doi-asserted-by":"publisher","unstructured":"Li X, Snoek C.G.M. Classifying tag relevance with relevant positive and negative examples. MM 2013 - Proc. 2013 ACM Multimed Conf. 2013; 485\u2013488. https:\/\/doi.org\/10.1145\/2502081.2502129","DOI":"10.1145\/2502081.2502129"},{"key":"824_CR21","unstructured":"Park M, Li H, Kim J. HARRISON: a Benchmark on HAshtag Recommendation for Real-world Images in Social Networks 2016; arXiv:1605.05054"},{"key":"824_CR22","doi-asserted-by":"publisher","unstructured":"Nguyen H.T.H, Wistuba M, Grabocka J, Drumond L.R, Schmidt-Thieme L. Personalized deep learning for tag recommendation. Lect Notes Comput Sci. (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 10234 LNAI, 2017; 186\u2013197 . https:\/\/doi.org\/10.1007\/978-3-319-57454-7_15","DOI":"10.1007\/978-3-319-57454-7_15"},{"key":"824_CR23","doi-asserted-by":"publisher","unstructured":"Wu G, Li Y, Yan W, Li R, Gu X, Yang Q. Hashtag Recommendation with Attention-Based Neural Image Hashtagging Network. Lect Notes Comput Sci. (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 11302 LNCS, 2018; 52\u201363. https:\/\/doi.org\/10.1007\/978-3-030-04179-3_5","DOI":"10.1007\/978-3-030-04179-3_5"},{"key":"824_CR24","doi-asserted-by":"publisher","unstructured":"Kao D, Lai K.T, Chen M.S. An efficient and resource-aware hashtag recommendation using deep neural networks. Lect Notes Comput Sci. (including Subser. Lect Notes Artif Intell Lect Notes Bioinformatics) 11440 LNAI, 2019; 150\u2013162: https:\/\/doi.org\/10.1007\/978-3-030-16145-3_12","DOI":"10.1007\/978-3-030-16145-3_12"},{"issue":"12","key":"824_CR25","doi-asserted-by":"publisher","first-page":"1351","DOI":"10.3390\/E22121351","volume":"22","author":"T Hachaj","year":"2020","unstructured":"Hachaj T, Miazga J. Image hashtag recommendations using a voting deep neural network and associative rules mining approach. Entropy. 2020;22(12):1351. https:\/\/doi.org\/10.3390\/E22121351.","journal-title":"Entropy"},{"key":"824_CR26","doi-asserted-by":"publisher","unstructured":"Durand T. Learning user representations for open vocabulary image hashtag prediction. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2020; 9766\u20139775. https:\/\/doi.org\/10.1109\/CVPR42600.2020.00979","DOI":"10.1109\/CVPR42600.2020.00979"},{"key":"824_CR27","doi-asserted-by":"publisher","unstructured":"; Zhang Q, Wang J, Huang H, Huang X, Gong Y. Hashtag recommendation for multimodal microblog using co-attention network. IJCAI Int Jt Conf Artif Intell. 0, 2017; 3420\u20133426: https:\/\/doi.org\/10.24963\/ijcai.2017\/478","DOI":"10.24963\/ijcai.2017\/478"},{"key":"824_CR28","unstructured":"Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2015; arXiv:1409.1556v6"},{"key":"824_CR29","doi-asserted-by":"publisher","unstructured":"Zhang S, Yao Y, Xu F, Tong H, Yan X, Lu J. Hashtag recommendation for photo sharing services. 33rd AAAI Conf. Artif. Intell. AAAI 2019, 31st Innov Appl Artif Intell Conf. IAAI 2019 9th AAAI Symp Educ Adv Artif Intell. EAAI 2019. 2019; 5805\u20135812 . https:\/\/doi.org\/10.1609\/aaai.v33i01.33015805","DOI":"10.1609\/aaai.v33i01.33015805"},{"issue":"3","key":"824_CR30","doi-asserted-by":"publisher","first-page":"768","DOI":"10.1109\/TCSS.2020.2986778","volume":"7","author":"Q Yang","year":"2020","unstructured":"Yang Q, Wu G, Li Y, Li R, Gu X, Deng H, Wu J. AMNN: attention-based multimodal neural network model for hashtag recommendation. IEEE Trans Comput Soc Syst. 2020;7(3):768\u201379. https:\/\/doi.org\/10.1109\/TCSS.2020.2986778.","journal-title":"IEEE Trans Comput Soc Syst"},{"key":"824_CR31","doi-asserted-by":"publisher","unstructured":"Cho K, Van Merri\u00ebnboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using rnn encoder-decoder for statistical machine translation. EMNLP 2014 - 2014 Conf Empir Methods Nat Lang Process Proc Conf. 2014; 1724\u20131734 . https:\/\/doi.org\/10.3115\/V1\/D14-1179. arXiv:1406.1078","DOI":"10.3115\/V1\/D14-1179"},{"issue":"2","key":"824_CR32","doi-asserted-by":"publisher","first-page":"388","DOI":"10.1109\/TKDE.2019.2932406","volume":"33","author":"R Ma","year":"2021","unstructured":"Ma R, Qiu X, Zhang Q, Hu X, Jiang YG, Huang X. Co-attention memory network for multimodal microblog\u2019s hashtag recommendation. IEEE Trans Knowl Data Eng. 2021;33(2):388\u2013400. https:\/\/doi.org\/10.1109\/TKDE.2019.2932406.","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"824_CR33","doi-asserted-by":"publisher","unstructured":"Im J.H, Cho W, Kim D.S. Cross-active connection for image-text multimodal feature fusion. vol. 12801 LNCS, pp. 343\u2013354. Springer. 2021; https:\/\/doi.org\/10.1007\/978-3-030-80599-9_30","DOI":"10.1007\/978-3-030-80599-9_30"},{"key":"824_CR34","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-022-00570-x","author":"R Rivas","year":"2022","unstructured":"Rivas R, Paul S, Hristidis V, Papalexakis EE, Roy-Chowdhury AK. Task-agnostic representation learning of multimodal twitter data for downstream applications. J Big Data. 2022. https:\/\/doi.org\/10.1186\/s40537-022-00570-x.","journal-title":"J Big Data"},{"key":"824_CR35","unstructured":"Wu Y, Schuster M, Chen Z, Le Q.V, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J, Shah A, Johnson M, Liu X, Kaiser \u0141, Gouws S, Kato Y, Kudo T, Kazawa H, Stevens K, Kurian G, Patil N, Wang W, Young C, Smith J, Riesa J, Rudnick A, Vinyals O, Corrado G, Hughes M, Dean J.: Google\u2019s neural machine translation system: bridging the gap between human and machine translation. arXiv:1609.08144v2"},{"issue":"6","key":"824_CR36","doi-asserted-by":"publisher","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","volume":"39","author":"S Ren","year":"2015","unstructured":"Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. 2015;39(6):1137\u201349. https:\/\/doi.org\/10.1109\/TPAMI.2016.2577031.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"824_CR37","doi-asserted-by":"crossref","unstructured":"Anderson P, He X, Buehler C, Teney D, Johnson M, Gould S, Zhang L. Bottom-up and top-down attention for image captioning and visual question answering. InProceedings of the IEEE conference on computer vision and pattern recognition 2018 (pp. 6077\u20136086).","DOI":"10.1109\/CVPR.2018.00636"},{"key":"824_CR38","unstructured":"Gomez R. Learning to learn from web data through deep semantic embeddings. arXiv:1808.06368v1"}],"container-title":["Journal of Big Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-023-00824-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s40537-023-00824-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-023-00824-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,18]],"date-time":"2023-11-18T20:07:12Z","timestamp":1700338032000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofbigdata.springeropen.com\/articles\/10.1186\/s40537-023-00824-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,28]]},"references-count":38,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["824"],"URL":"https:\/\/doi.org\/10.1186\/s40537-023-00824-2","relation":{},"ISSN":["2196-1115"],"issn-type":[{"value":"2196-1115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,28]]},"assertion":[{"value":"31 May 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 September 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 September 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"148"}}