{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T06:48:13Z","timestamp":1777704493969,"version":"3.51.4"},"reference-count":36,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2020,9,28]],"date-time":"2020-09-28T00:00:00Z","timestamp":1601251200000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"},{"start":{"date-parts":[[2020,9,28]],"date-time":"2020-09-28T00:00:00Z","timestamp":1601251200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Intelligent &amp; Fuzzy Systems: Applications in Engineering and Technology"],"published-print":{"date-parts":[[2020,11,19]]},"abstract":"<jats:p>The increasing tendency of people expressing opinions via images online has motivated the development of automatic assessment of sentiment from visual contents. Based on the observation that visual sentiment is conveyed through many visual elements in images, we put forward to tackle visual sentiment analysis under multiple instance learning (MIL) formulation. We propose a deep multiple clustered instance learning formulation, under which a deep multiple clustered instance learning network (DMCILN) is constructed for visual sentiment analysis. Specifically, the input image is converted into a bag of instances through visual instance generation module, which is composed of a pre-trained convolutional neural network (CNN) and two adaptation layers. Then, a fuzzy c-means routing algorithm is introduced for generating clustered instances as semantic mid-level representation to bridge the instance-to-bag gap. To explore the relationships between clustered instances and bags, we construct an attention based MIL pooling layer for representing bag features. A multi-head mechanism is integrated to form MIL ensembles, which enables to weigh the contribution of each clustered instance in different subspaces for generating more robust bag representation. Finally, we conduct extensive experiments on several datasets, and the experimental results verify the feasibility of our proposed approach for visual sentiment analysis.<\/jats:p>","DOI":"10.3233\/jifs-200675","type":"journal-article","created":{"date-parts":[[2020,9,29]],"date-time":"2020-09-29T15:54:08Z","timestamp":1601394848000},"page":"7217-7231","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":1,"title":["Visual sentiment analysis via deep multiple clustered instance learning"],"prefix":"10.1177","volume":"39","author":[{"given":"Wenjing","family":"Gao","sequence":"first","affiliation":[{"name":"Shanghai University","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenjun","family":"Zhang","sequence":"additional","affiliation":[{"name":"Shanghai University","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haiyan","family":"Gao","sequence":"additional","affiliation":[{"name":"Shanghai University","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yonghua","family":"Zhu","sequence":"additional","affiliation":[{"name":"Shanghai University","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2020,9,28]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2015.2482228"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","unstructured":"TruongQ.-T. and LauwH.W. Visual Sentiment Analysis for Review Images with Item-Oriented and User-Oriented CNN. In Proceedings of the 25th ACM international conference on Multimedia (MM 2017) 1274\u20131282. https:\/\/doi.org\/10.1145\/3123266.3123374.","DOI":"10.1145\/3123266.3123374"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/TAFFC.2014.2388370"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","unstructured":"ZhaoS. GaoY. JiangX. YaoH. ChuaT.-S. and SunX. Exploring Principles-of-Art Features For Image Emotion Recognition in: Proceedings of the ACM International Conference on Multimedia - MM \u201914 ACM Press Orlando Florida USA 2014 pp. 47\u201356. https:\/\/doi.org\/10.1145\/2647868.2654930.","DOI":"10.1145\/2647868.2654930"},{"key":"e_1_3_2_6_2","doi-asserted-by":"crossref","unstructured":"YouQ. LuoJ. JinH. and YangJ. Robust image sentiment analysis using progressively trained and domain transferred deep networks. In Twenty-ninth AAAI conference on artificial intelligence. (2015).","DOI":"10.1609\/aaai.v29i1.9179"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2017.01.011"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","unstructured":"YouQ. LuoJ. JinH. and YangJ. Cross-modality Consistent Regression for JointVisual-Textual Sentiment Analysis of Social Multimedia in: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining - WSDM \u201916 ACM Press San Francisco California USA 2016 pp. 13\u201322. https:\/\/doi.org\/10.1145\/2835776.2835779.","DOI":"10.1145\/2835776.2835779"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2017.2757769"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2019.01.019"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","unstructured":"BorthD. JiR. ChenT. BreuelT. and ChangS.-F. Large-scale visual sentiment ontology and detectors using adjective noun pairs in: Proceedings of the 21st ACM International Conference on Multimedia - MM \u201913 ACM Press Barcelona Spain 2013 pp. 223\u2013232. https:\/\/doi.org\/10.1145\/2502081.2502282.","DOI":"10.1145\/2502081.2502282"},{"key":"e_1_3_2_12_2","unstructured":"ChenT. BorthD. DarrellT. and ChangeS.F. Deepsentibank: Visual sentiment concept classification with deep convolutional neural networks. arXiv preprint arXiv:1410.8586 (2014)."},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/2502069.2502079"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2016.7552961"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2018.2803520"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","unstructured":"E. Ko C. Yoon and E.-Y. Kim Discovering visual features for recognizing user\u2019s sentiments in social images in: 2016 International Conference on Big Data and Smart Computing (BigComp) IEEE Hong Kong China 2016: pp. 378-381. https:\/\/doi.org\/10.1109\/BIGCOMP.2016.7425952.","DOI":"10.1109\/BIGCOMP.2016.7425952"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP.2016.7532434"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.12.053"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","unstructured":"SheD. YangJ. ChengM.-M. LaiY.-K. RosinP.L. and WangL. WSCNet: Weakly Supervised Coupled Networks for Visual Sentiment Classification and Detection IEEE Trans Multimedia (2019) 1\u20131. https:\/\/doi.org\/10.1109\/TMM.2019.2939744.","DOI":"10.1109\/TMM.2019.2939744"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2019.8852317"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11063-020-10201-2"},{"key":"e_1_3_2_22_2","unstructured":"RamonJ. and De RaedtL. Multi instance neural networks. In: Proceedings of the ICML-2000 workshop on attribute-value and relational learning (2000) p. 53\u201360."},{"key":"e_1_3_2_23_2","unstructured":"ZhouZ.H. and ZhangM.L. Neural networks for multiinstance learning. In: Proceedings of the International Conference on Intelligent Information Technology Beijing China. (2002) p. 455\u2013459."},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2017.08.026"},{"key":"e_1_3_2_25_2","unstructured":"IlseM. TomczakJ.M. and WellingM. Attention-based deep multiple instance learning arXiv preprint arXiv:1802.04712 (2018)."},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/3366423.3380096"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-006-0029-3"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2014.01.010"},{"key":"e_1_3_2_29_2","unstructured":"SimonyanK. and ZissermanA. Very deep convolutional networks for large-scale image recognition arXiv preprint arXiv:1409.1556 (2014)."},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","unstructured":"SongL. LiuJ. QianB. SunM. YangK. SunM. and AbbasS. A Deep Multi-Modal CNN for Multi-Instance Multi-Label Image Classification IEEE Trans on Image Process 27 (2018) 6025\u20136038. https:\/\/doi.org\/10.1109\/TIP.2018.2864920.","DOI":"10.1109\/TIP.2018.2864920"},{"key":"e_1_3_2_31_2","unstructured":"GlorotX. BordesA. and BengioY. Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence And Statistics (2011) p. 315\u2013323."},{"key":"e_1_3_2_32_2","doi-asserted-by":"crossref","unstructured":"FengJ. and ZhouZ.H. Deep MIML network. In: Thirty-First AAAI Conference on Artificial Intelligence. (2017).","DOI":"10.1609\/aaai.v31i1.10890"},{"key":"e_1_3_2_33_2","unstructured":"SabourS. FrosstN. and HintonG.E. Dynamic routing between capsules. In: Advances in Neural Information Processing Systems (2017) 3856\u20133866."},{"key":"e_1_3_2_34_2","unstructured":"RenH. and LuH. Compositional coding capsule network with k-means routing for text classification arXiv preprint arXiv:1810.09177 (2018)."},{"key":"e_1_3_2_35_2","doi-asserted-by":"crossref","unstructured":"YouQ. LuoJ. JinH. and YangJ. Building a large scale dataset for image emotion recognition: The fine print and the benchmark. In: Thirtieth AAAI Conference on Artificial Intelligence. (2016).","DOI":"10.1609\/aaai.v30i1.9987"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","unstructured":"KatsuraiM. and SatohS. Image sentiment analysis using latent correlations among visual textual and sentiment views in: 2016 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) IEEE Shanghai 2016 2837\u20132841. https:\/\/doi.org\/10.1109\/ICASSP.2016.7472195.","DOI":"10.1109\/ICASSP.2016.7472195"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","unstructured":"PengK.-C. SadovnikA. GallagherA. and ChenT. Where do emotions come from? Predicting the Emotion Stimuli Map in: 2016 IEEE International Conference on Image Processing (ICIP) IEEE Phoenix AZ USA 2016 pp. 614\u2013618. https:\/\/doi.org\/10.1109\/ICIP.2016.7532430.","DOI":"10.1109\/ICIP.2016.7532430"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems: Applications in Engineering and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-200675","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/JIFS-200675","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-200675","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:41:21Z","timestamp":1777455681000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/JIFS-200675"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,28]]},"references-count":36,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2020,11,19]]}},"alternative-id":["10.3233\/JIFS-200675"],"URL":"https:\/\/doi.org\/10.3233\/jifs-200675","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9,28]]}}}