{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:20:45Z","timestamp":1750306845942,"version":"3.41.0"},"reference-count":49,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2013,9,1]],"date-time":"2013-09-01T00:00:00Z","timestamp":1377993600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Web"],"published-print":{"date-parts":[[2013,9]]},"abstract":"<jats:p>Image annotation is a process of finding appropriate semantic labels for images in order to obtain a more convenient way for indexing and searching images on the Web. This article proposes a novel method for image annotation based on combining feature-word distributions, which map from visual space to word space, and word-topic distributions, which form a structure to capture label relationships for annotation. We refer to this type of model as Feature-Word-Topic models. The introduction of topics allows us to efficiently take word associations, such as {ocean, fish, coral} or {desert, sand, cactus}, into account for image annotation. Unlike previous topic-based methods, we do not consider topics as joint distributions of words and visual features, but as distributions of words only. Feature-word distributions are utilized to define weights in computation of topic distributions for annotation. By doing so, topic models in text mining can be applied directly in our method. Our Feature-word-topic model, which exploits Gaussian Mixtures for feature-word distributions, and probabilistic Latent Semantic Analysis (pLSA) for word-topic distributions, shows that our method is able to obtain promising results in image annotation and retrieval.<\/jats:p>","DOI":"10.1145\/2516633.2516634","type":"journal-article","created":{"date-parts":[[2013,10,1]],"date-time":"2013-10-01T18:14:28Z","timestamp":1380651268000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["A feature-word-topic model for image annotation and retrieval"],"prefix":"10.1145","volume":"7","author":[{"given":"Cam-Tu","family":"Nguyen","sequence":"first","affiliation":[{"name":"National Key Laboratory for Novel Software Technology, Nanjing University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Natsuda","family":"Kaothanthong","sequence":"additional","affiliation":[{"name":"Tohoku University, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Takeshi","family":"Tokuyama","sequence":"additional","affiliation":[{"name":"Tohoku University, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xuan-Hieu","family":"Phan","sequence":"additional","affiliation":[{"name":"University of Engineering and Technology, VNUH, Vietnam"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2013,9,30]]},"reference":[{"volume-title":"Proceedings of Advances in Neural Information Processing Systems (NIPS'03)","author":"Andrews S.","key":"e_1_2_1_1_1"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/860435.860460"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1214\/07-AOAS114"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/944919.944937"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1273496.1273510"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2007.61"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1348246.1348248"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-007-9039-3"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(96)00034-3"},{"volume-title":"Proceedings of the 7th European Conference on Computer Vision (ECCV'02)","author":"Duygulu P.","key":"e_1_2_1_10_1"},{"volume-title":"Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR'04)","author":"Feng S. L.","key":"e_1_2_1_11_1"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1099554.1099591"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/2283516.2283613"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1386352.1386399"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007617005950"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1282280.1282283"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1386352.1386395"},{"volume-title":"Proceedings of the 12th Annual ACM International Conference on Multimedia.","author":"Jeon J.","key":"e_1_2_1_18_1"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1027527.1027732"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1101149.1101305"},{"key":"e_1_2_1_21_1","unstructured":"Lavrenko V. Manmatha R. and Jeon J. 2003. A model for learning the semantics of pictures. In Advances in Neural Information Processing Systems. MIT Press.  Lavrenko V. Manmatha R. and Jeon J. 2003. A model for learning the semantics of pictures. In Advances in Neural Information Processing Systems. MIT Press."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1646396.1646408"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-007-5018-6"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2007.10.018"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2006.68"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-88690-7_24"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1027527.1027608"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2007.1097"},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","unstructured":"M\u00fcller H. Clough P. Deselaers T. and Caputo B. 2010. ImageCLEF. Experimental Evaluation of Visual Information Retrieval. Springer.   M\u00fcller H. Clough P. Deselaers T. and Caputo B. 2010. ImageCLEF. Experimental Evaluation of Visual Information Retrieval. Springer.","DOI":"10.1007\/978-3-642-15181-1"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1871437.1871652"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1568292.1568295"},{"volume-title":"Proceedings of the CLEF Conference on Multilingual and Multimodal Information Access Evaluation.","author":"Nowak S.","key":"e_1_2_1_32_1"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2010.27"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1367497.1367510"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1291233.1291245"},{"volume-title":"Kernel Methods: Support Vector Learning","year":"1999","author":"Sch\u00f6lkopf B.","key":"e_1_2_1_36_1"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1178677.1178722"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.895972"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1561\/1500000014"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-00958-7_15"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1666420.1666446"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2001.990449"},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1903--1910","author":"Wang C.","key":"e_1_2_1_43_1"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1282280.1282343"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.5555\/1005332.1016791"},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08)","author":"Zha Z.-J.","key":"e_1_2_1_46_1"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/1835804.1835930"},{"key":"e_1_2_1_48_1","unstructured":"Zhang Z. and Zhang R. 2009. Multimedia Data Mining. Chapman & Hall\/CRC Press.  Zhang Z. and Zhang R. 2009. Multimedia Data Mining. Chapman & Hall\/CRC Press."},{"key":"e_1_2_1_49_1","first-page":"1609","article-title":"Multi-instance multi-label learning with application to scene classification","volume":"19","author":"Zhou Z.-H.","year":"2006","journal-title":"Advances in Neural Information Processing Systems"}],"container-title":["ACM Transactions on the Web"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2516633.2516634","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2516633.2516634","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T08:10:12Z","timestamp":1750234212000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2516633.2516634"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,9]]},"references-count":49,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2013,9]]}},"alternative-id":["10.1145\/2516633.2516634"],"URL":"https:\/\/doi.org\/10.1145\/2516633.2516634","relation":{},"ISSN":["1559-1131","1559-114X"],"issn-type":[{"type":"print","value":"1559-1131"},{"type":"electronic","value":"1559-114X"}],"subject":[],"published":{"date-parts":[[2013,9]]},"assertion":[{"value":"2011-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-03-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-09-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}