{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T20:45:10Z","timestamp":1776113110003,"version":"3.50.1"},"reference-count":86,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2011,2,1]],"date-time":"2011-02-01T00:00:00Z","timestamp":1296518400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2011,2]]},"abstract":"<jats:p>\n            Active learning is a machine learning technique that selects the most informative samples for labeling and uses them as training data. It has been widely explored in multimedia research community for its capability of reducing human annotation effort. In this article, we provide a survey on the efforts of leveraging active learning in multimedia annotation and retrieval. We mainly focus on two application domains: image\/video annotation and content-based image retrieval. We first briefly introduce the principle of active learning and then we analyze the sample selection criteria. We categorize the existing sample selection strategies used in multimedia annotation and retrieval into five criteria:\n            <jats:italic>risk reduction<\/jats:italic>\n            ,\n            <jats:italic>uncertainty<\/jats:italic>\n            ,\n            <jats:italic>diversity<\/jats:italic>\n            ,\n            <jats:italic>density<\/jats:italic>\n            and\n            <jats:italic>relevance<\/jats:italic>\n            . We then introduce several classification models used in active learning-based multimedia annotation and retrieval, including semi-supervised learning, multilabel learning and multiple instance learning. We also provide a discussion on several future trends in this research direction. In particular, we discuss cost analysis of human annotation and large-scale interactive multimedia annotation.\n          <\/jats:p>","DOI":"10.1145\/1899412.1899414","type":"journal-article","created":{"date-parts":[[2012,10,12]],"date-time":"2012-10-12T20:56:02Z","timestamp":1350075362000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":183,"title":["Active learning in multimedia annotation and retrieval"],"prefix":"10.1145","volume":"2","author":[{"given":"Meng","family":"Wang","sequence":"first","affiliation":[{"name":"Microsoft Research Asia, Beijing, China"}]},{"given":"Xian-Sheng","family":"Hua","sequence":"additional","affiliation":[{"name":"Microsoft Research Asia, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2011,2,24]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/985692.985733"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1124772.1124782"},{"key":"e_1_2_1_3_1","volume-title":"Recaptcha: Human-based character recognition via web security measures. Science.","author":"Ahn L.","year":"2008"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the Neural Information Processing Systems.","author":"Andrews S."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022821128753"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of International Workshop on Content-Based Multimedia Indexing.","author":"Ayache S."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1631272.1631355"},{"key":"e_1_2_1_8_1","first-page":"1","article-title":"A maximum entropy approach to natural language processing","volume":"22","author":"Berger A.","year":"1996","journal-title":"Computat. Linguistics"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/279943.279962"},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the International Conference on Machine Learning.","author":"Brinker K.","year":"2003"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of Neural Information Processing Systems.","author":"Cauwenberghs G."},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","unstructured":"Chapelle O. Zien A. and Sch\u00f6lkopf B. 2006. Semi-Supervised Learning. MIT Press.   Chapelle O. Zien A. and Sch\u00f6lkopf B. 2006. Semi-Supervised Learning. MIT Press.","DOI":"10.7551\/mitpress\/9780262033589.001.0001"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1101149.1101342"},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","unstructured":"Cohen D. A. Ghahramani Z. and Jordan M. I. 1996. Active learning with statistical models. J. Artif. Intell. Res.   Cohen D. A. Ghahramani Z. and Jordan M. I. 1996. Active learning with statistical models. J. Artif. Intell. Res.","DOI":"10.21236\/ADA295617"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022673506211"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-88682-2_8"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the International Conference on Machine Learning.","author":"Dagan I."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/11788034_13"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(96)00034-3"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1117\/12.290336"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007330508534"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the International Conference on Multimedia &amp; Expo.","author":"Geng B."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1027527.1027664"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1039470.1039483"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the International Conference on Acoustics, Speech and Signal Processing.","author":"Hakkani"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the International Conference on Artificial Intelligence and Statistics.","author":"Hanneke S."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1026711.1026715"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143897"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2005.44"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390208"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1459359.1459379"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2008.916364"},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the International Conference on Computer Vision and Pattern Recognition.","author":"Jain P."},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the International Conference on Computer Vision and Pattern Recognition.","author":"Joshi A. J."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature02236"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2007.70847"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the International Conference on Multimedia &amp; Expo.","author":"Lin C."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00530-006-0032-2"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the International Conference on Machine Learning.","author":"Maronand O."},{"key":"e_1_2_1_40_1","doi-asserted-by":"crossref","unstructured":"Mitchell T. 1982. Generalization as search. Artif. Intell.  Mitchell T. 1982. Generalization as search. Artif. Intell.","DOI":"10.1016\/0004-3702(82)90040-6"},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the International Conference on Machine Learning.","author":"Muslea I."},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the International Conference on Image Processing.","author":"Naphade M."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/1027527.1027680"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015330.1015349"},{"key":"e_1_2_1_45_1","unstructured":"Olsson F. 2009. A literature survey of active machine learning in the context of natural language processing. SICS Tech. rep. Swedish Institute of Computer Science.  Olsson F. 2009. A literature survey of active machine learning in the context of natural language processing. SICS Tech. rep. Swedish Institute of Computer Science."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143930"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-006-0043-1"},{"key":"e_1_2_1_48_1","doi-asserted-by":"crossref","unstructured":"Parzen E. 1962. On the estimation of a probability density function and the mode. Ann. Math. Stat. 33.  Parzen E. 1962. On the estimation of a probability density function and the mode. Ann. Math. Stat. 33.","DOI":"10.1214\/aoms\/1177704472"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/1291233.1291245"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2008.218"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2006.211"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1137\/1026034"},{"key":"e_1_2_1_53_1","volume-title":"Proceedings of the International Conference on Machine Learning.","author":"Roy N.","year":"2001"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1006\/jvci.1999.0413"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/76.718510"},{"key":"e_1_2_1_56_1","volume-title":"Proceedings of the International Conference on Computer Vision and Pattern Recognition.","author":"Sahbi H."},{"key":"e_1_2_1_57_1","unstructured":"Settles B. 2009. Active learning literature survey. Computer Sciences Tech. rep. University of Wisconsin-Madison.  Settles B. 2009. Active learning literature survey. Computer Sciences Tech. rep. University of Wisconsin-Madison."},{"key":"e_1_2_1_58_1","volume-title":"Proceedings of the NIPS Workshop on Cost-Sensitive Learning.","author":"Settles B."},{"key":"e_1_2_1_59_1","volume-title":"Proceedings of Neural Information Processing Systems.","author":"Settles B."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/130385.130417"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.895972"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/1101826.1101844"},{"key":"e_1_2_1_63_1","volume-title":"Proceedings of the CVPR Workshop.","author":"Sorokin A."},{"key":"e_1_2_1_64_1","volume-title":"Proceedings of the International Conference on Image Processing.","author":"Sychay G."},{"key":"e_1_2_1_65_1","volume-title":"Proceedings of the International Conference on Multimedia &amp; Expo.","author":"Tang J."},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/500141.500159"},{"key":"e_1_2_1_67_1","volume-title":"Proceedings of the International Conference on Machine Learning.","author":"Tong S."},{"key":"e_1_2_1_68_1","volume-title":"Proceedings of the TRECVID Workshop.","author":"Vendrig J."},{"key":"e_1_2_1_69_1","volume-title":"Proceedings of the Neural Information Processing Systems.","author":"Vijayanarasimhan S."},{"key":"e_1_2_1_70_1","volume-title":"Proceedings of the Symposium on Computer Vision and Pattern Recognition.","author":"Vijayanarasimhan S."},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/1101149.1101341"},{"key":"e_1_2_1_72_1","first-page":"4","article-title":"Interactive video annotation by multi-concept multi-modality active learning","volume":"1","author":"Wang M.","year":"2007","journal-title":"Int. J. Seman. Comput."},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2009.2012919"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10. 1109\/TCSVT.2009.2017400"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390301"},{"key":"e_1_2_1_76_1","volume-title":"Proceedings of the International Conference on Multimedia &amp; Expo.","author":"Wu Y."},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1109\/MMUL.2009.28"},{"key":"e_1_2_1_78_1","volume-title":"Proceedings of the International Conference on Computer Vision.","author":"Yan R."},{"key":"e_1_2_1_79_1","volume-title":"Proceedings of the International Conference on Multimedia &amp; Expo.","author":"Yang J."},{"key":"e_1_2_1_80_1","volume-title":"Proceedings of the International Conference on Multimedia &amp; Expo.","author":"Yuan J."},{"key":"e_1_2_1_81_1","volume-title":"Proceedings of the International Conference on Image Processing.","author":"Zhang C."},{"key":"e_1_2_1_82_1","volume-title":"Proceedings of the Neural Information Processing Systems.","author":"Zhang Q."},{"key":"e_1_2_1_83_1","volume-title":"Proceedings of the International Conference on Multimedia &amp; Expo.","author":"Zhang X."},{"key":"e_1_2_1_84_1","doi-asserted-by":"crossref","unstructured":"Zhu X. 2009. Semi-supervised learning literature survey. Tech. rep. (1530) Wisconsin-Madison.  Zhu X. 2009. Semi-supervised learning literature survey. Tech. rep. (1530) Wisconsin-Madison.","DOI":"10.1007\/978-3-031-01548-9_7"},{"key":"e_1_2_1_85_1","volume-title":"Proceedings of the International Conference on Machine Learning.","author":"Zhu X."},{"key":"e_1_2_1_86_1","volume-title":"Proceedings of the ICML 2003 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining.","author":"Zhu X."}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1899412.1899414","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1899412.1899414","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T10:59:46Z","timestamp":1750244386000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1899412.1899414"}},"subtitle":["A survey"],"short-title":[],"issued":{"date-parts":[[2011,2]]},"references-count":86,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2011,2]]}},"alternative-id":["10.1145\/1899412.1899414"],"URL":"https:\/\/doi.org\/10.1145\/1899412.1899414","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"value":"2157-6904","type":"print"},{"value":"2157-6912","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,2]]},"assertion":[{"value":"2010-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-02-24","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}