{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:24:34Z","timestamp":1750307074583,"version":"3.41.0"},"reference-count":49,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2012,11,1]],"date-time":"2012-11-01T00:00:00Z","timestamp":1351728000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100002855","name":"Ministry of Science and Technology of the People's Republic of China","doi-asserted-by":"publisher","award":["2006AA01A115"],"award-info":[{"award-number":["2006AA01A115"]}],"id":[{"id":"10.13039\/501100002855","id-type":"DOI","asserted-by":"publisher"}]},{"name":"CSIDM Project","award":["CSIDM-200803"],"award-info":[{"award-number":["CSIDM-200803"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2012,11]]},"abstract":"<jats:p>In this work, we investigate how to reassign the fully annotated labels at image level to those contextually derived semantic regions, namely Label-to-Region (L2R), in a collective manner. Given a set of input images with label annotations, the basic idea of our approach to L2R is to first discover the patch correspondence across images, and then propagate the common labels shared in image pairs to these correlated patches. Specially, our approach consists of following aspects. First, each of the input images is encoded as a Bag-of-Hierarchical-Patch (BOP) for capturing the rich cues at variant scales, and the individual patches are expressed by patch-level feature descriptors. Second, we present a sparse representation formulation for discovering how well an image or a semantic region can be robustly reconstructed by all the other image patches from the input image set. The underlying philosophy of our formulation is that an image region can be sparsely reconstructed with the image patches belonging to the other images with common labels, while the robustness in label propagation across images requires that these selected patches come from very few images. This preference of being sparse at both patch and image level is named<jats:italic>bi-layer sparsity prior<\/jats:italic>. Meanwhile, we enforce the preference of choosing larger-size patches in reconstruction, referred to as<jats:italic>continuity-biased prior<\/jats:italic>in this work, which may further enhance the reliability of L2R assignment. Finally, we harness the reconstruction coefficients to propagate the image labels to the matched patches, and fuse the propagation results over all patches to finalize the L2R task. As a by-product, the proposed continuity-biased bi-layer sparse representation formulation can be naturally applied to perform image annotation on new testing images. Extensive experiments on three public image datasets clearly demonstrate the effectiveness of our proposed framework in both L2R assignment and image annotation.<\/jats:p>","DOI":"10.1145\/2379790.2379792","type":"journal-article","created":{"date-parts":[[2012,12,11]],"date-time":"2012-12-11T13:13:42Z","timestamp":1355231622000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Label-to-region with continuity-biased bi-layer sparsity priors"],"prefix":"10.1145","volume":"8","author":[{"given":"Xiaobai","family":"Liu","sequence":"first","affiliation":[{"name":"Huazhong University of Science and Technology, China and National University of Singapore, Singapore"}]},{"given":"Shuicheng","family":"Yan","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}]},{"given":"Bin","family":"Cheng","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}]},{"given":"Jinhui","family":"Tang","sequence":"additional","affiliation":[{"name":"Nanjing University of Science and Technology, China"}]},{"given":"Tat-Sheng","family":"Chua","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}]},{"given":"Hai","family":"Jin","sequence":"additional","affiliation":[{"name":"Huazhong University of Science and Technology, China"}]}],"member":"320","published-online":{"date-parts":[[2012,11,30]]},"reference":[{"volume-title":"Nonlinear Programming","author":"Bertsekas D.","key":"e_1_2_1_1_1","unstructured":"Bertsekas , D. 1999. Nonlinear Programming . Athena Scientific . Bertsekas, D. 1999. Nonlinear Programming. Athena Scientific."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpa.20124"},{"volume-title":"Proceedings of the IEEE International Conference on Computer Vision. 1--8.","author":"Cao L.","key":"e_1_2_1_3_1","unstructured":"Cao , L. and Li , F . 2007. Spatially coherent latent topic model for concurrent object segmentation and classification . In Proceedings of the IEEE International Conference on Computer Vision. 1--8. Cao, L. and Li, F. 2007. Spatially coherent latent topic model for concurrent object segmentation and classification. In Proceedings of the IEEE International Conference on Computer Vision. 1--8."},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8.","author":"Chen Y.","key":"e_1_2_1_4_1","unstructured":"Chen , Y. , Zhu , L. , Yuille , A. , and Zhang , H . 2008. Unsupervised learning of probabilistic object models (poms) for object classification, segmentation and recognition . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8. Chen, Y., Zhu, L., Yuille, A., and Zhang, H. 2008. Unsupervised learning of probabilistic object models (poms) for object classification, segmentation and recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1646396.1646452"},{"key":"e_1_2_1_6_1","unstructured":"Comite F. Gilleron R. and Tommasi M . 2003 . Learning multi-label altenating decision tree from texts and data. In Proceedings of the Machine Learning and Data Mining in Pattern Recognition Lecture Notes in Computer Science vol. 2734 . 251--274. Comite F. Gilleron R. and Tommasi M. 2003. Learning multi-label altenating decision tree from texts and data. In Proceedings of the Machine Learning and Data Mining in Pattern Recognition Lecture Notes in Computer Science vol. 2734. 251--274."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1348246.1348248"},{"volume-title":"Proceedings of the Advances in Neural Information Processing Systems. 681--687","author":"Elisseef A.","key":"e_1_2_1_8_1","unstructured":"Elisseef , A. and Weston , J . 2001. A kernel method for multi-labelled classification . In Proceedings of the Advances in Neural Information Processing Systems. 681--687 . Elisseef, A. and Weston, J. 2001. A kernel method for multi-labelled classification. In Proceedings of the Advances in Neural Information Processing Systems. 681--687."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/1046920.1194907"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:VISI.0000022288.19776.77"},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1002--1009","author":"Feng S.","key":"e_1_2_1_11_1","unstructured":"Feng , S. , Manmatha , R. , and Lavrenko , V . 2004. Multiple bernoulli relevance models for image and video annotation . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1002--1009 . Feng, S., Manmatha, R., and Lavrenko, V. 2004. Multiple bernoulli relevance models for image and video annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1002--1009."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2005.142"},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 678--683","author":"Forsyth D.","key":"e_1_2_1_13_1","unstructured":"Forsyth , D. and Fleck , M . 1997. Body plans . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 678--683 . Forsyth, D. and Fleck, M. 1997. Body plans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 678--683."},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","first-page":"397","DOI":"10.1080\/10618600.1998.10474784","article-title":"Penalized regression: The bridge versus the lasso","volume":"7","author":"Fu W.","year":"1998","unstructured":"Fu , W. 1998 . Penalized regression: The bridge versus the lasso . J. Comput. Graph. Statist. 7 , 397 -- 416 . Fu, W. 1998. Penalized regression: The bridge versus the lasso. J. Comput. Graph. Statist. 7, 397--416.","journal-title":"J. Comput. Graph. Statist."},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8.","author":"Galleguillos C.","key":"e_1_2_1_15_1","unstructured":"Galleguillos , C. , Rabinovich , A. , and Belongie , S . 2008. Object categorization using co-occurrence, location and appearance . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8. Galleguillos, C., Rabinovich, A., and Belongie, S. 2008. Object categorization using co-occurrence, location and appearance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8."},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1030--1037","author":"Gu C.","key":"e_1_2_1_16_1","unstructured":"Gu , C. , Lim , J. , Arbelaez , P. , and Malik , J . 2009. Recognition using regions . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1030--1037 . Gu, C., Lim, J., Arbelaez, P., and Malik, J. 2009. Recognition using regions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1030--1037."},{"volume-title":"Proceedings of the IEEE Workshop on Contentbased Access of Image and Video Libraries. 18--25","author":"Haering N.","key":"e_1_2_1_17_1","unstructured":"Haering , N. , Myles , Z. , and Lobo , N . 1997. Locating dedicuous trees . In Proceedings of the IEEE Workshop on Contentbased Access of Image and Video Libraries. 18--25 . Haering, N., Myles, Z., and Lobo, N. 1997. Locating dedicuous trees. In Proceedings of the IEEE Workshop on Contentbased Access of Image and Video Libraries. 18--25."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553431"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/860435.860459"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1027527.1027732"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2006.90"},{"volume-title":"Proceedings of the Advances in Neural Information Processing Systems. 553--560","author":"Lavrenko V.","key":"e_1_2_1_22_1","unstructured":"Lavrenko , V. , Manmatha , R. , and Jeon , J . 2004. A model for learning the semantics of pictures . In Proceedings of the Advances in Neural Information Processing Systems. 553--560 . Lavrenko, V., Manmatha, R., and Jeon, J. 2004. A model for learning the semantics of pictures. In Proceedings of the Advances in Neural Information Processing Systems. 553--560."},{"volume-title":"Proceedings of the ECCV Workshop on Statistical Learning in Computer Vision. 17--32","author":"Leibe B.","key":"e_1_2_1_23_1","unstructured":"Leibe , B. , Leonardis , A. , and Schiele , B . 2004. Combined object categorization and segmentation with an implicit shape model . In Proceedings of the ECCV Workshop on Statistical Learning in Computer Vision. 17--32 . Leibe, B., Leonardis, A., and Schiele, B. 2004. Combined object categorization and segmentation with an implicit shape model. In Proceedings of the ECCV Workshop on Statistical Learning in Computer Vision. 17--32."},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2036--2043","author":"Li L.","key":"e_1_2_1_24_1","unstructured":"Li , L. , Socher , R. , and Li , F . 2009. Towards total scene understanding: Classification, annotation and segmentation in an automatic framework . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2036--2043 . Li, L., Socher, R., and Li, F. 2009. Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2036--2043."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2010.147"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1291233.1291380"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1631272.1631291"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1873951.1873968"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:VISI.0000029664.99615.94"},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.","author":"Nesterov Y.","year":"2007","unstructured":"Nesterov , Y. 2007 . Gradient methods for minimizing composite objective function . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Nesterov, Y. 2007. Gradient methods for minimizing composite objective function. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0042-6989(97)00169-7"},{"volume-title":"Proceedings of the IEEE International Conference on Computer Vision. 1--8.","author":"Rabinovich A.","key":"e_1_2_1_32_1","unstructured":"Rabinovich , A. , Vedaldi , A. , Galleguillos , C. , Wiewiora , E. , and Belongie , S . 2007. Objects in context . In Proceedings of the IEEE International Conference on Computer Vision. 1--8. Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., and Belongie, S. 2007. Objects in context. In Proceedings of the IEEE International Conference on Computer Vision. 1--8."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2006.326"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2005.254"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1873951.1873956"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/11744023_1"},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 18--20","author":"Singhal A.","key":"e_1_2_1_37_1","unstructured":"Singhal , A. , Luo , J. , and Zhu , W . 2003. Probabilistic spatial context models for scene content understanding . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 18--20 . Singhal, A., Luo, J., and Zhu, W. 2003. Probabilistic spatial context models for scene content understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 18--20."},{"volume-title":"Proceedings of the IEEE International Workshop on Content-Based Access to Image and Video Databases. 42--51","author":"Szummer M.","key":"e_1_2_1_38_1","unstructured":"Szummer , M. and Picard , R . 1998. Indoor-outdoor image classification . In Proceedings of the IEEE International Workshop on Content-Based Access to Image and Video Databases. 42--51 . Szummer, M. and Picard, R. 1998. Indoor-outdoor image classification. In Proceedings of the IEEE International Workshop on Content-Based Access to Image and Video Databases. 42--51."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.2517-6161.1996.tb02080.x"},{"key":"e_1_2_1_40_1","unstructured":"Tseng P. 2008. On accelerated proximal gradient methods for convex-concave optimization. submitted to SIAM J. Optimiz. Tseng P. 2008. On accelerated proximal gradient methods for convex-concave optimization. submitted to SIAM J. Optimiz."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2005.148"},{"key":"e_1_2_1_42_1","unstructured":"Wright J. Ganesh A. Rao S. Peng Y. and Ma Y. 2009a. Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. J. ACM. Wright J. Ganesh A. Rao S. Peng Y. and Ma Y. 2009a. Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. J. ACM."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2008.79"},{"volume-title":"Proceedings of the SIAM International Conference on Data Mining. 792--801","author":"Yan S.","key":"e_1_2_1_44_1","unstructured":"Yan , S. and Wang , H . 2009. Semi-supervised learning by sparse representation . In Proceedings of the SIAM International Conference on Data Mining. 792--801 . Yan, S. and Wang, H. 2009. Semi-supervised learning by sparse representation. In Proceedings of the SIAM International Conference on Data Mining. 792--801."},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.","author":"Yang J.","key":"e_1_2_1_45_1","unstructured":"Yang , J. , Yu , K. , Gong , Y. , and Huang , T . 2000. Linear spatial pyramid matching using sparse coding for image classification . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Yang, J., Yu, K., Gong, Y., and Huang, T. 2000. Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/1291233.1291379"},{"volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.","author":"Yuan X.","key":"e_1_2_1_47_1","unstructured":"Yuan , X. and Yan , S . 2010. Visual classification with multi-task joint sparse representation . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Yuan, X. and Yan, S. 2010. Visual classification with multi-task joint sparse representation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition."},{"key":"e_1_2_1_48_1","unstructured":"Zhang J. 2006. A probabilistic framework for multi-task learning. Tech. rep. CMU-LTI-06-006. Zhang J. 2006. A probabilistic framework for multi-task learning. Tech. rep. CMU-LTI-06-006."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2006.12.019"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2379790.2379792","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2379790.2379792","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T09:33:57Z","timestamp":1750239237000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2379790.2379792"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,11]]},"references-count":49,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2012,11]]}},"alternative-id":["10.1145\/2379790.2379792"],"URL":"https:\/\/doi.org\/10.1145\/2379790.2379792","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"type":"print","value":"1551-6857"},{"type":"electronic","value":"1551-6865"}],"subject":[],"published":{"date-parts":[[2012,11]]},"assertion":[{"value":"2011-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-11-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}