{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T17:57:40Z","timestamp":1769709460707,"version":"3.49.0"},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"1s","license":[{"start":{"date-parts":[[2019,1,24]],"date-time":"2019-01-24T00:00:00Z","timestamp":1548288000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100018619","name":"National Program for Support of Top-notch Young Professionals","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100018619","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Lenovo Outstanding Young Scientists Program"},{"DOI":"10.13039\/501100012152","name":"National Postdoctoral Program for Innovative Talents","doi-asserted-by":"crossref","award":["BX201700255"],"award-info":[{"award-number":["BX201700255"]}],"id":[{"id":"10.13039\/501100012152","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61532018"],"award-info":[{"award-number":["61532018"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Program for Special Support of Eminent Professionals"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2019,1,31]]},"abstract":"<jats:p>Scene classification is a challenging problem. Compared with object images, scene images are more abstract, as they are composed of objects. Object and scene images have different characteristics with different scales and composition structures. How to effectively integrate the local mid-level semantic representations including both object and scene concepts needs to be investigated, which is an important aspect for scene classification. In this article, the idea of a sharing codebook is introduced by organically integrating deep learning, concept feature, and local feature encoding techniques. More specifically, the shared local feature codebook is generated from the combined ImageNet1K and Places365 concepts (Mixed1365) using convolutional neural networks. As the Mixed1365 features cover all the semantic information including both object and scene concepts, we can extract a shared codebook from the Mixed1365 features, which only contain a subset of the whole 1,365 concepts with the same codebook size. The shared codebook can not only provide complementary representations without additional codebook training but also be adaptively extracted toward different scene classification tasks. A method of fusing the encoded features with both the original codebook and the shared codebook is proposed for scene classification. In this way, more comprehensive and representative image features can be generated for classification. Extensive experimentations conducted on two public datasets validate the effectiveness of the proposed method. Besides, some useful observations are also revealed to show the advantage of shared codebook.<\/jats:p>","DOI":"10.1145\/3231738","type":"journal-article","created":{"date-parts":[[2019,1,28]],"date-time":"2019-01-28T14:01:39Z","timestamp":1548684099000},"page":"1-17","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":22,"title":["Deep Patch Representations with Shared Codebook for Scene Classification"],"prefix":"10.1145","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1596-4326","authenticated-orcid":false,"given":"Shuqiang","family":"Jiang","sequence":"first","affiliation":[{"name":"Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0634-6075","authenticated-orcid":false,"given":"Gongwei","family":"Chen","sequence":"additional","affiliation":[{"name":"Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xinhang","family":"Song","sequence":"additional","affiliation":[{"name":"Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Linhu","family":"Liu","sequence":"additional","affiliation":[{"name":"Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2019,1,24]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2016.2555080"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2014.2313111"},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS\u201910)","author":"Bo L."},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS\u201910)","author":"Bo Liefeng","year":"2009"},{"key":"e_1_2_1_5_1","first-page":"7","article-title":"An object-level high-order contextual descriptor based on semantic, spatial, and scale cues","volume":"45","author":"Cao X.","year":"2015","journal-title":"IEEE Trans. Cybernet."},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201912)","author":"Cinbis R. G."},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the European Conference on Computer Vision Workshop on Statistical Learning in Computer Vision (ECCV\u201904)","author":"Csurka Gabriella","year":"2004"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298916"},{"key":"e_1_2_1_9_1","volume-title":"Dixit and Nuno Vasconcelos","author":"Mandar","year":"2016"},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS\u201913)","author":"Doersch Carl"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML\u201914)","author":"Donahue Jeff","year":"2014"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2005.16"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the Annual European Conference on Computer Vision (ECCV\u201914)","author":"Gong Y."},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916)","author":"Herranz L."},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201910)","author":"Jegou H."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.124"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS\u201912)","author":"Krizhevsky Alex"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33765-9_26"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2006.68"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS\u201910)","author":"Li L. J."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2012.2194993"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-013-0660-x"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2018.2839916"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:VISI.0000029664.99615.94"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201912)","author":"Niu Z."},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201907)","author":"Perronnin Florent"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.5555\/1888089.1888101"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201909)","author":"Quattoni A."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2007.900138"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206826"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2011.175"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.69"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR\u201915)","author":"Simonyan K."},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201915)","author":"Song Xinhang","year":"2015"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2017.2686017"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.5555\/850924.851590"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-88690-7_52"},{"key":"e_1_2_1_39_1","doi-asserted-by":"crossref","volume-title":"A Semantic Typicality Measure for Natural Scene Categorization","author":"Vogel Julia","DOI":"10.1007\/978-3-540-28649-3_24"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-006-8614-1"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2010.5540018"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2700292"},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS\u201907)","author":"Wang X."},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of the International Conference on Machine Learning (ICML\u201913)","author":"Wang Xinggang","year":"2013"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.152"},{"key":"e_1_2_1_46_1","volume-title":"Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201910)","author":"Xiao J."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2015.2511543"},{"key":"e_1_2_1_48_1","volume-title":"Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201909)","author":"Yang Jianchao"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2015.7301274"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2014.2330794"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2016.2590321"},{"key":"e_1_2_1_52_1","volume-title":"Places: An image database for deep scene understanding. arXiv preprint arXiv:1610.02055","author":"Zhou Bolei","year":"2016"},{"key":"e_1_2_1_53_1","volume-title":"Annual Conference on Neural Information Processing Systems (NIPS\u201914)","author":"Zhou Bolei","year":"2014"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3231738","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3231738","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:08:16Z","timestamp":1750208896000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3231738"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,1,24]]},"references-count":53,"journal-issue":{"issue":"1s","published-print":{"date-parts":[[2019,1,31]]}},"alternative-id":["10.1145\/3231738"],"URL":"https:\/\/doi.org\/10.1145\/3231738","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,1,24]]},"assertion":[{"value":"2017-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-01-24","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}