{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T16:12:41Z","timestamp":1761581561194,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":37,"publisher":"ACM","license":[{"start":{"date-parts":[[2017,6,6]],"date-time":"2017-06-06T00:00:00Z","timestamp":1496707200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2017,6,6]]},"DOI":"10.1145\/3078971.3078993","type":"proceedings-article","created":{"date-parts":[[2017,5,25]],"date-time":"2017-05-25T16:27:32Z","timestamp":1495729652000},"page":"347-355","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["AMECON"],"prefix":"10.1145","author":[{"given":"Ines","family":"Chami","sequence":"first","affiliation":[{"name":"Stanford University, Stanford, CA, USA"}]},{"given":"Youssef","family":"Tamaazousti","sequence":"additional","affiliation":[{"name":"CEA LIST, GIF SUR YVETTE, France"}]},{"given":"Herv\u00e9","family":"Le Borgne","sequence":"additional","affiliation":[{"name":"CEA LIST, GIF SUR YVETTE, France"}]}],"member":"320","published-online":{"date-parts":[[2017,6,6]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Alessandro Bergamo and Lorenzo Torresani. 2012. Meta-class features for large- scale object categorization on a budget. In CVPR.  Alessandro Bergamo and Lorenzo Torresani. 2012. Meta-class features for large- scale object categorization on a budget. In CVPR.","DOI":"10.1109\/CVPR.2012.6248040"},{"volume-title":"Natural language processing with Python. -- O'Reilly Media","author":"Bird Steven","key":"e_1_3_2_1_2_1","unstructured":"Steven Bird , Ewan Klein , and Edward Loper . 2009. Natural language processing with Python. -- O'Reilly Media , Inc . Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural language processing with Python. -- O'Reilly Media, Inc."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"crossref","unstructured":"Ken Chatfield Karen Simonyan Andrea Vedaldi and Andrew Zisserman. 2015. Return of the devil in the details: Delving deep into convolutional nets. In BMVC.  Ken Chatfield Karen Simonyan Andrea Vedaldi and Andrew Zisserman. 2015. Return of the devil in the details: Delving deep into convolutional nets. In BMVC.","DOI":"10.5244\/C.28.6"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.142"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"crossref","unstructured":"J. Deng W. Dong R. Socher L.-J. Li K. Li and L. Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR.  J. Deng W. Dong R. Socher L.-J. Li K. Li and L. Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"e_1_3_2_1_6_1","volume-title":"Snoek","author":"Dong Jianfeng","year":"2016","unstructured":"Jianfeng Dong , Xirong Li , and Cees G. M . Snoek . 2016 . Word2VisualVec: Image and Video to Sentence Matching by Visual Feature Prediction. In ArXiv 1604.06838. Jianfeng Dong, Xirong Li, and Cees G. M. Snoek. 2016. Word2VisualVec: Image and Video to Sentence Matching by Visual Feature Prediction. In ArXiv 1604.06838."},{"key":"e_1_3_2_1_7_1","volume-title":"Devise: A deep visual-semantic embedding model. In NIPS.","author":"Frome Andrea","year":"2013","unstructured":"Andrea Frome , Greg S Corrado , Jon Shlens , Samy Bengio , Jeff Dean , Tomas Mikolov , and others. 2013 . Devise: A deep visual-semantic embedding model. In NIPS. Andrea Frome, Greg S Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Tomas Mikolov, and others. 2013. Devise: A deep visual-semantic embedding model. In NIPS."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.81"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-013-0658-4"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2671188.2749403"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1162\/0899766042321814"},{"key":"e_1_3_2_1_12_1","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.  Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR."},{"key":"e_1_3_2_1_13_1","volume-title":"Framing image description as a ranking task: Data, models and evaluation metrics. Journal of Artificial Intelligence Research","author":"Hodosh Micah","year":"2013","unstructured":"Micah Hodosh , Peter Young , and Julia Hockenmaier . 2013. Framing image description as a ranking task: Data, models and evaluation metrics. Journal of Artificial Intelligence Research ( 2013 ). Micah Hodosh, Peter Young, and Julia Hockenmaier. 2013. Framing image description as a ranking task: Data, models and evaluation metrics. Journal of Artificial Intelligence Research (2013)."},{"key":"e_1_3_2_1_14_1","unstructured":"Gao Huang Zhuang Liu Kilian Q Weinberger and Laurens van der Maaten. 2016. Densely connected convolutional networks. In ArXiv 1608.06993.  Gao Huang Zhuang Liu Kilian Q Weinberger and Laurens van der Maaten. 2016. Densely connected convolutional networks. In ArXiv 1608.06993."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2647868.2654889"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"crossref","unstructured":"Andrej Karpathy and Li Fei-Fei. 2015. Deep visual-semantic alignments for generating image descriptions. In CVPR.  Andrej Karpathy and Li Fei-Fei. 2015. Deep visual-semantic alignments for generating image descriptions. In CVPR.","DOI":"10.1109\/CVPR.2015.7298932"},{"key":"e_1_3_2_1_17_1","unstructured":"Andrej Karpathy Armand Joulin and Li Fei Fei. 2014. Deep fragment embeddings for bidirectional image sentence mapping. In NIPS.   Andrej Karpathy Armand Joulin and Li Fei Fei. 2014. Deep fragment embeddings for bidirectional image sentence mapping. In NIPS."},{"key":"e_1_3_2_1_18_1","volume-title":"Zemel","author":"Kiros Ryan","year":"2014","unstructured":"Ryan Kiros , Ruslan Salakhutdinov , and Richard S . Zemel . 2014 . Unifying Visual- Semantic Embeddings with Multimodal Neural Language Models. In ArXiv 1411.2539. Ryan Kiros, Ruslan Salakhutdinov, and Richard S. Zemel. 2014. Unifying Visual- Semantic Embeddings with Multimodal Neural Language Models. In ArXiv 1411.2539."},{"key":"e_1_3_2_1_19_1","unstructured":"B. Klein G. Lev G. Sadeh and L. Wolf. 2015. Fisher vectors derived from hybrid gaussian-laplacian mixture models for image annotation. In CVPR.  B. Klein G. Lev G. Sadeh and L. Wolf. 2015. Fisher vectors derived from hybrid gaussian-laplacian mixture models for image annotation. In CVPR."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2011.6126534"},{"key":"e_1_3_2_1_21_1","volume-title":"Yuille","author":"Mao Junhua","year":"2014","unstructured":"Junhua Mao , Wei Xu , Yi Yang , Jiang Wang , and Alan L . Yuille . 2014 . Explain Images with Multimodal Recurrent Neural Networks. In ArXiv 1410.1090. Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, and Alan L. Yuille. 2014. Explain Images with Multimodal Recurrent Neural Networks. In ArXiv 1410.1090."},{"key":"e_1_3_2_1_22_1","unstructured":"Tomas Mikolov Ilya Sutskever Kai Chen Greg S Corrado and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS.   Tomas Mikolov Ilya Sutskever Kai Chen Greg S Corrado and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS."},{"key":"e_1_3_2_1_23_1","unstructured":"Jiquan Ngiam Aditya Khosla Mingyu Kim Juhan Nam Honglak Lee and Andrew Y Ng. 2011. Multimodal deep learning. In ICML.   Jiquan Ngiam Aditya Khosla Mingyu Kim Juhan Nam Honglak Lee and Andrew Y Ng. 2011. Multimodal deep learning. In ICML."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/78.650093"},{"key":"e_1_3_2_1_25_1","unstructured":"Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In ICLR.  Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In ICLR."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.895972"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2911996.2912013"},{"key":"e_1_3_2_1_28_1","volume-title":"Herv\u00e9 Le Borgne, and C\u00e9line Hudelot","author":"Tamaazousti Youssef","year":"2017","unstructured":"Youssef Tamaazousti , Herv\u00e9 Le Borgne, and C\u00e9line Hudelot . 2017 . MuCaLe-Net: Multi Categorical-Level Networks to Generate More Discriminating Features. In CVPR. Youssef Tamaazousti, Herv\u00e9 Le Borgne, and C\u00e9line Hudelot. 2017. MuCaLe-Net: Multi Categorical-Level Networks to Generate More Discriminating Features. In CVPR."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2911996.2912009"},{"key":"e_1_3_2_1_30_1","volume-title":"Adrian Popescu, Etienne Gadeski, Alexandru Ginsca, and C\u00e9line Hudelot.","author":"Tamaazousti Youssef","year":"2017","unstructured":"Youssef Tamaazousti , Herv\u00e9 Le Borgne , Adrian Popescu, Etienne Gadeski, Alexandru Ginsca, and C\u00e9line Hudelot. 2017 . Vision-Language Integration using Constrained Local Semantic Features. CVIU ( 2017). Youssef Tamaazousti, Herv\u00e9 Le Borgne, Adrian Popescu, Etienne Gadeski, Alexandru Ginsca, and C\u00e9line Hudelot. 2017. Vision-Language Integration using Constrained Local Semantic Features. CVIU (2017)."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"crossref","unstructured":"Lorenzo Torresani Martin Szummer and Andrew Fitzgibbon. 2010. Efficient Object Category Recognition Using Classemes. In ECCV.   Lorenzo Torresani Martin Szummer and Andrew Fitzgibbon. 2010. Efficient Object Category Recognition Using Classemes. In ECCV.","DOI":"10.1007\/978-3-642-15549-9_56"},{"key":"e_1_3_2_1_32_1","volume-title":"Herv\u00e9 Le Borgne, and Michel Crucianu","author":"Nhi Tran Thi Quynh","year":"2016","unstructured":"Thi Quynh Nhi Tran , Herv\u00e9 Le Borgne, and Michel Crucianu . 2016 . Aggregating Image and Text Quantized Correlated Components. In CVPR. Thi Quynh Nhi Tran, Herv\u00e9 Le Borgne, and Michel Crucianu. 2016. Aggregating Image and Text Quantized Correlated Components. In CVPR."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2983563.2983570"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"crossref","unstructured":"Jinjun Wang Jianchao Yang Kai Yu Fengjun Lv Thomas Huang and Yihong Gong. 2010. Locality-constrained linear coding for image classification. In CVPR.  Jinjun Wang Jianchao Yang Kai Yu Fengjun Lv Thomas Huang and Yihong Gong. 2010. Locality-constrained linear coding for image classification. In CVPR.","DOI":"10.1109\/CVPR.2010.5540018"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"crossref","unstructured":"Liwei Wang Yin Li and Svetlana Lazebnik. 2016. Learning Deep Structure- Preserving Image-Text Embeddings. In CVPR.  Liwei Wang Yin Li and Svetlana Lazebnik. 2016. Learning Deep Structure- Preserving Image-Text Embeddings. In CVPR.","DOI":"10.1109\/CVPR.2016.541"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"crossref","unstructured":"Fei Yan and Krystian Mikolajczyk. 2015. Deep correlation for matching images and text. In CVPR.  Fei Yan and Krystian Mikolajczyk. 2015. Deep correlation for matching images and text. In CVPR.","DOI":"10.1109\/CVPR.2015.7298966"},{"key":"e_1_3_2_1_37_1","volume-title":"From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. ACL","author":"Young Peter","year":"2014","unstructured":"Peter Young , Alice Lai , Micah Hodosh , and Julia Hockenmaier . 2014. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. ACL ( 2014 ). Peter Young, Alice Lai, Micah Hodosh, and Julia Hockenmaier. 2014. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. ACL (2014)."}],"event":{"name":"ICMR '17: International Conference on Multimedia Retrieval","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Bucharest Romania","acronym":"ICMR '17"},"container-title":["Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3078971.3078993","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3078971.3078993","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:03:09Z","timestamp":1750215789000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3078971.3078993"}},"subtitle":["Abstract Meta-Concept Features for Text-Illustration"],"short-title":[],"issued":{"date-parts":[[2017,6,6]]},"references-count":37,"alternative-id":["10.1145\/3078971.3078993","10.1145\/3078971"],"URL":"https:\/\/doi.org\/10.1145\/3078971.3078993","relation":{},"subject":[],"published":{"date-parts":[[2017,6,6]]},"assertion":[{"value":"2017-06-06","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}