{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:12:19Z","timestamp":1750306339931,"version":"3.41.0"},"reference-count":63,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2016,7,11]],"date-time":"2016-07-11T00:00:00Z","timestamp":1468195200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2016,7,11]]},"abstract":"<jats:p>As humans, we regularly interpret scenes based on how objects are<jats:italic>related<\/jats:italic>, rather than based on the objects themselves. For example, we see a person<jats:italic>riding<\/jats:italic>an object X or a plank<jats:italic>bridging<\/jats:italic>two objects. Current methods provide limited support to search for content based on such relations. We present<jats:sc>raid<\/jats:sc>, a relation-augmented image descriptor that supports queries based on inter-region relations. The key idea of our descriptor is to encode region-to-region relations as the spatial distribution of point-to-region relationships between two image regions.<jats:sc>raid<\/jats:sc>allows sketch-based retrieval and requires minimal training data, thus making it suited even for querying uncommon relations. We evaluate the proposed descriptor by querying into large image databases and successfully extract non-trivial images demonstrating complex inter-region relations, which are easily missed or erroneously classified by existing methods. We assess the robustness of<jats:sc>raid<\/jats:sc>on multiple datasets even when the region segmentation is computed automatically or very noisy.<\/jats:p>","DOI":"10.1145\/2897824.2925939","type":"journal-article","created":{"date-parts":[[2016,7,11]],"date-time":"2016-07-11T16:04:33Z","timestamp":1468253073000},"page":"1-12","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["RAID"],"prefix":"10.1145","volume":"35","author":[{"given":"Paul","family":"Guerrero","sequence":"first","affiliation":[{"name":"University College London"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Niloy J.","family":"Mitra","sequence":"additional","affiliation":[{"name":"University College London"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Peter","family":"Wonka","sequence":"additional","affiliation":[{"name":"KAUST"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2016,7,11]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.895972"},{"key":"e_1_2_2_2_1","first-page":"154","article-title":"Content-Based Image Retrieval by Combining Structural and Content Based Features","volume":"2","author":"Badadapure P. R.","year":"2013","unstructured":"Badadapure , P. R. 2013 . Content-Based Image Retrieval by Combining Structural and Content Based Features . International Journal of Engineering and Advanced Technology 2 , 4, 154 -- 156 . Badadapure, P. R. 2013. Content-Based Image Retrieval by Combining Structural and Content Based Features. International Journal of Engineering and Advanced Technology 2, 4, 154--156.","journal-title":"International Journal of Engineering and Advanced Technology"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.993558"},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2019627.2019639"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2004.06.013"},{"key":"e_1_2_2_6_1","unstructured":"2015. Boost polygon version 1.58. www.boost.org. 2015. Boost polygon version 1.58. www.boost.org."},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2011.5995460"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ITCC.2005.3"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/952532.952682"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/2919332.2919813"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1618452.1618470"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661229.2661239"},{"key":"e_1_2_2_13_1","unstructured":"Chen L. Papandreou G. Kokkinos I. Murphy K. and Yuille A. L. 2015. Semantic image segmentation with deep convolutional nets and fully connected crfs. ICLR (Nov.). Chen L. Papandreou G. Kokkinos I. Murphy K. and Yuille A. L. 2015. Semantic image segmentation with deep convolutional nets and fully connected crfs. ICLR (Nov.)."},{"volume-title":"ICCV Workshops, 1282--1289","author":"Choi W.","key":"e_1_2_2_14_1","unstructured":"Choi , W. , Shahid , K. , and Savarese , S . 2009. What are they doing?: Collective activity classification using spatio-temporal relationship among people . In ICCV Workshops, 1282--1289 . Choi, W., Shahid, K., and Savarese, S. 2009. What are they doing?: Collective activity classification using spatio-temporal relationship among people. In ICCV Workshops, 1282--1289."},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/MMCS.1997.609637"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1572741.1572747"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1572741.1572747"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCG.2011.67"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2185520.2185527"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964929"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366154"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818057"},{"key":"e_1_2_2_23_1","volume-title":"International Conference on, 139--142","author":"Flusser J.","year":"1992","unstructured":"Flusser , J. 1992 . Invariant shape description and measure of object similarity. In Image Processing and its Applications, 1992 ., International Conference on, 139--142 . Flusser, J. 1992. Invariant shape description and measure of object similarity. In Image Processing and its Applications, 1992., International Conference on, 139--142."},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.1985.4767734"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1276377.1276382"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2007.09.008"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2508363.2508381"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766914"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2508363.2508373"},{"key":"e_1_2_2_30_1","doi-asserted-by":"crossref","unstructured":"Huang S. Wang W. and Zhang H. 2014. Retrieving images using saliency detection and graph matching. In IEEE ICIP 3087--3091. Huang S. Wang W. and Zhang H. 2014. Retrieving images using saliency detection and graph matching. In IEEE ICIP 3087--3091.","DOI":"10.1109\/ICIP.2014.7025624"},{"key":"e_1_2_2_31_1","doi-asserted-by":"crossref","unstructured":"Jansen S. Shantia A. and Wiering M. A. 2015. The neural-sift feature descriptor for visual vocabulary object recognition. In IJCNN 1--8. Jansen S. Shantia A. and Wiering M. A. 2015. The neural-sift feature descriptor for visual vocabulary object recognition. In IJCNN 1--8.","DOI":"10.1109\/IJCNN.2015.7280660"},{"key":"e_1_2_2_32_1","volume-title":"-F","author":"Karpathy A.","year":"2015","unstructured":"Karpathy , A. , and Li , F . -F . 2015 . Deep Visual-Semantic Alignments for Generating Image Descriptions. In IEEE CVPR. Karpathy, A., and Li, F.-F. 2015. Deep Visual-Semantic Alignments for Generating Image Descriptions. In IEEE CVPR."},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CGIV.2013.11"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601117"},{"key":"e_1_2_2_35_1","doi-asserted-by":"crossref","unstructured":"Ko B. and Byun H. 2002. Multiple Regions and Their Spatial Relationship-Based Image Retrieval. In LNCS 2383. 81--90. Ko B. and Byun H. 2002. Multiple Regions and Their Spatial Relationship-Based Image Retrieval. In LNCS 2383. 81--90.","DOI":"10.1007\/3-540-45479-9_9"},{"key":"e_1_2_2_36_1","doi-asserted-by":"crossref","unstructured":"Krishna R. Zhu Y. Groth O. Johnson J. Hata K. Kravitz J. Chen S. Kalantidis Y. Li L.-J. Shamma D. A. Bernstein M. and Fei-Fei L. 2016. Visual genome: Connecting language and vision using crowd-sourced dense image annotations. Krishna R. Zhu Y. Groth O. Johnson J. Hata K. Kravitz J. Chen S. Kalantidis Y. Li L.-J. Shamma D. A. Bernstein M. and Fei-Fei L. 2016. Visual genome: Connecting language and vision using crowd-sourced dense image annotations.","DOI":"10.1007\/s11263-016-0981-7"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2012.162"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33783-3_10"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.53"},{"volume-title":"Proceedings of Fourth Int. Symposium on Multimedia Software Engineering.","author":"Lee S. L. S.","key":"e_1_2_2_40_1","unstructured":"Lee , S. L. S. , and Hwang , E. H. E. 2002. Spatial similarity and annotation-based image retrieval system . Proceedings of Fourth Int. Symposium on Multimedia Software Engineering. Lee, S. L. S., and Hwang, E. H. E. 2002. Spatial similarity and annotation-based image retrieval system. Proceedings of Fourth Int. Symposium on Multimedia Software Engineering."},{"key":"e_1_2_2_41_1","doi-asserted-by":"crossref","unstructured":"Lin T. Maire M. Belongie S. Hays J. Perona P. Ramanan D. Doll\u00e1r P. and Zitnick C. L. 2014. Microsoft COCO: common objects in context. CoRR abs\/1405.0312. Lin T. Maire M. Belongie S. Hays J. Perona P. Ramanan D. Doll\u00e1r P. and Zitnick C. L. 2014. Microsoft COCO: common objects in context. CoRR abs\/1405.0312.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661229.2661243"},{"key":"e_1_2_2_43_1","doi-asserted-by":"crossref","unstructured":"Long J. Shelhamer E. and Darrell T. 2015. Fully convolutional networks for semantic segmentation. IEEE CVPR. Long J. Shelhamer E. and Darrell T. 2015. Fully convolutional networks for semantic segmentation. IEEE CVPR.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:VISI.0000029664.99615.94"},{"key":"e_1_2_2_45_1","volume-title":"Beyond Categories: The Visual Memex Model for Reasoning About Object Relationships. In NIPS, 1--9.","author":"Malisiewicz T.","year":"2009","unstructured":"Malisiewicz , T. , and A., E. A. 2009 . Beyond Categories: The Visual Memex Model for Reasoning About Object Relationships. In NIPS, 1--9. Malisiewicz, T., and A., E. A. 2009. Beyond Categories: The Visual Memex Model for Reasoning About Object Relationships. In NIPS, 1--9."},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1007\/s007780050057"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00123143"},{"key":"e_1_2_2_48_1","doi-asserted-by":"crossref","unstructured":"Rubner Y. Tomasi C. and Guibas L. J. 1998. A metric for distributions with applications to image databases. IEEE Computer Society Washington DC USA IEEE ICCV 59--66. Rubner Y. Tomasi C. and Guibas L. J. 1998. A metric for distributions with applications to image databases. IEEE Computer Society Washington DC USA IEEE ICCV 59--66.","DOI":"10.1109\/ICCV.1998.710701"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2011.5995711"},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661229.2661288"},{"key":"e_1_2_2_51_1","doi-asserted-by":"crossref","unstructured":"Shechtman E. and Irani M. 2007. Matching local self-similarities across images and videos. In IEEE CVPR 1--8. Shechtman E. and Irani M. 2007. Matching local self-similarities across images and videos. In IEEE CVPR 1--8.","DOI":"10.1109\/CVPR.2007.383198"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/244130.244151"},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1364\/JOSA.70.000920"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/2036264.2036276"},{"key":"e_1_2_2_55_1","doi-asserted-by":"crossref","unstructured":"Wang Y.-H. 2003. Image indexing and similarity retrieval based on spatial relationship model. Wang Y.-H. 2003. Image indexing and similarity retrieval based on spatial relationship model.","DOI":"10.1016\/S0020-0255(03)00005-7"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/1835449.1835497"},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/2461912.2461968"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366195"},{"key":"e_1_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2003.07.008"},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/2574860"},{"key":"e_1_2_2_61_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.12309"},{"key":"e_1_2_2_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.179"},{"key":"e_1_2_2_63_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8655(00)00123-9"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2897824.2925939","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2897824.2925939","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:55:04Z","timestamp":1750222504000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2897824.2925939"}},"subtitle":["a relation-augmented image descriptor"],"short-title":[],"issued":{"date-parts":[[2016,7,11]]},"references-count":63,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2016,7,11]]}},"alternative-id":["10.1145\/2897824.2925939"],"URL":"https:\/\/doi.org\/10.1145\/2897824.2925939","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"type":"print","value":"0730-0301"},{"type":"electronic","value":"1557-7368"}],"subject":[],"published":{"date-parts":[[2016,7,11]]},"assertion":[{"value":"2016-07-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}