{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,21]],"date-time":"2025-10-21T15:17:20Z","timestamp":1761059840146},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2014,11,28]],"date-time":"2014-11-28T00:00:00Z","timestamp":1417132800000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Vis"],"published-print":{"date-parts":[[2015,4]]},"DOI":"10.1007\/s11263-014-0778-5","type":"journal-article","created":{"date-parts":[[2014,12,2]],"date-time":"2014-12-02T14:27:45Z","timestamp":1417530465000},"page":"150-171","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors"],"prefix":"10.1007","volume":"112","author":[{"given":"Joseph","family":"Tighe","sequence":"first","affiliation":[]},{"given":"Marc","family":"Niethammer","sequence":"additional","affiliation":[]},{"given":"Svetlana","family":"Lazebnik","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2014,11,28]]},"reference":[{"key":"778_CR1","unstructured":"Adelson, E. H. (2001). On seeing stuff: The perception of materials by humans and machines. In Human Vision and Electronic Imaging, pp. 1\u201312."},{"key":"778_CR2","doi-asserted-by":"crossref","unstructured":"Boykov, Y., & Kolmogorov, V. (2003). Computing geodesics and minimal surfaces via graph cuts. In International Conference on Computer Vision (ICCV), Nice, France.","DOI":"10.1109\/ICCV.2003.1238310"},{"issue":"9","key":"778_CR3","doi-asserted-by":"crossref","first-page":"1124","DOI":"10.1109\/TPAMI.2004.60","volume":"26","author":"Y Boykov","year":"2004","unstructured":"Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut\/max-flow algorithms for energy minimization in vision. PAMI, 26(9), 1124\u201337.","journal-title":"PAMI"},{"key":"778_CR4","doi-asserted-by":"crossref","unstructured":"Brostow, G. J., Shotton, J., Fauqueur, J., & Cipolla, R. (2008). Segmentation and recognition using structure from motion point clouds. In European Conference on Computer Vision (ECCV), Marseille, France.","DOI":"10.1007\/978-3-540-88682-2_5"},{"key":"778_CR5","doi-asserted-by":"crossref","unstructured":"Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, CA.","DOI":"10.1109\/CVPR.2005.177"},{"key":"778_CR6","doi-asserted-by":"crossref","unstructured":"Dean, T., Ruzon, M. A., Segal, M., Shlens, J., Vijayanarasimhan, S., & Yagnik, J. (2013). Fast, accurate detection of 100,000 object classes on a single machine. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR.","DOI":"10.1109\/CVPR.2013.237"},{"key":"778_CR7","doi-asserted-by":"crossref","unstructured":"Eigen, D., & Fergus, R. (2012). Nonparametric image parsing using adaptive neighbor sets. In IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI (CVPR).","DOI":"10.1109\/CVPR.2012.6248004"},{"key":"778_CR8","unstructured":"Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2008). The PASCAL visual object classes challenge 2008 (VOC2008) results. http:\/\/www.pascal-network.org\/challenges\/VOC\/voc2008\/workshop\/index.html"},{"key":"778_CR9","unstructured":"Farabet, C., Couprie, C., Najman, L., & LeCun, Y. (2012). Scene parsing with multiscale feature learning, purity trees, and optimal covers. In International Conference on Machine Learning (ICML), Edinburgh, Scotland."},{"issue":"9","key":"778_CR10","doi-asserted-by":"crossref","first-page":"1627","DOI":"10.1109\/TPAMI.2009.167","volume":"32","author":"PF Felzenszwalb","year":"2010","unstructured":"Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. PAMI, 32(9), 1627\u20131645.","journal-title":"PAMI"},{"key":"778_CR11","doi-asserted-by":"crossref","unstructured":"Floros, G., Rematas, K., & Leibe, B. (2011). Multi-class image labeling with top-down segmentation and generalized robust $${P}^{N}$$ P N potentials. In Proceedings of the British Machine Vision Conference (BMVC), Dundee, UK.","DOI":"10.5244\/C.25.79"},{"key":"778_CR12","doi-asserted-by":"crossref","unstructured":"Gould, S., Fulton, R., & Koller, D. (2009). Decomposing a scene into geometric and semantically consistent regions. In International Conference on Computer Vision (ICCV), Kyoto, Japan.","DOI":"10.1109\/ICCV.2009.5459211"},{"key":"778_CR13","doi-asserted-by":"crossref","unstructured":"Guo, R., & Hoiem, D. (2012). Beyond the line of sight: labeling the underlying surfaces. In European Conference on Computer Vision (ECCV), Amsterdam.","DOI":"10.1007\/978-3-642-33715-4_55"},{"key":"778_CR14","doi-asserted-by":"crossref","unstructured":"Hariharan, B., Malik, J., & Ramanan, D. (2012). Discriminative decorrelation for clustering and classification. In The European Conference on Computer Vision (ECCV), Amsterdam.","DOI":"10.1007\/978-3-642-33765-9_33"},{"key":"778_CR15","doi-asserted-by":"crossref","unstructured":"Heitz, G., & Koller, D. (2008). Learning spatial context: Using stuff to find things. In: European Conference on Computer Vision, Marseille, France, (ECCV), pp. 30\u201343.","DOI":"10.1007\/978-3-540-88682-2_4"},{"key":"778_CR16","unstructured":"IBM. (2013). Cplex optimizer. http:\/\/www.ibm.com\/software\/commerce\/optimization\/cplex-optimizer\/ ."},{"key":"778_CR17","doi-asserted-by":"crossref","unstructured":"Isola, P., & Liu, C. (2013). Scene collaging: Analysis and synthesis of natural images with semantic layers. In IEEE International Conference on Computer Vision (ICCV), Sydney, Australia.","DOI":"10.1109\/ICCV.2013.457"},{"key":"778_CR18","doi-asserted-by":"crossref","unstructured":"Kim, B., Sun, M., Kohli, P., & Savarese, S. (2012). Relating things and stuff by high-order potential modeling. In ECCV Workshop on Higher-Order Models and Global Constraints in Computer Vision.","DOI":"10.1007\/978-3-642-33885-4_30"},{"key":"778_CR19","doi-asserted-by":"crossref","unstructured":"Kim, J., & Grauman, K. (2012). Shape sharing for object segmentation. In European Conference on Computer Vision (ECCV), Amsterdam.","DOI":"10.1007\/978-3-642-33786-4_33"},{"issue":"2","key":"778_CR20","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1109\/TPAMI.2004.1262177","volume":"26","author":"V Kolmogorov","year":"2004","unstructured":"Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? PAMI, 26(2), 147\u201359.","journal-title":"PAMI"},{"key":"778_CR21","unstructured":"Krahenbuhl, P., & Koltun, V. (2011). Efficient inference in fully connected CRFs with Gaussian edge potentials. In Annual Conference on Neural Information Processing Systems (NIPS)."},{"key":"778_CR22","doi-asserted-by":"crossref","unstructured":"Ladick\u00fd, L., Sturgess, P., Alahari, K., Russell, C., & Torr, P. H. (2010). What, where & how many? combining object detectors and CRFs. In The 11th European Conference on Computer Vision (ECCV), Heraklion, Greece.","DOI":"10.1007\/978-3-642-15561-1_31"},{"issue":"12","key":"778_CR23","doi-asserted-by":"crossref","first-page":"2368","DOI":"10.1109\/TPAMI.2011.131","volume":"33","author":"C Liu","year":"2011","unstructured":"Liu, C., Yuen, J., & Torralba, A. (2011). Nonparametric scene parsing via label transfer. PAMI, 33(12), 2368\u20132382.","journal-title":"PAMI"},{"key":"778_CR24","doi-asserted-by":"crossref","unstructured":"Malisiewicz, T., Gupta, A., & Efros, A. A. (2011). Ensemble of exemplar-SVMs for object detection and beyond. In 13th International Conference on Computer Vision (ICCV), Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126229"},{"key":"778_CR25","unstructured":"Myeong, H. J., Chang, Y., & Lee, K. M. (2012). Learning object relationships via graph-based context model. In Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI."},{"key":"778_CR26","unstructured":"Rahimi, A., & Recht, B. (2007). Random features for large-scale kernel machines. In Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver (NIPS)."},{"key":"778_CR27","doi-asserted-by":"crossref","unstructured":"Rother, C., Kolmogorov, V., & Blake, A. (2004). \u201cgrabCut\u201d\u2014interactive foreground extraction using iterated graph cuts. In Special Interest Group on Computer Graphics and Interactive Techniques, Los Angeles, CA (SIGGRAPH).","DOI":"10.1145\/1186562.1015720"},{"key":"778_CR28","doi-asserted-by":"crossref","unstructured":"Russell, B. C., & Torralba, A. (2009). Building a database of 3d scenes from user annotations. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL.","DOI":"10.1109\/CVPR.2009.5206643"},{"issue":"1\u20133","key":"778_CR29","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1007\/s11263-007-0090-8","volume":"77","author":"BC Russell","year":"2008","unstructured":"Russell, B. C., Torralba, A., Murphy, K. P., & Freeman, W. T. (2008). Labelme: A database and web-based tool for image annotation. IJCV, 77(1\u20133), 157\u2013173.","journal-title":"IJCV"},{"key":"778_CR30","doi-asserted-by":"crossref","unstructured":"Shotton, J., Winn, J. M., Rother, C., & Criminisi, A. (2009). Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. IJCV, 81(1), 2\u201323.","DOI":"10.1007\/s11263-007-0109-1"},{"key":"778_CR31","doi-asserted-by":"crossref","unstructured":"Sturgess, P., Alahari, K., Ladick\u00fd, L., & Torr, P. H. S. (2009). Combining appearance and structure from motion features for road scene understanding. In British Machine Vision Conference (BMVC), London, UK.","DOI":"10.5244\/C.23.62"},{"key":"778_CR32","doi-asserted-by":"crossref","unstructured":"Tighe, J., & Lazebnik, S. (2013). Finding things: Image parsing with regions and per-exemplar detectors. In IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR (CVPR).","DOI":"10.1109\/CVPR.2013.386"},{"issue":"2","key":"778_CR33","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1007\/s11263-012-0574-z","volume":"101","author":"J Tighe","year":"2013","unstructured":"Tighe, J., & Lazebnik, S. (2013). SuperParsing: Scalable nonparametric image parsing with superpixels. IJCV, 101(2), 329\u2013349.","journal-title":"IJCV"},{"key":"778_CR34","doi-asserted-by":"crossref","unstructured":"Tighe, J., Niethammer, M., & Lazebnik, S. (2014). Scene parsing with object instances and occlusion ordering. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH.","DOI":"10.1109\/CVPR.2014.479"},{"issue":"2","key":"778_CR35","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1007\/s11263-005-6642-x","volume":"63","author":"Z Tu","year":"2005","unstructured":"Tu, Z., Chen, X., Yuille, A. L., & Zhu, S. C. (2005). Image parsing: Unifying segmentation, detection, and recognition. IJCV, 63(2), 113\u2013140.","journal-title":"IJCV"},{"key":"778_CR36","doi-asserted-by":"crossref","unstructured":"Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., & Torralba, A. (2010). Sun database: Large-scale scene recognition from abbey to zoo. In The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA.","DOI":"10.1109\/CVPR.2010.5539970"},{"issue":"9","key":"778_CR37","doi-asserted-by":"crossref","first-page":"1731","DOI":"10.1109\/TPAMI.2011.208","volume":"34","author":"Y Yang","year":"2012","unstructured":"Yang, Y., Hallman, S., Ramanan, D., & Fowlkes, C. (2012). Layered object models for image segmentation. PAMI, 34(9), 1731\u20131743.","journal-title":"PAMI"},{"key":"778_CR38","unstructured":"Yao, J., Fidler, S., & Urtasun, R. (2012). Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation. In Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI."},{"key":"778_CR39","doi-asserted-by":"crossref","unstructured":"Zhang, C., Wang, L., & Yang, R. (2010). Semantic segmentation of urban scenes using dense depth maps. In The 11th European Conference on Computer Vision (ECCV), Heraklion, Greece.","DOI":"10.1007\/978-3-642-15561-1_51"}],"container-title":["International Journal of Computer Vision"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-014-0778-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s11263-014-0778-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-014-0778-5","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,30]],"date-time":"2023-07-30T11:14:35Z","timestamp":1690715675000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s11263-014-0778-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,11,28]]},"references-count":39,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2015,4]]}},"alternative-id":["778"],"URL":"https:\/\/doi.org\/10.1007\/s11263-014-0778-5","relation":{},"ISSN":["0920-5691","1573-1405"],"issn-type":[{"value":"0920-5691","type":"print"},{"value":"1573-1405","type":"electronic"}],"subject":[],"published":{"date-parts":[[2014,11,28]]}}}