{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T14:17:04Z","timestamp":1760710624916,"version":"3.37.3"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"9","license":[{"start":{"date-parts":[[2021,6,20]],"date-time":"2021-06-20T00:00:00Z","timestamp":1624147200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,6,20]],"date-time":"2021-06-20T00:00:00Z","timestamp":1624147200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100010663","name":"H2020 European Research Council","doi-asserted-by":"crossref","award":["640156"],"award-info":[{"award-number":["640156"]}],"id":[{"id":"10.13039\/100010663","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Vis"],"published-print":{"date-parts":[[2021,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In this work, we propose a learning-based method to denoise and refine disparity maps. The proposed variational network arises naturally from unrolling the iterates of a proximal gradient method applied to a variational energy defined in a joint disparity, color, and confidence image space. Our method allows to learn a robust collaborative regularizer leveraging the joint statistics of the color image, the confidence map and the disparity map. Due to the variational structure of our method, the individual steps can be easily visualized, thus enabling interpretability of the method. We can therefore provide interesting insights into how our method refines and denoises disparity maps. To this end, we can visualize and interpret the learned filters and activation functions and prove the increased reliability of the predicted pixel-wise confidence maps. Furthermore, the optimization based structure of our refinement module allows us to compute <jats:italic>eigen disparity maps<\/jats:italic>, which reveal structural properties of our refinement module. The efficiency of our method is demonstrated on the publicly available stereo benchmarks Middlebury 2014 and Kitti 2015.\n<\/jats:p>","DOI":"10.1007\/s11263-021-01485-5","type":"journal-article","created":{"date-parts":[[2021,6,20]],"date-time":"2021-06-20T15:02:28Z","timestamp":1624201348000},"page":"2565-2582","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Learned Collaborative Stereo Refinement"],"prefix":"10.1007","volume":"129","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2371-014X","authenticated-orcid":false,"given":"Patrick","family":"Kn\u00f6belreiter","sequence":"first","affiliation":[]},{"given":"Thomas","family":"Pock","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,6,20]]},"reference":[{"key":"1485_CR1","doi-asserted-by":"crossref","unstructured":"Barron, J. T., & Poole, B.(2016). The fast bilateral solver. In European conference on computer vision (ECCV) (pp. 617\u2013632).","DOI":"10.1007\/978-3-319-46487-9_38"},{"key":"1485_CR2","doi-asserted-by":"crossref","unstructured":"Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal of Imaging and Sciences pp. 183\u2013202.","DOI":"10.1137\/080716542"},{"key":"1485_CR3","doi-asserted-by":"crossref","unstructured":"Brox, T., Bruhn, A., Papenberg, N., & Weickert, J. (2004). High accuracy optical flow estimation based on a theory for warping. In European conference on computer vision (ECCV) (pp. 25\u201336).","DOI":"10.1007\/978-3-540-24673-2_3"},{"key":"1485_CR4","doi-asserted-by":"crossref","unstructured":"Chambolle, A., & Pock, T. (2011). A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision 120\u2013145.","DOI":"10.1007\/s10851-010-0251-1"},{"key":"1485_CR5","doi-asserted-by":"crossref","unstructured":"Chang, J. R., & Chen, Y. S. (2018). Pyramid stereo matching network. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5410\u20135418).","DOI":"10.1109\/CVPR.2018.00567"},{"key":"1485_CR6","doi-asserted-by":"crossref","unstructured":"Chen, Y., Yu, W., & Pock, T.(2015). On learning optimized reaction diffusion processes for effective image restoration. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5261\u20135269).","DOI":"10.1109\/CVPR.2015.7299163"},{"key":"1485_CR7","doi-asserted-by":"crossref","unstructured":"Effland, A., Kobler, E., Kunisch, K., & Pock, T. (2020). An optimal control approach to early stopping variational methods for image restoration. Journal of Mathematical Imaging and Vision 396\u2013416.","DOI":"10.1007\/s10851-019-00926-8"},{"key":"1485_CR8","doi-asserted-by":"crossref","unstructured":"Gidaris, S., & Komodakis, N. (2017). Detect, replace, refine: Deep structured prediction for pixel wise labeling. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 5248\u20135257).","DOI":"10.1109\/CVPR.2017.760"},{"key":"1485_CR9","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 770\u2013778).","DOI":"10.1109\/CVPR.2016.90"},{"key":"1485_CR10","doi-asserted-by":"crossref","unstructured":"Hu, X., & Mordohai, P. (2012). A quantitative evaluation of confidence measures for stereo vision. IEEE Transactions on Pattern Analysis and Machine Intelligence 2121\u20132133.","DOI":"10.1109\/TPAMI.2012.46"},{"key":"1485_CR11","doi-asserted-by":"crossref","unstructured":"Khamis, S., Fanello, S., Rhemann, C., Kowdle, A., Valentin, J., & Izadi, S. (2018). Stereonet: Guided hierarchical refinement for real-time edge-aware depth prediction. In European conference on computer vision (ECCV) (pp. 8\u201314).","DOI":"10.1007\/978-3-030-01267-0_35"},{"key":"1485_CR12","unstructured":"Kingma, D. P., & Ba, J.(2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980."},{"key":"1485_CR13","doi-asserted-by":"crossref","unstructured":"Kn\u00f6belreiter, P., & Pock, T. (2019). Learned collaborative stereo refinement. In German conference on pattern recognition (GCPR) (pp. 3\u201317).","DOI":"10.1007\/978-3-030-33676-9_1"},{"key":"1485_CR14","doi-asserted-by":"crossref","unstructured":"Kn\u00f6belreiter, P., Reinbacher, C., Shekhovtsov, A., & Pock, T. (2017). End-to-end training of hybrid CNN-CRF models for stereo. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2339\u20132348).","DOI":"10.1109\/CVPR.2017.159"},{"key":"1485_CR15","doi-asserted-by":"crossref","unstructured":"Kobler, E., Klatzer, T., Hammernik, K., & Pock, T.(2017). Variational networks: Connecting variational methods and deep learning. In German conference on pattern recognition (GCPR) (pp. 281\u2013293).","DOI":"10.1007\/978-3-319-66709-6_23"},{"key":"1485_CR16","doi-asserted-by":"crossref","unstructured":"Kuschk, G., & Cremers, D. (2013). Fast and accurate large-scale stereo reconstruction using variational methods. In IEEE international conference on computer vision workshop (pp. 700\u2013707).","DOI":"10.1109\/ICCVW.2013.96"},{"key":"1485_CR17","doi-asserted-by":"crossref","unstructured":"Liang, Z., Feng, Y., Guo, Y., Liu, H., Chen, W., Qiao, L., Zhou, L., & Zhang, J. (2018). Learning for disparity estimation through feature constancy. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2811\u20132820).","DOI":"10.1109\/CVPR.2018.00297"},{"key":"1485_CR18","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., & Darrell, T.(2015). Fully convolutional networks for semantic segmentation. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3431\u20133440).","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"1485_CR19","doi-asserted-by":"crossref","unstructured":"Maurer, D., Stoll, M., & Bruhn, A.(2017). Order-adaptive and illumination-aware variational optical flow refinement. In British machine vision conference.","DOI":"10.5244\/C.31.150"},{"key":"1485_CR20","doi-asserted-by":"crossref","unstructured":"Meinhardt, T., Moeller, M., Hazirbas, C., & Cremers, D.(2017). Learning proximal operators: Using denoising networks for regularizing inverse imaging problems. In IEEE International conference on computer vision (ICCV).","DOI":"10.1109\/ICCV.2017.198"},{"key":"1485_CR21","doi-asserted-by":"crossref","unstructured":"Menze, M., & Geiger, A. (2015). Object scene flow for autonomous vehicles. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3061\u20133070).","DOI":"10.1109\/CVPR.2015.7298925"},{"issue":"3","key":"1485_CR22","first-page":"509","volume":"24","author":"Y Nesterov","year":"1988","unstructured":"Nesterov, Y. (1988). On an approach to the construction of optimal methods of minimization of smooth convex functions. Ekonomika i Mateaticheskie Metody, 24(3), 509\u2013517.","journal-title":"Ekonomika i Mateaticheskie Metody"},{"key":"1485_CR23","doi-asserted-by":"crossref","unstructured":"Pang, J., Sun, W., Ren, J. S., Yang, C., & Yan, Q. (2017). Cascade residual learning: A two-stage convolutional neural network for stereo matching. In IEEE international conference on computer vision workshop (pp. 887\u2013895).","DOI":"10.1109\/ICCVW.2017.108"},{"key":"1485_CR24","doi-asserted-by":"crossref","unstructured":"Parikh, N., & Boyd, S. (2014). Proximal algorithms. Foundations and trends\u00ae in Optimization pp. 127\u2013239.","DOI":"10.1561\/2400000003"},{"key":"1485_CR25","doi-asserted-by":"crossref","unstructured":"Ranftl, R., Bredies, K., & Pock, T. (2014). Non-local total generalized variation for optical flow estimation. In European conference on computer vision (ECCV) (pp. 439\u2013454).","DOI":"10.1007\/978-3-319-10590-1_29"},{"key":"1485_CR26","doi-asserted-by":"crossref","unstructured":"Ranftl, R., Gehrig, S., Pock, T., & Bischof, H. (2012). Pushing the limits of stereo using variational stereo estimation. In IEEE intelligent vehicles symposium (pp. 401\u2013407).","DOI":"10.1109\/IVS.2012.6232171"},{"key":"1485_CR27","doi-asserted-by":"crossref","unstructured":"Revaud, J., Weinzaepfel, P., Harchaoui, Z., & Schmid, C. (2015). Epicflow: Edge-preserving interpolation of correspondences for optical flow. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1164\u20131172).","DOI":"10.1109\/CVPR.2015.7298720"},{"key":"1485_CR28","doi-asserted-by":"crossref","unstructured":"Riegler, G., R\u00fcther, M., & Bischof, H. (2016). ATGV-Net: Accurate depth super-resolution. In European conference on computer vision (ECCV) (pp. 268\u2013284).","DOI":"10.1007\/978-3-319-46487-9_17"},{"key":"1485_CR29","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International conference on medical image computing and computer-assisted intervention (MICCAI) (pp. 234\u2013241).","DOI":"10.1007\/978-3-319-24574-4_28"},{"issue":"2","key":"1485_CR30","doi-asserted-by":"publisher","first-page":"205","DOI":"10.1007\/s11263-008-0197-6","volume":"82","author":"S Roth","year":"2009","unstructured":"Roth, S., & Black, M. J. (2009). Fields of experts. International Journal of Computer Vision, 82(2), 205.","journal-title":"International Journal of Computer Vision"},{"key":"1485_CR31","doi-asserted-by":"crossref","unstructured":"Scharstein, D., Hirschm\u00fcller, H., Kitajima, Y., Krathwohl, G., Nesic, N., Wang, X., & Westling, P. (2014). High-resolution stereo datasets with subpixel-accurate ground truth. In German conference on pattern recognition (GCPR) (pp. 31\u201342).","DOI":"10.1007\/978-3-319-11752-2_3"},{"issue":"1\u20133","key":"1485_CR32","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1023\/A:1014573219977","volume":"47","author":"D Scharstein","year":"2002","unstructured":"Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47(1\u20133), 7\u201342.","journal-title":"International Journal of Computer Vision"},{"key":"1485_CR33","unstructured":"Shekhovtsov, A., Reinbacher, C., Graber, G., & Pock, T.(2016). Solving dense image matching in real-time using discrete-continuous optimization. Computer vision winter workshop."},{"key":"1485_CR34","unstructured":"Tulyakov, S., Ivanov, A., & Fleuret, F. (2018). Practical deep stereo (PDS): Toward applications-friendly deep stereo matching. In Proceedings of advances in neural information processing systems (pp. 5871\u20135881)."},{"key":"1485_CR35","doi-asserted-by":"crossref","unstructured":"Vogel, C., Kn\u00f6belreiter, P., & Pock, T. (2018). Learning energy based inpainting for optical flow. In Asian conference on computer vision (ACCV) (pp. 340\u2013356).","DOI":"10.1007\/978-3-030-20876-9_22"},{"key":"1485_CR36","doi-asserted-by":"crossref","unstructured":"Vogel, C., & Pock, T.(2017). A primal dual network for low-level vision problems. In German conference on pattern recognition (GCPR) (pp. 189\u2013202).","DOI":"10.1007\/978-3-319-66709-6_16"},{"key":"1485_CR37","unstructured":"Wang, S., Fidler, S., & Urtasun, R. (2016). Proximal deep structured models. In Proceedings of advances in neural information processing systems (pp. 865\u2013873)."},{"key":"1485_CR38","doi-asserted-by":"crossref","unstructured":"Zach, C., Pock, T., & Bischof, H.(2007). A duality based approach for realtime TV-L1 optical flow. In German conference on pattern recognition (GCPR) (pp. 214\u2013223).","DOI":"10.1007\/978-3-540-74936-3_22"},{"key":"1485_CR39","unstructured":"\u017dbontar, J., & LeCun, Y. (2016). Stereo matching by training a convolutional neural network to compare image patches. Journal of Machine Learning Research, 17(1), 2287\u20132318."},{"issue":"2","key":"1485_CR40","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1023\/A:1007925832420","volume":"27","author":"SC Zhu","year":"1998","unstructured":"Zhu, S. C., Wu, Y., & Mumford, D. (1998). Filters, random fields and maximum entropy (frame): Towards a unified theory for texture modeling. International Journal of Computer Vision, 27(2), 107\u2013126.","journal-title":"International Journal of Computer Vision"}],"container-title":["International Journal of Computer Vision"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-021-01485-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11263-021-01485-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-021-01485-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,30]],"date-time":"2021-07-30T03:42:43Z","timestamp":1627616563000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11263-021-01485-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,20]]},"references-count":40,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2021,9]]}},"alternative-id":["1485"],"URL":"https:\/\/doi.org\/10.1007\/s11263-021-01485-5","relation":{},"ISSN":["0920-5691","1573-1405"],"issn-type":[{"type":"print","value":"0920-5691"},{"type":"electronic","value":"1573-1405"}],"subject":[],"published":{"date-parts":[[2021,6,20]]},"assertion":[{"value":"7 January 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 May 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 June 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}