{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,19]],"date-time":"2025-11-19T06:59:28Z","timestamp":1763535568039,"version":"build-2065373602"},"reference-count":29,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2018,11,20]],"date-time":"2018-11-20T00:00:00Z","timestamp":1542672000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Stereo matching has been solved as a supervised learning task with convolutional neural network (CNN). However, CNN based approaches basically require huge memory use. In addition, it is still challenging to find correct correspondences between images at ill-posed dim and sensor noise regions. To solve these problems, we propose Sparse Cost Volume Net (SCV-Net) achieving high accuracy, low memory cost and fast computation. The idea of the cost volume for stereo matching was initially proposed in GC-Net. In our work, by making the cost volume compact and proposing an efficient similarity evaluation for the volume, we achieved faster stereo matching while improving the accuracy. Moreover, we propose to use weight normalization instead of commonly-used batch normalization for stereo matching tasks. This improves the robustness to not only sensor noises in images but also batch size in the training process. We evaluated our proposed network on the Scene Flow and KITTI 2015 datasets, its performance overall surpasses the GC-Net. Comparing with the GC-Net, our SCV-Net achieved to: (1) reduce     73.08 %     GPU memory cost; (2) reduce     61.11 %     processing time; (3) improve the 3PE from     2.87 %     to     2.61 %     on the KITTI 2015 dataset.<\/jats:p>","DOI":"10.3390\/rs10111844","type":"journal-article","created":{"date-parts":[[2018,11,22]],"date-time":"2018-11-22T09:18:25Z","timestamp":1542878305000},"page":"1844","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":28,"title":["Sparse Cost Volume for Efficient Stereo Matching"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5399-5951","authenticated-orcid":false,"given":"Chuanhua","family":"Lu","sequence":"first","affiliation":[{"name":"Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka 819-0395, Japan"}]},{"given":"Hideaki","family":"Uchiyama","sequence":"additional","affiliation":[{"name":"Library, Kyushu University, Fukuoka 819-0395, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8525-7133","authenticated-orcid":false,"given":"Diego","family":"Thomas","sequence":"additional","affiliation":[{"name":"Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka 819-0395, Japan"}]},{"given":"Atsushi","family":"Shimada","sequence":"additional","affiliation":[{"name":"Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka 819-0395, Japan"}]},{"given":"Rin-ichiro","family":"Taniguchi","sequence":"additional","affiliation":[{"name":"Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka 819-0395, Japan"}]}],"member":"1968","published-online":{"date-parts":[[2018,11,20]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohi, P., Shotton, J., Hodges, S., and Fitzgibbon, A. (2011, January 26\u201329). KinectFusion: Real-time dense surface mapping and tracking. Proceedings of the 10th IEEE international symposium on IEEE Mixed and augmented reality (ISMAR), Basel, Switzerland.","DOI":"10.1109\/ISMAR.2011.6092378"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Helmer, S., and Lowe, D. (2010, January 3\u20137). Using stereo for object recognition. Proceedings of the 2010 IEEE International Conference on Robotics and Automation, Anchorage, AK, USA.","DOI":"10.1109\/ROBOT.2010.5509826"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Howard, A. (2008, January 22\u201326). Real-time stereo visual odometry for autonomous ground vehicles. Proceedings of the IEEE\/RSJ 2008 International Conference on Intelligent RObots and Systems, Nice, France.","DOI":"10.1109\/IROS.2008.4651147"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1023\/A:1014573219977","article-title":"A taxonomy and evaluation of dense two-frame stereo correspondence algorithms","volume":"47","author":"Scharstein","year":"2002","journal-title":"Int. J. Comput. Vis."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Zbontar, J., and LeCun, Y. (2015, January 7\u201312). Computing the stereo matching cost with a convolutional neural network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298767"},{"key":"ref_6","unstructured":"Luo, W., Schwing, A.G., and Urtasun, R. (July, January 26). Efficient deep learning for stereo matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_7","unstructured":"Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., and Brox, T. (July, January 26). A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A., and Bry, A. (arXiv, 2017). End-to-end learning of geometry and context for deep stereo regression, arXiv.","DOI":"10.1109\/ICCV.2017.17"},{"key":"ref_9","unstructured":"Salimans, T., and Kingma, D.P. (2016). Weight normalization: A simple reparameterization to accelerate training of deep neural networks. Advances in Neural Information Processing Systems, MIT Press."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zagoruyko, S., and Komodakis, N. (2015, January 7\u201312). Learning to compare image patches via convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299064"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Seki, A., and Pollefeys, M. (2017, January 1). Sgm-nets: Semi-global matching with neural networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.703"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Gidaris, S., and Komodakis, N. (2017, January 21\u201326). Detect, replace, refine: Deep structured prediction for pixel wise labeling. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.760"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Pang, J., Sun, W., Ren, J., Yang, C., and Yan, Q. (2017, January 28). Cascade residual learning: A two-stage convolutional neural network for stereo matching. Proceedings of the International Conference on Computer Vision-Workshop on Geometry Meets Deep Learning (ICCVW 2017), Venice, Italy.","DOI":"10.1109\/ICCVW.2017.108"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Jie, Z., Wang, P., Ling, Y., Zhao, B., Wei, Y., Feng, J., and Liu, W. (arXiv, 2018). Left-Right Comparative Recurrent Model for Stereo Matching, arXiv.","DOI":"10.1109\/CVPR.2018.00404"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Liang, Z., Feng, Y., Guo, Y., and Liu, H. (arXiv, 2017). Learning for Disparity Estimation through Feature Constancy, arXiv.","DOI":"10.1109\/CVPR.2018.00297"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Chang, J.R., and Chen, Y.S. (arXiv, 2018). Pyramid Stereo Matching Network, arXiv.","DOI":"10.1109\/CVPR.2018.00567"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Wang, L., Jin, H., and Yang, R. (2008). Search Space Reduction for MRF Stereo. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-540-88682-2_44"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Veksler, O. (2006, January 4\u20137). Reducing Search Space for Stereo Correspondence with Graph Cuts. Proceedings of the British Machine Vision Conference (BMVC), Citeseer, Edinburgh, UK.","DOI":"10.5244\/C.20.73"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Geiger, A., Roser, M., and Urtasun, R. (2010). Efficient large-scale stereo matching. Computer Vision\u2013ACCV 2010, Springer.","DOI":"10.1007\/978-3-642-19315-6_3"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"G\u00fcrb\u00fcz, Y.Z., Alatan, A.A., and \u00c7\u0131\u011fla, C. (2015, January 16\u201319). Sparse recursive cost aggregation towards O (1) complexity local stereo matching. Proceedings of the 23rd Signal Processing and Communications Applications Conference (SIU), Malatya, Turkey.","DOI":"10.1109\/SIU.2015.7130335"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Khamis, S., Fanello, S., Rhemann, C., Kowdle, A., Valentin, J., and Izadi, S. (2018, January 8\u201314). StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01267-0_35"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Weinberger, K.Q., and van der Maaten, L. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_23","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., van der Smagt, P., Cremers, D., and Brox, T. (2015, January 7\u201313). Flownet: Learning optical flow with convolutional networks. Proceedings of the IEEE International Conference on Computer Vision, Washington, DC, USA.","DOI":"10.1109\/ICCV.2015.316"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., and Brox, T. (2017, January 21\u201326). Flownet 2.0: Evolution of optical flow estimation with deep networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.179"},{"key":"ref_26","unstructured":"Bahdanau, D., Cho, K., and Bengio, Y. (arXiv, 2014). Neural machine translation by jointly learning to align and translate, arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Geiger, A., Lenz, P., and Urtasun, R. (2012, January 16\u201321). Are we ready for autonomous driving? the kitti vision benchmark suite. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"ref_28","unstructured":"Menze, M., and Geiger, A. (June, January 7). Object scene flow for autonomous vehicles. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA."},{"key":"ref_29","first-page":"26","article-title":"Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude","volume":"4","author":"Tieleman","year":"2012","journal-title":"COURSERA Neural Netw. Mach. Learn."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/10\/11\/1844\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:31:00Z","timestamp":1760196660000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/10\/11\/1844"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,11,20]]},"references-count":29,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2018,11]]}},"alternative-id":["rs10111844"],"URL":"https:\/\/doi.org\/10.3390\/rs10111844","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2018,11,20]]}}}