{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,29]],"date-time":"2025-12-29T18:52:37Z","timestamp":1767034357137,"version":"build-2065373602"},"reference-count":41,"publisher":"MDPI AG","issue":"18","license":[{"start":{"date-parts":[[2021,9,8]],"date-time":"2021-09-08T00:00:00Z","timestamp":1631059200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the Science and Technology Department of Jilin Province, China","award":["20200401123GX"],"award-info":[{"award-number":["20200401123GX"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Stereo matching networks based on deep learning are widely developed and can obtain excellent disparity estimation. We present a new end-to-end fast deep learning stereo matching network in this work that aims to determine the corresponding disparity from two stereo image pairs. We extract the characteristics of the low-resolution feature images using the stacked hourglass structure feature extractor and build a multi-level detailed cost volume. We also use the edge of the left image to guide disparity optimization and sub-sample with the low-resolution data, ensuring excellent accuracy and speed at the same time. Furthermore, we design a multi-cross attention model for binocular stereo matching to improve the matching accuracy and achieve end-to-end disparity regression effectively. We evaluate our network on Scene Flow, KITTI2012, and KITTI2015 datasets, and the experimental results show that the speed and accuracy of our method are excellent.<\/jats:p>","DOI":"10.3390\/s21186016","type":"journal-article","created":{"date-parts":[[2021,9,8]],"date-time":"2021-09-08T21:28:45Z","timestamp":1631136525000},"page":"6016","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["A Fast Stereo Matching Network with Multi-Cross Attention"],"prefix":"10.3390","volume":"21","author":[{"given":"Ming","family":"Wei","sequence":"first","affiliation":[{"name":"Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ming","family":"Zhu","sequence":"additional","affiliation":[{"name":"Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yi","family":"Wu","sequence":"additional","affiliation":[{"name":"Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiaqi","family":"Sun","sequence":"additional","affiliation":[{"name":"Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiarong","family":"Wang","sequence":"additional","affiliation":[{"name":"Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Changji","family":"Liu","sequence":"additional","affiliation":[{"name":"Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,9,8]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1016\/j.image.2019.07.008","article-title":"Wide context learning network for stereo matching","volume":"78","author":"Nguyen","year":"2019","journal-title":"Signal Process. Image Commun."},{"key":"ref_2","first-page":"7","article-title":"Performance Review of the Stereo Matching Algorithms","volume":"4","author":"Mondal","year":"2017","journal-title":"Am. J. Comput. Sci. Inf. Eng."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Yao, G., Yilmaz, A., Zhang, L., Meng, F., Ai, H., and Jin, F. (2021). Matching Large Baseline Oblique Stereo Images Using an End-To-End Convolutional Neural Network. Remote Sens., 13.","DOI":"10.3390\/rs13020274"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"19651","DOI":"10.1109\/ACCESS.2021.3050540","article-title":"Bidirectional Stereo Matching Network with Double Cost Volumes","volume":"9","author":"Jia","year":"2021","journal-title":"IEEE Access"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Xu, B., Xu, Y., Yang, X., Jia, W., and Guo, Y. (2021). Bilateral Grid Learning for Stereo Matching Network. arXiv.","DOI":"10.1109\/CVPR46437.2021.01231"},{"key":"ref_6","first-page":"12926","article-title":"Adaptive Unimodal Cost Volume Filtering for Deep Stereo Matching","volume":"34","author":"Zhang","year":"2020","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"50828","DOI":"10.1109\/ACCESS.2020.2980243","article-title":"A Convolutional Attention Residual Network for Stereo Matching","volume":"8","author":"Huang","year":"2020","journal-title":"IEEE Access"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Pang, J., Sun, W., Ren, J., Yang, C., Yang, Q., and Yan, Q. (2017, January 22\u201329). Cascade Residual Learning: A Two-Stage Convolutional Neural Network for Stereo Matching. Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV), Venice, Italy.","DOI":"10.1109\/ICCVW.2017.108"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Gu, X., Fan, Z., Zhu, S., Dai, Z., Tan, F., and Tan, P. (2020, January 14\u201319). Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00257"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"G\u00fcney, F., and Geiger, A. (2015, January 7\u201312). Displets: Resolving stereo ambiguities using object knowledge. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299044"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Gidaris, S., and Komodakis, N. (2017, January 21\u201326). Detect, Replace, Refine: Deep Structured Prediction for Pixel Wise Labeling. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.760"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Zhang, F., Prisacariu, V., Yang, R., and Torr, P.H.S. (2019, January 16\u201320). GA-Net: Guided Aggregation Net for End-To-End Stereo Matching. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00027"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Khamis, S., Fanello, S.R., Rhemann, C., Valentin, J., and Izadi, S. (2018, January 8\u201314). StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01267-0_35"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Mei, X., Sun, X., Zhou, M., Jiao, S., Wang, H., and Zhang, X. (2011, January 6\u201313). On building an accurate stereo matching system on graphics hardware. Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV), Barcelona, Spain.","DOI":"10.1109\/ICCVW.2011.6130280"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Tao, R., Xiang, Y., and You, H. (2020). An Edge-Sense Bidirectional Pyramid Network for Stereo Matching of VHR Remote Sensing Images. Remote Sens., 12.","DOI":"10.3390\/rs12244025"},{"key":"ref_16","first-page":"2287","article-title":"Stereo matching by training a convolutional neural network to compare image patches","volume":"17","author":"Zbontar","year":"2016","journal-title":"Signal Process. Image Commun."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Luo, W., Schwing, A.G., and Urtasun, R. (2016, January 27\u201330). Efficient Deep Learning for Stereo Matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.614"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1109\/TPAMI.2007.1166","article-title":"Stereo Processing by Semiglobal Matching and Mutual Information","volume":"30","author":"Hirschmuller","year":"2007","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Seki, A., and Pollefeys, M. (2017, January 21\u201326). SGM-Nets: Semi-Global Matching with Neural Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.703"},{"key":"ref_20","unstructured":"S\u00e9bastien, D., Serge, B., Michel, B., Maxime, M., and Lo\u00efc, S. (2017, January 15\u201317). Sparse Stereo Disparity Map Densification using Hierarchical Image Segmentation. Proceedings of the 13th International Symposium, Fontainebleau, France."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A., and Bry, A. (2017, January 22\u201329). End-to-End Learning of Geometry and Context for Deep Stereo Regression. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.17"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Chang, J., and Chen, Y. (2018, January 18\u201322). Pyramid Stereo Matching Network. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00567"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Song, X., Zhao, X., Hu, H., and Fang, L. (2018, January 2\u20136). EdgeStereo: A Context Integrated Residual Pyramid Network for Stereo Matching. Proceedings of the Asian Conference on Computer Vision (ACCV), Perth, Australia.","DOI":"10.1007\/978-3-030-20873-8_2"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Yang, G., Zhao, H., Shi, J., Deng, Z., and Jia, J. (2018, January 8\u201314). SegStereo: Exploiting Semantic Information for Disparity Estimation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_39"},{"key":"ref_25","unstructured":"Duggal, S., Wang, S., Ma, W., Hu, R., and Urtasun, R. (November, January 27). DeepPruner: Learning Efficient Stereo Matching via Differentiable PatchMatch. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_26","unstructured":"Bleyer, M., Rhemann, C., and Rother, C. (September, January 29). PatchMatch Stereo\u2014Stereo Matching with Slanted Support Windows. Proceedings of the British Machine Vision Conference, Dundee, UK."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"2361","DOI":"10.1109\/TPAMI.2019.2947374","article-title":"Learning Depth with Convolutional Spatial Propagation Network","volume":"42","author":"Cheng","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"3885","DOI":"10.1109\/TIP.2019.2903318","article-title":"Segment-Based Disparity Refinement with Occlusion Handling for Stereo Matching","volume":"28","author":"Yan","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_29","unstructured":"Wu, Z., Wu, X., Zhang, X., Wang, S., and Ju, L. (November, January 27). Semantic Stereo Matching with Pyramid Cost Volumes. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"4353","DOI":"10.1109\/LRA.2021.3068108","article-title":"PVStereo: Pyramid Voting Module for End-to-End Self-Supervised Stereo Matching","volume":"6","author":"Wang","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Xu, H., and Zhang, J. (2020, January 14\u201319). AANet: Adaptive Aggregation Network for Efficient Stereo Matching. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00203"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Melekhov, I., Kannala, J., and Rahtu, E. (2016, January 4\u20138). Siamese network features for image matching. Proceedings of the 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico.","DOI":"10.1109\/ICPR.2016.7899663"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., and Hu, Q. (2020, January 14\u201319). ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01155"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Li, Z., Liu, X., Creighton, F., Taylor, R., and Unberath, M. (2020). Revisiting Stereo Depth Estimation from a Sequence-to-Sequence Perspective with Transformers. arXiv.","DOI":"10.1109\/ICCV48922.2021.00614"},{"key":"ref_35","unstructured":"Huang, Z., Wang, X., Wei, Y., Huang, L., Shi, H., Liu, W., and Huang, T.S. (November, January 27). CCNet: Criss-Cross Attention for Semantic Segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Wang, X., Girshick, R., Gupta, A., and He, K. (2018, January 18\u201322). Non-local Neural Networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00813"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Sun, K., Xiao, B., Liu, D., and Wang, J. (2019, January 16\u201320). Deep High-Resolution Representation Learning for Human Pose Estimation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00584"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Aleotti, F., Tosi, F., Zhang, L., Poggi, M., and Mattoccia, S. (2020). Reversing the cycle: Self-supervised deep stereo through enhanced monocular distillation. arXiv.","DOI":"10.1007\/978-3-030-58621-8_36"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Mayer, N., Ilg, E., Hausser, P., and Fischer, P. (2016, January 27\u201330). A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.438"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Geiger, A., Lenz, P., and Urtasun, R. (2012, January 16\u201321). Are we ready for autonomous driving? The KITTI vision benchmark suite. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Menze, M., and Geiger, A. (2015, January 7\u201312). Object scene flow for autonomous vehicles. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298925"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/18\/6016\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:58:59Z","timestamp":1760165939000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/18\/6016"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,8]]},"references-count":41,"journal-issue":{"issue":"18","published-online":{"date-parts":[[2021,9]]}},"alternative-id":["s21186016"],"URL":"https:\/\/doi.org\/10.3390\/s21186016","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,9,8]]}}}