{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,29]],"date-time":"2025-12-29T18:59:00Z","timestamp":1767034740330,"version":"build-2065373602"},"reference-count":36,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2021,2,18]],"date-time":"2021-02-18T00:00:00Z","timestamp":1613606400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["No. 2018YFB0204301"],"award-info":[{"award-number":["No. 2018YFB0204301"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"name":"General Program of National Natural Science Foundation of China","award":["81973244"],"award-info":[{"award-number":["81973244"]}]},{"name":"Science and Technology Program Projects of Shenzhen","award":["JCYJ20170818110101726"],"award-info":[{"award-number":["JCYJ20170818110101726"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Stereo matching is an important research field of computer vision. Due to the dimension of cost aggregation, current neural network-based stereo methods are difficult to trade-off speed and accuracy. To this end, we integrate fast 2D stereo methods with accurate 3D networks to improve performance and reduce running time. We leverage a 2D encoder-decoder network to generate a rough disparity map and construct a disparity range to guide the 3D aggregation network, which can significantly improve the accuracy and reduce the computational cost. We use a stacked hourglass structure to refine the disparity from coarse to fine. We evaluated our method on three public datasets. According to the KITTI official website results, Our network can generate an accurate result in 80 ms on a modern GPU. Compared to other 2D stereo networks (AANet, DeepPruner, FADNet, etc.), our network has a big improvement in accuracy. Meanwhile, it is significantly faster than other 3D stereo networks (5\u00d7 than PSMNet, 7.5\u00d7 than CSN and 22.5\u00d7 than GANet, etc.), demonstrating the effectiveness of our method.<\/jats:p>","DOI":"10.3390\/s21041430","type":"journal-article","created":{"date-parts":[[2021,2,18]],"date-time":"2021-02-18T21:59:58Z","timestamp":1613685598000},"page":"1430","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["A Joint 2D-3D Complementary Network for Stereo Matching"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8068-3635","authenticated-orcid":false,"given":"Xiaogang","family":"Jia","sequence":"first","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha 410073, China"}]},{"given":"Wei","family":"Chen","sequence":"additional","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha 410073, China"}]},{"given":"Zhengfa","family":"Liang","sequence":"additional","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha 410073, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1641-5713","authenticated-orcid":false,"given":"Xin","family":"Luo","sequence":"additional","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha 410073, China"}]},{"given":"Mingfei","family":"Wu","sequence":"additional","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha 410073, China"}]},{"given":"Chen","family":"Li","sequence":"additional","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha 410073, China"}]},{"given":"Yulin","family":"He","sequence":"additional","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha 410073, China"}]},{"given":"Yusong","family":"Tan","sequence":"additional","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha 410073, China"}]},{"given":"Libo","family":"Huang","sequence":"additional","affiliation":[{"name":"College of Computer, National University of Defense Technology, Changsha 410073, China"}]}],"member":"1968","published-online":{"date-parts":[[2021,2,18]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Li, P., Chen, X., and Shen, S. (2019, January 16\u201320). Stereo r-cnn based 3d object detection for autonomous driving. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00783"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Geiger, A., Ziegler, J., and Stiller, C. (2011, January 5\u20139). Stereoscan: Dense 3d reconstruction in real-time. Proceedings of the 2011 IEEE Intelligent Vehicles Symposium (IV), Baden-Baden, Germany.","DOI":"10.1109\/IVS.2011.5940405"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"3739","DOI":"10.1016\/j.proeng.2011.08.700","article-title":"A human-machine interaction technique: Hand gesture recognition based on hidden Markov models with trajectory of hand motion","volume":"15","author":"Kao","year":"2011","journal-title":"Procedia Eng."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"298","DOI":"10.1016\/j.measurement.2019.05.004","article-title":"Model based design of a stereo vision system for intelligent deep-sea operations","volume":"144","author":"Pehle","year":"2019","journal-title":"Measurement"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1109\/MC.2008.479","article-title":"Autonomy for mars rovers: Past, present, and future","volume":"41","author":"Bajracharya","year":"2008","journal-title":"Computer"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Hrabar, S., Sukhatme, G.S., Corke, P., Usher, K., and Roberts, J. (2005, January 2\u20136). Combined optic-flow and stereo-based navigation of urban canyons for a UAV. Proceedings of the 2005 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Edmonton, AB, Canada.","DOI":"10.1109\/IROS.2005.1544998"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Zeng, K., Ning, M., Wang, Y., and Guo, Y. (2020, January 14\u201319). Hierarchical clustering with hard-batch triplet loss for person re-identification. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Virtual Conference, Seattle, DC, USA.","DOI":"10.1109\/CVPR42600.2020.01367"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"103913","DOI":"10.1016\/j.imavis.2020.103913","article-title":"Energy clustering for unsupervised person re-identification","volume":"98","author":"Zeng","year":"2020","journal-title":"Image Vis. Comput."},{"key":"ref_9","unstructured":"Kanade, T., Kano, H., Kimura, S., Yoshida, A., and Oda, K. (1995, January 5\u20139). Development of a video-rate stereo machine. Proceedings of the 1995 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Human Robot Interaction and Cooperative Robots, Pittsburgh, PA, USA."},{"key":"ref_10","unstructured":"Kim, J., and Kolmogorov, Z. (2003, January 13\u201316). Visual correspondence using energy minimization and mutual information. Proceedings of the Ninth IEEE International Conference on Computer Vision, Nice, France."},{"key":"ref_11","unstructured":"Heo, Y.S., Lee, K.M., and Lee, S.U. (October, January 27). Mutual Information as a Stereo Correspondence Measure. Proceedings of the IEEE International Conference on Computer Vision, Kyoto, Japan."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Ma, L., Li, J., Ma, J., and Zhang, H. (2013, January 26\u201328). A Modified Census Transform Based on the Neighborhood Information for Stereo Matching Algorithm. Proceedings of the 2013 Seventh International Conference on Image and Graphics, Qingdao, China.","DOI":"10.1109\/ICIG.2013.113"},{"key":"ref_13","unstructured":"Balk, Y.K., Jo, J.H., and LEE, K.M. (2006, January 2\u20133). Fast Census Transform-based Stereo Algorithm using SSE2. Proceedings of the 12th Korea-Japan Joint Workshop on Frontiers of Computer Vision, Tokushima, Japan."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1230","DOI":"10.1016\/j.patrec.2008.01.032","article-title":"Local stereo matching with adaptive support-weight, rank transform and disparity calibration","volume":"29","author":"Gu","year":"2008","journal-title":"Pattern Recognit. Lett."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Banks, J., Bennamoun, M., Kubik, K., and Corke, P. (1999, January 15\u201319). A constraint to improve the reliability of stereo matching using the rank transform. Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP99), Phoenix, AZ, USA.","DOI":"10.1109\/ICASSP.1999.757552"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1109\/34.677269","article-title":"A pixel dissimilarity measure that is insensitive to image sampling","volume":"20","author":"Birchfield","year":"1998","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_17","first-page":"1","article-title":"Stereo matching by training a convolutional neural network to compare image patches","volume":"17","author":"Zbontar","year":"2016","journal-title":"Mach. Learn. Res."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Xu, H., and Zhang, J. (2020, January 14\u201319). AANet: Adaptive Aggregation Network for Efficient Stereo Matching. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Virtual Conference, Seattle, DC, USA.","DOI":"10.1109\/CVPR42600.2020.00203"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Lee, H., and Shin, Y. (2019, January 22\u201325). Real-Time Stereo Matching Network with High Accuracy. Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan.","DOI":"10.1109\/ICIP.2019.8803514"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Wang, Q., Shi, S., Zheng, S., Zhao, K., and Chu, X. (2020). FADNet: A Fast and Accurate Network for Disparity Estimation. arXiv.","DOI":"10.1109\/ICRA40945.2020.9197031"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Yee, K., and Chakrabarti, A. (2020, January 1\u20135). Fast Deep Stereo with 2D Convolutional Processing of Cost Signatures. Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Snowmass Village, CO, USA.","DOI":"10.1109\/WACV45572.2020.9093273"},{"key":"ref_22","unstructured":"Mayer, N., Ilg, E., and Hausser, P. (July, January 26). A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_23","unstructured":"Duggal, S., Wang, S., Ma, W.C., Hu, R., and Urtasun, R. (November, January 27). Deeppruner: Learning efficient stereo matching via differentiable patchmatch. Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Kendall, A., Martirosyan, H., and Dasgupta, S. (2017, January 22\u201329). End-to-end learning of geometry and context for deep stereo regression. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.17"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Chang, J.R., and Chen, Y.S. (2018, January 18\u201322). Pyramid stereo matching network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00567"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Gu, X., Fan, Z., Zhu, S., Dai, Z., Tan, F., and Tan, P. (2020, January 14\u201319). Cascade cost volume for high-resolution multi-view stereo and stereo matching. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Virtual Conference, Seattle, DC, USA.","DOI":"10.1109\/CVPR42600.2020.00257"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Zhang, F., Prisacariu, V., Yang, R., and Torr, P.H. (2019, January 16\u201320). Ga-net: Guided aggregation net for end-to-end stereo matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00027"},{"key":"ref_28","unstructured":"Scharstein, D., Szeliski, R., and Zabih, R. (2001, January 9\u201310). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001), Kauai, HI, USA."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid scene parsing network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Guo, X., Yang, K., Yang, W., Wang, X., and Li, H. (2019, January 16\u201320). Group-wise correlation stereo network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00339"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"E18","DOI":"10.1017\/ATSIP.2020.16","article-title":"NLCA-Net: A non-local context attention network for stereo matching","volume":"9","author":"Rao","year":"2020","journal-title":"APSIPA Trans. Signal Inf. Process."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"177823","DOI":"10.1109\/ACCESS.2020.3027205","article-title":"3D Correspondence and Point Projection Method for Structures Deformation Analysis","volume":"8","author":"Melo","year":"2020","journal-title":"IEEE Access"},{"key":"ref_34","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Geiger, A., Lenz, P., and Urtasun, R. (2012, January 16\u201321). Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"ref_36","unstructured":"Menze, M., Heipke, C., and Geiger, A. (October, January 28). Joint 3D Estimation of Vehicles and Scene Flow. Proceedings of the ISPRS Geospatial Week, La Grande Motte, France."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/4\/1430\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:25:47Z","timestamp":1760160347000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/4\/1430"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,18]]},"references-count":36,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2021,2]]}},"alternative-id":["s21041430"],"URL":"https:\/\/doi.org\/10.3390\/s21041430","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,2,18]]}}}