{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T14:31:56Z","timestamp":1773930716057,"version":"3.50.1"},"reference-count":35,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,2,15]],"date-time":"2023-02-15T00:00:00Z","timestamp":1676419200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Stairs are common vertical traffic structures in buildings, and stair detection tasks are important in environmental perception for autonomous mobile robots. Most existing algorithms have difficulty combining the visual information from binocular sensors effectively and ensuring reliable detection at night and in the case of extremely fuzzy visual clues. To solve these problems, we propose a stair detection network with red-green-blue (RGB) and depth inputs. Specifically, we design a selective module, which can make the network learn the complementary relationship between the RGB feature maps and the depth feature maps and fuse the features effectively in different scenes. In addition, we propose several postprocessing algorithms, including a stair line clustering algorithm and a coordinate transformation algorithm, to obtain the stair geometric parameters. Experiments show that our method has better performance than existing the state-of-the-art deep learning method, and the accuracy, recall, and runtime are improved by 5.64%, 7.97%, and 3.81 ms, respectively. The improved indexes show the effectiveness of the multimodal inputs and the selective module. The estimation values of stair geometric parameters have root mean square errors within 15 mm when ascending stairs and 25 mm when descending stairs. Our method also has extremely fast detection speed, which can meet the requirements of most real-time applications.<\/jats:p>","DOI":"10.3390\/s23042175","type":"journal-article","created":{"date-parts":[[2023,2,15]],"date-time":"2023-02-15T02:31:23Z","timestamp":1676428283000},"page":"2175","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["RGB-D-Based Stair Detection and Estimation Using Deep Learning"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6798-0312","authenticated-orcid":false,"given":"Chen","family":"Wang","sequence":"first","affiliation":[{"name":"School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7748-8591","authenticated-orcid":false,"given":"Zhongcai","family":"Pei","sequence":"additional","affiliation":[{"name":"School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2214-5089","authenticated-orcid":false,"given":"Shuang","family":"Qiu","sequence":"additional","affiliation":[{"name":"School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3974-1271","authenticated-orcid":false,"given":"Zhiyong","family":"Tang","sequence":"additional","affiliation":[{"name":"School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,2,15]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1016\/j.jvcir.2013.11.005","article-title":"RGB-D image-based detection of stairs, pedestrian crosswalks and traffic signs","volume":"10","author":"Wang","year":"2014","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Krausz, N.E., and Hargrove, L.J. (2015, January 22\u201324). Recognition of ascending stairs from 2D images for control of powered lower limb prostheses. Proceedings of the 2015 7th International IEEE\/EMBS Conference on Neural Engineering, Montpellier, France.","DOI":"10.1109\/NER.2015.7146698"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Harms, H., Rehder, E., Schwarze, T., and Lauer, M. (October, January 28). Detection of ascending stairs using stereo vision. Proceedings of the 2015 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany.","DOI":"10.1109\/IROS.2015.7353716"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"16124","DOI":"10.1038\/s41598-022-20667-w","article-title":"Deep leaning-based ultra-fast stair detection","volume":"12","author":"Wang","year":"2022","journal-title":"Sci. Rep."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Westfechtel, T., Ohno, K., Mertsching, B., Nickchen, D., Kojima, S., and Tadokoro, S. (2016, January 9\u201314). 3D graph based stairway detection and localization for mobile robots. Proceedings of the 2016 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea.","DOI":"10.1109\/IROS.2016.7759096"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Zhao, X., Chen, W., Yan, X., Wang, J., and Wu, X. (2018, January 9\u201311). Real-Time Stairs Geometric Parameters Estimation for Lower Limb Rehabilitation Exoskeleton. Proceedings of the 2018 Chinese Control And Decision Conference (CCDC), Shenyang, China.","DOI":"10.1109\/CCDC.2018.8408001"},{"key":"ref_7","first-page":"327","article-title":"Staircase Detection to Guide Visually Impaired People: A Hybrid Approach","volume":"33","author":"Habib","year":"2019","journal-title":"Rev. D\u2019Intelligence Artif."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Patil, U., Gujarathi, A., Kulkarni, A., Jain, A., Malke, L., Tekade, R., Paigwar, K., and Chaturvedi, P. (2019, January 25\u201327). Deep Learning Based Stair Detection and Statistical Image Filtering for Autonomous Stair Climbing. Proceedings of the 2019 Third IEEE International Conference on Robotic Computing (IRC), Naples, Italy.","DOI":"10.1109\/IRC.2019.00031"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Yu, S.-H., Yang, B.-R., Lee, H.-H., and Tanaka, E. (2021, January 11\u201314). A Ground-Stair Walking Strategy of the Assistive Device Based on the RGB-D Camera. Proceedings of the 2021 IEEE\/SICE International Symposium on System Integration (SII), Iwaki, Japan.","DOI":"10.1109\/IEEECONF49454.2021.9382668"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Khaliluzzaman, M., Deb, K., and Jo, K.-H. (2018, January 21\u201323). Geometrical Feature Based Stairways Detection and Recognition Using Depth Sensor. Proceedings of the IECON 2018\u201444th Annual Conference of the IEEE Industrial Electronics Society, Washington, DC, USA.","DOI":"10.1109\/IECON.2018.8591340"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Murakami, S., Shimakawa, M., Kivota, K., and Kato, T. (2014, January 3\u20136). Study on stairs detection using RGB-depth images. Proceedings of the 2014 Joint 7th International Conference on Soft Computing and Intelligent Systems (SCIS) and 15th International Symposium on Advanced Intelligent Systems (ISIS), Kitakyushu, Japan.","DOI":"10.1109\/SCIS-ISIS.2014.7044705"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Khaliluzzaman, M., Yakub, M., and Chakraborty, N. (2018, January 27\u201328). Comparative Analysis of Stairways Detection Based on RGB and RGB-D Image. Proceedings of the 2018 International Conference on Innovations in Science, Engineering and Technology (ICISET), Chittagong, Bangladesh.","DOI":"10.1109\/ICISET.2018.8745624"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1007\/978-3-031-17576-3_4","article-title":"Salak Image Classification Method Based Deep Learning Technique Using Two Transfer Learning Models","volume":"Volume 1071","author":"Abualigah","year":"2023","journal-title":"Classification Applications with Deep Learning and Machine Learning Technologies"},{"key":"ref_14","unstructured":"Wang, J., and Zhang, K. (2019). Unsupervised Domain Adaptation Learning Algorithm for RGB-D Staircase Recognition. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Takahashi, M., Ji, Y., Umeda, K., and Moro, A. (2020, January 9\u201311). Expandable YOLO: 3D Object Detection from RGB-D Images. Proceedings of the 2020 21st International Conference on Research and Education in Mechatronics (REM), Cracow, Poland.","DOI":"10.1109\/REM49740.2020.9313886"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"5001512","DOI":"10.1109\/TIM.2022.3145388","article-title":"Image Segmentation of Cabin Assembly Scene Based on Improved RGB-D Mask R-CNN","volume":"71","author":"Fu","year":"2022","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Kumar, A., Shrivatsav, S.N., Subrahmanyam, G.R.K.S., and Mishra, D. (2016, January 21\u201324). Application of transfer learning in RGB-D object recognition. Proceedings of the 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Jaipur, India.","DOI":"10.1109\/ICACCI.2016.7732108"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Sharma, P., and Valles, D. (2020, January 28\u201331). Backbone Neural Network Design of Single Shot Detector from RGB-D Images for Object Detection. Proceedings of the 2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA.","DOI":"10.1109\/UEMCON51285.2020.9298175"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1109\/TCDS.2018.2866587","article-title":"Canonical Correlation Analysis Regularization: An Effective Deep Multiview Learning Baseline for RGB-D Object Recognition","volume":"11","author":"Tang","year":"2019","journal-title":"IEEE Trans. Cogn. Dev. Syst."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1887","DOI":"10.1109\/TMM.2015.2476655","article-title":"Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition","volume":"17","author":"Wang","year":"2015","journal-title":"IEEE Trans. Multimed."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Eitel, A., Springenberg, J.T., Spinello, L., Riedmiller, M., and Burgard, W. (October, January 28). Multimodal deep learning for robust RGB-D object recognition. Proceedings of the 2015 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany.","DOI":"10.1109\/IROS.2015.7353446"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"5558","DOI":"10.1109\/LRA.2020.3007457","article-title":"Real-Time Fusion Network for RGB-D Semantic Segmentation Incorporating Unexpected Obstacle Detection for Road-Driving Images","volume":"5","author":"Sun","year":"2020","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Barchid, S., Mennesson, J., and Dj\u00e9raba, C. (2021, January 28\u201330). Review on Indoor RGB-D Semantic Segmentation with Deep Convolutional Neural Networks. Proceedings of the 2021 International Conference on Content-Based Multimedia Indexing (CBMI), Lille, France.","DOI":"10.1109\/CBMI50038.2021.9461875"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Aakerberg, A., Nasrollahi, K., and Heder, T. (December, January 28). Improving a deep learning based RGB-D object recognition model by ensemble learning. Proceedings of the 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA), Montreal, QC, Canada.","DOI":"10.1109\/IPTA.2017.8310101"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zheng, Z., Xie, D., Chen, C., and Zhu, Z. (November, January 30). Multi-resolution Cascaded Network with Depth-similar Residual Module for Real-time Semantic Segmentation on RGB-D Images. Proceedings of the 2020 IEEE International Conference on Networking, Sensing and Control (ICNSC), Nanjing, China.","DOI":"10.1109\/ICNSC48988.2020.9238079"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs","volume":"40","author":"Chen","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"2011","DOI":"10.1109\/TPAMI.2019.2913372","article-title":"Squeeze-and-Excitation Networks","volume":"42","author":"Hu","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Xie, S., Girshick, R., Doll\u00e1r, P., Tu, Z., and He, K. (2017, January 21\u201326). Aggregated Residual Transformations for Deep Neural Networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.634"},{"key":"ref_29","unstructured":"Ultralytics (2020, April 01). YOLOv5. Available online: https:\/\/github.com\/ultralytics\/yolov5."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Khaliluzzaman, M., Deb, K., and Jo, K.-H. (2016, January 6\u20138). Stairways detection and distance estimation approach based on three connected point and triangular similarity. Proceedings of the 2016 9th International Conference on Human System Interactions (HSI), Portsmouth, UK.","DOI":"10.1109\/HSI.2016.7529653"},{"key":"ref_31","first-page":"24","article-title":"An Algorithm for String Searching Based on Brute-Force Algorithm","volume":"11","author":"Abdeen","year":"2011","journal-title":"Int. J. Comput. Sci. Netw. Secur."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"105798","DOI":"10.1016\/j.optlaseng.2019.105798","article-title":"Robust three-dimensional face reconstruction by one-shot structured light line pattern","volume":"124","author":"Wang","year":"2020","journal-title":"Opt. Lasers Eng."},{"key":"ref_33","first-page":"84","article-title":"Three-dimensional point piecewise linear fitting method based on least square method","volume":"31","author":"Xue","year":"2015","journal-title":"J. Qiqihar Univ."},{"key":"ref_34","unstructured":"(2013, December 18). Depth Camera D435i. Available online: https:\/\/www.intelrealsense.com\/depth-camera-d435i\/."},{"key":"ref_35","unstructured":"Garcia, G.A., Escolano, O.S., Oprea, S., Martinez, V.V., and Rodriguez, G.J. (2017). A Review on Deep Learning Techniques Applied to Semantic Segmentation. arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/4\/2175\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:36:01Z","timestamp":1760121361000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/4\/2175"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,15]]},"references-count":35,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["s23042175"],"URL":"https:\/\/doi.org\/10.3390\/s23042175","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,15]]}}}