{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T09:17:55Z","timestamp":1772011075014,"version":"3.50.1"},"reference-count":56,"publisher":"MDPI AG","issue":"14","license":[{"start":{"date-parts":[[2022,7,13]],"date-time":"2022-07-13T00:00:00Z","timestamp":1657670400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"INTERREG VA FMA ADAPT"},{"name":"European Regional Development Fund (ERDF)"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>The real-time segmentation of sidewalk environments is critical to achieving autonomous navigation for robotic wheelchairs in urban territories. A robust and real-time video semantic segmentation offers an apt solution for advanced visual perception in such complex domains. The key to this proposition is to have a method with lightweight flow estimations and reliable feature extractions. We address this by selecting an approach based on recent trends in video segmentation. Although these approaches demonstrate efficient and cost-effective segmentation performance in cross-domain implementations, they require additional procedures to put their striking characteristics into practical use. We use our method for developing a visual perception technique to perform in urban sidewalk environments for the robotic wheelchair. We generate a collection of synthetic scenes in a blending target distribution to train and validate our approach. Experimental results show that our method improves prediction accuracy on our benchmark with tolerable loss of speed and without additional overhead. Overall, our technique serves as a reference to transfer and develop perception algorithms for any cross-domain visual perception applications with less downtime.<\/jats:p>","DOI":"10.3390\/s22145241","type":"journal-article","created":{"date-parts":[[2022,7,14]],"date-time":"2022-07-14T00:12:40Z","timestamp":1657757560000},"page":"5241","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Self-Supervised Sidewalk Perception Using Fast Video Semantic Segmentation for Robotic Wheelchairs in Smart Mobility"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4688-481X","authenticated-orcid":false,"given":"Vishnu","family":"Pradeep","sequence":"first","affiliation":[{"name":"Normandie University, UNIROUEN, ESIGELEC, IRSEEM, 76000 Rouen, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6230-2966","authenticated-orcid":false,"given":"Redouane","family":"Khemmar","sequence":"additional","affiliation":[{"name":"Normandie University, UNIROUEN, ESIGELEC, IRSEEM, 76000 Rouen, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8570-3781","authenticated-orcid":false,"given":"Louis","family":"Lecrosnier","sequence":"additional","affiliation":[{"name":"Normandie University, UNIROUEN, ESIGELEC, IRSEEM, 76000 Rouen, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4377-1414","authenticated-orcid":false,"given":"Yann","family":"Duchemin","sequence":"additional","affiliation":[{"name":"Normandie University, UNIROUEN, ESIGELEC, IRSEEM, 76000 Rouen, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5130-4798","authenticated-orcid":false,"given":"Romain","family":"Rossi","sequence":"additional","affiliation":[{"name":"Normandie University, UNIROUEN, ESIGELEC, IRSEEM, 76000 Rouen, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4037-2880","authenticated-orcid":false,"given":"Benoit","family":"Decoux","sequence":"additional","affiliation":[{"name":"Normandie University, UNIROUEN, ESIGELEC, IRSEEM, 76000 Rouen, France"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,7,13]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Eitel, A., Springenberg, J.T., Spinello, L., Riedmiller, M., and Burgard, W. (October, January 28). Multimodal deep learning for robust RGB-D object recognition. Proceedings of the 2015 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany.","DOI":"10.1109\/IROS.2015.7353446"},{"key":"ref_2","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep Residual Learning for Image Recognition. arXiv.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., and Bernstein, M. (2014). ImageNet Large Scale Visual Recognition Challenge. arXiv.","DOI":"10.1007\/s11263-015-0816-y"},{"key":"ref_5","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2014). Fully Convolutional Networks for Semantic Segmentation. arXiv.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Oliveira, G.L., Burgard, W., and Brox, T. (2016, January 9\u201314). Efficient deep models for monocular road segmentation. Proceedings of the 2016 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea.","DOI":"10.1109\/IROS.2016.7759717"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Chang, Y.H., Chung, P.L., and Lin, H.W. (2018, January 13\u201317). Deep learning for object identification in ROS-based mobile robots. Proceedings of the 2018 IEEE International Conference on Applied System Invention (ICASI), Chiba, Japan.","DOI":"10.1109\/ICASI.2018.8394348"},{"key":"ref_10","unstructured":"Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., and Zhang, J. (2016). End to End Learning for Self-Driving Cars. arXiv."},{"key":"ref_11","unstructured":"Volpi, R., Namkoong, H., Sener, O., Duchi, J., Murino, V., and Savarese, S. (2018). Generalizing to Unseen Domains via Adversarial Data Augmentation. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Volpi, R., Morerio, P., Savarese, S., and Murino, V. (2018, January 18\u201323). Adversarial Feature Augmentation for Unsupervised Domain Adaptation. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00576"},{"key":"ref_13","unstructured":"Morerio, P., Cavazza, J., and Murino, V. (2017). Minimal-Entropy Correlation Alignment for Unsupervised Deep Domain Adaptation. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Sun, B., and Saenko, K. (2016). Deep CORAL: Correlation Alignment for Deep Domain Adaptation. arXiv.","DOI":"10.1007\/978-3-319-49409-8_35"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Tzeng, E., Hoffman, J., Saenko, K., and Darrell, T. (2017, January 21\u201326). Adversarial Discriminative Domain Adaptation. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.316"},{"key":"ref_16","unstructured":"Ganin, Y., and Lempitsky, V. (2014). Unsupervised Domain Adaptation by Backpropagation. arXiv."},{"key":"ref_17","unstructured":"(2022, June 02). Office Website ADAPT Project. Available online: http:\/\/adapt-project.com."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Wang, H., Jiang, X., Ren, H., Hu, Y., and Bai, S. (2021). SwiftNet: Real-time Video Object Segmentation. arXiv.","DOI":"10.1109\/CVPR46437.2021.00135"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., and Sang, N. (2018). BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation. arXiv.","DOI":"10.1007\/978-3-030-01261-8_20"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Lee, S.P., Chen, S.C., and Peng, W.H. (2021). GSVNet: Guided Spatially-Varying Convolution for Fast Semantic Segmentation on Video. arXiv.","DOI":"10.1109\/ICME51207.2021.9428381"},{"key":"ref_21","first-page":"353","article-title":"Adequacy of power wheelchair control interfaces for persons with severe disabilities a clinical survey","volume":"37","author":"Fehr","year":"2000","journal-title":"J. Rehabil. Res. Dev."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1179\/otb.2014.69.1.012","article-title":"User perspectives on assistive technology: A qualitative analysis of 55 letters from citizens applying for assistive technology","volume":"69","author":"Jensen","year":"2014","journal-title":"World Fed. Occup. Ther. Bull."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Gong, K., and Green, R. (2009, January 23\u201325). Ground-plane detection using stereo depth values for wheelchair guidance. Proceedings of the 2009 24th International Conference Image and Vision Computing New Zealand, Wellington, New Zealand.","DOI":"10.1109\/IVCNZ.2009.5378352"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Dayangac, E., and Hirtz, G. (2014, January 7\u201310). Object recognition for human behavior analysis. Proceedings of the 2014 IEEE Fourth International Conference on Consumer Electronics Berlin (ICCE-Berlin), Berlin, Germany.","DOI":"10.1109\/ICCE-Berlin.2014.7034218"},{"key":"ref_25","unstructured":"Matsumototi, Y., Inot, T., and Ogsawara, T. (2001, January 18\u201321). Development of intelligent wheelchair system with face and gaze based interface. Proceedings of the 10th IEEE International Workshop on Robot and Human Interactive Communication, ROMAN 2001 (Cat. No.01TH8591), Paris, France."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Manta, L.F., Cojocaru, D., Vladu, I.C., Dragomir, A., and Mariniuc, A.M. (2019, January 26\u201329). Wheelchair control by head motion using a noncontact method in relation to the pacient. Proceedings of the 2019 20th International Carpathian Control Conference (ICCC), Krakow-Wieliczka, Poland.","DOI":"10.1109\/CarpathianCC.2019.8765982"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Arora, P., Sharma, A., Soni, A.S., and Garg, A. (2015, January 17\u201320). Control of wheelchair dummy for differently abled patients via iris movement using image processing in MATLAB. Proceedings of the 2015 Annual IEEE India Conference (INDICON), New Delhi, India.","DOI":"10.1109\/INDICON.2015.7443610"},{"key":"ref_28","unstructured":"Rascanu, G.C., and Solea, R. (2011, January 14\u201316). Electric wheelchair control for people with locomotor disabilities using eye movements. Proceedings of the 15th International Conference on System Theory, Control and Computing, Sinaia, Romania."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Akanto, J.M., Islam, M.K., Hakim, A., Sojun, M.A.H., and Shikder, K. (2021, January 5\u20137). Eye Pupil Controlled Transport Riding Wheelchair. Proceedings of the 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), Dhaka, Bangladesh.","DOI":"10.1109\/ICREST51555.2021.9331133"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Banerjee, C., Gupta, H., and Sushobhan, K. (2010, January 26\u201328). Low cost speech and vision based wheel chair for physically challenged. Proceedings of the 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE), Singapore.","DOI":"10.1109\/ICCAE.2010.5451281"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Li, Z., Xiong, Y., and Zhou, L. (2017, January 9\u201310). ROS-Based Indoor Autonomous Exploration and Navigation Wheelchair. Proceedings of the 2017 10th International Symposium on Computational Intelligence and Design (ISCID), Hangzhou, China.","DOI":"10.1109\/ISCID.2017.55"},{"key":"ref_32","unstructured":"Viswanathan, P., Little, J., Mackworth, A.K., and Mihailidis, A. (2011, January 26). Adaptive navigation assistance for visually-impaired wheelchair users. Proceedings of the IROS 2011 Workshop on New and Emerging Technologies in Assistive Robotics, San Francisco, CA, USA."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"4386","DOI":"10.1109\/LRA.2019.2932874","article-title":"Self-Supervised Drivable Area and Road Anomaly Segmentation Using RGB-D Data For Robotic Wheelchairs","volume":"4","author":"Wang","year":"2019","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"102680","DOI":"10.1109\/ACCESS.2020.2998427","article-title":"Lightweight Semantic Segmentation for Road-Surface Damage Recognition Based on Multiscale Learning","volume":"8","author":"Shim","year":"2020","journal-title":"IEEE Access"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Sakai, Y., Nakayama, Y., Lu, H., Li, Y., and Kim, H. (2019, January 15\u201318). Recognition of Surrounding Environment for Electric Wheelchair Based on WideSeg. Proceedings of the 2019 19th International Conference on Control, Automation and Systems (ICCAS), Jeju, Korea.","DOI":"10.23919\/ICCAS47443.2019.8971608"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Zagoruyko, S., and Komodakis, N. (2016). Wide Residual Networks. arXiv.","DOI":"10.5244\/C.30.87"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"He, Y., Chiu, W.C., Keuper, M., and Fritz, M. (2016). STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling. arXiv.","DOI":"10.1109\/CVPR.2017.757"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Kundu, A., Vineet, V., and Koltun, V. (2016, January 27\u201330). Feature Space Optimization for Semantic Video Segmentation. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.345"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Tripathi, S., Belongie, S., Hwang, Y., and Nguyen, T. (2015, January 2\u20135). Semantic video segmentation: Exploring inference efficiency. Proceedings of the 2015 International SoC Design Conference (ISOCC), Gyeongju, Korea.","DOI":"10.1109\/ISOCC.2015.7401766"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Gadde, R., Jampani, V., and Gehler, P.V. (2017). Semantic Video CNNs through Representation Warping. arXiv.","DOI":"10.1109\/ICCV.2017.477"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Jin, X., Li, X., Xiao, H., Shen, X., Lin, Z., Yang, J., Chen, Y., Dong, J., Liu, L., and Jie, Z. (2016). Video Scene Parsing with Predictive Feature Learning. arXiv.","DOI":"10.1109\/ICCV.2017.595"},{"key":"ref_42","unstructured":"Nilsson, D., and Sminchisescu, C. (2016). Semantic Video Segmentation by Gated Recurrent Flow Propagation. arXiv."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Zhu, X., Xiong, Y., Dai, J., Yuan, L., and Wei, Y. (2016). Deep Feature Flow for Video Recognition. arXiv.","DOI":"10.1109\/CVPR.2017.441"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Shelhamer, E., Rakelly, K., Hoffman, J., and Darrell, T. (2016). Clockwork Convnets for Video Semantic Segmentation. arXiv.","DOI":"10.1007\/978-3-319-49409-8_69"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Li, Y., Shi, J., and Lin, D. (2018). Low-Latency Video Semantic Segmentation. arXiv.","DOI":"10.1109\/CVPR.2018.00628"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Jain, S., Wang, X., and Gonzalez, J. (2018). Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video. arXiv.","DOI":"10.1109\/CVPR.2019.00907"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Carreira, J., Patraucean, V., Mazare, L., Zisserman, A., and Osindero, S. (2018). Massively Parallel Video Networks. arXiv.","DOI":"10.1007\/978-3-030-01225-0_40"},{"key":"ref_48","unstructured":"Yu, F., and Koltun, V. (2015). Multi-Scale Context Aggregation by Dilated Convolutions. arXiv."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs","volume":"40","author":"Chen","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_50","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. (2016, January 27\u201330). The Cityscapes Dataset for Semantic Urban Scene Understanding. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.350"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Brostow, G.J., Shotton, J., Fauqueur, J., and Cipolla, R. (2008, January 12\u201318). Segmentation and Recognition Using Structure from Motion Point Clouds. Proceedings of the ECCV (1), Marseille, France.","DOI":"10.1007\/978-3-540-88682-2_5"},{"key":"ref_53","unstructured":"Dosovitskiy, A., Ros, G., Codevilla, F., L\u00f3pez, A., and Koltun, V. (2017, January 13\u201315). CARLA: An Open Urban Driving Simulator. Proceedings of the 1st Annual Conference on Robot Learning, Mountain View, CA, USA."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1007\/s11263-014-0733-5","article-title":"The Pascal Visual Object Classes Challenge: A Retrospective","volume":"111","author":"Everingham","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_55","unstructured":"Wallach, H., Larochelle, H., Beygelzimer, A., d\u2019Alch\u00e9-Buc, F., Fox, E., and Garnett, R. (2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. Advances in Neural Information Processing Systems 32, Curran Associates, Inc."},{"key":"ref_56","unstructured":"(2022, June 02). CRIANN. Available online: https:\/\/www.criann.fr\/."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/14\/5241\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:49:40Z","timestamp":1760140180000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/14\/5241"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,13]]},"references-count":56,"journal-issue":{"issue":"14","published-online":{"date-parts":[[2022,7]]}},"alternative-id":["s22145241"],"URL":"https:\/\/doi.org\/10.3390\/s22145241","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,13]]}}}