{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,6]],"date-time":"2025-11-06T20:05:28Z","timestamp":1762459528263,"version":"build-2065373602"},"reference-count":27,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2017,11,16]],"date-time":"2017-11-16T00:00:00Z","timestamp":1510790400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>This paper describes three coarse image description strategies, which are meant to promote a rough perception of surrounding objects for visually impaired individuals, with application to indoor spaces. The described algorithms operate on images (grabbed by the user, by means of a chest-mounted camera), and provide in output a list of objects that likely exist in his context across the indoor scene. In this regard, first, different colour, texture, and shape-based feature extractors are generated, followed by a feature learning step by means of AutoEncoder (AE) models. Second, the produced features are fused and fed into a multilabel classifier in order to list the potential objects. The conducted experiments point out that fusing a set of AE-learned features scores higher classification rates with respect to using the features individually. Furthermore, with respect to reference works, our method: (i) yields higher classification accuracies, and (ii) runs (at least four times) faster, which enables a potential full real-time application.<\/jats:p>","DOI":"10.3390\/s17112641","type":"journal-article","created":{"date-parts":[[2017,11,16]],"date-time":"2017-11-16T11:10:02Z","timestamp":1510830602000},"page":"2641","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["Real-Time Indoor Scene Description for the Visually Impaired Using Autoencoder Fusion Strategies with Visible Cameras"],"prefix":"10.3390","volume":"17","author":[{"given":"Salim","family":"Malek","sequence":"first","affiliation":[{"name":"Department of Information Engineering and Computer Science, University of Trento, Via Sommarive 9, I-38123 Trento, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9745-3732","authenticated-orcid":false,"given":"Farid","family":"Melgani","sequence":"additional","affiliation":[{"name":"Department of Information Engineering and Computer Science, University of Trento, Via Sommarive 9, I-38123 Trento, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mohamed","family":"Mekhalfi","sequence":"additional","affiliation":[{"name":"Department of Information Engineering and Computer Science, University of Trento, Via Sommarive 9, I-38123 Trento, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9287-0596","authenticated-orcid":false,"given":"Yakoub","family":"Bazi","sequence":"additional","affiliation":[{"name":"College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2017,11,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Pundlik, S., Tomasi, M., and Luo, G. (2013, January 23\u201328). Collision Detection for Visually Impaired from a Body-Mounted Camera. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Portland, OR, USA.","DOI":"10.1109\/CVPRW.2013.11"},{"key":"ref_2","unstructured":"Balakrishnan, G., Sainarayanan, G., Nagarajan, R., and Yaacob, S. (2004, January 18\u201320). Stereopsis method for visually impaired to identify obstacles based on distance. Proceedings of the 3rd International Conference on Image and Graphics (ICIG\u201904), Hong Kong, China."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1109\/3468.911370","article-title":"The GuideCane-applying mobile robot technologies to assist the visually impaired","volume":"31","author":"Ulrich","year":"2001","journal-title":"IEEE Trans. Syst. Man Cybern. Part Syst. Hum."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"459","DOI":"10.1109\/5326.704589","article-title":"Auditory guidance with the Navbelt-a computerized travel aid for the blind","volume":"28","author":"Shoval","year":"1998","journal-title":"IEEE Trans. Syst. Man Cybern. Part C Appl. Rev."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1007\/s10846-011-9555-7","article-title":"A Navigation Aid for Blind People","volume":"64","author":"Bettayeb","year":"2011","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"3047","DOI":"10.1109\/TIM.2012.2202169","article-title":"Experimental Investigation of Electromagnetic Obstacle Detection for Visually Impaired Users: A Comparison with Ultrasonic Sensing","volume":"61","author":"Scalise","year":"2012","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"L\u00f3pez-De-Ipi\u00f1a, D., Lorido, T., and L\u00f3pez, U. (2011, January 20\u201322). BlindShopping: Enabling Accessible Shopping for Visually Impaired People through Mobile Technologies. Proceedings of the Toward Useful Services for Elderly and People with Disabilities, Montreal, QC, Canada.","DOI":"10.1007\/978-3-642-21535-3_39"},{"key":"ref_8","unstructured":"Pan, H., Yi, C., and Tian, Y. (2013, January 15\u201319). A primary travelling assistant system of bus detection and recognition for visually impaired people. Proceedings of the IEEE International Conference on Multimedia and Expo Workshops (ICMEW), San Jose, CA, USA."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1021","DOI":"10.1109\/TSMCC.2011.2178120","article-title":"Robust and Effective Component-Based Banknote Recognition for the Blind","volume":"42","author":"Hasanuzzaman","year":"2012","journal-title":"IEEE Trans. Syst. Man Cybern. Part C Appl. Rev."},{"key":"ref_10","unstructured":"Tang, T.J.J., Lui, W.L.D., and Li, W.H. (2012, January 3\u20135). Plane-based detection of staircases using inverse depth. Proceedings of the Australasian Conference on Robotics and Automation, Wellington, New Zealand."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Yang, X., and Tian, Y. (2010, January 13\u201318). Robust door detection in unfamiliar environments by combining edge and corner features. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition\u2014Workshops, San Francisco, CA, USA.","DOI":"10.1109\/CVPRW.2010.5543830"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Wang, S., and Tian, Y. (2012, January 11\u201313). Camera-Based Signage Detection and Recognition for Blind Persons. Proceedings of the 13th International Conference Computers Helping People with Special Needs, Linz, Austria.","DOI":"10.1007\/978-3-642-31534-3_3"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Mocanu, B., Tapu, R., and Zaharia, T. (2016). When Ultrasonic Sensors and Computer Vision Join Forces for Efficient Obstacle Detection and Recognition. Sensors, 16.","DOI":"10.3390\/s16111807"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"2907","DOI":"10.1016\/j.eswa.2014.11.017","article-title":"Toward an assisted indoor scene perception for blind people with image multilabeling strategies","volume":"42","author":"Mekhalfi","year":"2015","journal-title":"Expert Syst. Appl."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1246","DOI":"10.1109\/TCSVT.2014.2372371","article-title":"A Compressive Sensing Approach to Describe Indoor Scenes for Blind People","volume":"25","author":"Mekhalfi","year":"2015","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_16","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201905), San Diego, CA, USA."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1007\/s13042-010-0001-0","article-title":"Understanding bag-of-words model: A statistical framework","volume":"1","author":"Zhang","year":"2010","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_18","first-page":"265","article-title":"Object Tracking with Multi-View Support Vector Machines","volume":"17","author":"Zhang","year":"2015","journal-title":"IEEE Trans. Multimed."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"6356","DOI":"10.1109\/TGRS.2013.2296351","article-title":"Detecting Cars in UAV Images with a Catalog-Based Approach","volume":"52","author":"Moranduzzo","year":"2014","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"971","DOI":"10.1109\/TPAMI.2002.1017623","article-title":"Multiresolution gray-scale and rotation invariant texture classification with local binary patterns","volume":"24","author":"Ojala","year":"2002","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1657","DOI":"10.1109\/TIP.2010.2044957","article-title":"A Completed Modeling of Local Binary Pattern Operator for Texture Classification","volume":"19","author":"Guo","year":"2010","journal-title":"IEEE Trans. Image Process."},{"key":"ref_22","first-page":"3371","article-title":"Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion","volume":"11","author":"Vincent","year":"2010","journal-title":"J. Mach. Learn. Res."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"Rumelhart","year":"1986","journal-title":"Nature"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_26","unstructured":"Simonyan, K., and Zisserman, A. (arXiv, 2014). Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"694","DOI":"10.1109\/LGRS.2017.2671922","article-title":"A Deep Learning Approach to UAV Image Multilabeling","volume":"14","author":"Zeggada","year":"2017","journal-title":"IEEE Geosci. Remote Sens. Lett."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/17\/11\/2641\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T18:49:44Z","timestamp":1760208584000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/17\/11\/2641"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,11,16]]},"references-count":27,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2017,11]]}},"alternative-id":["s17112641"],"URL":"https:\/\/doi.org\/10.3390\/s17112641","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2017,11,16]]}}}